Actions
Bug #21141
closedrgw_file: incorrect lane lock behavior in evict_block()
% Done:
0%
Source:
Tags:
Backport:
jewel luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
On Aug. 24, Supriti Singh sent a link to:
/usr/bin/ganesha.nfsd -f /etc/ganesha/ganesha.conf -F- Caught signal (Segmentation fault)
in thread 7f74437fe700 thread_name:ganesha.nfsd
ceph version v12.1.2-550-g59afe41c49 (59afe41c491258cd4f9d215e53b86b0c2c62b138) luminous (rc)
1: (()+0x44c529) [0x7f7537e92529]
2: (()+0x10b10) [0x7f753b2e2b10]
3: (()+0x124a0) [0x7f753b2e44a0]
4: (cohort::lru::LRU<std::mutex>::evict_block()+0x42) [0x7f7537bdc642]
5: (rgw::RGWLibFS::lookup_fh(rgw::RGWFileHandle*, char const*, unsigned int)+0x37c) [0x7f7537be2fbc]
6: (rgw::RGWLibFS::stat_leaf(rgw::RGWFileHandle*, char const*, rgw_fh_type, unsigned int)+0x9b5) [0x7f7537bcb3b5]
7: (rgw_lookup()+0xb1) [0x7f7537bcb9d1]
8: (()+0x48bc) [0x7f753819f8bc]
9: (()+0x4a43) [0x7f753819fa43]
10: (rgw::RGWReaddirRequest::send_response()+0x371) [0x7f7537be3a11]
11: (rgw::RGWLibProcess::process_request(rgw::RGWLibRequest*, rgw::RGWLibIO*)+0x4a6) [0x7f7537bbe066]
12: (rgw::RGWLibProcess::process_request(rgw::RGWLibRequest*)+0xed) [0x7f7537bbef2d]
13: (rgw::RGWFileHandle::readdir(bool ()(char const, void*, unsigned long, unsigned int), void*, unsigned long*, bool*, unsigned int)+0x5ef) [0x7f7537bd016f]
14: (()+0x4bf3) [0x7f753819fbf3]
15: (mdcache_populate_dir_chunk()+0x24b) [0x54b182]
16: (mdcache_readdir_chunked()+0x754) [0x54bc2e]
17: /usr/bin/ganesha.nfsd() [0x53c4ca]
18: (fsal_readdir()+0x28b) [0x436311]
19: (nfs4_op_readdir()+0x55a) [0x47e0a4]
20: (nfs4_Compound()+0xa1e) [0x462629]
21: (nfs_rpc_execute()+0x1d53) [0x44f1c6]
22: /usr/bin/ganesha.nfsd() [0x44f9d0]
23: /usr/bin/ganesha.nfsd() [0x50bac7]
24: (()+0x8744) [0x7f753b2da744]
25: (clone()+0x6d) [0x7f753abc3aad]
2017-08-18 10:27:39.615573 7f74437fe700 -1 Caught signal (Segmentation fault) *
in thread 7f74437fe700 thread_name:ganesha.nfsd
The root cause of this appears to be missing logic to take the lane lock at cohort_lru.h l. 137. The unlock paths are intact, so this thread could (impl. defined) unlock a lock held in another path.
Actions