Project

General

Profile

Bug #21141

rgw_file: incorrect lane lock behavior in evict_block()

Added by Matt Benjamin about 2 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
08/25/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
jewel luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Needs Doc:
No

Description

On Aug. 24, Supriti Singh sent a link to:

/usr/bin/ganesha.nfsd -f /etc/ganesha/ganesha.conf -F
  • Caught signal (Segmentation fault)
    in thread 7f74437fe700 thread_name:ganesha.nfsd
    ceph version v12.1.2-550-g59afe41c49 (59afe41c491258cd4f9d215e53b86b0c2c62b138) luminous (rc)
    1: (()+0x44c529) [0x7f7537e92529]
    2: (()+0x10b10) [0x7f753b2e2b10]
    3: (()+0x124a0) [0x7f753b2e44a0]
    4: (cohort::lru::LRU<std::mutex>::evict_block()+0x42) [0x7f7537bdc642]
    5: (rgw::RGWLibFS::lookup_fh(rgw::RGWFileHandle*, char const*, unsigned int)+0x37c) [0x7f7537be2fbc]
    6: (rgw::RGWLibFS::stat_leaf(rgw::RGWFileHandle*, char const*, rgw_fh_type, unsigned int)+0x9b5) [0x7f7537bcb3b5]
    7: (rgw_lookup()+0xb1) [0x7f7537bcb9d1]
    8: (()+0x48bc) [0x7f753819f8bc]
    9: (()+0x4a43) [0x7f753819fa43]
    10: (rgw::RGWReaddirRequest::send_response()+0x371) [0x7f7537be3a11]
    11: (rgw::RGWLibProcess::process_request(rgw::RGWLibRequest*, rgw::RGWLibIO*)+0x4a6) [0x7f7537bbe066]
    12: (rgw::RGWLibProcess::process_request(rgw::RGWLibRequest*)+0xed) [0x7f7537bbef2d]
    13: (rgw::RGWFileHandle::readdir(bool ()(char const, void*, unsigned long, unsigned int), void*, unsigned long*, bool*, unsigned int)+0x5ef) [0x7f7537bd016f]
    14: (()+0x4bf3) [0x7f753819fbf3]
    15: (mdcache_populate_dir_chunk()+0x24b) [0x54b182]
    16: (mdcache_readdir_chunked()+0x754) [0x54bc2e]
    17: /usr/bin/ganesha.nfsd() [0x53c4ca]
    18: (fsal_readdir()+0x28b) [0x436311]
    19: (nfs4_op_readdir()+0x55a) [0x47e0a4]
    20: (nfs4_Compound()+0xa1e) [0x462629]
    21: (nfs_rpc_execute()+0x1d53) [0x44f1c6]
    22: /usr/bin/ganesha.nfsd() [0x44f9d0]
    23: /usr/bin/ganesha.nfsd() [0x50bac7]
    24: (()+0x8744) [0x7f753b2da744]
    25: (clone()+0x6d) [0x7f753abc3aad]
    2017-08-18 10:27:39.615573 7f74437fe700 -1
    Caught signal (Segmentation fault) *
    in thread 7f74437fe700 thread_name:ganesha.nfsd

The root cause of this appears to be missing logic to take the lane lock at cohort_lru.h l. 137. The unlock paths are intact, so this thread could (impl. defined) unlock a lock held in another path.


Related issues

Copied to rgw - Backport #21185: luminous: rgw_file: incorrect lane lock behavior in evict_block() Resolved
Copied to rgw - Backport #21186: jewel: rgw_file: incorrect lane lock behavior in evict_block() Resolved

History

#2 Updated by Matt Benjamin about 2 months ago

  • Status changed from Need Review to Pending Backport

#3 Updated by Nathan Cutler about 2 months ago

  • Copied to Backport #21185: luminous: rgw_file: incorrect lane lock behavior in evict_block() added

#4 Updated by Nathan Cutler about 2 months ago

  • Copied to Backport #21186: jewel: rgw_file: incorrect lane lock behavior in evict_block() added

#5 Updated by Nathan Cutler about 1 month ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF