


Backport #13512

Updated by Nathan Cutler about 7 years ago This is happening at startup in a small minority of test runs. 


 The ceph-osd daemons are starting, their logs are happily spinning away, but they're not getting as far as sending their boot messages to the mon. 

 I caught one in the act, and tried to attach a debugger, gdb hung, tried to run a fresh osd process in a debugger and it hung on ctrl-c. 

 I happened to notice that the host mira106 had some dead krbd volumes (presumably from some other test, see #13510). 

 It seems highly likely that the OSD process is hanging inside get_device_by_uuid.    For some reason the heartbeat map doesn't care that this thread is hanging. 
