Project

General

Profile

Actions

Bug #42680

open

crash in in thread 7f6a445ee700 thread_name:devicehealth

Added by Tomasz Torcz over 4 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

It seems ceph-mgr crashes in relation to device health. I think it may be related to scraping.

   -18> 2019-11-07 12:42:25.884 7f6a47df5700  4 mgr.server handle_report from 0x555590146900 mon,sandrider                                                 
   -17> 2019-11-07 12:42:25.885 7f6a47df5700  4 mgr.server maybe_ready initial report from osd 5                                                           
   -16> 2019-11-07 12:42:25.886 7f6a47df5700  4 mgr.server maybe_ready still waiting for 4 osds to report in before PGMap is ready                         
   -15> 2019-11-07 12:42:25.886 7f6a47df5700  4 mgr.server handle_report from 0x555590018000 osd,4                                                         
   -14> 2019-11-07 12:42:25.887 7f6a47df5700  4 mgr.server handle_report from 0x555590149600 osd,3                                                         
   -13> 2019-11-07 12:42:25.888 7f6a47df5700  4 mgr.server handle_report from 0x555590681b00 osd,0                                                         
   -12> 2019-11-07 12:42:25.889 7f6a47df5700  4 mgr.server handle_report from 0x555590681680 osd,1                                                         
   -11> 2019-11-07 12:42:25.893 7f6a47df5700  4 mgr.server maybe_ready initial report from osd 4                                                           
   -10> 2019-11-07 12:42:25.893 7f6a47df5700  4 mgr.server maybe_ready still waiting for 3 osds to report in before PGMap is ready                         
    -9> 2019-11-07 12:42:25.893 7f6a47df5700  4 mgr.server maybe_ready initial report from osd 3                                                           
    -8> 2019-11-07 12:42:25.893 7f6a47df5700  4 mgr.server maybe_ready still waiting for 2 osds to report in before PGMap is ready                         
    -7> 2019-11-07 12:42:25.894 7f6a47df5700  4 mgr.server maybe_ready initial report from osd 0                                                           
    -6> 2019-11-07 12:42:25.894 7f6a47df5700  4 mgr.server maybe_ready still waiting for 1 osds to report in before PGMap is ready                         
    -5> 2019-11-07 12:42:25.894 7f6a47df5700  4 mgr.server maybe_ready initial report from osd 1                                                           
    -4> 2019-11-07 12:42:25.894 7f6a47df5700  4 mgr.server maybe_ready all osds have reported, sending PG state to mon                                     
    -3> 2019-11-07 12:42:25.895 7f6a47df5700  0 log_channel(cluster) log [DBG] : pgmap v2: 225 pgs: 225 active+clean; 90 GiB data, 144 GiB used, 1.6 TiB / 
1.7 TiB avail         
    -2> 2019-11-07 12:42:25.895 7f6a47df5700 10 monclient: _send_mon_message to mon.naib at v2:[2001:470:71:68d:e3f5:39b2:1578:f7ae]:3300/0                
    -1> 2019-11-07 12:42:25.899 7f6a33c97700  3 client.4425728 may_lookup 0x55558f990580 = 0                                                               
     0> 2019-11-07 12:42:26.262 7f6a445ee700 -1 *** Caught signal (Segmentation fault) **                                                                  
 in thread 7f6a445ee700 thread_name:devicehealth                             

 ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable)                                                                          
 1: (()+0x14950) [0x7f6a5b80a950]
 2: (rados_write_op_omap_set()+0x164) [0x7f6a4f8c0074]                       
 3: (()+0xb8c21) [0x7f6a4fa8ac21]
 4: (()+0x17d842) [0x7f6a5bb93842]
 5: (PyVectorcall_Call()+0x70) [0x7f6a5bb42df0]                              
 6: (()+0x452d6) [0x7f6a4fa172d6]
 7: (()+0x8bbc9) [0x7f6a4fa5dbc9]
 8: (_PyObject_MakeTpCall()+0x230) [0x7f6a5bb3d440]                          
 9: (()+0xd187e) [0x7f6a5bae787e]                                            
 10: (_PyEval_EvalFrameDefault()+0x5199) [0x7f6a5bba64f9]                    
 11: (_PyFunction_Vectorcall()+0xfa) [0x7f6a5bb6ff0a]
12: (_PyEval_EvalFrameDefault()+0x852) [0x7f6a5bba1bb2]                                                                                                   
 13: (_PyFunction_Vectorcall()+0xfa) [0x7f6a5bb6ff0a]                                                                                                      
 14: (_PyEval_EvalFrameDefault()+0x852) [0x7f6a5bba1bb2]                                                                                                   
 15: (_PyFunction_Vectorcall()+0xfa) [0x7f6a5bb6ff0a]                                                                                                      
 16: (()+0x175eab) [0x7f6a5bb8beab]                                                                                                                        
 17: (()+0x127d14) [0x7f6a5bb3dd14]                                                                                                                        
 18: (()+0x1fc6cf) [0x7f6a5bc126cf]                                          
 19: (PyObject_CallMethod()+0xc0) [0x7f6a5bc1d770]                                                                                                         
 20: (PyModuleRunner::serve()+0x66) [0x55558a35da76]                                                                                                       
 21: (PyModuleRunner::PyModuleRunnerThread::entry()+0x1dd) [0x55558a35e31d]                                                                                
 22: (()+0x94e2) [0x7f6a5b7ff4e2]                                                                                                                          
 23: (clone()+0x43) [0x7f6a5b3ca623]                                                                                                                       
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.                                                               

--- logging levels ---                                                                                                                                     
   0/ 5 none                                                                                                                                               
   0/ 1 lockdep                                                                                                                                            
   0/ 1 context                                                                                                                                            
   1/ 1 crush                                                                                                                                              
   1/ 5 mds                                                                                                                                                
   1/ 5 mds_balancer                                                                                                                                       
   1/ 5 mds_locker                                                                                                                                         
   1/ 5 mds_log                                                                                                                                            
   1/ 5 mds_log_expire
   1/ 5 mds_migrator                                                                                                                                       
   0/ 1 buffer                                                                                                                                             
   0/ 1 timer                                                                                                                                              
   0/ 1 filer                                                                
   0/ 1 striper                       
   0/ 1 objecter                                                                                                                                           
   0/ 5 rados                    
   0/ 5 rbd                                                                  
   0/ 5 rbd_mirror               
   0/ 5 rbd_replay                
   0/ 5 journaler                                                            
   0/ 5 objectcacher             
   0/ 5 client                   
   9/ 9 osd                                                                  
   0/ 5 optracker                                                            
   0/ 5 objclass                                                             
   1/ 3 filestore    
   1/ 3 journal                                                                                                                                            
   0/ 0 ms                                                                                                                                                 
   9/ 9 mon                                                                                                                                                
   0/10 monc                                                                                                                                               
   1/ 5 paxos                                                                                                                                              
   0/ 5 tp                                                                                                                                                 
   1/ 5 auth                                                                 
   1/ 5 crypto                                                                                                                                             
   1/ 1 finisher                                                                                                                                           
   1/ 1 reserver                                                                                                                                           
   1/ 5 heartbeatmap                                                                                                                                       
   1/ 5 perfcounter                                                                                                                                        
   1/ 5 rgw                                                                                                                                                
   1/ 5 rgw_sync                                                                                                                                           
   1/10 civetweb                                                                                                                                           
   1/ 5 javaclient                                                                                                                                         
   1/ 5 asok                                                                                                                                               
   1/ 1 throttle                                                                                                                                           
   0/ 0 refs                                                                                                                                               
   1/ 5 xio                                                                                                                                                
   1/ 5 compressor                                                                                                                                         
   1/ 5 bluestore                                                                                                                                          
   1/ 5 bluefs                                                                                                                                             
   1/ 3 bdev          
   1/ 5 kstore                                                                                                                                             
   4/ 5 rocksdb                                                                                                                                            
   4/ 5 leveldb                                                                                                                                            
   4/ 5 memdb                                                                
   1/ 5 kinetic                       
   1/ 5 fuse                                                                                                                                               
   1/ 5 mgr                      
   1/ 5 mgrc                                                                 
   1/ 5 dpdk                     
   1/ 5 eventtrace                
  -2/-2 (syslog threshold)                                                   
  -1/-1 (stderr threshold)       
  max_recent     10000           
  max_new         1000                                                       
  log_file /var/log/ceph/ceph-mgr.naib.log                                   
--- end dump of recent events ---  

This is on Fedora rawhide, package version ceph-mgr-14.2.4-1.fc32.x86_64. Cluster has 6 OSD on SATA HDD (no ssd, no nvme).
May be related to BUG #42578


Related issues 1 (0 open1 closed)

Related to RADOS - Bug #42082: pybind/rados: set_omap() crash on py3ResolvedSage Weil09/27/2019

Actions
Actions #1

Updated by Brad Hubbard over 4 years ago

I think this is more likely to be a duplicate of https://tracker.ceph.com/issues/42082

Actions #2

Updated by Brad Hubbard over 4 years ago

  • Related to Bug #42082: pybind/rados: set_omap() crash on py3 added
Actions

Also available in: Atom PDF