Project

General

Profile

Actions

Bug #20657

closed

apparent mgr deadlock in handle_report

Added by Sage Weil almost 7 years ago. Updated over 6 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2017-07-17 17:58:00.483541 7f5065ffe700  1 -- 172.21.15.94:6800/69755 <== mon.? 172.21.15.94:0/71232 10 ==== mgrreport(mon.a +0-0 packed 1734) v4 ==== 1756+0+0 (108168292 0 0) 0x5572db936400 con 0x5572dbba74e0
2017-07-17 17:58:00.483575 7f5065ffe700  4 mgr.server handle_report from 0x5572dbba74e0 mon,a
2017-07-17 17:58:00.483581 7f5065ffe700 20 mgr.server handle_report updating existing DaemonState for mon,a
2017-07-17 17:58:00.483584 7f5065ffe700 20 mgr update loading 0 new types, 0 old types, had 592 types, got 1734 bytes of data
2017-07-17 17:58:00.550258 7f5065ffe700 10 mgr.server ms_handle_reset unregistering osd.1  session 0x5572dafcdc00 con 0x5572db9baa20
2017-07-17 17:58:00.550412 7f5065ffe700 10 mgr.server ms_handle_reset unregistering osd.2  session 0x5572dafcdf80 con 0x5572db9bb320
2017-07-17 17:58:00.550536 7f5065ffe700 10 mgr.server ms_handle_reset unregistering osd.0  session 0x5572dafcd5e0 con 0x5572db2ff200
2017-07-17 17:58:01.516782 7f506a107700  1 -- 172.21.15.94:0/2148420327 mark_down 0x5572dbba4a20 -- 0x5572dbb09400
2017-07-17 17:58:01.516954 7f506a107700  1 -- 172.21.15.94:0/2148420327 --> 172.21.15.94:6790/0 -- auth(proto 0 26 bytes epoch 3) v1 -- ?+0 0x5572dbb0c000 con 0x5572db9bb320
2017-07-17 17:58:01.516975 7f5069105700  0 -- 172.21.15.94:0/2148420327 >> 172.21.15.94:6790/0 pipe(0x5572dba3c000 sd=8 :0 s=1 pgs=0 cs=0 l=0 c=0x5572db9bb320).fault
2017-07-17 17:58:02.268131 7f5068002700 10 mgr tick tick
2017-07-17 17:58:02.268143 7f5068002700  1 mgr send_beacon active
2017-07-17 17:58:02.268222 7f5068002700 10 mgr send_beacon sending beacon as gid 4098 modules dashboard,restful,status,zabbix
2017-07-17 17:58:02.268234 7f5068002700 10 mgr tick 
2017-07-17 17:58:02.268237 7f5068002700 10 mgr update_delta_stats  v61

but 7f5065ffe700 does not come back. the test's pg dump shortly after this times out and the test fails.

/a/sage-2017-07-17_17:34:45-rados-wip-sage-testing-distro-basic-smithi/1408489

Actions #1

Updated by Sage Weil over 6 years ago

  • Status changed from 12 to Can't reproduce

I suspect this was fixed by f236b5e78338f85a0ac82440ae44cdf4db83a04f

Actions

Also available in: Atom PDF