Actions
Bug #38490
openmds: multimds stuck
Status:
New
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Sorry for vague $subject, not sure what's wrong yet.
2019-02-26 18:57:12.716 7f6806098700 1 -- [v2:172.21.15.145:6836/1016281496,v1:172.21.15.145:6837/1016281496] --> [v2:172.21.15.36:6800/36393,v1:172.21.15.36:6801/36393] -- mgrreport(unknown.b +0-0 packed 1366) v7 -- 0x55810979c300 con 0x558109071000 2019-02-26 18:57:15.308 7f680b0a2700 1 -- [v2:172.21.15.145:6836/1016281496,v1:172.21.15.145:6837/1016281496] >> [v2:172.21.15.145:6838/3486520481,v1:172.21.15.145:6839/3486520481] conn(0x558108391000 msgr2=0x55810912a580 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_bulk peer close file descriptor 33 2019-02-26 18:57:15.308 7f680b0a2700 1 -- [v2:172.21.15.145:6836/1016281496,v1:172.21.15.145:6837/1016281496] >> [v2:172.21.15.145:6838/3486520481,v1:172.21.15.145:6839/3486520481] conn(0x558108391000 msgr2=0x55810912a580 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until read failed 2019-02-26 18:57:15.308 7f680b0a2700 1 --2- [v2:172.21.15.145:6836/1016281496,v1:172.21.15.145:6837/1016281496] >> [v2:172.21.15.145:6838/3486520481,v1:172.21.15.145:6839/3486520481] conn(0x558108391000 0x55810912a580 crc :-1 s=READY pgs=11 cs=0 l=0 rx=0 tx=0).handle_read_frame_preamble_main read frame length and tag failed r=-1 ((1) Operation not permitted) 2019-02-26 18:57:15.308 7f680b0a2700 1 --2- [v2:172.21.15.145:6836/1016281496,v1:172.21.15.145:6837/1016281496] >> [v2:172.21.15.145:6838/3486520481,v1:172.21.15.145:6839/3486520481] conn(0x558108391000 0x55810912a580 unknown :-1 s=READY pgs=11 cs=0 l=0 rx=0 tx=0)._fault with nothing to send, going to standby 2019-02-26 18:57:15.358 7f680a0a0700 1 -- [v2:172.21.15.145:6836/1016281496,v1:172.21.15.145:6837/1016281496] >> [v2:172.21.15.145:6834/2671704713,v1:172.21.15.145:6835/2671704713] conn(0x558109170400 msgr2=0x558108f38580 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_bulk peer close file descriptor 38 2019-02-26 18:57:15.358 7f680a0a0700 1 -- [v2:172.21.15.145:6836/1016281496,v1:172.21.15.145:6837/1016281496] >> [v2:172.21.15.145:6834/2671704713,v1:172.21.15.145:6835/2671704713] conn(0x558109170400 msgr2=0x558108f38580 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until read failed 2019-02-26 18:57:15.358 7f680a0a0700 1 --2- [v2:172.21.15.145:6836/1016281496,v1:172.21.15.145:6837/1016281496] >> [v2:172.21.15.145:6834/2671704713,v1:172.21.15.145:6835/2671704713] conn(0x558109170400 0x558108f38580 crc :-1 s=READY pgs=13 cs=0 l=0 rx=0 tx=0).handle_read_frame_preamble_main read frame length and tag failed r=-1 ((1) Operation not permitted) 2019-02-26 18:57:15.358 7f680a0a0700 1 --2- [v2:172.21.15.145:6836/1016281496,v1:172.21.15.145:6837/1016281496] >> [v2:172.21.15.145:6834/2671704713,v1:172.21.15.145:6835/2671704713] conn(0x558109170400 0x558108f38580 unknown :-1 s=READY pgs=13 cs=0 l=0 rx=0 tx=0)._fault with nothing to send, going to standby
From: /ceph/teuthology-archive/pdonnell-2019-02-26_07:49:50-multimds-wip-pdonnell-testing-20190226.051327-distro-basic-smithi/3641251/remote/smithi145/log/ceph-mds.b.log.gz
I suspect this may be some messenger2 issue.
Actions