Project

General

Profile

Bug #24040

mds: assert in CDir::_committed

Added by zs 吴 7 months ago. Updated 7 months ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
05/07/2018
Due date:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
crash
Pull request ID:

Description

2018-05-08 07:19:07.713963 7f56986c6700 -1 *** Caught signal (Aborted) **
 in thread 7f56986c6700 thread_name:fn_anonymous

 ceph version 11.2.1 (e0354f9d3b1eea1d75a7dd487ba8098311be38a7)
 1: (()+0x533e4e) [0x557f9b6efe4e]
 2: (()+0x11390) [0x7f56a3b6b390]
 3: (gsignal()+0x38) [0x7f56a215d428]
 4: (abort()+0x16a) [0x7f56a215f02a]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x264) [0x557f9b7a8304]
 6: (CDir::_committed(int, unsigned long)+0x10ff) [0x557f9b5fc8bf]
 7: (MDSIOContextBase::complete(int)+0x98) [0x557f9b677ce8]
 8: (Finisher::finisher_thread_entry()+0x49e) [0x557f9b7a74ae]
 9: (()+0x76ba) [0x7f56a3b616ba]
 10: (clone()+0x6d) [0x7f56a222f3dd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
   -38> 2018-05-08 07:19:07.664427 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 53 0x55806c39e680 osd_op_reply(32401 200.000d9322 [write 3278952~878868 [fadvise_dontneed]] v129403'47429 uv47429 ondisk = 0) v7
   -37> 2018-05-08 07:19:07.664446 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 53 ==== osd_op_reply(32401 200.000d9322 [write 3278952~878868 [fadvise_dontneed]] v129403'47429 uv47429 ondisk = 0) v7 ==== 132+0+0 (3192774713 0 0) 0x55806c39e680 con 0x557fa4e01000
   -36> 2018-05-08 07:19:07.666709 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 54 0x55806c39e680 osd_op_reply(32428 200.000d9322 [write 4157820~1903 [fadvise_dontneed]] v129403'47430 uv47430 ondisk = 0) v7
   -35> 2018-05-08 07:19:07.666724 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 54 ==== osd_op_reply(32428 200.000d9322 [write 4157820~1903 [fadvise_dontneed]] v129403'47430 uv47430 ondisk = 0) v7 ==== 132+0+0 (2151410342 0 0) 0x55806c39e680 con 0x557fa4e01000
   -34> 2018-05-08 07:19:07.666781 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 55 0x55806c39e680 osd_op_reply(32437 200.000d9322 [write 4159723~1903 [fadvise_dontneed]] v129403'47431 uv47431 ondisk = 0) v7
   -33> 2018-05-08 07:19:07.666789 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 55 ==== osd_op_reply(32437 200.000d9322 [write 4159723~1903 [fadvise_dontneed]] v129403'47431 uv47431 ondisk = 0) v7 ==== 132+0+0 (3752260763 0 0) 0x55806c39e680 con 0x557fa4e01000
   -32> 2018-05-08 07:19:07.666804 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 56 0x55806c39e680 osd_op_reply(32438 200.000d9322 [write 4161626~1903 [fadvise_dontneed]] v129403'47432 uv47432 ondisk = 0) v7
   -31> 2018-05-08 07:19:07.666810 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 56 ==== osd_op_reply(32438 200.000d9322 [write 4161626~1903 [fadvise_dontneed]] v129403'47432 uv47432 ondisk = 0) v7 ==== 132+0+0 (1949150409 0 0) 0x55806c39e680 con 0x557fa4e01000
   -30> 2018-05-08 07:19:07.666823 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 57 0x55806c39e680 osd_op_reply(32440 200.000d9322 [write 4163529~1903 [fadvise_dontneed]] v129403'47433 uv47433 ondisk = 0) v7
   -29> 2018-05-08 07:19:07.666828 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 57 ==== osd_op_reply(32440 200.000d9322 [write 4163529~1903 [fadvise_dontneed]] v129403'47433 uv47433 ondisk = 0) v7 ==== 132+0+0 (2642351671 0 0) 0x55806c39e680 con 0x557fa4e01000
   -28> 2018-05-08 07:19:07.666849 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 58 0x55806c39e680 osd_op_reply(32443 200.000d9322 [write 4165432~1903 [fadvise_dontneed]] v129403'47434 uv47434 ondisk = 0) v7
   -27> 2018-05-08 07:19:07.666858 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 58 ==== osd_op_reply(32443 200.000d9322 [write 4165432~1903 [fadvise_dontneed]] v129403'47434 uv47434 ondisk = 0) v7 ==== 132+0+0 (1356124725 0 0) 0x55806c39e680 con 0x557fa4e01000
   -26> 2018-05-08 07:19:07.666889 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 59 0x55806c39e680 osd_op_reply(32448 200.000d9322 [write 4167335~1903 [fadvise_dontneed]] v129403'47435 uv47435 ondisk = 0) v7
   -25> 2018-05-08 07:19:07.666895 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 59 ==== osd_op_reply(32448 200.000d9322 [write 4167335~1903 [fadvise_dontneed]] v129403'47435 uv47435 ondisk = 0) v7 ==== 132+0+0 (399617321 0 0) 0x55806c39e680 con 0x557fa4e01000
   -24> 2018-05-08 07:19:07.666908 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 60 0x55806c39e680 osd_op_reply(32451 200.000d9322 [write 4169238~1903 [fadvise_dontneed]] v129403'47436 uv47436 ondisk = 0) v7
   -23> 2018-05-08 07:19:07.666913 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 60 ==== osd_op_reply(32451 200.000d9322 [write 4169238~1903 [fadvise_dontneed]] v129403'47436 uv47436 ondisk = 0) v7 ==== 132+0+0 (3746815010 0 0) 0x55806c39e680 con 0x557fa4e01000
   -22> 2018-05-08 07:19:07.666926 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 61 0x55806c39e680 osd_op_reply(32457 200.000d9322 [write 4171141~1903 [fadvise_dontneed]] v129403'47437 uv47437 ondisk = 0) v7
   -21> 2018-05-08 07:19:07.666932 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 61 ==== osd_op_reply(32457 200.000d9322 [write 4171141~1903 [fadvise_dontneed]] v129403'47437 uv47437 ondisk = 0) v7 ==== 132+0+0 (2158704419 0 0) 0x55806c39e680 con 0x557fa4e01000
   -20> 2018-05-08 07:19:07.667589 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 62 0x55806c39e680 osd_op_reply(32458 200.000d9322 [write 4173044~1903 [fadvise_dontneed]] v129403'47438 uv47438 ondisk = 0) v7
   -19> 2018-05-08 07:19:07.667601 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 62 ==== osd_op_reply(32458 200.000d9322 [write 4173044~1903 [fadvise_dontneed]] v129403'47438 uv47438 ondisk = 0) v7 ==== 132+0+0 (2156969519 0 0) 0x55806c39e680 con 0x557fa4e01000
   -18> 2018-05-08 07:19:07.667669 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 63 0x55806c39e680 osd_op_reply(32461 200.000d9322 [write 4174947~4665 [fadvise_dontneed]] v129403'47439 uv47439 ondisk = 0) v7
   -17> 2018-05-08 07:19:07.667678 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 63 ==== osd_op_reply(32461 200.000d9322 [write 4174947~4665 [fadvise_dontneed]] v129403'47439 uv47439 ondisk = 0) v7 ==== 132+0+0 (180666404 0 0) 0x55806c39e680 con 0x557fa4e01000
   -16> 2018-05-08 07:19:07.668189 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 64 0x55806c39e680 osd_op_reply(32463 200.000d9322 [write 4179612~2723 [fadvise_dontneed]] v129403'47440 uv47440 ondisk = 0) v7
   -15> 2018-05-08 07:19:07.668201 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 64 ==== osd_op_reply(32463 200.000d9322 [write 4179612~2723 [fadvise_dontneed]] v129403'47440 uv47440 ondisk = 0) v7 ==== 132+0+0 (3306234165 0 0) 0x55806c39e680 con 0x557fa4e01000
   -14> 2018-05-08 07:19:07.668360 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 65 0x55806c39e680 osd_op_reply(32465 200.000d9322 [write 4182335~1903 [fadvise_dontneed]] v129403'47441 uv47441 ondisk = 0) v7
   -13> 2018-05-08 07:19:07.668371 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 65 ==== osd_op_reply(32465 200.000d9322 [write 4182335~1903 [fadvise_dontneed]] v129403'47441 uv47441 ondisk = 0) v7 ==== 132+0+0 (1018918278 0 0) 0x55806c39e680 con 0x557fa4e01000
   -12> 2018-05-08 07:19:07.670505 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 66 0x55806c39e680 osd_op_reply(32467 200.000d9322 [write 4184238~1903 [fadvise_dontneed]] v129403'47442 uv47442 ondisk = 0) v7
   -11> 2018-05-08 07:19:07.670520 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 66 ==== osd_op_reply(32467 200.000d9322 [write 4184238~1903 [fadvise_dontneed]] v129403'47442 uv47442 ondisk = 0) v7 ==== 132+0+0 (892943844 0 0) 0x55806c39e680 con 0x557fa4e01000
   -10> 2018-05-08 07:19:07.670579 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 67 0x55806c39e680 osd_op_reply(32470 200.000d9322 [write 4186141~1903 [fadvise_dontneed]] v129403'47443 uv47443 ondisk = 0) v7
    -9> 2018-05-08 07:19:07.670589 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 67 ==== osd_op_reply(32470 200.000d9322 [write 4186141~1903 [fadvise_dontneed]] v129403'47443 uv47443 ondisk = 0) v7 ==== 132+0+0 (2608651745 0 0) 0x55806c39e680 con 0x557fa4e01000
    -8> 2018-05-08 07:19:07.670628 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 68 0x55806c39e680 osd_op_reply(32472 200.000d9322 [write 4188044~1903 [fadvise_dontneed]] v129403'47444 uv47444 ondisk = 0) v7
    -7> 2018-05-08 07:19:07.670637 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 68 ==== osd_op_reply(32472 200.000d9322 [write 4188044~1903 [fadvise_dontneed]] v129403'47444 uv47444 ondisk = 0) v7 ==== 132+0+0 (3534289336 0 0) 0x55806c39e680 con 0x557fa4e01000
    -6> 2018-05-08 07:19:07.670719 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 69 0x55806c39e680 osd_op_reply(32474 200.000d9322 [write 4189947~1903 [fadvise_dontneed]] v129403'47445 uv47445 ondisk = 0) v7
    -5> 2018-05-08 07:19:07.670729 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 69 ==== osd_op_reply(32474 200.000d9322 [write 4189947~1903 [fadvise_dontneed]] v129403'47445 uv47445 ondisk = 0) v7 ==== 132+0+0 (2223777515 0 0) 0x55806c39e680 con 0x557fa4e01000
    -4> 2018-05-08 07:19:07.670850 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 70 0x55806c39e680 osd_op_reply(32477 200.000d9322 [write 4191850~1903 [fadvise_dontneed]] v129403'47446 uv47446 ondisk = 0) v7
    -3> 2018-05-08 07:19:07.670859 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 70 ==== osd_op_reply(32477 200.000d9322 [write 4191850~1903 [fadvise_dontneed]] v129403'47446 uv47446 ondisk = 0) v7 ==== 132+0+0 (3129659317 0 0) 0x55806c39e680 con 0x557fa4e01000
    -2> 2018-05-08 07:19:07.671045 7f569fe72700  5 -- 192.168.153.100:6825/221877648 >> 192.168.153.111:6810/4758 conn(0x557fa4e01000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=657 cs=1 l=1). rx osd.268 seq 71 0x55806c39e680 osd_op_reply(32479 200.000d9322 [write 4193753~551 [fadvise_dontneed]] v129403'47447 uv47447 ondisk = 0) v7
    -1> 2018-05-08 07:19:07.671055 7f569fe72700  1 -- 192.168.153.100:6825/221877648 <== osd.268 192.168.153.111:6810/4758 71 ==== osd_op_reply(32479 200.000d9322 [write 4193753~551 [fadvise_dontneed]] v129403'47447 uv47447 ondisk = 0) v7 ==== 132+0+0 (2376346672 0 0) 0x55806c39e680 con 0x557fa4e01000
     0> 2018-05-08 07:19:07.713963 7f56986c6700 -1 *** Caught signal (Aborted) **
 in thread 7f56986c6700 thread_name:fn_anonymous

 ceph version 11.2.1 (e0354f9d3b1eea1d75a7dd487ba8098311be38a7)
 1: (()+0x533e4e) [0x557f9b6efe4e]
 2: (()+0x11390) [0x7f56a3b6b390]
 3: (gsignal()+0x38) [0x7f56a215d428]
 4: (abort()+0x16a) [0x7f56a215f02a]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x264) [0x557f9b7a8304]
 6: (CDir::_committed(int, unsigned long)+0x10ff) [0x557f9b5fc8bf]
 7: (MDSIOContextBase::complete(int)+0x98) [0x557f9b677ce8]
 8: (Finisher::finisher_thread_entry()+0x49e) [0x557f9b7a74ae]
 9: (()+0x76ba) [0x7f56a3b616ba]
 10: (clone()+0x6d) [0x7f56a222f3dd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_mirror
   0/ 5 rbd_replay
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 xio
   1/ 5 compressor
   1/ 5 newstore
   1/ 5 bluestore
   1/ 5 bluefs
   1/ 3 bdev
   1/ 5 kstore
   4/ 5 rocksdb
   4/ 5 leveldb
   4/ 5 memdb
   1/ 5 kinetic
   1/ 5 fuse
   1/ 5 mgr
   1/ 5 mgrc
   1/ 5 dpdk
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-mds.851PM-153-100.log
--- end dump of recent events ---

History

#1 Updated by John Spray 7 months ago

Thanks for the report - it looks like you're using a 11.x ("kraken") version, which is no longer receiving bug fixes.

Could you re-test with a 12.x ("luminous") release of Ceph?

#2 Updated by John Spray 7 months ago

  • Project changed from Ceph to fs
  • Subject changed from mds Always rejoin to mds assert in CDir::_committed (kraken)

#3 Updated by Patrick Donnelly 7 months ago

  • Subject changed from mds assert in CDir::_committed (kraken) to mds: assert in CDir::_committed
  • Status changed from New to Need More Info
  • Target version set to v14.0.0
  • Affected Versions v11.2.1 added
  • Labels (FS) crash added

#4 Updated by Patrick Donnelly 7 months ago

  • Description updated (diff)

Also available in: Atom PDF