Project

General

Profile

Actions

Bug #24039

closed

MDSTableServer.cc: 62: FAILED assert(g_conf->mds_kill_mdstable_at != 1)

Added by Patrick Donnelly almost 6 years ago. Updated almost 6 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Assertion: /build/ceph-13.0.2-2109-g5eb2a2b/src/mds/MDSTableServer.cc: 62: FAILED assert(g_conf->mds_kill_mdstable_at != 1)
ceph version 13.0.2-2109-g5eb2a2b (5eb2a2b10a839dcc72dbc16c6f66898fe5bede13) mimic (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fea4e337172]
 2: (()+0x2e4337) [0x7fea4e337337]
 3: (MDSTableServer::handle_prepare(MMDSTableRequest*)+0x328) [0x558dea504638]
 4: (MDSRank::handle_deferrable_message(Message*)+0x109) [0x558dea2d22b9]
 5: (MDSRank::_dispatch(Message*, bool)+0x62b) [0x558dea2defbb]
 6: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x558dea2df745]
 7: (MDSDaemon::ms_dispatch(Message*)+0xd3) [0x558dea2ca673]
 8: (DispatchQueue::entry()+0xb92) [0x7fea4e3afc62]
 9: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fea4e44dd0d]
 10: (()+0x76ba) [0x7fea4dbcc6ba]
 11: (clone()+0x6d) [0x7fea4d3f541d]
11 jobs: ['2474541', '2474811', '2474991', '2474481', '2475021', '2474721', '2474931', '2474961', '2474601', '2474661', '2474511']
suites intersection: ['frag_enable.yaml', 'fuse-default-perm-no.yaml}', 'mount/fuse.yaml', 'multimds/basic/{begin.yaml', 'overrides/{basic/{debug.yaml', 'q_check_counter/check_counter.yaml', 'tasks/cephfs_test_snapshots.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']
suites union: ['clusters/3-mds.yaml', 'clusters/9-mds.yaml', 'frag_enable.yaml', 'fuse-default-perm-no.yaml}', 'inline/no.yaml', 'inline/yes.yaml', 'mount/fuse.yaml', 'multimds/basic/{begin.yaml', 'objectstore-ec/bluestore-comp-ec-root.yaml', 'objectstore-ec/bluestore-comp.yaml', 'objectstore-ec/bluestore-ec-root.yaml', 'objectstore-ec/bluestore.yaml', 'objectstore-ec/filestore-xfs.yaml', 'overrides/{basic/{debug.yaml', 'q_check_counter/check_counter.yaml', 'tasks/cephfs_test_snapshots.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']

Assertion: /build/ceph-13.0.2-2109-g5eb2a2b/src/mds/MDSTableClient.cc: 74: FAILED assert(g_conf->mds_kill_mdstable_at != 9)
ceph version 13.0.2-2109-g5eb2a2b (5eb2a2b10a839dcc72dbc16c6f66898fe5bede13) mimic (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f8ac3f2d172]
 2: (()+0x2e4337) [0x7f8ac3f2d337]
 3: (MDSTableClient::handle_request(MMDSTableRequest*)+0xf9d) [0x561e84d4392d]
 4: (MDSRank::handle_deferrable_message(Message*)+0x9db) [0x561e84b14b8b]
 5: (MDSRank::_dispatch(Message*, bool)+0x62b) [0x561e84b20fbb]
 6: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x561e84b21745]
 7: (MDSDaemon::ms_dispatch(Message*)+0xd3) [0x561e84b0c673]
 8: (DispatchQueue::entry()+0xb92) [0x7f8ac3fa5c62]
 9: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f8ac4043d0d]
 10: (()+0x76ba) [0x7f8ac37c26ba]
 11: (clone()+0x6d) [0x7f8ac2feb41d]
5 jobs: ['2474766', '2474856', '2474496', '2474886', '2474466']
suites intersection: ['frag_enable.yaml', 'fuse-default-perm-no.yaml}', 'inline/yes.yaml', 'mount/kclient.yaml', 'multimds/basic/{begin.yaml', 'overrides/{basic/{debug.yaml', 'q_check_counter/check_counter.yaml', 'tasks/cephfs_test_snapshots.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']
suites union: ['clusters/3-mds.yaml', 'clusters/9-mds.yaml', 'frag_enable.yaml', 'fuse-default-perm-no.yaml}', 'inline/yes.yaml', 'mount/kclient.yaml', 'multimds/basic/{begin.yaml', 'objectstore-ec/bluestore-comp.yaml', 'objectstore-ec/bluestore-ec-root.yaml', 'objectstore-ec/bluestore.yaml', 'objectstore-ec/filestore-xfs.yaml', 'overrides/{basic/{debug.yaml', 'q_check_counter/check_counter.yaml', 'tasks/cephfs_test_snapshots.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']

From: http://pulpito.ceph.com/pdonnell-2018-05-04_03:45:51-multimds-master-testing-basic-smithi/


Related issues 1 (0 open1 closed)

Related to CephFS - Bug #24111: mds didn't update file's max_sizeResolvedZheng Yan05/14/2018

Actions
Actions #1

Updated by Patrick Donnelly almost 6 years ago

  • Description updated (diff)
Actions #2

Updated by Zheng Yan almost 6 years ago

These are intentional crashes in table transaction test

Actions #3

Updated by Patrick Donnelly almost 6 years ago

Right, there's something else wrong with the test.

Actions #4

Updated by Zheng Yan almost 6 years ago

fsstress failure looks like new bug

pjd failure is similar to http://tracker.ceph.com/issues/23327
two dead tasks were caused by osd crash

Actions #5

Updated by Zheng Yan almost 6 years ago

the fsstress failure: http://pulpito.ceph.com/pdonnell-2018-05-04_03:45:51-multimds-master-testing-basic-smithi/2474517/

in mds log

2018-05-04T05:04:27.287 INFO:tasks.workunit.client.0.smithi099.stdout:4/920: rename d9/dc/d76/fe2 to d9/d70/d71/dbc/d12c/dff/d117/f146 0

above line was the last output from thread4 of fsstress. The next operation (the stuck one) should be "4/921: dwrite d9/d70/f26 [1568768,77824] 0".

2018-05-04 05:04:27.303 16359700  7 mds.6.locker issue_caps loner client.4410 allowed=pAsxLsXsxFsxcrwbl, xlocker allowed=pAsxLsXsxFsxcrwbl, others allowed=pLs on [inode 0x10000000030 [...2,head] /client.0/tmp/fsstress-smithi09917626/p4/d9/d70/f26 auth v432 snaprealm=0x1d2660d0 dirtyparent s=4117808 n(v0 rc2018-05-04 05:00:10.656484 b4117808 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) caps={4410=pAsLsXsFscr/pAsxXsxFxwb@30},l=4410 | ptrwaiter=0 request=0 lock=0 importingcaps=0 caps=1 remoteparent=0 truncating=0 dirtyparent=1 replicated=0 dirty=1 waiter=0 authpin=0 0x1d265910]
2018-05-04 05:04:27.305 16359700 20 mds.6.locker  client.4410 pending pAsLsXsFscr allowed pAsxLsXsxFsxcrwbl wanted pAsxXsxFxwb
2018-05-04 05:04:27.305 16359700  7 mds.6.locker    sending MClientCaps to client.4410 seq 31 new pending pAsxLsXsxFsxcrwb was pAsLsXsFscr

mds did issue Fw caps to client. it's likely a kernel bug
Actions #6

Updated by Zheng Yan almost 6 years ago

the pjd: http://pulpito.ceph.com/pdonnell-2018-05-04_03:45:51-multimds-master-testing-basic-smithi/2475062/

2018-05-04 10:54:12.282 7f924b8db700  1 -- 172.21.15.111:6817/1152369955 <== client.4404 172.21.15.84:0/4267712957 898 ==== client_request(client.4404:550 create #0x1000000014b/fstest_f3851bfcb5dcb053a62a905849154936 2018-05-04 10:54:12.456706 caller_uid=0, caller_gid=0{}) v2 ==== 205+0+0 (1330219802 0 0) 0x555d6b754a40 con 0x555d6af54700

...

2018-05-04 10:54:12.310 7f924b8db700  1 -- 172.21.15.111:6817/1152369955 <== client.4404 172.21.15.84:0/4267712957 900 ==== client_request(client.4404:551 link #0x1000000014b/fstest_1de0b7d39f5c7bc97f732e35531530e1 #0x1000000014b/fstest_f3851bfcb5dcb053a62a905849154936 2018-05-04 10:54:12.484692 caller_uid=0, caller_gid=0{}) v2 ==== 244+0+0 (1462308990 0 0) 0x555d6b754d80 con 0x555d6af54700

it's short sleep

Actions #7

Updated by Zheng Yan almost 6 years ago

  • Related to Bug #24111: mds didn't update file's max_size added
Actions #8

Updated by Zheng Yan almost 6 years ago

  • Status changed from New to Closed

create new ticket for the fsstress hang http://tracker.ceph.com/issues/24111

close this one

Actions

Also available in: Atom PDF