Bug #20337: test_rebuild_simple_altpool triggers MDS assertion - CephFS - Ceph

Actions

Copy link

Bug #20337

closed

test_rebuild_simple_altpool triggers MDS assertion

Added by John Spray almost 7 years ago. Updated over 6 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Douglas Fuller

Category:

Testing

Target version:

% Done:

Source:

Tags:

Backport:

luminous

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Two things are going wrong here, I think:

The test code is doing a self.fs.wait_for_daemons() (test_data_scan.py:425), but there is also an other_fs in this case which should be waited for as well.
The scrub_path command is not checking that the daemon is active before executing.

We should fix the test not to trigger this case, and at the same time fix the MDS to reject the command in this situation instead of asserting out when it tries to repair metadata.

ceph version 12.0.3-1728-g9742c3e (9742c3ee3c45045a4b7853b252ed427748214bb6) luminous (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f62ad116f30]
 2: (MDLog::_submit_entry(LogEvent*, MDSLogContextBase*)+0x164) [0x7f62ad078404]
 3: (Locker::scatter_writebehind(ScatterLock*)+0x863) [0x7f62acf71f73]
 4: (Locker::simple_sync(SimpleLock*, bool*)+0x558) [0x7f62acf75da8]
 5: (Locker::file_eval(ScatterLock*, bool*)+0x3e6) [0x7f62acf7b6f6]
 6: (Locker::try_eval(SimpleLock*, bool*)+0x6ee) [0x7f62acf7c89e]
 7: (Locker::wrlock_finish(SimpleLock*, MutationImpl*, bool*)+0x26e) [0x7f62acf7f30e]
 8: (Locker::_drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x22c) [0x7f62acf82c7c]
 9: (Locker::drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x76) [0x7f62acf83046]
 10: (MDCache::repair_inode_stats_work(boost::intrusive_ptr<MDRequestImpl>&)+0x9ca) [0x7f62acee996a]
 11: (MDCache::repair_inode_stats(CInode*)+0x73) [0x7f62acee9ba3]
 12: (()+0x4a2016) [0x7f62acffb016]
 13: (Continuation::_continue_function(int, int)+0x1aa) [0x7f62ad02c00a]
 14: (()+0x4c2583) [0x7f62ad01b583]
 15: (()+0x4c4aa6) [0x7f62ad01daa6]
 16: (Continuation::_continue_function(int, int)+0x1aa) [0x7f62ad02c00a]
 17: (Continuation::Callback::finish(int)+0x10) [0x7f62ad02c0f0]
 18: (Context::complete(int)+0x9) [0x7f62acdfc859]
 19: (MDSIOContextBase::complete(int)+0xa4) [0x7f62ad062ba4]
 20: (Finisher::finisher_thread_entry()+0x1c5) [0x7f62ad115ef5]
 21: (()+0x7dc5) [0x7f62aac48dc5]
 22: (clone()+0x6d) [0x7f62a9d2d73d]

Related issues 1 (0 open — 1 closed)