Actions
Bug #53645
openMDCache::shutdown_pass: ceph_assert(!migrator->is_importing())
Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I'm running a pinning/multimds thrash test (see stressfs.sh attached) on a 3 node test cluster and occasionally seeing this crash while stopping:
-10> 2021-12-16 15:50:15.760 7f4ab1070700 7 mds.2.cache shutdown_pass -9> 2021-12-16 15:50:15.760 7f4ab1070700 10 mds.2.cache shutdown_export_strays 0x61d '' -8> 2021-12-16 15:50:15.760 7f4ab1070700 7 mds.2.cache trim bytes_used=41kB limit=2GB reservation=0.05% count=18446744073709551615 -7> 2021-12-16 15:50:15.760 7f4ab1070700 7 mds.2.cache trim_lru trimming 18446744073709551615 items from LRU size=10 mid=0 pintail=10 pinned=10 -6> 2021-12-16 15:50:15.760 7f4ab1070700 7 mds.2.cache trim_lru trimmed 0 items -5> 2021-12-16 15:50:15.760 7f4ab1070700 5 mds.2.cache lru size now 10/0 -4> 2021-12-16 15:50:15.760 7f4ab1070700 7 mds.2.cache looking for subtrees to export to mds0 -3> 2021-12-16 15:50:15.760 7f4ab1070700 10 mds.2.log trim_all: 1/0/0 -2> 2021-12-16 15:50:15.760 7f4ab1070700 10 mds.2.log _trim_expired_segments waiting for 20758/71894891982 to expire -1> 2021-12-16 15:50:15.761 7f4ab1070700 -1 /builddir/build/BUILD/ceph-14.2.22/src/mds/MDCache.cc: In function 'bool MDCache::shutdown_pass()' thread 7f4ab1070700 time 2021-12-16 15:50:15.760517 /builddir/build/BUILD/ceph-14.2.22/src/mds/MDCache.cc: 7786: FAILED ceph_assert(!migrator->is_importing()) ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351) nautilus (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x156) [0x7f4abcc3282a] 2: (()+0x274a44) [0x7f4abcc32a44] 3: (MDCache::shutdown_pass()+0x13a7) [0x55d286352e87] 4: (MDSRankDispatcher::tick()+0x2a8) [0x55d2862375c8] 5: (FunctionContext::finish(int)+0x30) [0x55d2862212e0] 6: (Context::complete(int)+0xd) [0x55d28621f46d] 7: (SafeTimer::timer_thread()+0x19c) [0x7f4abcd0e76c] 8: (SafeTimerThread::entry()+0x11) [0x7f4abcd10171] 9: (()+0x817a) [0x7f4abaa1217a] 10: (clone()+0x43) [0x7f4ab9466d83]
mds log is at ceph-post-file: 358f9d7b-5953-4a7d-818a-092d1a645b3c
There have been some changes in shutdown_pass related to ephemeral pinning, so we'll try this on a pacific test cluster next to see if it is still a bug.
Files
Actions