Actions
Bug #53645
openMDCache::shutdown_pass: ceph_assert(!migrator->is_importing())
Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I'm running a pinning/multimds thrash test (see stressfs.sh attached) on a 3 node test cluster and occasionally seeing this crash while stopping:
-10> 2021-12-16 15:50:15.760 7f4ab1070700 7 mds.2.cache shutdown_pass -9> 2021-12-16 15:50:15.760 7f4ab1070700 10 mds.2.cache shutdown_export_strays 0x61d '' -8> 2021-12-16 15:50:15.760 7f4ab1070700 7 mds.2.cache trim bytes_used=41kB limit=2GB reservation=0.05% count=18446744073709551615 -7> 2021-12-16 15:50:15.760 7f4ab1070700 7 mds.2.cache trim_lru trimming 18446744073709551615 items from LRU size=10 mid=0 pintail=10 pinned=10 -6> 2021-12-16 15:50:15.760 7f4ab1070700 7 mds.2.cache trim_lru trimmed 0 items -5> 2021-12-16 15:50:15.760 7f4ab1070700 5 mds.2.cache lru size now 10/0 -4> 2021-12-16 15:50:15.760 7f4ab1070700 7 mds.2.cache looking for subtrees to export to mds0 -3> 2021-12-16 15:50:15.760 7f4ab1070700 10 mds.2.log trim_all: 1/0/0 -2> 2021-12-16 15:50:15.760 7f4ab1070700 10 mds.2.log _trim_expired_segments waiting for 20758/71894891982 to expire -1> 2021-12-16 15:50:15.761 7f4ab1070700 -1 /builddir/build/BUILD/ceph-14.2.22/src/mds/MDCache.cc: In function 'bool MDCache::shutdown_pass()' thread 7f4ab1070700 time 2021-12-16 15:50:15.760517 /builddir/build/BUILD/ceph-14.2.22/src/mds/MDCache.cc: 7786: FAILED ceph_assert(!migrator->is_importing()) ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351) nautilus (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x156) [0x7f4abcc3282a] 2: (()+0x274a44) [0x7f4abcc32a44] 3: (MDCache::shutdown_pass()+0x13a7) [0x55d286352e87] 4: (MDSRankDispatcher::tick()+0x2a8) [0x55d2862375c8] 5: (FunctionContext::finish(int)+0x30) [0x55d2862212e0] 6: (Context::complete(int)+0xd) [0x55d28621f46d] 7: (SafeTimer::timer_thread()+0x19c) [0x7f4abcd0e76c] 8: (SafeTimerThread::entry()+0x11) [0x7f4abcd10171] 9: (()+0x817a) [0x7f4abaa1217a] 10: (clone()+0x43) [0x7f4ab9466d83]
mds log is at ceph-post-file: 358f9d7b-5953-4a7d-818a-092d1a645b3c
There have been some changes in shutdown_pass related to ephemeral pinning, so we'll try this on a pacific test cluster next to see if it is still a bug.
Files
Updated by Venky Shankar over 2 years ago
Dan, thanks for the report. Please let us know if you hit this in pacific.
Updated by Dan van der Ster over 2 years ago
- Affected Versions v15.2.15 added
Still present in octopus:
2022-01-13T16:35:30.607+0100 7f1c07330700 2 mds.2.cache Memory usage: total 2930892, rss 208844, heap 323792, baseline 323792, 0 / 12 inodes have caps, 0 caps, 0 caps per inode 2022-01-13T16:35:30.984+0100 7f1c09b35700 -1 /builddir/build/BUILD/ceph-15.2.15/src/mds/MDCache.cc: In function 'bool MDCache::shutdown_pass()' thread 7f1c09b35700 time 2022-01-13T16:35:30.976714+0100 /builddir/build/BUILD/ceph-15.2.15/src/mds/MDCache.cc: 7922: FAILED ceph_assert(!migrator->is_importing()) ceph version 15.2.15-2 (2dfb18841cfecc2f7eb7eb2afd65986ca4d95985) octopus (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f1c13515ee0] 2: (()+0x27e0fa) [0x7f1c135160fa] 3: (MDCache::shutdown_pass()+0x1771) [0x56402b685f81] 4: (MDSRankDispatcher::tick()+0x298) [0x56402b5670b8] 5: (Context::complete(int)+0xd) [0x56402b55015d] 6: (SafeTimer::timer_thread()+0x1b7) [0x7f1c135f0d57] 7: (SafeTimerThread::entry()+0x11) [0x7f1c135f2331] 8: (()+0x817f) [0x7f1c1233117f] 9: (clone()+0x43) [0x7f1c10d85d83]
Actions