Bug #18046: filestore: sync_entry timed out cause osd crash - Ceph - Ceph

Actions

Copy link

Bug #18046

closed

filestore: sync_entry timed out cause osd crash

Added by huang jun over 7 years ago. Updated almost 7 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

hello, i got filestore sync_entry thread hung, and crashed osd after 600s
here is the sync_entry thread backtrace:
(gdb) thread 192 [Switching to thread 192 (Thread 0x7fe425457700 (LWP 1165749))] #0 0x00007fe4340c3f4d in __lll_lock_wait () from /lib64/libpthread.so.0 (gdb) bt #0 0x00007fe4340c3f4d in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007fe4340bfd02 in _L_lock_791 () from /lib64/libpthread.so.0 #2 0x00007fe4340bfc08 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x0000000000abe12f in Mutex::Lock (this=this@entry=0x7fe4318d0c88, no_lockdep=no_lockdep@entry=false) at common/Mutex.cc:92 #4 0x00000000009f5385 in Locker (m=..., this=<synthetic pointer>) at common/Mutex.h:115 #5 FileJournal::committed_thru (this=0x7fe4318d0700, seq=43449) at os/FileJournal.cc:1601 #6 0x00000000009017ae in JournalingObjectStore::ApplyManager::commit_finish (this=this@entry=0x7fe4318651c8) at os/JournalingObjectStore.cc:247 #7 0x00000000008b7a00 in FileStore::sync_entry (this=0x7fe431865000) at os/FileStore.cc:3900 #8 0x00000000008efc7d in FileStore::SyncThread::entry (this=<optimized out>) at os/FileStore.h:173 #9 0x00007fe4340bddc5 in start_thread () from /lib64/libpthread.so.0 #10 0x00007fe432ba117d in clone () from /lib64/libc.so.6 (gdb) f 5 #5 FileJournal::committed_thru (this=0x7fe4318d0700, seq=43449) at os/FileJournal.cc:1601 1601 Mutex::Locker locker(write_lock); (gdb) p write_lock $2 = {name = 0xc8a36c "FileJournal::write_lock", id = -1, recursive = false, lockdep = true, backtrace = false, _m = {__data = {__lock = 2, __count = 0, __owner = 1165754, __nusers = 1, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\272\311\021\000\001\000\000\000\002", '\000' <repeats 22 times>, __align = 2}, nlock = 1, locked_by = 140617846187776, cct = 0x7fe431835100, logger = 0x7fe431813d80} (gdb) thread 492 [Switching to thread 492 (Thread 0x7fe424c56700 (LWP 1165754))] #0 0x00007fe432bf0aae in __memcpy_ssse3_back () from /lib64/libc.so.6 (gdb) bt #0 0x00007fe432bf0aae in __memcpy_ssse3_back () from /lib64/libc.so.6 #1 0x0000000000b85389 in ceph::buffer::list::rebuild (this=0x7fe424c557f0, nb=...) at common/buffer.cc:1223 #2 0x0000000000b85b2d in ceph::buffer::list::rebuild_aligned_size_and_memory (this=this@entry=0x7fe424c55ce0, align_size=4096, align_memory=4096) at common/buffer.cc:1271 #3 0x0000000000b85c77 in ceph::buffer::list::rebuild_aligned (this=this@entry=0x7fe424c55ce0, align=<optimized out>) at common/buffer.cc:1233 #4 0x0000000000b85c8e in ceph::buffer::list::rebuild_page_aligned (this=this@entry=0x7fe424c55ce0) at common/buffer.cc:1280 #5 0x00000000009ead1e in FileJournal::align_bl (this=this@entry=0x7fe4318d0700, pos=7729348608, bl=...) at os/FileJournal.cc:960 #6 0x00000000009f2ca5 in FileJournal::do_write (this=this@entry=0x7fe4318d0700, bl=...) at os/FileJournal.cc:1020 #7 0x00000000009f73eb in FileJournal::write_thread_entry (this=0x7fe4318d0700) at os/FileJournal.cc:1245 #8 0x00000000008ed8dd in FileJournal::Writer::entry (this=<optimized out>) at os/FileJournal.h:338 #9 0x00007fe4340bddc5 in start_thread () from /lib64/libpthread.so.0 #10 0x00007fe432ba117d in clone () from /lib64/libc.so.6 (gdb) f 5 #5 0x00000000009ead1e in FileJournal::align_bl (this=this@entry=0x7fe4318d0700, pos=7729348608, bl=...) at os/FileJournal.cc:960 960 bl.rebuild_page_aligned(); (gdb) p bl $3 = (ceph::bufferlist &) @0x7fe424c55ce0: {_buffers = empty std::list, _len = 210534400, _memcopy_count = 0, append_buffer = {_raw = 0x7fe429046080, _off = 0, _len = 3904}, last_p = { bl = 0x7fe424c55ce0, ls = 0x7fe424c55ce0, off = 0, p = {_raw = , _off = 688152704, _len = 32740}, p_off = 0}, static CLAIM_DEFAULT = 0, static CLAIM_ALLOW_NONSHAREABLE = 1}
we use jemalloc-3.6.0-1.el7.x86_64 as memory allocator,
the ceph version is 0.94.5
os version:
Linux server2 4.1.12-37.5.1.el7uek.x86_64 #2 SMP Wed Jun 15 14:46:02 CST 2016 x86_64 x86_64 x86_64 GNU/Linux
what will cause this problem? thank you

Actions

Copy link

Updated by huang jun over 7 years ago

can we do this?

@diff --git a/src/os/filestore/FileJournal.cc b/src/os/filestore/FileJournal.cc
index a0985ff..016f606 100644
--- a/src/os/filestore/FileJournal.cc
+++ b/src/os/filestore/FileJournal.cc
@@ -1045,13 +1045,15 @@ void FileJournal::do_write(bufferlist& bl)
   off64_t pos = write_pos;

   // Adjust write_pos
-  align_bl(pos, bl);
   write_pos += bl.length();
   if (write_pos >= header.max_size)
     write_pos = write_pos - header.max_size + get_top();

   write_lock.Unlock();

+  // align bl after unlock the write_lock
+  align_bl(pos, bl);
+@

Actions

Copy link

Updated by Samuel Just over 7 years ago

Hmm, I'm not sure about that. Why would align_bl take that long to begin with? Is this reproducible? If so, can you add debugging to determine what is being passed in?

Actions

Copy link

Updated by huang jun over 7 years ago

it maybe related to our configuration, we set vm.zone_reclaim_mode=3, which will block the memory allocation if there is not enough memory in the zone.

Actions

Copy link