Project

General

Profile

Actions

Bug #9545

closed

filestore stuck in journal->should_commit_now() loop on shutdown

Added by Sage Weil over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Related issues 1 (0 open1 closed)

Related to Ceph - Bug #9768: ceph-osd mkfs hangsDuplicateLoïc Dachary10/14/2014

Actions
Actions #1

Updated by Sage Weil over 9 years ago

  • Assignee set to Sage Weil
  • Priority changed from Normal to Urgent
  • Source changed from other to Q/A

sync_entry is looping on the same seq while the main thread waits for umount. journal should_commit_now() is stuck returning true.

Actions #2

Updated by Sage Weil over 9 years ago

Thread 14 (Thread 0x7f6cfb791700 (LWP 18611)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x00007f6d01e773cb in ?? () from /usr/lib/x86_64-linux-gnu/liblttng-ust.so.0
#2  0x00007f6d00d50182 in start_thread (arg=0x7f6cfb791700) at pthread_create.c:312
#3  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 13 (Thread 0x7f6cfaf90700 (LWP 18612)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x00007f6d01e773cb in ?? () from /usr/lib/x86_64-linux-gnu/liblttng-ust.so.0
#2  0x00007f6d00d50182 in start_thread (arg=0x7f6cfaf90700) at pthread_create.c:312
#3  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 12 (Thread 0x7f6cfa78f700 (LWP 18613)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000000000ad9c4b in ceph::log::Log::entry (this=0x4790000) at log/Log.cc:345
#2  0x00007f6d00d50182 in start_thread (arg=0x7f6cfa78f700) at pthread_create.c:312
#3  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 11 (Thread 0x7f6cf9815700 (LWP 18614)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000ba4ff3 in WaitUntil (when=..., mutex=..., this=0x4795328) at ./common/Cond.h:71
#2  WaitInterval (interval=..., mutex=..., cct=<optimized out>, this=0x4795328) at ./common/Cond.h:79
#3  CephContextServiceThread::entry (this=0x47952b0) at common/ceph_context.cc:58
#4  0x00007f6d00d50182 in start_thread (arg=0x7f6cf9815700) at pthread_create.c:312
#5  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 10 (Thread 0x7f6cf9014700 (LWP 18615)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000009cad70 in Wait (mutex=..., this=0x47c8940) at ./common/Cond.h:55
#2  WBThrottle::get_next_should_flush (this=this@entry=0x47c8828, next=next@entry=0x7f6cf9013c40) at os/WBThrottle.cc:142
#3  0x00000000009cb81a in WBThrottle::entry (this=0x47c8828) at os/WBThrottle.cc:160
#4  0x00007f6d00d50182 in start_thread (arg=0x7f6cf9014700) at pthread_create.c:312
#5  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 9 (Thread 0x7f6cf8813700 (LWP 18616)):
#0  FileStore::sync_entry (this=0x47c8000) at os/FileStore.cc:3530
#1  0x0000000000963c6d in FileStore::SyncThread::entry (this=<optimized out>) at os/FileStore.h:173
#2  0x00007f6d00d50182 in start_thread (arg=0x7f6cf8813700) at pthread_create.c:312
#3  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 8 (Thread 0x7f6cf8012700 (LWP 18617)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x000000000076ea5e in Cond::Wait (this=0x471fae0, mutex=...) at ./common/Cond.h:55
#2  0x0000000000a684eb in FileJournal::write_thread_entry (this=0x471f500) at os/FileJournal.cc:1169
#3  0x00000000009620bd in FileJournal::Writer::entry (this=<optimized out>) at os/FileJournal.h:333
#4  0x00007f6d00d50182 in start_thread (arg=0x7f6cf8012700) at pthread_create.c:312
#5  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 7 (Thread 0x7f6cf7811700 (LWP 18618)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000000000ad0c0c in Wait (mutex=..., this=0x47c8080) at ./common/Cond.h:55
#2  Finisher::finisher_thread_entry (this=0x47c8020) at common/Finisher.cc:81
#3  0x00007f6d00d50182 in start_thread (arg=0x7f6cf7811700) at pthread_create.c:312
#4  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 6 (Thread 0x7f6cf7010700 (LWP 18619)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000b81b0b in WaitUntil (when=..., mutex=..., this=0x47c8cb8) at common/Cond.h:71
#2  WaitInterval (interval=..., mutex=..., cct=<optimized out>, this=0x47c8cb8) at common/Cond.h:79
#3  ThreadPool::worker (this=0x47c8c40, wt=0x479a490) at common/WorkQueue.cc:146
#4  0x0000000000b82f50 in ThreadPool::WorkThread::entry (this=<optimized out>) at common/WorkQueue.h:318
#5  0x00007f6d00d50182 in start_thread (arg=0x7f6cf7010700) at pthread_create.c:312
#6  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 5 (Thread 0x7f6cf680f700 (LWP 18620)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000b81b0b in WaitUntil (when=..., mutex=..., this=0x47c8cb8) at common/Cond.h:71
#2  WaitInterval (interval=..., mutex=..., cct=<optimized out>, this=0x47c8cb8) at common/Cond.h:79
#3  ThreadPool::worker (this=0x47c8c40, wt=0x479a430) at common/WorkQueue.cc:146
#4  0x0000000000b82f50 in ThreadPool::WorkThread::entry (this=<optimized out>) at common/WorkQueue.h:318
#5  0x00007f6d00d50182 in start_thread (arg=0x7f6cf680f700) at pthread_create.c:312
#6  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 4 (Thread 0x7f6cf600e700 (LWP 18621)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000000000ad0c0c in Wait (mutex=..., this=0x47c8b60) at ./common/Cond.h:55
#2  Finisher::finisher_thread_entry (this=0x47c8b00) at common/Finisher.cc:81
#3  0x00007f6d00d50182 in start_thread (arg=0x7f6cf600e700) at pthread_create.c:312
#4  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3 (Thread 0x7f6cf580d700 (LWP 18622)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000000000ad0c0c in Wait (mutex=..., this=0x47c8520) at ./common/Cond.h:55
#2  Finisher::finisher_thread_entry (this=0x47c84c0) at common/Finisher.cc:81
#3  0x00007f6d00d50182 in start_thread (arg=0x7f6cf580d700) at pthread_create.c:312
#4  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x7f6cf500c700 (LWP 18623)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000000000b79e5c in Wait (mutex=..., this=0x47c8710) at common/Cond.h:55
#2  SafeTimer::timer_thread (this=0x47c8700) at common/Timer.cc:112
#3  0x0000000000b7bb8d in SafeTimerThread::entry (this=<optimized out>) at common/Timer.cc:38
#4  0x00007f6d00d50182 in start_thread (arg=0x7f6cf500c700) at pthread_create.c:312
#5  0x00007f6cfcb5238d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7f6d022b1900 (LWP 18610)):
#0  0x00007f6d00d5166b in pthread_join (threadid=140106002413312, thread_return=0x0) at pthread_join.c:92
#1  0x0000000000b71f52 in Thread::join (this=this@entry=0x47c87e0, prval=prval@entry=0x0) at common/Thread.cc:139
#2  0x00000000009282e3 in FileStore::umount (this=0x47c8000) at os/FileStore.cc:1598
#3  0x000000000063ed27 in main (argc=<optimized out>, argv=<optimized out>) at tools/ceph_objectstore_tool.cc:2542

Actions #3

Updated by Sage Weil over 9 years ago

  • Status changed from New to Fix Under Review
Actions #4

Updated by Samuel Just over 9 years ago

  • Status changed from Fix Under Review to 7
Actions #5

Updated by Samuel Just over 9 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF