Project

General

Profile

Actions

Bug #172

closed

OSD and MDS crash on rm -r

Added by Wido den Hollander almost 14 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I'm still using my test script which unpacks the kernel source and then removes it again with a few steps in between.

Right now the copy back from the snapshot goes fine, but afterwards the rm of the original kernel files fail.

Attached you will find various files with straces and logs in them, i'll try to point out the scenario:

The scripts runs fine and the copy back from the snapshot goes fine, expect for the messages in "ceph_client_script_log.txt"

Why can't those files be found? That seems like a different bug to me?

Well, the bug of the stalling cp seems to have been fixed, but while removing the files afterwards the MDS crashes first, 15 seconds later, all 5 OSD's go down as well.

In the mds strace and osd strace i've added a stat of /core, here you can see the MDS core dump is older then the OSD, thus the MDS crashes first.

After gathering all this information i started my OSD's and the crash MDS (192.168.6.206) again. While doing so, my cluster started to recover, but then mds0 (192.168.6.205) crashed. (Core and log is also attached in mds0_*)

I'm doing a clean mkcephfs right now and running the same test again, expecting the same result as it happened for the second time today.

My Envirioment:
  • Branch: unstable ( 7c0df0540700fe2816470f5cc2a2fc7a130e4456 )
  • OS: Ubuntu 10.04 (AMD64)
  • Kernel: 2.6.34

Files

mds1_crash_strace.txt (2.09 KB) mds1_crash_strace.txt Wido den Hollander, 06/03/2010 05:11 AM
osd_crash_log.txt (63.9 KB) osd_crash_log.txt Wido den Hollander, 06/03/2010 05:11 AM
osd_crash_trace.txt (9.66 KB) osd_crash_trace.txt Wido den Hollander, 06/03/2010 05:11 AM
mds0_crash_strace.txt (2.37 KB) mds0_crash_strace.txt Wido den Hollander, 06/03/2010 05:11 AM
ceph_client_test_script.sh (840 Bytes) ceph_client_test_script.sh Wido den Hollander, 06/03/2010 05:11 AM
ceph_client_script_log.txt (1.51 KB) ceph_client_script_log.txt Wido den Hollander, 06/03/2010 05:11 AM
ceph_client_ps.txt (632 Bytes) ceph_client_ps.txt Wido den Hollander, 06/03/2010 05:11 AM
ceph_client_kernel_log.txt (1.67 KB) ceph_client_kernel_log.txt Wido den Hollander, 06/03/2010 05:11 AM
mds0_crash_log.txt (20 KB) mds0_crash_log.txt Wido den Hollander, 06/03/2010 05:11 AM
mds1_crash_log.txt (26.9 KB) mds1_crash_log.txt Wido den Hollander, 06/03/2010 05:11 AM
mds0_crash_second_run_log.txt (26 KB) mds0_crash_second_run_log.txt MDS crash of the second run Wido den Hollander, 06/04/2010 06:22 AM
Actions

Also available in: Atom PDF