Project

General

Profile

Actions

Bug #202

closed

OSD crash during reads from cluster

Added by Wido den Hollander almost 14 years ago. Updated over 13 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Today i noticed one crashing OSD during read operations (rsync) from my cluster.

I don't know if it matters, but the crashes started after a added an extra OSD to my cluster, but it hasn't been added to the CRUSH map yet.

The full log can be found at: http://zooi.widodh.nl/ceph/ceph05.10199.gz

The strace from gdb:

root@ceph05:~# gdb /usr/lib/debug/usr/bin/cosd /core.ceph05.10199 
GNU gdb (GDB) 7.1-ubuntu
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying" 
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/lib/debug/usr/bin/cosd...done.
[New Thread 10250]
[New Thread 10251]
[New Thread 10253]
[New Thread 10252]
[New Thread 10254]
[New Thread 10262]
[New Thread 10255]
[New Thread 10264]
[New Thread 10256]
[New Thread 10265]
[New Thread 10258]
[New Thread 10260]
[New Thread 10267]
[New Thread 10266]
[New Thread 10269]
[New Thread 10268]
[New Thread 10271]
[New Thread 10270]
[New Thread 10275]
[New Thread 10272]
[New Thread 10277]
[New Thread 10279]
[New Thread 10278]
[New Thread 10281]
[New Thread 10283]
[New Thread 10284]
[New Thread 10286]
[New Thread 10285]
[New Thread 10287]
[New Thread 10291]
[New Thread 10199]
[New Thread 10292]
[New Thread 10239]
[New Thread 10201]
[New Thread 10244]
[New Thread 10202]
[New Thread 10246]
[New Thread 10234]
[New Thread 10247]
[New Thread 10235]
[New Thread 10248]
[New Thread 10236]
[New Thread 10238]
[New Thread 10241]
[New Thread 10243]
[New Thread 10245]
[New Thread 10249]
[New Thread 10240]
[New Thread 10237]

warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
Core was generated by `/usr/bin/cosd -i 5 -c /etc/ceph/ceph.conf'.
Program terminated with signal 6, Aborted.
#0  0x00007f60ddddca75 in ?? ()
(gdb) bt
#0  0x00007f60ddddca75 in ?? ()
#1  0x00007f60ddde05c0 in ?? ()
#2  0x0000000000000000 in ?? ()
(gdb) 

After the crash i restarted the OSD, but a few minutes later it crashed again, this time with a bit more information. The log of the second crash can be found at: http://zooi.widodh.nl/ceph/ceph05.10671.gz

The strace of the second crash was the same as the first one.

Actions

Also available in: Atom PDF