Project

General

Profile

Actions

Bug #3773

closed

mds crashed at LogEvent::decode

Added by Tamilarasi muthamizhan over 11 years ago. Updated almost 8 years ago.

Status:
Can't reproduce
Priority:
High
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ceph version: 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)

I had a cluster [burnupi06, burnupi07, burnupi08] running on ceph version 0.56 and upgraded them one by one to v0.56.1 while pumping IO from the client.

mds crashed when it was upgraded and restarted.

0> 2013-01-09 11:35:36.314638 7fa65d038700 -1 mds/LogEvent.cc: In function 'static LogEvent* LogEvent::decode(ceph::bufferlist&)' thread 7fa65d038700 time 2013-01-09 11:35:34.715866
mds/LogEvent.cc: 95: FAILED assert(p.end())
ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
1: (LogEvent::decode(ceph::buffer::list&)+0x9ff) [0x6b308f]
2: (MDLog::_replay_thread()+0x2d8) [0x69e678]
3: (MDLog::ReplayThread::entry()+0xd) [0x4c7c5d]
4: (()+0x7e9a) [0x7fa664f2ae9a]
5: (clone()+0x6d) [0x7fa663dcbcbd]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

ubuntu@burnupi06:~$ sudo cat /etc/ceph/ceph.conf
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
debug ms = 1

[osd]
osd journal size = 1000
filestore xattr use omap = true
debug osd = 20

[osd.1]
host = burnupi06

[osd.2]
host = burnupi06

[osd.3]
host = burnupi07

[osd.4]
host = burnupi07

[osd.5]
host = burnupi08

[osd.6]
host = burnupi08

[mon.a]
host = burnupi06
mon addr = 10.214.133.8:6789

[mon.b]
host = burnupi07
mon addr = 10.214.134.38:6789

[mon.c]
host = burnupi08
mon addr = 10.214.134.36:6789

[mds.a]
host = burnupi08

[client.radosgw.gateway]
host = burnupi06
keyring = /etc/ceph/keyring.radosgw.gateway
rgw socket path = /tmp/radosgw.sock
log file = /var/log/ceph/radosgw.log

leaving the cluster state as it is.

Actions

Also available in: Atom PDF