Project

General

Profile

Actions

Bug #3948

closed

problems from leveldb static linkage and leveldb downgrade

Added by Corin Langosch about 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Two days ago I upgraded one of my osds to 0.48.3 (see http://tracker.ceph.com/issues/3797) and everything worked fined so far. However tonight this single osd crashed. I never had such a crash before. There were no network problems of any kind, the whole cluster was only very lightly loaded the past few hours. There was also no clock drift (all ntp daemons up and in sync). Any changes between .2 and .3 which could cause this (seems filestore related - leveldb problem?)? Can I restart the osd or should I recreate it?


Files

ceph-osd.8.log (1.73 MB) ceph-osd.8.log Corin Langosch, 01/28/2013 10:50 PM
ceph-logs.tar.gz (833 KB) ceph-logs.tar.gz Corin Langosch, 01/31/2013 07:40 PM

Related issues 1 (0 open1 closed)

Blocked by Ceph - Bug #3945: osd: dynamically link to leveldbResolved01/28/2013

Actions
Actions #1

Updated by Sage Weil about 11 years ago

  • Status changed from New to Need More Info

Corin-

Just restart the osd. And check dmesg for any kernel malfeasance... that is usually what triggers this. Any other circumstances that are unusual?

Actions #2

Updated by Corin Langosch about 11 years ago

Hi Sage,

does it matter that the OSD is now down for around 1-2 days or will it just pickup any changes made to the data/ cluster so far?

I double checked syslog, kernel log, dmesg and there were not a single error/ strangeness. Our cluster is also quite good monitored (each machine has newrelic and munin client, we also have internal smokeping to measure latencies) and nothing indicates any errors, hangs, load peaks, ...

When I reenable the osd within the next hours (after your ok), which debug settings should I enable so we can track down the bug in case it occurs again after a few days? I'll not upgrade any other osds till I know 0.48.3 is as stable as 0.48.2 always was for us... :)

Thanks,
Corin

Actions #3

Updated by Corin Langosch about 11 years ago

Hi Sage!

Today I was brave and upgraded two more nodes (one has 1 osd, the other 3 osds). I worked for some time but then suddenly 2 of the upgraded osds crashed and my whole cluster hung. I tried restarting them, but they crashed again after a few seconds. Please note that none of the "old" osds crashed. So I downgraded all osds to .2, restarted them and puhh...the cluster recovered fine and so far everything seems to work again.

So there must be really nasty new bug, which was not present in .2 and I highly recommend against upgrading.

Please find the logs of the osds i upgraded attached.

Corin

Actions #4

Updated by Samuel Just about 11 years ago

Both osd.7 and osd.15 have corrupted leveldb state. It's likely related to downgrading and then upgrading leveldb.

Actions #5

Updated by Corin Langosch about 11 years ago

Well, after downgrading them they seem to work stable again. If it's related to leveldb, then upgrading leveldb as this patch does is not possible. It's not the downgrading, which makes troubles.

Actions #6

Updated by Corin Langosch about 11 years ago

After the downgrade my cluster is still stable and no osd crashed so far.

What can I do to upgrade to latest argonaut? I'd even prefer upgrading to latest botail, but I assume it has the same problem. And I don't really want to try, because I'd not be able to downgrade then I guess.

If there's anything I can do to help you track down and fix this bug please let me know.

Actions #7

Updated by Sage Weil about 11 years ago

Corin Langosch wrote:

After the downgrade my cluster is still stable and no osd crashed so far.

What can I do to upgrade to latest argonaut? I'd even prefer upgrading to latest botail, but I assume it has the same problem. And I don't really want to try, because I'd not be able to downgrade then I guess.

If there's anything I can do to help you track down and fix this bug please let me know.

#3945 is tracking the dynamic linking to leveldb. Once this is done, we can cherry-pick it for your cluster against bobtail. Alternatively, I can push a branch with the newer leveldb linked in statically as a temporary fix (the same version quantal has).

It sounds like the up/downgrade may have caused issues on some of your leveldb databases, though, so it may be necessary to migrate data off those osds and replace them. It probably depends on whether you're in a hurry to upgrade..

Actions #8

Updated by Corin Langosch about 11 years ago

It's not really urgent, but being able to upgrade to latest argonaut (and if that works for 2-3 days) to latest bobtail within the next 1-2 weeks would be great. Bobtail has several fixes I'm quite in need of (for example windows kvm guests don't crash anymore #3521 as has been fixed). It just has to be rock stable as 0.48.2 currently seems to be :)

Do you really think it's only leveldb data? I mean if the data would be corrupted, why does the downgraded version work perfectly fine with this data (for several days now on several osds)? I'd be quite stunned if leveldb wouldn't check if the data format on disk is supported and just returns wrong data to the user. Normally a program with invalid data just crashes (especially a database should take data integrity really serious). Should I report this on the leveldb tracker?

Actions #9

Updated by Sage Weil about 11 years ago

  • Subject changed from osd crashed with FAILED assert(0 == "hit suicide timeout") to problems from leveldb static linkage and leveldb downgrade
  • Category set to OSD
Actions #10

Updated by Ian Colle about 11 years ago

  • Status changed from Need More Info to In Progress
  • Assignee set to Anonymous

We'll create a branch off of Bobtail with the fix for 3945.

Actions #11

Updated by Corin Langosch about 11 years ago

Will you also add checks to ensure ceph doesn't start if the on disk data is incompatible with the version of the library in use?

Actions #12

Updated by Sage Weil about 11 years ago

Unfortunately there doesn't appear to be a way to detect that with leveldb's current API. (If there was, I would expect that leveldb would be checking for incompatible up/downgrades internally and wouldn't hit this issue in the first place. :)

Future releases, starting with v0.58, will use the installed libleveldb. And we'll make a bobtail branch for you so you can safely stick with v0.56.x.

Actions #13

Updated by Anonymous about 11 years ago

Created branch bobtail-leveldb with leveldb changes.

Actions #14

Updated by Corin Langosch about 11 years ago

Great, thanks. Can only upgrade a single osd (out of around 16) to this branch and keep the others running using argonaut 0.48.2? If the upgraded osd runs stable for a couple of days then I'd upgrade them all.

Actions #15

Updated by Sage Weil about 11 years ago

Yes

Actions #16

Updated by Anonymous about 11 years ago

  • Status changed from In Progress to Resolved
Actions #17

Updated by Corin Langosch about 11 years ago

I just tried to update one of my osds, but it doesn't start. I tried it twice:

ceph version 0.56.3-28-g2457211 (24572111607b3f2a89c2db8bd4acd5f9bf3fd22c)

2013-03-08 19:05:16.420338 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount FIEMAP ioctl is supported and appears to work
2013-03-08 19:05:16.420403 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option
2013-03-08 19:05:16.424565 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount did NOT detect btrfs
2013-03-08 19:05:16.429690 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount syncfs(2) syscall fully supported (by glibc and kernel)
2013-03-08 19:05:16.429956 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount found snaps <>
2013-03-08 19:05:16.541282 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount: enabling WRITEAHEAD journal mode: btrfs not detected
2013-03-08 19:05:16.549733 7f3218cba780 1 journal _open /var/lib/ceph/osd/ceph-8/journal fd 19: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-03-08 19:05:16.666846 7f3218cba780 1 journal _open /var/lib/ceph/osd/ceph-8/journal fd 19: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-03-08 19:05:16.698963 7f3218cba780 1 journal close /var/lib/ceph/osd/ceph-8/journal
2013-03-08 19:05:16.717722 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount FIEMAP ioctl is supported and appears to work
2013-03-08 19:05:16.717746 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option
2013-03-08 19:05:16.718180 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount did NOT detect btrfs
2013-03-08 19:05:16.723162 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount syncfs(2) syscall fully supported (by glibc and kernel)
2013-03-08 19:05:16.723806 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount found snaps <>
2013-03-08 19:05:16.732368 7f3218cba780 0 filestore(/var/lib/ceph/osd/ceph-8) mount: enabling WRITEAHEAD journal mode: btrfs not detected
2013-03-08 19:05:16.744101 7f3218cba780 1 journal _open /var/lib/ceph/osd/ceph-8/journal fd 26: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-03-08 19:05:16.745044 7f3218cba780 1 journal _open /var/lib/ceph/osd/ceph-8/journal fd 26: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-03-08 19:05:16.778834 7f3218cba780 0 osd.8 13882 crush map has features 262144, adjusting msgr requires for clients
2013-03-08 19:05:16.778877 7f3218cba780 0 osd.8 13882 crush map has features 262144, adjusting msgr requires for osds
2013-03-08 19:05:17.701661 7f3218cba780 1 journal close /var/lib/ceph/osd/ceph-8/journal
2013-03-08 19:05:17.702858 7f3218cba780 -1 * ERROR: osd init failed: (95) Operation not supported
2013-03-08 19:06:06.117936 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount FIEMAP ioctl is supported and appears to work
2013-03-08 19:06:06.117989 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option
2013-03-08 19:06:06.118961 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount did NOT detect btrfs
2013-03-08 19:06:06.122584 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount syncfs(2) syscall fully supported (by glibc and kernel)
2013-03-08 19:06:06.122864 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount found snaps <>
2013-03-08 19:06:06.127184 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount: enabling WRITEAHEAD journal mode: btrfs not detected
2013-03-08 19:06:06.145721 7f73dd9d1780 1 journal _open /var/lib/ceph/osd/ceph-8/journal fd 18: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-03-08 19:06:06.189452 7f73dd9d1780 1 journal _open /var/lib/ceph/osd/ceph-8/journal fd 18: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-03-08 19:06:06.196632 7f73dd9d1780 1 journal close /var/lib/ceph/osd/ceph-8/journal
2013-03-08 19:06:06.217373 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount FIEMAP ioctl is supported and appears to work
2013-03-08 19:06:06.217403 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option
2013-03-08 19:06:06.218319 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount did NOT detect btrfs
2013-03-08 19:06:06.231650 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount syncfs(2) syscall fully supported (by glibc and kernel)
2013-03-08 19:06:06.231868 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount found snaps <>
2013-03-08 19:06:06.236043 7f73dd9d1780 0 filestore(/var/lib/ceph/osd/ceph-8) mount: enabling WRITEAHEAD journal mode: btrfs not detected
2013-03-08 19:06:06.243854 7f73dd9d1780 1 journal _open /var/lib/ceph/osd/ceph-8/journal fd 26: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-03-08 19:06:06.244595 7f73dd9d1780 1 journal _open /var/lib/ceph/osd/ceph-8/journal fd 26: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
2013-03-08 19:06:06.246056 7f73dd9d1780 0 osd.8 13882 crush map has features 262144, adjusting msgr requires for clients
2013-03-08 19:06:06.246065 7f73dd9d1780 0 osd.8 13882 crush map has features 262144, adjusting msgr requires for osds
2013-03-08 19:06:06.298262 7f73dd9d1780 1 journal close /var/lib/ceph/osd/ceph-8/journal
2013-03-08 19:06:06.298551 7f73dd9d1780 -1 *
ERROR: osd init failed: (95) Operation not supported

Actions #18

Updated by Corin Langosch about 11 years ago

Seems like I can't even downgrade anymore... :-(. I'll delete all files of this osd and restart it using the old version of ceph to get my cluster clean again.

Actions #19

Updated by Anonymous about 11 years ago

This might be an authentication issue. That is one of the possible causes for the osd init failed error. I believe the defaults changed between Argonaut and Bobtail.

Actions #20

Updated by Greg Farnum about 11 years ago

This is the wrong area to be authentication. Sam probably has a better idea what precisely it is; I would need to go code diving. Usually it's something silly like xattrs on ext4 but I assume that's not the issue here.

Actions #21

Updated by Corin Langosch about 11 years ago

I'm using XFS everywhere. Kernel is stock ubuntu quantal (3.5.x) on these systems.

Actions #22

Updated by Sage Weil about 11 years ago

can you reproduce the EOPNOTSUPP with debug filestore = 20 and debug osd = 20 ?

Actions

Also available in: Atom PDF