Project

General

Profile

Actions

Bug #13815

closed

OSDs failed after upgrade from 0.80.10 to 0.94.5

Added by Kosta Velikov over 8 years ago. Updated over 8 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hello,
Few hours аfter upgrading from Firefly 0.80.10 to 0.94.5, 2 OSDs died simultaneously. The upgrade was done with ceph-deploy, and there were no errors. My setup is 5 nodes with 4 OSDs each, + 3 MONs. After the OSDs died, I have 19 pgs that are stale:

cluster b2434ea3-90d7-4aa7-b6ee-f5bc6b642c0c
health HEALTH_WARN
19 pgs stale
19 pgs stuck stale
57 requests are blocked > 32 sec
too many PGs per OSD (327 > max 300)
monmap e1: 3 mons at {compute-1=10.16.0.111:6789/0,compute-2=10.16.0.112:6789/0,controller-1=10.16.0.101:6789/0}
election epoch 7564, quorum 0,1,2 controller-1,compute-1,compute-2
osdmap e11993: 20 osds: 18 up, 18 in
pgmap v31723603: 2944 pgs, 19 pools, 3586 GB data, 751 kobjects
7106 GB used, 43026 GB / 50132 GB avail
2925 active+clean
19 stale+active+clean
client io 1113 kB/s wr, 394 op/s

Both OSDs show similar errors when I try to start them:

/usr/bin/ceph-osd --cluster=ceph -i 16 -f
starting osd.16 at :/0 osd_data /var/lib/ceph/osd/ceph-16 /var/lib/ceph/osd/ceph-16/journal
2015-11-17 00:06:30.897000 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 1f//head//14 in index: (2) No such file or directory
os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7f10094f2900 time 2015-11-17 00:06:30.934662
os/FileStore.cc: 2757: FAILED assert(0 "unexpected error")
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xbc60eb]
2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0xa52) [0x923d12]
3: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long, ThreadPool::TPHandle*)+0x64) [0x92a3a4]
4: (JournalingObjectStore::journal_replay(unsigned long)+0x5cb) [0x94355b]
5: (FileStore::mount()+0x3bb6) [0x9139f6]
6: (OSD::init()+0x259) [0x6c59b9]
7: (main()+0x2860) [0x6527e0]
8: (__libc_start_main()+0xf5) [0x7f10065e9ec5]
9: /usr/bin/ceph-osd() [0x66b887]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2015-11-17 00:06:30.898621 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 1b//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.898725 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 2b//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.898855 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 22//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.898950 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 39//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.899045 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 33//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.899140 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 4a//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.899686 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 41//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.899782 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 5a//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.899876 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 53//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.899976 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 65//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900072 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 62//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900221 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 7b//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900331 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 8e//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900392 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 84//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900434 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find ae//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900476 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find ac//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900521 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a6//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900563 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a7//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900605 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a4//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900647 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a3//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900689 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find c5//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.900730 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find e5//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.913951 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find e1//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.914068 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find e//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.914197 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 4//head//14 in index: (2) No such file or directory
2015-11-17 00:06:30.920081 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 13//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.920242 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 12//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.920339 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 2b//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.920439 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 3f//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.920533 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 39//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.920629 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 38//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.920723 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 36//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.920841 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 4f//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.920937 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 4e//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.932507 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 4b//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.932603 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 43//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.932682 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 6f//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.932763 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 6a//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.932842 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 69//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.933025 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 63//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.933172 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 76//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.933250 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 86//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.933332 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 84//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.933409 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find ac//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.933513 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a8//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.933590 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a2//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.933667 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find b6//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.933745 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find b2//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.933824 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find c7//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.933906 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find e9//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.934362 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find e//head//15 in index: (2) No such file or directory
2015-11-17 00:06:30.939899 7f10094f2900 -1 os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7f10094f2900 time 2015-11-17 00:06:30.934662
os/FileStore.cc: 2757: FAILED assert(0 "unexpected error")
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xbc60eb]
2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0xa52) [0x923d12]
3: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long, ThreadPool::TPHandle*)+0x64) [0x92a3a4]
4: (JournalingObjectStore::journal_replay(unsigned long)+0x5cb) [0x94355b]
5: (FileStore::mount()+0x3bb6) [0x9139f6]
6: (OSD::init()+0x259) [0x6c59b9]
7: (main()+0x2860) [0x6527e0]
8: (__libc_start_main()+0xf5) [0x7f10065e9ec5]
9: /usr/bin/ceph-osd() [0x66b887]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
-466> 2015-11-17 00:06:30.897000 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 1f//head//14 in index: (2) No such file or directory
-456> 2015-11-17 00:06:30.898621 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 1b//head//14 in index: (2) No such file or directory
-449> 2015-11-17 00:06:30.898725 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 2b//head//14 in index: (2) No such file or directory
-442> 2015-11-17 00:06:30.898855 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 22//head//14 in index: (2) No such file or directory
-435> 2015-11-17 00:06:30.898950 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 39//head//14 in index: (2) No such file or directory
-428> 2015-11-17 00:06:30.899045 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 33//head//14 in index: (2) No such file or directory
-421> 2015-11-17 00:06:30.899140 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 4a//head//14 in index: (2) No such file or directory
-414> 2015-11-17 00:06:30.899686 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 41//head//14 in index: (2) No such file or directory
-407> 2015-11-17 00:06:30.899782 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 5a//head//14 in index: (2) No such file or directory
-400> 2015-11-17 00:06:30.899876 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 53//head//14 in index: (2) No such file or directory
-393> 2015-11-17 00:06:30.899976 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 65//head//14 in index: (2) No such file or directory
-386> 2015-11-17 00:06:30.900072 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 62//head//14 in index: (2) No such file or directory
-379> 2015-11-17 00:06:30.900221 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 7b//head//14 in index: (2) No such file or directory
-372> 2015-11-17 00:06:30.900331 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 8e//head//14 in index: (2) No such file or directory
-365> 2015-11-17 00:06:30.900392 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 84//head//14 in index: (2) No such file or directory
-358> 2015-11-17 00:06:30.900434 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find ae//head//14 in index: (2) No such file or directory
-351> 2015-11-17 00:06:30.900476 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find ac//head//14 in index: (2) No such file or directory
-344> 2015-11-17 00:06:30.900521 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a6//head//14 in index: (2) No such file or directory
-337> 2015-11-17 00:06:30.900563 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a7//head//14 in index: (2) No such file or directory
-330> 2015-11-17 00:06:30.900605 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a4//head//14 in index: (2) No such file or directory
-323> 2015-11-17 00:06:30.900647 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a3//head//14 in index: (2) No such file or directory
-316> 2015-11-17 00:06:30.900689 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find c5//head//14 in index: (2) No such file or directory
-309> 2015-11-17 00:06:30.900730 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find e5//head//14 in index: (2) No such file or directory
-302> 2015-11-17 00:06:30.913951 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find e1//head//14 in index: (2) No such file or directory
-295> 2015-11-17 00:06:30.914068 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find e//head//14 in index: (2) No such file or directory
-288> 2015-11-17 00:06:30.914197 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 4//head//14 in index: (2) No such file or directory
-185> 2015-11-17 00:06:30.920081 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 13//head//15 in index: (2) No such file or directory
-178> 2015-11-17 00:06:30.920242 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 12//head//15 in index: (2) No such file or directory
-171> 2015-11-17 00:06:30.920339 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 2b//head//15 in index: (2) No such file or directory
-164> 2015-11-17 00:06:30.920439 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 3f//head//15 in index: (2) No such file or directory
-157> 2015-11-17 00:06:30.920533 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 39//head//15 in index: (2) No such file or directory
-150> 2015-11-17 00:06:30.920629 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 38//head//15 in index: (2) No such file or directory
-143> 2015-11-17 00:06:30.920723 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 36//head//15 in index: (2) No such file or directory
-136> 2015-11-17 00:06:30.920841 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 4f//head//15 in index: (2) No such file or directory
-129> 2015-11-17 00:06:30.920937 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 4e//head//15 in index: (2) No such file or directory
-122> 2015-11-17 00:06:30.932507 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 4b//head//15 in index: (2) No such file or directory
-115> 2015-11-17 00:06:30.932603 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 43//head//15 in index: (2) No such file or directory
-108> 2015-11-17 00:06:30.932682 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 6f//head//15 in index: (2) No such file or directory
-101> 2015-11-17 00:06:30.932763 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 6a//head//15 in index: (2) No such file or directory
-94> 2015-11-17 00:06:30.932842 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 69//head//15 in index: (2) No such file or directory
-87> 2015-11-17 00:06:30.933025 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 63//head//15 in index: (2) No such file or directory
-80> 2015-11-17 00:06:30.933172 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 76//head//15 in index: (2) No such file or directory
-73> 2015-11-17 00:06:30.933250 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 86//head//15 in index: (2) No such file or directory
-66> 2015-11-17 00:06:30.933332 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find 84//head//15 in index: (2) No such file or directory
-59> 2015-11-17 00:06:30.933409 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find ac//head//15 in index: (2) No such file or directory
-52> 2015-11-17 00:06:30.933513 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a8//head//15 in index: (2) No such file or directory
-45> 2015-11-17 00:06:30.933590 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find a2//head//15 in index: (2) No such file or directory
-38> 2015-11-17 00:06:30.933667 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find b6//head//15 in index: (2) No such file or directory
-31> 2015-11-17 00:06:30.933745 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find b2//head//15 in index: (2) No such file or directory
-24> 2015-11-17 00:06:30.933824 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find c7//head//15 in index: (2) No such file or directory
-17> 2015-11-17 00:06:30.933906 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find e9//head//15 in index: (2) No such file or directory
-10> 2015-11-17 00:06:30.934362 7f10094f2900 -1 filestore(/var/lib/ceph/osd/ceph-16) could not find e//head//15 in index: (2) No such file or directory
0> 2015-11-17 00:06:30.939899 7f10094f2900 -1 os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7f10094f2900 time 2015-11-17 00:06:30.934662
os/FileStore.cc: 2757: FAILED assert(0 == "unexpected error")
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xbc60eb]
2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0xa52) [0x923d12]
3: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long, ThreadPool::TPHandle*)+0x64) [0x92a3a4]
4: (JournalingObjectStore::journal_replay(unsigned long)+0x5cb) [0x94355b]
5: (FileStore::mount()+0x3bb6) [0x9139f6]
6: (OSD::init()+0x259) [0x6c59b9]
7: (main()+0x2860) [0x6527e0]
8: (__libc_start_main()+0xf5) [0x7f10065e9ec5]
9: /usr/bin/ceph-osd() [0x66b887]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
  • Caught signal (Aborted)
    in thread 7f10094f2900
    ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
    1: /usr/bin/ceph-osd() [0xacd7ba]
    2: (()+0x10340) [0x7f1008180340]
    3: (gsignal()+0x39) [0x7f10065fecc9]
    4: (abort()+0x148) [0x7f10066020d8]
    5: (_gnu_cxx::_verbose_terminate_handler()+0x155) [0x7f1006f106b5]
    6: (()+0x5e836) [0x7f1006f0e836]
    7: (()+0x5e863) [0x7f1006f0e863]
    8: (()+0x5eaa2) [0x7f1006f0eaa2]
    9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x278) [0xbc62d8]
    10: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0xa52) [0x923d12]
    11: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long, ThreadPool::TPHandle*)+0x64) [0x92a3a4]
    12: (JournalingObjectStore::journal_replay(unsigned long)+0x5cb) [0x94355b]
    13: (FileStore::mount()+0x3bb6) [0x9139f6]
    14: (OSD::init()+0x259) [0x6c59b9]
    15: (main()+0x2860) [0x6527e0]
    16: (__libc_start_main()+0xf5) [0x7f10065e9ec5]
    17: /usr/bin/ceph-osd() [0x66b887]
    2015-11-17 00:06:31.177389 7f10094f2900 -1
    Caught signal (Aborted) *
    in thread 7f10094f2900
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
1: /usr/bin/ceph-osd() [0xacd7ba]
2: (()+0x10340) [0x7f1008180340]
3: (gsignal()+0x39) [0x7f10065fecc9]
4: (abort()+0x148) [0x7f10066020d8]
5: (_gnu_cxx::_verbose_terminate_handler()+0x155) [0x7f1006f106b5]
6: (()+0x5e836) [0x7f1006f0e836]
7: (()+0x5e863) [0x7f1006f0e863]
8: (()+0x5eaa2) [0x7f1006f0eaa2]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x278) [0xbc62d8]
10: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0xa52) [0x923d12]
11: (FileStore::_do_transactions(std::list&lt;ObjectStore::Transaction*, std::allocator&lt;ObjectStore::Transaction*&gt; >&, unsigned long, ThreadPool::TPHandle*)+0x64) [0x92a3a4]
12: (JournalingObjectStore::journal_replay(unsigned long)+0x5cb) [0x94355b]
13: (FileStore::mount()+0x3bb6) [0x9139f6]
14: (OSD::init()+0x259) [0x6c59b9]
15: (main()+0x2860) [0x6527e0]
16: (__libc_start_main()+0xf5) [0x7f10065e9ec5]
17: /usr/bin/ceph-osd() [0x66b887]
NOTE: a copy of the executable, or `objdump -rdS &lt;executable&gt;` is needed to interpret this.
0> 2015-11-17 00:06:31.177389 7f10094f2900 -1 ** Caught signal (Aborted) *
in thread 7f10094f2900
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
1: /usr/bin/ceph-osd() [0xacd7ba]
2: (()+0x10340) [0x7f1008180340]
3: (gsignal()+0x39) [0x7f10065fecc9]
4: (abort()+0x148) [0x7f10066020d8]
5: (_gnu_cxx::_verbose_terminate_handler()+0x155) [0x7f1006f106b5]
6: (()+0x5e836) [0x7f1006f0e836]
7: (()+0x5e863) [0x7f1006f0e863]
8: (()+0x5eaa2) [0x7f1006f0eaa2]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x278) [0xbc62d8]
10: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0xa52) [0x923d12]
11: (FileStore::_do_transactions(std::list&lt;ObjectStore::Transaction*, std::allocator&lt;ObjectStore::Transaction*&gt; >&, unsigned long, ThreadPool::TPHandle*)+0x64) [0x92a3a4]
12: (JournalingObjectStore::journal_replay(unsigned long)+0x5cb) [0x94355b]
13: (FileStore::mount()+0x3bb6) [0x9139f6]
14: (OSD::init()+0x259) [0x6c59b9]
15: (main()+0x2860) [0x6527e0]
16: (__libc_start_main()+0xf5) [0x7f10065e9ec5]
17: /usr/bin/ceph-osd() [0x66b887]
NOTE: a copy of the executable, or `objdump -rdS &lt;executable&gt;` is needed to interpret this.
Aborted (core dumped)

OS is Ubuntu 14.04, kernel is 3.16.0-38-generic. As far as I can tell, the disks are OK. No data has been deleted, and the partitions are mounted to /var/lib/ceph/... The xfs_check and xfs_repair tools showed no problems.
Where do I go from here?

Some more info:

  1. ceph osd tree
    ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
    -1 54.39981 root default
    -2 10.87999 host ceph-sucker-01
    0 2.71999 osd.0 up 1.00000 1.00000
    1 2.71999 osd.1 down 0 1.00000
    2 2.71999 osd.2 up 1.00000 1.00000
    3 2.71999 osd.3 up 1.00000 1.00000
    -3 10.87994 host ceph-sucker-02
    4 2.71999 osd.4 up 1.00000 1.00000
    5 2.71999 osd.5 up 1.00000 1.00000
    6 2.71999 osd.6 up 1.00000 1.00000
    7 2.71999 osd.7 up 1.00000 1.00000
    -4 10.87994 host ceph-sucker-04
    9 2.71999 osd.9 up 1.00000 1.00000
    10 2.71999 osd.10 up 1.00000 1.00000
    11 2.71999 osd.11 up 1.00000 1.00000
    8 2.71999 osd.8 up 1.00000 1.00000
    -5 10.87999 host ceph-sucker-05
    12 2.71999 osd.12 up 1.00000 1.00000
    13 2.71999 osd.13 up 1.00000 1.00000
    14 2.71999 osd.14 up 1.00000 1.00000
    15 2.71999 osd.15 up 1.00000 1.00000
    -6 10.87994 host ceph-sucker-03
    16 2.71999 osd.16 down 0 1.00000
    17 2.71999 osd.17 up 1.00000 1.00000
    18 2.71999 osd.18 up 1.00000 1.00000
    19 2.71999 osd.19 up 1.00000 1.00000
  1. ceph health detail
    HEALTH_WARN 19 pgs stale; 19 pgs stuck stale; 73 requests are blocked > 32 sec; 5 osds have slow requests; too many PGs per OSD (327 > max 300)
    pg 21.1e is stuck stale for 9769.562513, current state stale+active+clean, last acting [1,16]
    pg 30.27 is stuck stale for 9769.562588, current state stale+active+clean, last acting [1,16]
    pg 4.24 is stuck stale for 9769.562553, current state stale+active+clean, last acting [1,16]
    pg 24.55 is stuck stale for 9769.562623, current state stale+active+clean, last acting [1,16]
    pg 33.58 is stuck stale for 9769.562680, current state stale+active+clean, last acting [16,1]
    pg 29.18 is stuck stale for 9769.562524, current state stale+active+clean, last acting [16,1]
    pg 6.158 is stuck stale for 9769.562721, current state stale+active+clean, last acting [16,1]
    pg 23.6f is stuck stale for 9769.562688, current state stale+active+clean, last acting [16,1]
    pg 6.18d is stuck stale for 9769.562731, current state stale+active+clean, last acting [1,16]
    pg 26.0 is stuck stale for 9769.562562, current state stale+active+clean, last acting [1,16]
    pg 25.29 is stuck stale for 9769.562603, current state stale+active+clean, last acting [1,16]
    pg 31.6a is stuck stale for 9769.562696, current state stale+active+clean, last acting [1,16]
    pg 6.1fa is stuck stale for 9769.562752, current state stale+active+clean, last acting [1,16]
    pg 21.61 is stuck stale for 9769.562700, current state stale+active+clean, last acting [16,1]
    pg 6.8e is stuck stale for 9769.562719, current state stale+active+clean, last acting [1,16]
    pg 32.79 is stuck stale for 9769.562673, current state stale+active+clean, last acting [16,1]
    pg 31.72 is stuck stale for 9769.562695, current state stale+active+clean, last acting [16,1]
    pg 4.9e is stuck stale for 9769.562731, current state stale+active+clean, last acting [1,16]
    pg 4.6d is stuck stale for 9769.562694, current state stale+active+clean, last acting [1,16]
    13 ops are blocked > 16777.2 sec
    52 ops are blocked > 4194.3 sec
    2 ops are blocked > 2097.15 sec
    6 ops are blocked > 131.072 sec
    1 ops are blocked > 16777.2 sec on osd.3
    4 ops are blocked > 16777.2 sec on osd.7
    8 ops are blocked > 4194.3 sec on osd.7
    6 ops are blocked > 131.072 sec on osd.7
    6 ops are blocked > 16777.2 sec on osd.10
    35 ops are blocked > 4194.3 sec on osd.10
    4 ops are blocked > 4194.3 sec on osd.18
    2 ops are blocked > 2097.15 sec on osd.18
    2 ops are blocked > 16777.2 sec on osd.19
    5 ops are blocked > 4194.3 sec on osd.19
    5 osds have slow requests
    too many PGs per OSD (327 > max 300)

Related issues 1 (0 open1 closed)

Is duplicate of Ceph - Bug #12428: garbage data in osd data dir crashes ceph-objectstore-toolCan't reproduce07/22/2015

Actions
Actions #1

Updated by Kosta Velikov over 8 years ago

It turns out, it is the same issue as this one: http://tracker.ceph.com/issues/12428
The workaround fixed it.

Actions #2

Updated by Nathan Cutler over 8 years ago

  • Tracker changed from Tasks to Bug
  • Project changed from Stable releases to Ceph
Actions #3

Updated by Nathan Cutler over 8 years ago

  • Is duplicate of Bug #12428: garbage data in osd data dir crashes ceph-objectstore-tool added
Actions #4

Updated by Nathan Cutler over 8 years ago

  • Status changed from New to Duplicate
Actions

Also available in: Atom PDF