Actions
Bug #8047
closed0.79: new OSD crashed within minutes
Status:
Closed
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
On 0.79 I added new OSD (on btrfs). Shortly after re-balancing begin newly added OSD crashed:
-5> 2014-04-09 16:44:26.998155 7f82d7908700 1 -- 192.168.0.2:6819/13171 <== osd.1 192.168.0.250:0/14833 75 ==== osd_ping(ping e14837 stamp 2014-04-09 16:44:26.980489) v2 ==== 47+0+0 (2626240657 0 0) 0x7f82f4a8a380 con 0x7f82eb58edc0 -4> 2014-04-09 16:44:26.998175 7f82d7908700 1 -- 192.168.0.2:6819/13171 --> 192.168.0.250:0/14833 -- osd_ping(ping_reply e14837 stamp 2014-04-09 16:44:26.980489) v2 -- ?+0 0x7f82f370a8c0 con 0x7f82eb58edc0 -3> 2014-04-09 16:44:27.212032 7f82dda55700 0 filestore(/var/lib/ceph/osd/ceph-11) error (1) Operation not permitted not handled on operation 32 (4143.0.0, or op 0, counting from 0) -2> 2014-04-09 16:44:27.212056 7f82dda55700 0 filestore(/var/lib/ceph/osd/ceph-11) unexpected error code -1> 2014-04-09 16:44:27.212058 7f82dda55700 0 filestore(/var/lib/ceph/osd/ceph-11) transaction dump: { "ops": [ { "op_num": 0, "op_name": "omap_setkeys", "collection": "meta", "oid": "516b9a0b\/pglog_2.63\/0\/\/-1", "attr_lens": { "0000014837.00000000000000491457": 172}}, { "op_num": 1, "op_name": "collection_setattr", "collection": "2.63_head", "name": "info", "length": 1}, { "op_num": 2, "op_name": "omap_setkeys", "collection": "meta", "oid": "16ef7597\/infos\/head\/\/-1", "attr_lens": { "2.63_biginfo": 138, "2.63_epoch": 4, "2.63_info": 650}}, { "op_num": 3, "op_name": "omap_rmkeys", "collection": "meta", "oid": "516b9a0b\/pglog_2.63\/0\/\/-1"}, { "op_num": 4, "op_name": "omap_setkeys", "collection": "meta", "oid": "516b9a0b\/pglog_2.63\/0\/\/-1", "attr_lens": { "0000014837.00000000000000491457": 172, "can_rollback_to": 12}}]} 0> 2014-04-09 16:44:27.238338 7f82dda55700 -1 os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7f82dda55700 time 2014-04-09 16:44:27.227975 os/FileStore.cc: 2540: FAILED assert(0 == "unexpected error") ceph version 0.79 (4c2d73a5095f527c3a2168deb5fa54b3c8991a6e) 1: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0xb7c) [0x7f82e9f505ac] 2: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long, ThreadPool::TPHandle*)+0x6c) [0x7f82e9f542dc] 3: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x160) [0x7f82e9f54460] 4: (ThreadPool::worker(ThreadPool::WorkThread*)+0xaf1) [0x7f82ea10fb91] 5: (ThreadPool::WorkThread::entry()+0x10) [0x7f82ea110a80] 6: (()+0x8062) [0x7f82e91f6062] 7: (clone()+0x6d) [0x7f82e771da3d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 0/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 keyvaluestore 1/ 3 journal 0/ 5 ms 1/ 5 mon 0/10 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle -2/-2 (syslog threshold) -1/-1 (stderr threshold) max_recent 10000 max_new 1000 log_file /var/log/ceph/ceph-osd.11.log --- end dump of recent events --- 2014-04-09 16:44:27.345195 7f82dda55700 -1 *** Caught signal (Aborted) ** in thread 7f82dda55700 ceph version 0.79 (4c2d73a5095f527c3a2168deb5fa54b3c8991a6e) 1: (()+0x59d26f) [0x7f82ea03e26f] 2: (()+0xf880) [0x7f82e91fd880] 3: (gsignal()+0x39) [0x7f82e766d3a9] 4: (abort()+0x148) [0x7f82e76704c8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f82e7f5a5e5] 6: (()+0x5e746) [0x7f82e7f58746] 7: (()+0x5e773) [0x7f82e7f58773] 8: (()+0x5e9b2) [0x7f82e7f589b2] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1f2) [0x7f82ea11ecd2] 10: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0xb7c) [0x7f82e9f505ac] 11: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long, ThreadPool::TPHandle*)+0x6c) [0x7f82e9f542dc] 12: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x160) [0x7f82e9f54460] 13: (ThreadPool::worker(ThreadPool::WorkThread*)+0xaf1) [0x7f82ea10fb91] 14: (ThreadPool::WorkThread::entry()+0x10) [0x7f82ea110a80] 15: (()+0x8062) [0x7f82e91f6062] 16: (clone()+0x6d) [0x7f82e771da3d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- 0> 2014-04-09 16:44:27.345195 7f82dda55700 -1 *** Caught signal (Aborted) ** in thread 7f82dda55700 ceph version 0.79 (4c2d73a5095f527c3a2168deb5fa54b3c8991a6e) 1: (()+0x59d26f) [0x7f82ea03e26f] 2: (()+0xf880) [0x7f82e91fd880] 3: (gsignal()+0x39) [0x7f82e766d3a9] 4: (abort()+0x148) [0x7f82e76704c8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f82e7f5a5e5] 6: (()+0x5e746) [0x7f82e7f58746] 7: (()+0x5e773) [0x7f82e7f58773] 8: (()+0x5e9b2) [0x7f82e7f589b2] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1f2) [0x7f82ea11ecd2] 10: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0xb7c) [0x7f82e9f505ac] 11: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long, ThreadPool::TPHandle*)+0x6c) [0x7f82e9f542dc] 12: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x160) [0x7f82e9f54460] 13: (ThreadPool::worker(ThreadPool::WorkThread*)+0xaf1) [0x7f82ea10fb91] 14: (ThreadPool::WorkThread::entry()+0x10) [0x7f82ea110a80] 15: (()+0x8062) [0x7f82e91f6062] 16: (clone()+0x6d) [0x7f82e771da3d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 0/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 keyvaluestore 1/ 3 journal 0/ 5 ms 1/ 5 mon 0/10 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle -2/-2 (syslog threshold) -1/-1 (stderr threshold) max_recent 10000 max_new 1000 log_file /var/log/ceph/ceph-osd.11.log --- end dump of recent events ---
Actions