Project

General

Profile

Actions

Bug #8643

closed

0.80.1: OSD crash: osd/ECBackend.cc: 529: FAILED assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( after_progress.data_recovered_to - op.recovery_progress.data_recovered_to))

Added by Dmitry Smirnov almost 10 years ago. Updated almost 10 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

OSD (0.80.1) crashed and became unusable ever since due to similar crash soon after restart:

    -1> 2014-06-23 08:34:07.440615 7fecc3d85700  0 osd.3 38905 crush map has features 2578087936000, adjusting msgr requires for mons
     0> 2014-06-23 08:34:07.540053 7fecbe57a700 -1 osd/ECBackend.cc: In function 'void ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*)' thread 7fecbe57a700 time 2014-06-23 08:34:07.366630
osd/ECBackend.cc: 529: FAILED assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( after_progress.data_recovered_to - op.recovery_progress.data_recovered_to))

 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
 1: (()+0x55e1cb) [0x7fecd9f181cb]
 2: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, boost::optional<std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > > >, RecoveryMessages*)+0x8c7) [0x7fecd9f190d7]
 3: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x93) [0x7fecd9f27203]
 4: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x4a) [0x7fecd9f0c07a]
 5: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*)+0x972) [0x7fecd9f10772]
 6: (ECBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x186) [0x7fecd9f183f6]
 7: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x2db) [0x7fecd9d9202b]
 8: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x374) [0x7fecd9bd9104]
 9: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x1cf) [0x7fecd9bf4fdf]
 10: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x7fecd9c38c6c]
 11: (ThreadPool::worker(ThreadPool::WorkThread*)+0x1390) [0x7fecda063ab0]
 12: (ThreadPool::WorkThread::entry()+0x10) [0x7fecda064760]
 13: (()+0x80ca) [0x7fecd910e0ca]
 14: (clone()+0x6d) [0x7fecd762fffd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.3.log
--- end dump of recent events ---
2014-06-23 08:34:07.614542 7fecbe57a700 -1 *** Caught signal (Aborted) **
 in thread 7fecbe57a700

 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
 1: (()+0x5d44c2) [0x7fecd9f8e4c2]
 2: (()+0xf8f0) [0x7fecd91158f0]
 3: (gsignal()+0x37) [0x7fecd757f407]
 4: (abort()+0x148) [0x7fecd7582508]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x175) [0x7fecd7e6ad65]
 6: (()+0x5edd6) [0x7fecd7e68dd6]
 7: (()+0x5ee21) [0x7fecd7e68e21]
 8: (()+0x5f039) [0x7fecd7e69039]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1e3) [0x7fecda0731e3]
 10: (()+0x55e1cb) [0x7fecd9f181cb]
 11: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, boost::optional<std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > > >, RecoveryMessages*)+0x8c7) [0x7fecd9f190d7]
 12: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x93) [0x7fecd9f27203]
 13: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x4a) [0x7fecd9f0c07a]
 14: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*)+0x972) [0x7fecd9f10772]
 15: (ECBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x186) [0x7fecd9f183f6]
 16: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x2db) [0x7fecd9d9202b]
 17: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x374) [0x7fecd9bd9104]
 18: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x1cf) [0x7fecd9bf4fdf]
 19: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x7fecd9c38c6c]
 20: (ThreadPool::worker(ThreadPool::WorkThread*)+0x1390) [0x7fecda063ab0]
 21: (ThreadPool::WorkThread::entry()+0x10) [0x7fecda064760]
 22: (()+0x80ca) [0x7fecd910e0ca]
 23: (clone()+0x6d) [0x7fecd762fffd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
     0> 2014-06-23 08:34:07.614542 7fecbe57a700 -1 *** Caught signal (Aborted) **
 in thread 7fecbe57a700

 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
 1: (()+0x5d44c2) [0x7fecd9f8e4c2]
 2: (()+0xf8f0) [0x7fecd91158f0]
 3: (gsignal()+0x37) [0x7fecd757f407]
 4: (abort()+0x148) [0x7fecd7582508]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x175) [0x7fecd7e6ad65]
 6: (()+0x5edd6) [0x7fecd7e68dd6]
 7: (()+0x5ee21) [0x7fecd7e68e21]
 8: (()+0x5f039) [0x7fecd7e69039]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1e3) [0x7fecda0731e3]
 10: (()+0x55e1cb) [0x7fecd9f181cb]
 11: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, boost::optional<std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > > >, RecoveryMessages*)+0x8c7) [0x7fecd9f190d7]
 12: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x93) [0x7fecd9f27203]
 13: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x4a) [0x7fecd9f0c07a]
 14: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*)+0x972) [0x7fecd9f10772]
 15: (ECBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x186) [0x7fecd9f183f6]
 16: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x2db) [0x7fecd9d9202b]
 17: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x374) [0x7fecd9bd9104]
 18: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x1cf) [0x7fecd9bf4fdf]
 19: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x7fecd9c38c6c]
 20: (ThreadPool::worker(ThreadPool::WorkThread*)+0x1390) [0x7fecda063ab0]
 21: (ThreadPool::WorkThread::entry()+0x10) [0x7fecda064760]
 22: (()+0x80ca) [0x7fecd910e0ca]
 23: (clone()+0x6d) [0x7fecd762fffd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.3.log
--- end dump of recent events ---
2014-06-23 08:35:52.549587 7f300e0f97c0  0 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74), process ceph-osd, pid 8214
2014-06-23 08:35:52.601996 7f300e0f97c0  0 filestore(/var/lib/ceph/osd/ceph-3) mount detected btrfs
2014-06-23 08:35:52.759509 7f300e0f97c0  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_features: FIEMAP ioctl is supported and appears to work
2014-06-23 08:35:52.759524 7f300e0f97c0  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2014-06-23 08:35:53.310845 7f300e0f97c0  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2014-06-23 08:35:53.310950 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: CLONE_RANGE ioctl is supported
2014-06-23 08:35:54.084294 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: SNAP_CREATE is supported
2014-06-23 08:35:54.099785 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: SNAP_DESTROY is supported
2014-06-23 08:35:54.100296 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: START_SYNC is supported (transid 215110)
2014-06-23 08:35:54.437508 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: WAIT_SYNC is supported
2014-06-23 08:35:54.457076 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: SNAP_CREATE_V2 is supported
2014-06-23 08:35:55.101438 7f300e0f97c0  0 filestore(/var/lib/ceph/osd/ceph-3) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
2014-06-23 08:35:55.106213 7f300e0f97c0 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2014-06-23 08:35:55.106248 7f300e0f97c0  1 journal _open /var/lib/ceph/osd/ceph-3/journal fd 21: 4294967296 bytes, block size 4096 bytes, directio = 1, aio = 0
2014-06-23 08:35:55.355847 7f300e0f97c0  1 journal _open /var/lib/ceph/osd/ceph-3/journal fd 21: 4294967296 bytes, block size 4096 bytes, directio = 1, aio = 0
2014-06-23 08:35:59.636873 7f300e0f97c0  1 journal close /var/lib/ceph/osd/ceph-3/journal
2014-06-23 08:35:59.637686 7f300e0f97c0  0 filestore(/var/lib/ceph/osd/ceph-3) mount detected btrfs
2014-06-23 08:35:59.813314 7f300e0f97c0  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_features: FIEMAP ioctl is supported and appears to work
2014-06-23 08:35:59.813333 7f300e0f97c0  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2014-06-23 08:36:01.187740 7f300e0f97c0  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2014-06-23 08:36:01.187843 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: CLONE_RANGE ioctl is supported
2014-06-23 08:36:01.898978 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: SNAP_CREATE is supported
2014-06-23 08:36:01.899616 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: SNAP_DESTROY is supported
2014-06-23 08:36:01.900004 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: START_SYNC is supported (transid 215116)
2014-06-23 08:36:02.293731 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: WAIT_SYNC is supported
2014-06-23 08:36:02.311463 7f300e0f97c0  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: SNAP_CREATE_V2 is supported
2014-06-23 08:36:02.872252 7f300e0f97c0  0 filestore(/var/lib/ceph/osd/ceph-3) mount: WRITEAHEAD journal mode explicitly enabled in conf
2014-06-23 08:36:02.876856 7f300e0f97c0 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2014-06-23 08:36:02.876923 7f300e0f97c0  1 journal _open /var/lib/ceph/osd/ceph-3/journal fd 23: 4294967296 bytes, block size 4096 bytes, directio = 1, aio = 0
2014-06-23 08:36:02.901217 7f300e0f97c0  1 journal _open /var/lib/ceph/osd/ceph-3/journal fd 23: 4294967296 bytes, block size 4096 bytes, directio = 1, aio = 0
2014-06-23 08:36:02.959723 7f300e0f97c0  0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello
2014-06-23 08:36:02.964149 7f300e0f97c0  0 osd.3 38905 crush map has features 2303210029056, adjusting msgr requires for clients
2014-06-23 08:36:02.964162 7f300e0f97c0  0 osd.3 38905 crush map has features 2578087936000, adjusting msgr requires for mons
2014-06-23 08:36:02.964166 7f300e0f97c0  0 osd.3 38905 crush map has features 2578087936000, adjusting msgr requires for osds
2014-06-23 08:36:02.964179 7f300e0f97c0  0 osd.3 38905 load_pgs
2014-06-23 08:36:28.804657 7f300e0f97c0  0 osd.3 38905 load_pgs opened 172 pgs
2014-06-23 08:36:28.859886 7f2ff971a700  0 osd.3 38905 ignoring osdmap until we have initialized
2014-06-23 08:36:28.859995 7f2ff971a700  0 osd.3 38905 ignoring osdmap until we have initialized
2014-06-23 08:36:28.951890 7f300e0f97c0  0 osd.3 38905 done with init, starting boot process
2014-06-23 08:36:30.167677 7f2ff8718700  0 osd.3 38910 crush map has features 2578087936000, adjusting msgr requires for mons
2014-06-23 08:36:31.245243 7f2ff8718700  0 osd.3 38911 crush map has features 2578087936000, adjusting msgr requires for mons
2014-06-23 08:40:49.971715 7f2ff971a700  0 osd.3 38912 crush map has features 2578087936000, adjusting msgr requires for mons
2014-06-23 08:40:50.152739 7f2ff270c700 -1 osd/ECBackend.cc: In function 'void ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*)' thread 7f2ff270c700 time 2014-06-23 08:40:50.150396
osd/ECBackend.cc: 529: FAILED assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( after_progress.data_recovered_to - op.recovery_progress.data_recovered_to))

 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
 1: (()+0x55e1cb) [0x7f300e6a21cb]
 2: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, boost::optional<std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > > >, RecoveryMessages*)+0x8c7) [0x7f300e6a30d7]
 3: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x93) [0x7f300e6b1203]
 4: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x4a) [0x7f300e69607a]
 5: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*)+0x972) [0x7f300e69a772]
 6: (ECBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x186) [0x7f300e6a23f6]
 7: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x2db) [0x7f300e51c02b]
 8: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x374) [0x7f300e363104]
 9: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x1cf) [0x7f300e37efdf]
 10: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0
 11: (ThreadPool::worker(ThreadPool::WorkThread*)+0x1390) [0x7f300e7edab0]
 12: (ThreadPool::WorkThread::entry()+0x10) [0x7f300e7ee760]
 13: (()+0x80ca) [0x7f300d8980ca]
 14: (clone()+0x6d) [0x7f300bdb9ffd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
-10000> 2014-06-23 08:40:25.049314 7f2ff270c700  5 -- op tracker -- , seq: 10201, time: 2014-06-23 08:40:25.049314, event: started, request: osd_sub_op(client.4205572.1:25905681 2.13 23243613/rb.0.31cd20.238e1f29.00000000a21b/head//2 [] v 38911'18861948 snapset=0=[]:[] snapc=0=[]) v10
 -9999> 2014-06-23 08:40:25.049383 7f2ff270c700  5 -- op tracker -- , seq: 10201, time: 2014-06-23 08:40:25.049383, event: started, request: osd_sub_op(client.4205572.1:25905681 2.13 23243613/rb.0.31cd20.238e1f29.00000000a21b/head//2 [] v 38911'18861948 snapset=0=[]:[] snapc=0=[]) v10
...
...
...
    -3> 2014-06-23 08:40:50.150251 7f2ff8718700  5 -- op tracker -- , seq: 10864, time: 2014-06-23 08:40:50.150251, event: started, request: pg_query(14.27s0 epoch 38912) v3
    -2> 2014-06-23 08:40:50.150272 7f2ff8718700  5 -- op tracker -- , seq: 10864, time: 2014-06-23 08:40:50.150272, event: done, request: pg_query(14.27s0 epoch 38912) v3
    -1> 2014-06-23 08:40:50.150310 7f2ff2f0d700  1 -- 192.168.0.2:6806/8214 --> 192.168.0.250:6830/13227 -- pg_notify(14.27s0(1) epoch 38912) v5 -- ?+0 0x7f3038877880 con 0x7f3034e9d1e0
     0> 2014-06-23 08:40:50.152739 7f2ff270c700 -1 osd/ECBackend.cc: In function 'void ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*)' thread 7f2ff270c700 time 2014-06-23 08:40:50.150396
osd/ECBackend.cc: 529: FAILED assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( after_progress.data_recovered_to - op.recovery_progress.data_recovered_to))

 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
 1: (()+0x55e1cb) [0x7f300e6a21cb]
 2: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, boost::optional<std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > > >, RecoveryMessages*)+0x8c7) [0x7f300e6a30d7]
 3: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x93) [0x7f300e6b1203]
 4: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x4a) [0x7f300e69607a]
 5: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*)+0x972) [0x7f300e69a772]
 6: (ECBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x186) [0x7f300e6a23f6]
 7: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x2db) [0x7f300e51c02b]
 8: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x374) [0x7f300e363104]
 9: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x1cf) [0x7f300e37efdf]
 10: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x7f300e3c2c6c]
 11: (ThreadPool::worker(ThreadPool::WorkThread*)+0x1390) [0x7f300e7edab0]
 12: (ThreadPool::WorkThread::entry()+0x10) [0x7f300e7ee760]
 13: (()+0x80ca) [0x7f300d8980ca]
 14: (clone()+0x6d) [0x7f300bdb9ffd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.3.log
--- end dump of recent events ---
2014-06-23 08:40:50.234635 7f2ff270c700 -1 *** Caught signal (Aborted) **
 in thread 7f2ff270c700

 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
 1: (()+0x5d44c2) [0x7f300e7184c2]
 2: (()+0xf8f0) [0x7f300d89f8f0]
 3: (gsignal()+0x37) [0x7f300bd09407]
 4: (abort()+0x148) [0x7f300bd0c508]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x175) [0x7f300c5f4d65]
 6: (()+0x5edd6) [0x7f300c5f2dd6]
 7: (()+0x5ee21) [0x7f300c5f2e21]
 8: (()+0x5f039) [0x7f300c5f3039]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1e3) [0x7f300e7fd1e3]
 10: (()+0x55e1cb) [0x7f300e6a21cb]
 11: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::list, std::less<pg_shard
_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type
, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, boost::optional<std::map<std::string, ceph::buffer::list, std::less<std::string>, std::alloca
tor<std::pair<std::string const, ceph::buffer::list> > > >, RecoveryMessages*)+0x8c7) [0x7f300e6a30d7]
 12: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x93) [0x7f300e6b1203]
 13: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x4a) [0x7f300e69607a]
 14: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*)+0x972) [0x7f300e69a772]
 15: (ECBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x186) [0x7f300e6a23f6]
 16: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x2db) [0x7f300e51c02b]
 17: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x374) [0x7f300e363104]
 18: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x1cf) [0x7f300e37efdf]
 19: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0
x9c) [0x7f300e3c2c6c]
 20: (ThreadPool::worker(ThreadPool::WorkThread*)+0x1390) [0x7f300e7edab0]
 21: (ThreadPool::WorkThread::entry()+0x10) [0x7f300e7ee760]
 22: (()+0x80ca) [0x7f300d8980ca]
 23: (clone()+0x6d) [0x7f300bdb9ffd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.3.log
--- end dump of recent events ---


Files

ceph-osd.3.log.xz (5.73 MB) ceph-osd.3.log.xz Dmitry Smirnov, 06/23/2014 12:24 PM
ceph-osd.0.log.txt.xz (1.28 MB) ceph-osd.0.log.txt.xz Dmitry Smirnov, 06/24/2014 09:57 PM
ceph-osd.1.log.txt.xz (1.14 MB) ceph-osd.1.log.txt.xz Dmitry Smirnov, 06/24/2014 09:57 PM
ceph-osd.2.log.txt.xz (530 KB) ceph-osd.2.log.txt.xz Dmitry Smirnov, 06/24/2014 09:57 PM
ceph-osd.3.log.txt.xz (1.54 MB) ceph-osd.3.log.txt.xz Dmitry Smirnov, 06/24/2014 09:57 PM
ceph-osd.4.log.txt.xz (2.49 MB) ceph-osd.4.log.txt.xz Dmitry Smirnov, 06/24/2014 09:57 PM
ceph-osd.5.log.txt.xz (1.98 MB) ceph-osd.5.log.txt.xz Dmitry Smirnov, 06/24/2014 09:57 PM
ceph-osd.6.log.txt.xz (534 KB) ceph-osd.6.log.txt.xz Dmitry Smirnov, 06/24/2014 09:57 PM
ceph-osd.7.log.txt.xz (108 KB) ceph-osd.7.log.txt.xz Dmitry Smirnov, 06/24/2014 09:57 PM
ceph-osd.9.log.txt.xz (256 KB) ceph-osd.9.log.txt.xz Dmitry Smirnov, 06/24/2014 09:57 PM
ceph-osd.10.log.txt.xz (61.5 KB) ceph-osd.10.log.txt.xz Dmitry Smirnov, 06/24/2014 09:57 PM
ceph-osd.11.log.txt.xz (5.3 MB) ceph-osd.11.log.txt.xz Dmitry Smirnov, 06/24/2014 10:06 PM
ceph-osd.12.log.txt.xz (667 KB) ceph-osd.12.log.txt.xz Dmitry Smirnov, 06/24/2014 10:06 PM
ceph-osd.10.log.keyvaluestore-dev.xz (13.2 KB) ceph-osd.10.log.keyvaluestore-dev.xz Dmitry Smirnov, 07/01/2014 12:25 PM

Related issues 1 (0 open1 closed)

Related to Ceph - Bug #8660: pg in forever "down+peering" stateClosed06/25/2014

Actions
Actions

Also available in: Atom PDF