Project

General

Profile

Actions

Bug #8161

closed

osd/ECBackend.cc: 475: FAILED assert(r == 0)

Added by Sage Weil about 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-04-18_21:29:10-rados:thrash-testing-testing-basic-plana/202248

   -11> 2014-04-19 07:59:20.825284 7f9173af6700 20 osd.1 pg_epoch: 881 pg[3.73s0( v 864'438 lc 344'204 (0'0,864'438] local-les=861 n=3 ec=10 les/c 861/839 860/860/860) [1,5,3] r=0 lpr=860 pi=817-859/3 rops=3 crt=853'432 mlcod 344'204 active+recovering m=6 u=6 snaptrimq=[1d7~1,1d
b~1,1e8~1,1fb~1,202~1,206~1,20f~3,213~2,216~5]] get_with_id: 3.73s0 got id 1203
   -10> 2014-04-19 07:59:20.825326 7f9173af6700 10 osd.1 881 dequeue_op 0x4b8aa50 prio 127 cost 3148728 latency 0.580285 MOSDPGPushReply(3.73s0 881 [PushReplyOp(bfbf9173/plana6611054-46/df//3),PushReplyOp(bfbf9173/plana6611054-46/1ba//3),PushReplyOp(bfbf9173/plana6611054-46/1fd/
/3)]) v2 pg pg[3.73s0( v 864'438 lc 344'204 (0'0,864'438] local-les=861 n=3 ec=10 les/c 861/839 860/860/860) [1,5,3] r=0 lpr=860 pi=817-859/3 rops=3 crt=853'432 mlcod 344'204 active+recovering m=6 u=6 snaptrimq=[1d7~1,1db~1,1e8~1,1fb~1,202~1,206~1,20f~3,213~2,216~5]]
    -9> 2014-04-19 07:59:20.825366 7f9173af6700  5 -- op tracker -- , seq: 9437, time: 2014-04-19 07:59:20.825366, event: reached_pg, request: MOSDPGPushReply(3.73s0 881 [PushReplyOp(bfbf9173/plana6611054-46/df//3),PushReplyOp(bfbf9173/plana6611054-46/1ba//3),PushReplyOp(bfbf917
3/plana6611054-46/1fd//3)]) v2
    -8> 2014-04-19 07:59:20.825379 7f9173af6700 10 osd.1 pg_epoch: 881 pg[3.73s0( v 864'438 lc 344'204 (0'0,864'438] local-les=861 n=3 ec=10 les/c 861/839 860/860/860) [1,5,3] r=0 lpr=860 pi=817-859/3 rops=3 crt=853'432 mlcod 344'204 active+recovering m=6 u=6 snaptrimq=[1d7~1,1d
b~1,1e8~1,1fb~1,202~1,206~1,20f~3,213~2,216~5]] handle_message: MOSDPGPushReply(3.73s0 881 [PushReplyOp(bfbf9173/plana6611054-46/df//3),PushReplyOp(bfbf9173/plana6611054-46/1ba//3),PushReplyOp(bfbf9173/plana6611054-46/1fd//3)]) v2
    -7> 2014-04-19 07:59:20.825413 7f9173af6700 10 osd.1 pg_epoch: 881 pg[3.73s0( v 864'438 lc 344'204 (0'0,864'438] local-les=861 n=3 ec=10 les/c 861/839 860/860/860) [1,5,3] r=0 lpr=860 pi=817-859/3 rops=3 crt=853'432 mlcod 344'204 active+recovering m=6 u=6 snaptrimq=[1d7~1,1db~1,1e8~1,1fb~1,202~1,206~1,20f~3,213~2,216~5]] continue_recovery_op: continuing RecoveryOp(hoid=bfbf9173/plana6611054-46/df//3 v=354'205 missing_on=1(0),3(2),5(1) missing_on_shards=^@,^A,^B recovery_info=ObjectRecoveryInfo(bfbf9173/plana6611054-46/df//3@354'205, copy_subset: [], clone_subset: {}) recovery_progress=ObjectRecoveryProgress(!first, data_recovered_to:1052672, data_complete:false, omap_recovered_to:, omap_complete:true) pending_read=0 obc refcount=1 state=WRITING waiting_on_pushes= extent_requested=0,1052672
    -6> 2014-04-19 07:59:20.825459 7f9173af6700 10 osd.1 pg_epoch: 881 pg[3.73s0( v 864'438 lc 344'204 (0'0,864'438] local-les=861 n=3 ec=10 les/c 861/839 860/860/860) [1,5,3] r=0 lpr=860 pi=817-859/3 rops=3 crt=853'432 mlcod 344'204 active+recovering m=6 u=6 snaptrimq=[1d7~1,1db~1,1e8~1,1fb~1,202~1,206~1,20f~3,213~2,216~5]] continue_recovery_op: WRITING continue RecoveryOp(hoid=bfbf9173/plana6611054-46/df//3 v=354'205 missing_on=1(0),3(2),5(1) missing_on_shards=^@,^A,^B recovery_info=ObjectRecoveryInfo(bfbf9173/plana6611054-46/df//3@354'205, copy_subset: [], clone_subset: {}) recovery_progress=ObjectRecoveryProgress(!first, data_recovered_to:1052672, data_complete:false, omap_recovered_to:, omap_complete:true) pending_read=0 obc refcount=1 state=IDLE waiting_on_pushes= extent_requested=0,1052672
    -5> 2014-04-19 07:59:20.825494 7f9173af6700 10 osd.1 pg_epoch: 881 pg[3.73s0( v 864'438 lc 344'204 (0'0,864'438] local-les=861 n=3 ec=10 les/c 861/839 860/860/860) [1,5,3] r=0 lpr=860 pi=817-859/3 rops=3 crt=853'432 mlcod 344'204 active+recovering m=6 u=6 snaptrimq=[1d7~1,1db~1,1e8~1,1fb~1,202~1,206~1,20f~3,213~2,216~5]] get_min_avail_to_read_shards: checking acting 1(0)
    -4> 2014-04-19 07:59:20.825523 7f9173af6700 10 osd.1 pg_epoch: 881 pg[3.73s0( v 864'438 lc 344'204 (0'0,864'438] local-les=861 n=3 ec=10 les/c 861/839 860/860/860) [1,5,3] r=0 lpr=860 pi=817-859/3 rops=3 crt=853'432 mlcod 344'204 active+recovering m=6 u=6 snaptrimq=[1d7~1,1db~1,1e8~1,1fb~1,202~1,206~1,20f~3,213~2,216~5]] get_min_avail_to_read_shards: checking acting 3(2)
    -3> 2014-04-19 07:59:20.825550 7f9173af6700 10 osd.1 pg_epoch: 881 pg[3.73s0( v 864'438 lc 344'204 (0'0,864'438] local-les=861 n=3 ec=10 les/c 861/839 860/860/860) [1,5,3] r=0 lpr=860 pi=817-859/3 rops=3 crt=853'432 mlcod 344'204 active+recovering m=6 u=6 snaptrimq=[1d7~1,1db~1,1e8~1,1fb~1,202~1,206~1,20f~3,213~2,216~5]] get_min_avail_to_read_shards: checking acting 5(1)
    -2> 2014-04-19 07:59:20.825577 7f9173af6700 10 osd.1 pg_epoch: 881 pg[3.73s0( v 864'438 lc 344'204 (0'0,864'438] local-les=861 n=3 ec=10 les/c 861/839 860/860/860) [1,5,3] r=0 lpr=860 pi=817-859/3 rops=3 crt=853'432 mlcod 344'204 active+recovering m=6 u=6 snaptrimq=[1d7~1,1db~1,1e8~1,1fb~1,202~1,206~1,20f~3,213~2,216~5]] get_min_avail_to_read_shards: checking missing_loc 5(0)
     0> 2014-04-19 07:59:20.832997 7f9173af6700 -1 osd/ECBackend.cc: In function 'void ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*)' thread 7f9173af6700 time 2014-04-19 07:59:20.825606
osd/ECBackend.cc: 475: FAILED assert(r == 0)

 ceph version 0.79-260-gf8f3ca4 (f8f3ca452c39f004fe158f9efb5978f16960aae9)
 1: (ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*)+0x2070) [0x924290]
 2: (ECBackend::handle_recovery_push_reply(PushReplyOp&, pg_shard_t, RecoveryMessages*)+0xb4) [0x925664]
 3: (ECBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x285) [0x929e65]
 4: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1ee) [0x7c244e]
 5: (OSD::dequeue_op(TrackedIntPtr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1a2) [0x61a5c2]
 6: (OSD::OpWQ::_process(TrackedIntPtr<PG>, ThreadPool::TPHandle&)+0x604) [0x635ec4]
 7: (ThreadPool::WorkQueueVal<std::pair<TrackedIntPtr<PG>, std::tr1::shared_ptr<OpRequest> >, TrackedIntPtr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0xa6) [0x67c496]
 8: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0xa5be66]
 9: (ThreadPool::WorkThread::entry()+0x10) [0xa5dc70]
 10: (()+0x7e9a) [0x7f91892d1e9a]
 11: (clone()+0x6d) [0x7f91878923fd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Actions #1

Updated by Samuel Just about 10 years ago

  • Status changed from New to 12
Actions #2

Updated by Samuel Just about 10 years ago

  • Assignee set to Samuel Just
Actions #3

Updated by Samuel Just about 10 years ago

  • Status changed from 12 to 7
Actions #4

Updated by Sage Weil almost 10 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF