Project

General

Profile

Actions

Bug #8046

closed

osd/ReplicatedPG.h: 666: FAILED assert(got) in get_rw_locks()

Added by Sage Weil about 10 years ago. Updated about 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-08_02:30:14-rados-firefly-distro-basic-plana/178780

  -107> 2014-04-08 15:28:21.344154 7f44a6b2a700 10 osd.1 366 dequeue_op 0x35615a0 prio 63 cost 53 latency 0.003421 osd_op(client.4120.0:2649 plana1714517-48 [copy-from ver 0] 3.a09693a0 RETRY=1 snapc 11f=[11f,11e,11c,111,10c,ff] ack+ondisk+retry+write e366) v4 pg pg[3.20( v 366'104 (0'0,366'104] local-les=366 n=3 ec=7 les/c 366/366 3
59/365/229) [1]/[1,2] r=0 lpr=365 crt=206'66 lcod 320'101 mlcod 0'0 active+remapped snaptrimq=[fe~1,10b~1,110~1,11a~2,11d~1]]
  -106> 2014-04-08 15:28:21.344196 7f44a6b2a700  5 -- op tracker -- , seq: 3351, time: 2014-04-08 15:28:21.344196, event: reached_pg, request: osd_op(client.4120.0:2649 plana1714517-48 [copy-from ver 0] 3.a09693a0 RETRY=1 snapc 11f=[11f,11e,11c,111,10c,ff] ack+ondisk+retry+write e366) v4
  -105> 2014-04-08 15:28:21.344216 7f44a6b2a700 20 osd.1 pg_epoch: 366 pg[3.20( v 366'104 (0'0,366'104] local-les=366 n=3 ec=7 les/c 366/366 359/365/229) [1]/[1,2] r=0 lpr=365 crt=206'66 lcod 320'101 mlcod 0'0 active+remapped snaptrimq=[fe~1,10b~1,110~1,11a~2,11d~1]] op_has_sufficient_caps pool=3 (unique_pool_0 ) owner=0 need_read_ca
p=0 need_write_cap=1 need_class_read_cap=0 need_class_write_cap=0 -> yes
  -104> 2014-04-08 15:28:21.344251 7f44a6b2a700 10 osd.1 pg_epoch: 366 pg[3.20( v 366'104 (0'0,366'104] local-les=366 n=3 ec=7 les/c 366/366 359/365/229) [1]/[1,2] r=0 lpr=365 crt=206'66 lcod 320'101 mlcod 0'0 active+remapped snaptrimq=[fe~1,10b~1,110~1,11a~2,11d~1]] handle_message: 0x35615a0
  -103> 2014-04-08 15:28:21.344277 7f44a6b2a700 10 osd.1 pg_epoch: 366 pg[3.20( v 366'104 (0'0,366'104] local-les=366 n=3 ec=7 les/c 366/366 359/365/229) [1]/[1,2] r=0 lpr=365 crt=206'66 lcod 320'101 mlcod 0'0 active+remapped snaptrimq=[fe~1,10b~1,110~1,11a~2,11d~1]] do_op osd_op(client.4120.0:2649 plana1714517-48 [copy-from ver 0] 3
.a09693a0 RETRY=1 snapc 11f=[11f,11e,11c,111,10c,ff] ack+ondisk+retry+write e366) v4 may_write -> write-ordered flags ack+ondisk+retry+write
  -102> 2014-04-08 15:28:21.344325 7f44a6b2a700 10 osd.1 pg_epoch: 366 pg[3.20( v 366'104 (0'0,366'104] local-les=366 n=3 ec=7 les/c 366/366 359/365/229) [1]/[1,2] r=0 lpr=365 crt=206'66 lcod 320'101 mlcod 0'0 active+remapped snaptrimq=[fe~1,10b~1,110~1,11a~2,11d~1]] get_object_context: found obc in cache: 0x3585000
  -101> 2014-04-08 15:28:21.344355 7f44a6b2a700 10 osd.1 pg_epoch: 366 pg[3.20( v 366'104 (0'0,366'104] local-les=366 n=3 ec=7 les/c 366/366 359/365/229) [1]/[1,2] r=0 lpr=365 crt=206'66 lcod 320'101 mlcod 0'0 active+remapped snaptrimq=[fe~1,10b~1,110~1,11a~2,11d~1]] get_object_context: 0x3585000 a09693a0/plana1714517-48/head//3 rwst
ate(write n=1 w=1) oi: a09693a0/plana1714517-48/head//3(0'0 unknown.0.0:0 wrlock_by=unknown.0.0:0 s 0 uv0) ssc: 0x3a2a1c0 snapset: 11f=[11f,11e,11c,111,10c,ff]:[11f]
  -100> 2014-04-08 15:28:21.344393 7f44a6b2a700 10 osd.1 pg_epoch: 366 pg[3.20( v 366'104 (0'0,366'104] local-les=366 n=3 ec=7 les/c 366/366 359/365/229) [1]/[1,2] r=0 lpr=365 crt=206'66 lcod 320'101 mlcod 0'0 active+remapped snaptrimq=[fe~1,10b~1,110~1,11a~2,11d~1]] find_object_context a09693a0/plana1714517-48/head//3 @head oi=a0969
3a0/plana1714517-48/head//3(0'0 unknown.0.0:0 wrlock_by=unknown.0.0:0 s 0 uv0)
   -99> 2014-04-08 15:28:21.344437 7f44a6b2a700 10 osd.1 pg_epoch: 366 pg[3.20( v 366'104 (0'0,366'104] local-les=366 n=3 ec=7 les/c 366/366 359/365/229) [1]/[1,2] r=0 lpr=365 crt=206'66 lcod 320'101 mlcod 0'0 active+remapped snaptrimq=[fe~1,10b~1,110~1,11a~2,11d~1]] get_object_context: found obc in cache: 0x3be4080
   -98> 2014-04-08 15:28:21.344468 7f44a6b2a700 10 osd.1 pg_epoch: 366 pg[3.20( v 366'104 (0'0,366'104] local-les=366 n=3 ec=7 les/c 366/366 359/365/229) [1]/[1,2] r=0 lpr=365 crt=206'66 lcod 320'101 mlcod 0'0 active+remapped snaptrimq=[fe~1,10b~1,110~1,11a~2,11d~1]] get_object_context: 0x3be4080 a09693a0/plana1714517-48/snapdir//3 rwstate(write n=1 w=0) oi: a09693a0/plana1714517-48/snapdir//3(366'103 client.4120.0:2646 [] s 0 uv0) ssc: 0x3a2a1c0 snapset: 11f=[11f,11e,11c,111,10c,ff]:[11f]
     0> 2014-04-08 15:28:21.407554 7f44a6b2a700 -1 osd/ReplicatedPG.h: In function 'bool ReplicatedPG::get_rw_locks(ReplicatedPG::OpContext*)' thread 7f44a6b2a700 time 2014-04-08 15:28:21.344510
osd/ReplicatedPG.h: 666: FAILED assert(got)

 ceph version 0.79-42-g010dff1 (010dff12c38882238591bb042f8e497a1f7ba020)
 1: (ReplicatedPG::get_rw_locks(ReplicatedPG::OpContext*)+0x43b) [0x83712b]
 2: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x2522) [0x81da52]
 3: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x692) [0x7bcdd2]
 4: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x34a) [0x619d9a]
 5: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x628) [0x634e88]
 6: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x67a6ec]
 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0xa56936]
 8: (ThreadPool::WorkThread::entry()+0x10) [0xa58740]
 9: (()+0x7e9a) [0x7f44bc3d2e9a]
 10: (clone()+0x6d) [0x7f44ba9933fd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Actions #1

Updated by Sage Weil about 10 years ago

The wr locks are held on both head and snapset due to a previous op (delete) that is committed but not yet applied.

Actions #2

Updated by Sage Weil about 10 years ago

  • Status changed from 12 to 7
  • Assignee set to Sage Weil
Actions #3

Updated by Sage Weil about 10 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF