Project

General

Profile

Bug #2070

osd/ReplicatedPG.cc: 3627: FAILED assert(is_active())

Added by Sage Weil over 8 years ago. Updated over 8 years ago.

Status:
Duplicate
Priority:
High
Assignee:
-
Category:
OSD
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

ubuntu@teuthology:/a/nightly_coverage_2012-02-15-b/12164

osd/ReplicatedPG.cc: In function 'void ReplicatedPG::sub_op_modify(OpRequest*)' thread 7fcac64ac700 time 2012-02-15 16:16:35.645072
osd/ReplicatedPG.cc: 3627: FAILED assert(is_active())
 ceph version 0.41-350-ge32668f (commit:e32668f8b83abad74e858d9e9fffbd456968a918)
 1: (ReplicatedPG::sub_op_modify(OpRequest*)+0x10b2) [0x4c03b2]
 2: (ReplicatedPG::do_sub_op(OpRequest*)+0xbb) [0x4db36b]
 3: (OSD::dequeue_op(PG*)+0x121) [0x547a31]
 4: (ThreadPool::worker()+0xa28) [0x619a78]
 5: (ThreadPool::WorkThread::entry()+0xd) [0x57adfd]
 6: (()+0x7971) [0x7fcad5a3d971]
 7: (clone()+0x6d) [0x7fcad40c892d]
 ceph version 0.41-350-ge32668f (commit:e32668f8b83abad74e858d9e9fffbd456968a918)
 1: (ReplicatedPG::sub_op_modify(OpRequest*)+0x10b2) [0x4c03b2]
 2: (ReplicatedPG::do_sub_op(OpRequest*)+0xbb) [0x4db36b]
 3: (OSD::dequeue_op(PG*)+0x121) [0x547a31]
 4: (ThreadPool::worker()+0xa28) [0x619a78]
 5: (ThreadPool::WorkThread::entry()+0xd) [0x57adfd]
 6: (()+0x7971) [0x7fcad5a3d971]
 7: (clone()+0x6d) [0x7fcad40c892d]
*** Caught signal (Aborted) **


  kernel:
    sha1: 27772dabb75b1072a81c0215b61b066bf8810f6c
  nuke-on-error: true
  overrides:
    ceph:
      conf:
        osd:
          osd op complaint time: 120
      coverage: true
      fs: btrfs
      log-whitelist:
      - clocks not synchronized
      - old request
      sha1: e32668f8b83abad74e858d9e9fffbd456968a918
  roles:
  - - mon.a
    - osd.0
    - osd.1
    - osd.2
  - - mds.a
    - client.0
    - osd.3
    - osd.4
    - osd.5
  tasks:
  - chef: null
  - ceph:
      log-whitelist:
      - wrongly marked me down or wrong addr
  - thrashosds: null
  - rados:
      clients:
      - client.0
      objects: 50
      op_weights:
        delete: 50
        read: 100
        snap_create: 50
        snap_remove: 50
        snap_rollback: 50
        write: 100
      ops: 4000


Related issues

Duplicates Ceph - Bug #2075: osd: recover_got assert Resolved 02/16/2012

Associated revisions

Revision 344c2022 (diff)
Added by Sage Weil over 8 years ago

osd: fix up argument to PG::init()

Commit cefa55b288b40e17ade9875493dd94de52ac22bf moved PG initialization
into init(), but passed acting for both up and acting args. This lead to
confusion between primary and replica.

Also fix debug print so that the output is useful.

Fixes: #2075, #2070
Signed-off-by: Sage Weil <>

History

#1 Updated by Sage Weil over 8 years ago

also hit this on ubuntu@teuthology:/a/nightly_coverage_2012-02-15-b/12169

#2 Updated by Sage Weil over 8 years ago

if i had to guess this is related to the pg init() refactor. not much to be found from the core, except that pg->state == 0 (hence, !is_active()).

would be nice to reproduce this with some logs

#3 Updated by Sage Weil over 8 years ago

ubuntu@teuthology:/a/nightly_coverage_2012-02-16-b/12294

#4 Updated by Sage Weil over 8 years ago

ubuntu@teuthology:/a/nightly_coverage_2012-02-18-a/12494

#5 Updated by Sage Weil over 8 years ago

  • Status changed from New to Duplicate

ok i didn't observe this crash and trace it back, but i'm almost certain it's the same as #2075.

344c20220345197c03fbaf46e2c1289d81a0a14f

Also available in: Atom PDF