Project

General

Profile

Actions

Bug #19902

closed

osd/OSD.h: 706: FAILED assert(removed) in PG::unreg_next_scrub

Added by Sage Weil almost 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
jewel,kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

es --num-callers=50 --suppressions=/home/ubuntu/cephtest/valgrind.supp --xml=yes --xml-file=/var/log/ceph/valgrind/osd.3.log --time-stamp=yes --tool=memcheck ceph-osd -f --cluster ceph -i 3  UID: 0
2017-05-10T05:25:58.630 INFO:tasks.ceph.osd.3.smithi034.stderr: -4063> 2017-05-10 05:25:53.328233 3444e700 -1 osd.3 30 *** Got signal Terminated ***
2017-05-10T05:25:58.630 INFO:tasks.ceph.osd.3.smithi034.stderr: -1117> 2017-05-10 05:25:57.808672 3444e700 -1 osd.3 31 shutdown
2017-05-10T05:25:58.636 INFO:teuthology.orchestra.run.smithi001:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok dump_ops_in_flight'
2017-05-10T05:25:58.692 INFO:tasks.ceph.osd.3.smithi034.stderr:   -73> 2017-05-10 05:25:58.190585 1d7be700 -1 /mnt/jenkins/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.0.1-2198-g876
0432/rpm/el7/BUILD/ceph-12.0.1-2198-g8760432/src/osd/OSD.h: In function 'void OSDService::unreg_pg_scrub(spg_t, utime_t)' thread 1d7be700 time 2017-05-10 05:25:58.138388
2017-05-10T05:25:58.692 INFO:tasks.ceph.osd.3.smithi034.stderr:/mnt/jenkins/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.0.1-2198-g8760432/rpm/el7/BUILD/ceph-12.0.1-2198-g8760432/sr
c/osd/OSD.h: 706: FAILED assert(removed)
2017-05-10T05:25:58.692 INFO:tasks.ceph.osd.3.smithi034.stderr:
2017-05-10T05:25:58.692 INFO:tasks.ceph.osd.3.smithi034.stderr: ceph version 12.0.1-2198-g8760432 (87604323e6d350d533c826fdb91f2052a11d3de9)
2017-05-10T05:25:58.693 INFO:tasks.ceph.osd.3.smithi034.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0xae6630]
2017-05-10T05:25:58.693 INFO:tasks.ceph.osd.3.smithi034.stderr: 2: (PG::unreg_next_scrub()+0x1f3) [0x660443]
2017-05-10T05:25:58.693 INFO:tasks.ceph.osd.3.smithi034.stderr: 3: (PrimaryLogPG::on_shutdown()+0x135) [0x704ae5]
2017-05-10T05:25:58.693 INFO:tasks.ceph.osd.3.smithi034.stderr: 4: (PrimaryLogPG::on_removal(ObjectStore::Transaction*)+0x283) [0x6d3273]
2017-05-10T05:25:58.694 INFO:tasks.ceph.osd.3.smithi034.stderr: 5: (OSD::_remove_pg(PG*)+0x49) [0x5c1349]
2017-05-10T05:25:58.694 INFO:tasks.ceph.osd.3.smithi034.stderr: 6: (OSD::consume_map()+0x428) [0x5c92f8]
2017-05-10T05:25:58.694 INFO:tasks.ceph.osd.3.smithi034.stderr: 7: (OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*)+0x79e) [0x5ca1be]
2017-05-10T05:25:58.695 INFO:tasks.ceph.osd.3.smithi034.stderr: 8: (C_OnMapCommit::finish(int)+0x17) [0x618c57]
2017-05-10T05:25:58.695 INFO:tasks.ceph.osd.3.smithi034.stderr: 9: (Context::complete(int)+0x9) [0x5de219]
2017-05-10T05:25:58.695 INFO:tasks.ceph.osd.3.smithi034.stderr: 10: (Finisher::finisher_thread_entry()+0x198) [0xae4098]
2017-05-10T05:25:58.695 INFO:tasks.ceph.osd.3.smithi034.stderr: 11: (()+0x7dc5) [0xc729dc5]
2017-05-10T05:25:58.695 INFO:tasks.ceph.osd.3.smithi034.stderr: 12: (clone()+0x6d) [0xd87573d]
2017-05-10T05:25:58.695 INFO:tasks.ceph.osd.3.smithi034.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2017-05-10T05:25:58.695 INFO:tasks.ceph.osd.3.smithi034.stderr:

/a/sage-2017-05-10_03:08:19-rados-wip-sage-testing2---basic-smithi/1119809

Related issues 2 (0 open2 closed)

Copied to Ceph - Backport #19915: jewel: osd/OSD.h: 706: FAILED assert(removed) in PG::unreg_next_scrubResolvedAlexey SheplyakovActions
Copied to Ceph - Backport #19916: kraken: osd/OSD.h: 706: FAILED assert(removed) in PG::unreg_next_scrubResolvedAlexey SheplyakovActions
Actions #1

Updated by Kefu Chai almost 7 years ago

  • Assignee set to Kefu Chai
Actions #2

Updated by Kefu Chai almost 7 years ago

  • Status changed from New to Fix Under Review
  • Backport set to jewel,kraken
 -1143> 2017-05-10 05:25:57.790514 1d7be700 10 osd.3 31 _committed_osd_maps 32..32
 ..
 -1118> 2017-05-10 05:25:57.804659 3444e700  0 osd.3 31 prepare_to_stop starting shutdown
 -1117> 2017-05-10 05:25:57.808672 3444e700 -1 osd.3 31 shutdown
 ..
 -1028> 2017-05-10 05:25:57.917169 3444e700 10 osd.3 pg_epoch: 31 pg[34.5( v 31'3 (0'0,31'3] local-lis/les=29/31 n=1 ec=29 lis/c 29/29 les/c/f 31/31/0 29/29/29) [2,3] r=1 lpr=31 luod=0'0 crt=31'2 lcod 31'1 active] on_shutdown
  ..
  -365> 2017-05-10 05:25:58.062739 3444e700 10 osd.3 pg_epoch: 31 pg[17.4( v 24'2 (0'0,24'2] local-lis/les=27/29 n=1 ec=20 lis/c 27/27 les/c/f 29/29/0 27/27/27) [3,1] r=0 lpr=27 crt=24'2 lcod 24'1 mlcod 24'1 active+clean] on_shutdown
  -352> 2017-05-10 05:25:58.065410 3444e700 10 osd.3 pg_epoch: 31 pg[29.1( v 29'1 (0'0,29'1] local-lis/les=26/28 n=1 ec=26 lis/c 26/26 les/c/f 28/28/0 26/26/26) [2,3] r=1 lpr=28 luod=0'0 crt=29'1 lcod 0'0 active] on_shutdown
  -340> 2017-05-10 05:25:58.067891 3444e700 10 osd.3 pg_epoch: 31 pg[27.7( v 29'3 (0'0,29'3] local-lis/les=27/29 n=0 ec=25 lis/c 27/27 les/c/f 29/29/0 27/27/27) [3,1] r=0 lpr=27 crt=29'3 lcod 29'2 mlcod 29'2 active+clean] on_shutdown
  -327> 2017-05-10 05:25:58.070515 3444e700 10 osd.3 pg_epoch: 31 pg[25.5( v 30'8 (0'0,30'8] local-lis/les=24/27 n=0 ec=24 lis/c 24/24 les/c/f 27/27/0 24/24/24) [3,5] r=0 lpr=24 crt=30'8 lcod 29'7 mlcod 29'7 active+clean] on_shutdown
  -314> 2017-05-10 05:25:58.073209 3444e700 10 osd.3 pg_epoch: 31 pg[24.4( empty local-lis/les=23/25 n=0 ec=23 lis/c 23/23 les/c/f 25/25/0 23/23/23) [3,4] r=0 lpr=23 crt=0'0 mlcod 0'0 active+clean] on_shutdown
  -301> 2017-05-10 05:25:58.076383 3444e700 10 osd.3 pg_epoch: 31 pg[25.6( empty local-lis/les=24/27 n=0 ec=24 lis/c 24/24 les/c/f 27/27/0 24/24/24) [2,3] r=1 lpr=27 crt=0'0 active] on_shutdown
  -289> 2017-05-10 05:25:58.078623 3444e700 10 osd.3 pg_epoch: 31 pg[29.2( v 31'2 (0'0,31'2] local-lis/les=27/29 n=2 ec=26 lis/c 27/27 les/c/f 29/29/0 27/27/26) [3,2] r=0 lpr=27 crt=31'2 lcod 30'1 mlcod 30'1 active+clean] on_shutdown
  -276> 2017-05-10 05:25:58.081325 3444e700 10 osd.3 pg_epoch: 31 pg[24.7( empty local-lis/les=23/25 n=0 ec=23 lis/c 23/23 les/c/f 25/25/0 23/23/23) [3,2] r=0 lpr=23 crt=0'0 mlcod 0'0 active+clean] on_shutdown
  -263> 2017-05-10 05:25:58.084186 3444e700 10 osd.3 pg_epoch: 31 pg[20.0( empty local-lis/les=21/24 n=0 ec=21 lis/c 21/21 les/c/f 24/24/0 21/21/21) [4,3] r=1 lpr=24 crt=0'0 active] on_shutdown
  // it tooks 0.4 second to acquire the osd_lock
  ..
  -225> 2017-05-10 05:25:58.109649 1d7be700 10 osd.3 31  advance to epoch 32 (<= last 32 <= newest_map 32)
  ..
  -263> 2017-05-10 05:25:58.084186 3444e700 10 osd.3 pg_epoch: 31 pg[20.0( empty local-lis/les=21/24 n=0 ec=21 lis/c 21/21 les/c/f 24/24/0 21/21/21) [4,3] r=1 lpr=24 crt=0'0 active] on_shutdown
  ..

  -220> 2017-05-10 05:25:58.110989 1d7be700  7 osd.3 32 consume_map version 32
  ..
  -156> 2017-05-10 05:25:58.122924 1d7be700 10 osd.3 pg_epoch: 31 pg[27.0( empty local-lis/les=27/29 n=0 ec=25 lis/c 27/27 les/c/f 29/29/0 27/27/25) [2,3] r=1 lpr=29 crt=0'0 active] on_removal
  -155> 2017-05-10 05:25:58.123363 1d7be700 10 log is not dirty
  -154> 2017-05-10 05:25:58.123490 1d7be700 10 osd.3 pg_epoch: 31 pg[27.0( empty lb MIN (bitwise) local-lis/les=27/29 n=0 ec=25 lis/c 27/27 les/c/f 29/29/0 27/27/25) [2,3] r=1 lpr=29 crt=0'0 active] on_shutdown
  -153> 2017-05-10 05:25:58.123706 1d7be700 10 osd.3 pg_epoch: 31 pg[27.0( empty lb MIN (bitwise) local-lis/les=27/29 n=0 ec=25 lis/c 27/27 les/c/f 29/29/0 27/27/25) [2,3] r=1 lpr=29 crt=0'0 active] cancel_copy_ops

https://github.com/ceph/ceph/pull/15040

Actions #3

Updated by Kefu Chai almost 7 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #4

Updated by Alexey Sheplyakov almost 7 years ago

  • Copied to Backport #19915: jewel: osd/OSD.h: 706: FAILED assert(removed) in PG::unreg_next_scrub added
Actions #5

Updated by Alexey Sheplyakov almost 7 years ago

  • Copied to Backport #19916: kraken: osd/OSD.h: 706: FAILED assert(removed) in PG::unreg_next_scrub added
Actions #6

Updated by Kefu Chai almost 7 years ago

  • Assignee deleted (Kefu Chai)
Actions #7

Updated by Nathan Cutler over 6 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF