Project

General

Profile

Bug #19943

osd: enoent on snaptrimmer

Added by Sage Weil 2 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
Snapshots
Target version:
-
Start date:
05/16/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Needs Doc:
No
Component(RADOS):
OSD

Description

   -15> 2017-05-16 03:54:15.567351 7f2462a12700 20 osd.1 op_wq(3) _process 2.3 item PGQueueable(PGSnapTrim prio 5 cost 1048576 e814) pg 0x56090e3b4000
   -14> 2017-05-16 03:54:15.567357 7f2462a12700 10 osd.1 pg_epoch: 814 pg[2.3( v 812'4221 (714'3846,812'4221] local-lis/les=810/812 n=47 ec=10 lis/c 810/810 les/c/f 812/812/0 808/810/810) [1,5] r=0 lpr=810 crt=812'4221 lcod 812'4219 mlcod 812'4219 active+clean+snaptrim snaptrimq=[263~1,27b~4,2a7~1,2ab~d,2ba~4,2bf~1]] snap_trimmer posting
   -13> 2017-05-16 03:54:15.567384 7f2462a12700 10 osd.1 pg_epoch: 814 pg[2.3( v 812'4221 (714'3846,812'4221] local-lis/les=810/812 n=47 ec=10 lis/c 810/810 les/c/f 812/812/0 808/810/810) [1,5] r=0 lpr=810 crt=812'4221 lcod 812'4219 mlcod 812'4219 active+clean+snaptrim snaptrimq=[263~1,27b~4,2a7~1,2ab~d,2ba~4,2bf~1]] SnapTrimmer state<Trimming/AwaitAsyncWork>: AwaitAsyncWork: trimming snap 263
   -12> 2017-05-16 03:54:15.567407 7f2462a12700 15 filestore(/var/lib/ceph/osd/ceph-1) get_omap_iterator meta/#-1:c0371625:::snapmapper:0#
   -11> 2017-05-16 03:54:15.567621 7f2462a12700 15 filestore(/var/lib/ceph/osd/ceph-1) get_omap_iterator meta/#-1:c0371625:::snapmapper:0#
   -10> 2017-05-16 03:54:15.567690 7f2462a12700 15 filestore(/var/lib/ceph/osd/ceph-1) get_omap_iterator meta/#-1:c0371625:::snapmapper:0#
    -9> 2017-05-16 03:54:15.567743 7f2462a12700 15 filestore(/var/lib/ceph/osd/ceph-1) get_omap_iterator meta/#-1:c0371625:::snapmapper:0#
    -8> 2017-05-16 03:54:15.567794 7f2462a12700 15 filestore(/var/lib/ceph/osd/ceph-1) get_omap_iterator meta/#-1:c0371625:::snapmapper:0#
    -7> 2017-05-16 03:54:15.567847 7f2462a12700 10 osd.1 pg_epoch: 814 pg[2.3( v 812'4221 (714'3846,812'4221] local-lis/les=810/812 n=47 ec=10 lis/c 810/810 les/c/f 812/812/0 808/810/810) [1,5] r=0 lpr=810 crt=812'4221 lcod 812'4219 mlcod 812'4219 active+clean+snaptrim snaptrimq=[263~1,27b~4,2a7~1,2ab~d,2ba~4,2bf~1]] SnapTrimmer state<Trimming/AwaitAsyncWork>: AwaitAsyncWork react trimming 2:fa704231:::smithi05621570-490:269
    -6> 2017-05-16 03:54:15.567870 7f2462a12700 10 osd.1 pg_epoch: 814 pg[2.3( v 812'4221 (714'3846,812'4221] local-lis/les=810/812 n=47 ec=10 lis/c 810/810 les/c/f 812/812/0 808/810/810) [1,5] r=0 lpr=810 crt=812'4221 lcod 812'4219 mlcod 812'4219 active+clean+snaptrim snaptrimq=[263~1,27b~4,2a7~1,2ab~d,2ba~4,2bf~1]] get_object_context: obc NOT found in cache: 2:fa704231:::smithi05621570-490:269
    -5> 2017-05-16 03:54:15.567880 7f2462a12700 15 filestore(/var/lib/ceph/osd/ceph-1) getattr 2.3_head/#2:fa704231:::smithi05621570-490:269# '_'
    -4> 2017-05-16 03:54:15.567933 7f2462a12700 10 filestore(/var/lib/ceph/osd/ceph-1) error opening file /var/lib/ceph/osd/ceph-1/current/2.3_head/DIR_F/smithi05621570-490__269_8C420E5F__2 with flags=2: (2) No such file or directory
    -3> 2017-05-16 03:54:15.567946 7f2462a12700 10 filestore(/var/lib/ceph/osd/ceph-1) getattr 2.3_head/#2:fa704231:::smithi05621570-490:269# '_' = -2
    -2> 2017-05-16 03:54:15.567950 7f2462a12700 10 osd.1 pg_epoch: 814 pg[2.3( v 812'4221 (714'3846,812'4221] local-lis/les=810/812 n=47 ec=10 lis/c 810/810 les/c/f 812/812/0 808/810/810) [1,5] r=0 lpr=810 crt=812'4221 lcod 812'4219 mlcod 812'4219 active+clean+snaptrim snaptrimq=[263~1,27b~4,2a7~1,2ab~d,2ba~4,2bf~1]] get_object_context: no obc for soid 2:fa704231:::smithi05621570-490:269 and !can_create
    -1> 2017-05-16 03:54:15.567959 7f2462a12700 -1 osd.1 pg_epoch: 814 pg[2.3( v 812'4221 (714'3846,812'4221] local-lis/les=810/812 n=47 ec=10 lis/c 810/810 les/c/f 812/812/0 808/810/810) [1,5] r=0 lpr=810 crt=812'4221 lcod 812'4219 mlcod 812'4219 active+clean+snaptrim snaptrimq=[263~1,27b~4,2a7~1,2ab~d,2ba~4,2bf~1]] trim_object could not find coid 2:fa704231:::smithi05621570-490:269
     0> 2017-05-16 03:54:15.571025 7f2462a12700 -1 *** Caught signal (Aborted) **
 in thread 7f2462a12700 thread_name:tp_osd_tp

 ceph version 12.0.1-2422-g0d194cc (0d194cc84cf22302e6f62c1cba71d85c99637937)
 1: (()+0x98c027) [0x5609043fd027]
 2: (()+0x10330) [0x7f2480cd0330]
 3: (gsignal()+0x37) [0x7f247fcf4c37]
 4: (abort()+0x148) [0x7f247fcf8028]
 5: (PrimaryLogPG::trim_object(bool, hobject_t const&)+0x1250) [0x56090408a370]
 6: (PrimaryLogPG::AwaitAsyncWork::react(PrimaryLogPG::DoSnapWork const&)+0x8a6) [0x56090408bcc6]
 7: (boost::statechart::simple_state<PrimaryLogPG::AwaitAsyncWork, PrimaryLogPG::Trimming, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x90) [0x5609040eab40]
 8: (PrimaryLogPG::snap_trimmer(unsigned int)+0x1b6) [0x560904046ea6]
 9: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfd2) [0x560903f3bcc2]
 10: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x955) [0x56090443e3c5]
 11: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x560904440520]
 12: (()+0x8184) [0x7f2480cc8184]

snapmapper shouldn't have htat object:
teuthology:1182436  03:58 PM $ grep 2:fa704231:::smithi05621570-490 remote/*/log/*osd.1* | grep snap_mapper
2017-05-16 03:37:48.951783 7fa1e5d9c700 20 snap_mapper.add_oid 2:fa704231:::smithi05621570-490:4 3,4
2017-05-16 03:37:48.951955 7fa1e5d9c700 20 snap_mapper.set_snaps 2:fa704231:::smithi05621570-490:4 3,4
2017-05-16 03:37:51.765773 7fa1e5d9c700 20 snap_mapper.update_snaps 2:fa704231:::smithi05621570-490:4 4 was 
2017-05-16 03:37:51.765946 7fa1e5d9c700 20 snap_mapper.get_snaps 2:fa704231:::smithi05621570-490:4 3,4
2017-05-16 03:37:51.765963 7fa1e5d9c700 20 snap_mapper.set_snaps 2:fa704231:::smithi05621570-490:4 4
2017-05-16 03:37:53.772318 7fa1e5d9c700 20 snap_mapper.remove_oid 2:fa704231:::smithi05621570-490:4
2017-05-16 03:37:53.772325 7fa1e5d9c700 20 snap_mapper._remove_oid 2:fa704231:::smithi05621570-490:4
2017-05-16 03:37:53.772476 7fa1e5d9c700 20 snap_mapper.get_snaps 2:fa704231:::smithi05621570-490:4 4
2017-05-16 03:37:53.772483 7fa1e5d9c700 20 snap_mapper.clear_snaps 2:fa704231:::smithi05621570-490:4
2017-05-16 03:39:50.166759 7fa1e2d96700 20 snap_mapper.add_oid 2:fa704231:::smithi05621570-490:5b 3d,53,56,58,59,5a,5b
2017-05-16 03:39:50.166875 7fa1e2d96700 20 snap_mapper.set_snaps 2:fa704231:::smithi05621570-490:5b 3d,53,56,58,59,5a,5b
2017-05-16 03:39:58.302018 7fa1e0d92700 20 snap_mapper.remove_oid 2:fa704231:::smithi05621570-490:5b
2017-05-16 03:39:58.302021 7fa1e0d92700 20 snap_mapper._remove_oid 2:fa704231:::smithi05621570-490:5b
2017-05-16 03:39:58.302093 7fa1e0d92700 20 snap_mapper.get_snaps 2:fa704231:::smithi05621570-490:5b 3d,53,56,58,59,5a,5b
2017-05-16 03:39:58.302097 7fa1e0d92700 20 snap_mapper.clear_snaps 2:fa704231:::smithi05621570-490:5b
2017-05-16 03:45:06.148733 7f9301842700 20 snap_mapper.add_oid 2:fa704231:::smithi05621570-490:131 12c
2017-05-16 03:45:06.148904 7f9301842700 20 snap_mapper.set_snaps 2:fa704231:::smithi05621570-490:131 12c
2017-05-16 03:45:12.490071 7f9301842700 20 snap_mapper.remove_oid 2:fa704231:::smithi05621570-490:131
2017-05-16 03:45:12.490077 7f9301842700 20 snap_mapper._remove_oid 2:fa704231:::smithi05621570-490:131
2017-05-16 03:45:12.490253 7f9301842700 20 snap_mapper.get_snaps 2:fa704231:::smithi05621570-490:131 12c
2017-05-16 03:45:12.490260 7f9301842700 20 snap_mapper.clear_snaps 2:fa704231:::smithi05621570-490:131
2017-05-16 03:48:12.572931 7f92fe03b700 20 snap_mapper.remove_oid 2:fa704231:::smithi05621570-490:head
2017-05-16 03:48:12.572933 7f92fe03b700 20 snap_mapper._remove_oid 2:fa704231:::smithi05621570-490:head
2017-05-16 03:52:15.264425 7f92ff03d700 20 snap_mapper.add_oid 2:fa704231:::smithi05621570-490:269 200,210,215,22c,240,245,249,24a,24b,24e,24f,251,259,260,261,262,263
2017-05-16 03:52:15.264573 7f92ff03d700 20 snap_mapper.set_snaps 2:fa704231:::smithi05621570-490:269 200,210,215,22c,240,245,249,24a,24b,24e,24f,251,259,260,261,262,263
2017-05-16 03:52:41.833040 7f9301842700 20 snap_mapper.add_oid 2:fa704231:::smithi05621570-490:279 26e,275,277
2017-05-16 03:52:41.833244 7f9301842700 20 snap_mapper.set_snaps 2:fa704231:::smithi05621570-490:279 26e,275,277
2017-05-16 03:52:52.392636 7f92fa033700 20 snap_mapper.remove_oid 2:fa704231:::smithi05621570-490:269
2017-05-16 03:52:52.392639 7f92fa033700 20 snap_mapper._remove_oid 2:fa704231:::smithi05621570-490:269
2017-05-16 03:52:52.392783 7f92fa033700 20 snap_mapper.get_snaps 2:fa704231:::smithi05621570-490:269 200,210,215,22c,240,245,249,24a,24b,24e,24f,251,259,260,261,262,263
2017-05-16 03:52:52.392790 7f92fa033700 20 snap_mapper.clear_snaps 2:fa704231:::smithi05621570-490:269
2017-05-16 03:53:11.338655 7f9301842700 20 snap_mapper.get_snaps 2:fa704231:::smithi05621570-490:279 26e,275,277
2017-05-16 03:53:13.940511 7f9301842700 20 snap_mapper.remove_oid 2:fa704231:::smithi05621570-490:279
2017-05-16 03:53:13.940514 7f9301842700 20 snap_mapper._remove_oid 2:fa704231:::smithi05621570-490:279
2017-05-16 03:53:13.940631 7f9301842700 20 snap_mapper.get_snaps 2:fa704231:::smithi05621570-490:279 26e,275,277
2017-05-16 03:53:13.940636 7f9301842700 20 snap_mapper.clear_snaps 2:fa704231:::smithi05621570-490:279
2017-05-16 03:53:40.165676 7f9301842700 20 snap_mapper.add_oid 2:fa704231:::smithi05621570-490:2a5 27b,27c,27d,27e,284,28c,292,2a0,2a3,2a4
2017-05-16 03:53:40.165785 7f9301842700 20 snap_mapper.set_snaps 2:fa704231:::smithi05621570-490:2a5 27b,27c,27d,27e,284,28c,292,2a0,2a3,2a4
2017-05-16 03:53:44.700164 7f9301842700 20 snap_mapper.update_snaps 2:fa704231:::smithi05621570-490:2a5 27b,27c,27d,27e,284,28c,2a0,2a3,2a4 was 
2017-05-16 03:53:44.700268 7f9301842700 20 snap_mapper.get_snaps 2:fa704231:::smithi05621570-490:2a5 27b,27c,27d,27e,284,28c,292,2a0,2a3,2a4
2017-05-16 03:53:44.700277 7f9301842700 20 snap_mapper.set_snaps 2:fa704231:::smithi05621570-490:2a5 27b,27c,27d,27e,284,28c,2a0,2a3,2a4

note that 269 was deleted at 03:52:52.392790

filestore, rocksdb (recent).
/a/sage-2017-05-16_02:42:05-rados-wip-sage-testing2---basic-smithi/1182436

History

#1 Updated by Greg Farnum 2 months ago

Clone 269 was trimmed but it corresponds to a lot of other snapshots, so the object shouldn't be removed until all the snapshots have been trimmed. I wonder if we have some terrible ordering bug when trimming multiple snapshots simultaneously...(although I thought it all went into a nicely ordered queue).

#2 Updated by Sage Weil 2 months ago

  • Subject changed from osd: enoent on snaptrimmer (filestore, rocksdb) to osd: enoent on snaptrimmer
  • Priority changed from Urgent to Immediate

/a/yuriw-2017-05-15_22:59:10-rados-wip-yuri-testing_2017_5_16-distro-basic-smithi/1181575 (bluestore)

#3 Updated by Sage Weil 2 months ago

/a/sage-2017-05-24_18:40:38-rados-wip-sage-testing2---basic-smithi/1224933

#4 Updated by Brad Hubbard 2 months ago

/a/bhubbard-2017-05-24_05:25:43-rados-wip-badone-testing---basic-smithi/1224546/teuthology.log

#6 Updated by Sage Weil about 2 months ago

/a/sage-2017-05-27_03:43:09-rados-wip-sage-testing2---basic-smithi/1235419

#7 Updated by Sage Weil about 2 months ago

/a/sage-2017-05-27_03:43:09-rados-wip-sage-testing2---basic-smithi/1235222

#8 Updated by Sage Weil about 2 months ago

/a/sage-2017-05-31_18:45:30-rados-wip-sage-testing---basic-smithi/1248735

#9 Updated by Sage Weil about 2 months ago

/a/sage-2017-06-01_21:44:07-rados-wip-sage-testing---basic-smithi/1253654
teuthology:1253654 02:23 AM $ cat summary.yaml
description: rados/thrash/{0-size-min-size-overrides/2-size-1-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml
backoff/peering.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-require-luminous/at-mkfs.yaml
msgr-failures/osd-delay.yaml msgr/simple.yaml objectstore/bluestore-comp.yaml rados.yaml
rocksdb.yaml thrashers/mapgap.yaml workloads/cache-pool-snaps-readproxy.yaml}

#10 Updated by Sage Weil about 2 months ago

/a/sage-2017-06-05_14:47:27-rados-wip-sage-testing-distro-basic-smithi/1260344
teuthology:1260344 06:24 PM $ cat summary.yaml
description: rados/thrash/{0-size-min-size-overrides/2-size-1-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml
backoff/normal.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-require-luminous/at-mkfs.yaml
msgr-failures/fastclose.yaml msgr/async.yaml objectstore/bluestore-comp.yaml rados.yaml
rocksdb.yaml thrashers/mapgap.yaml workloads/cache-snaps.yaml}

#11 Updated by Sage Weil about 2 months ago

/a/sage-2017-06-05_14:47:27-rados-wip-sage-testing-distro-basic-smithi/1260424
teuthology:1260424 06:25 PM $ cat summary.yaml
description: rados/thrash/{0-size-min-size-overrides/3-size-2-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml
backoff/peering_and_degraded.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml}
d-require-luminous/at-mkfs.yaml msgr-failures/osd-delay.yaml msgr/simple.yaml objectstore/filestore-btrfs.yaml
rados.yaml rocksdb.yaml thrashers/mapgap.yaml workloads/cache-pool-snaps-readproxy.yaml}

#12 Updated by Sage Weil about 2 months ago

/a/sage-2017-06-05_22:19:51-rados-wip-sage-testing-distro-basic-smithi/1262583

rados/thrash/{0-size-min-size-overrides/2-size-1-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml backoff/normal.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-require-luminous/at-mkfs.yaml msgr-failures/fastclose.yaml msgr/async.yaml objectstore/bluestore-comp.yaml rados.yaml rocksdb.yaml thrashers/mapgap.yaml workloads/cache-snaps.yaml}

#13 Updated by Sage Weil about 2 months ago

/a/sage-2017-06-05_22:19:51-rados-wip-sage-testing-distro-basic-smithi/1262663

rados/thrash/{0-size-min-size-overrides/3-size-2-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml backoff/peering_and_degraded.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-require-luminous/at-mkfs.yaml msgr-failures/osd-delay.yaml msgr/simple.yaml objectstore/filestore-btrfs.yaml rados.yaml rocksdb.yaml thrashers/mapgap.yaml workloads/cache-pool-snaps-readproxy.yaml}

#14 Updated by Sage Weil about 2 months ago

/a/sage-2017-06-06_21:54:14-rados-wip-sage-testing-distro-basic-smithi/1265627

rados/thrash/{0-size-min-size-overrides/3-size-2-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml backoff/peering.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-require-luminous/at-mkfs.yaml msgr-failures/osd-delay.yaml msgr/async.yaml objectstore/filestore-xfs.yaml rados.yaml rocksdb.yaml thrashers/default.yaml workloads/cache-snaps.yaml}

#15 Updated by Sage Weil about 2 months ago

/a/sage-2017-06-07_16:25:35-rados-wip-sage-testing2-distro-basic-smithi/1268182

rados/thrash-erasure-code/{ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-require-luminous/at-mkfs.yaml fast/normal.yaml leveldb.yaml msgr-failures/few.yaml objectstore/bluestore.yaml rados.yaml thrashers/morepggrow.yaml workloads/ec-rados-plugin=jerasure-k=2-m=1.yaml}

#16 Updated by Sage Weil about 1 month ago

  • Status changed from New to Verified

with snap trim whiteout fix applied,

/a/sage-2017-06-12_20:56:37-rados-wip-sage-testing-distro-basic-smithi/1280668
rados/thrash/{0-size-min-size-overrides/3-size-2-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml backoff/peering_and_degraded.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-require-luminous/at-mkfs.yaml msgr-failures/few.yaml msgr/random.yaml objectstore/bluestore-comp.yaml rados.yaml rocksdb.yaml thrashers/default.yaml workloads/cache-snaps.yaml}

#17 Updated by Sage Weil about 1 month ago

WTH. I've seen two cases where the object exists in snapmapper a different pool (cache tiering), but I think this is just coincidence; in ohter cases it does not. In all cases, I do not see a log message showing that the item was ever added, but we're turning it up during iteration.

Also it is not crazy that it would be there in some form. In /a/sage-2017-06-05_22:19:51-rados-wip-sage-testing-distro-basic-smithi/1262663 i see that the SnapSet references it. It's just weird that the snap_mapper doesn't show that we inserted it.

#18 Updated by Sage Weil about 1 month ago

  • Assignee set to Sage Weil

#19 Updated by Sage Weil about 1 month ago

  • Assignee deleted (Sage Weil)

log with more debugging at /a/sage-2017-06-14_03:38:53-rados:thrash-wip-19943---basic-smithi/1284145/ceph-osd.5.log

#20 Updated by Sage Weil about 1 month ago

  • Status changed from Verified to In Progress

#21 Updated by Sage Weil about 1 month ago

  • Assignee set to Sage Weil

#22 Updated by Greg Farnum about 1 month ago

  • Project changed from Ceph to RADOS
  • Category set to Snapshots
  • Component(RADOS) OSD added

#23 Updated by Sage Weil about 1 month ago

/a/sage-2017-06-17_13:41:40-rados-wip-sage-testing-distro-basic-smithi/1297478

#24 Updated by Sage Weil about 1 month ago

teuthology:1297525  04:39 PM $ grep MAP_000000000000009F_0000000000000002.092A3760.d1.smithi02829988-317 remote/*/log/*osd.1*
2017-06-17 16:09:14.100866 7f5313b48700 20 snap_mapper.add_oid set MAP_000000000000009F_0000000000000002.092A3760.d1.smithi02829988-317 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo..
                "MAP_000000000000009F_0000000000000002.092A3760.d1.smithi02829988-317 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo..": 372,
2017-06-17 16:09:14.103746 7f5313b48700 30 bluestore(/var/lib/ceph/osd/ceph-1) _omap_setkeys  0x0000000000000402'.MAP_000000000000009F_0000000000000002.092A3760.d1.smithi02829988-317 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo..' <- MAP_000000000000009F_0000000000000002.092A3760.d1.smithi02829988-317 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo..
2017-06-17 16:09:22.161808 7f530fb40700 20 snap_mapper._remove_oid rm MAP_000000000000009F_0000000000000002.092A3760.d1.smithi02829988-317 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo..
2017-06-17 16:09:22.162428 7f530fb40700 30 bluestore(/var/lib/ceph/osd/ceph-1) _omap_rmkeys  rm 0x0000000000000402'.MAP_000000000000009F_0000000000000002.092A3760.d1.smithi02829988-317 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo..' <- MAP_000000000000009F_0000000000000002.092A3760.d1.smithi02829988-317 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo..
2017-06-17 16:13:14.592448 7f32abc3f700 20 snap_mapper.get_next_objects_to_trim get_next(MAP_000000000000009F_0000000000000002.01A6E605.ab.smithi02829988-54..) returns 0 MAP_000000000000009F_0000000000000002.092A3760.d1.smithi02829988-317 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo..,buffer::list(len=372,
2017-06-17 16:13:14.592452 7f32abc3f700 20 snap_mapper.get_next_objects_to_trim MAP_000000000000009F_0000000000000002.092A3760.d1.smithi02829988-317 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo..

i.e., the key was deleted 3 minutes earlier, the osd later restarted, and then we saw it enumerated.

#25 Updated by Sage Weil about 1 month ago

confirmed same thing in another run. on osd startup, fsck shows the key that was deleted.

teuthology:1304114  01:40 PM $ grep MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300 remote/*/log/*osd.2*
2017-06-20 07:09:02.819665 7f202fd99700 20 snap_mapper.add_oid set MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..
                "MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..": 71,
2017-06-20 07:09:02.821391 7f202fd99700 30 bluestore(/var/lib/ceph/osd/ceph-2) _omap_setkeys  0x0000000000000402'.MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..' <- MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..
2017-06-20 07:09:16.012266 7f202d594700 20 snap_mapper._remove_oid rm MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..
2017-06-20 07:09:16.042832 7f202d594700 30 bluestore(/var/lib/ceph/osd/ceph-2) _omap_rmkeys  rm 0x0000000000000402'.MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..' <- MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..
2017-06-20 07:11:28.238685 7f2580a60b80 30 bluestore(/var/lib/ceph/osd/ceph-2) fsck  got 0x0000000000000402'.MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..' -> MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..
2017-06-20 07:11:36.799675 7f2561a18700 20 bluestore.OmapIteratorImpl(0x7f2598b0bf00) valid is at 0x0000000000000402'.MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..'
2017-06-20 07:11:36.799687 7f2561a18700 20 snap_mapper.get_next_objects_to_trim get_next(MAP_000000000000018F_0000000000000002.39BA15B9.1c2.smithi07213161-414..) returns 0 MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..,buffer::list(len=71,
2017-06-20 07:11:36.799763 7f2561a18700 20 bluestore.OmapIteratorImpl(0x7f2598b09780) valid is at 0x0000000000000402'.MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..'
2017-06-20 07:11:36.799774 7f2561a18700 20 snap_mapper.get_next_objects_to_trim get_next(MAP_000000000000018F_0000000000000002.7) returns 0 MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..,buffer::list(len=71,
2017-06-20 07:11:36.799780 7f2561a18700 20 snap_mapper.get_next_objects_to_trim MAP_000000000000018F_0000000000000002.7208BC40.251.smithi07213161-300..

/a/sage-2017-06-20_06:19:52-rados-wip-sage-testing-distro-basic-smithi/1304114

#26 Updated by Sage Weil about 1 month ago

no, i'm an idiot, ceph-objectstore-tool is doing it and it's noted in a different log file. sheesh.

#27 Updated by Sage Weil about 1 month ago

  • Status changed from In Progress to Need Review

#28 Updated by Sage Weil about 1 month ago

  • Status changed from Need Review to Resolved

Also available in: Atom PDF