Bug #13381
closedosd/SnapMapper.cc: 282: FAILED assert(check(oid)) on hammer->jewel upgrade
0%
Description
-14> 2015-10-05 19:30:49.183599 7f8e7f200700 15 filestore(/var/lib/ceph/osd/ceph-40) omap_get_values meta/-1/a468ec03/snapmapper/0 = 0 -13> 2015-10-05 19:30:49.183608 7f8e7f200700 20 snap_mapper.remove_oid 0/3473053c/1000e081c63.0000009d/head -12> 2015-10-05 19:30:49.183612 7f8e7f200700 15 filestore(/var/lib/ceph/osd/ceph-40) omap_get_values meta/-1/a468ec03/snapmapper/0 -11> 2015-10-05 19:30:49.183645 7f8e7f200700 15 filestore(/var/lib/ceph/osd/ceph-40) omap_get_values meta/-1/a468ec03/snapmapper/0 = 0 -10> 2015-10-05 19:30:49.183654 7f8e7f200700 20 snap_mapper.remove_oid 0/c273053c/100010365d9.00000000/head -9> 2015-10-05 19:30:49.183658 7f8e7f200700 15 filestore(/var/lib/ceph/osd/ceph-40) omap_get_values meta/-1/a468ec03/snapmapper/0 -8> 2015-10-05 19:30:49.183690 7f8e7f200700 15 filestore(/var/lib/ceph/osd/ceph-40) omap_get_values meta/-1/a468ec03/snapmapper/0 = 0 -7> 2015-10-05 19:30:49.183699 7f8e7f200700 20 snap_mapper.remove_oid 0/a6f3053c/100005b91ae.00000000/head -6> 2015-10-05 19:30:49.183703 7f8e7f200700 15 filestore(/var/lib/ceph/osd/ceph-40) omap_get_values meta/-1/a468ec03/snapmapper/0 -5> 2015-10-05 19:30:49.183736 7f8e7f200700 15 filestore(/var/lib/ceph/osd/ceph-40) omap_get_values meta/-1/a468ec03/snapmapper/0 = 0 -4> 2015-10-05 19:30:49.183745 7f8e7f200700 20 snap_mapper.remove_oid 0/cef3053c/1000ec7a4fc.00000000/head -3> 2015-10-05 19:30:49.183749 7f8e7f200700 15 filestore(/var/lib/ceph/osd/ceph-40) omap_get_values meta/-1/a468ec03/snapmapper/0 -2> 2015-10-05 19:30:49.183782 7f8e7f200700 15 filestore(/var/lib/ceph/osd/ceph-40) omap_get_values meta/-1/a468ec03/snapmapper/0 = 0 -1> 2015-10-05 19:30:49.183791 7f8e7f200700 20 snap_mapper.remove_oid 0/d2200e36/1000a89628c.00000000/head 0> 2015-10-05 19:30:49.186168 7f8e7f200700 -1 osd/SnapMapper.cc: In function 'int SnapMapper::remove_oid(const hobject_t&, MapCacher::Transaction<std::basic_string<char>, ceph::buffer::list>*)' thread 7f8e7f200700 time 2015-10-05 19:30:49.183793 osd/SnapMapper.cc: 282: FAILED assert(check(oid)) ceph version 9.0.3-2033-g3570ec6 (3570ec612a6e85169007e50533ea56c152a23f8e) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f8eae279c7b] 2: (SnapMapper::remove_oid(hobject_t const&, MapCacher::Transaction<std::string, ceph::buffer::list>*)+0x1ed) [0x7f8eadd5eecd] 3: (remove_dir(CephContext*, ObjectStore*, SnapMapper*, OSDriver*, ObjectStore::Sequencer*, coll_t, std::shared_ptr<DeletingState>, bool*, ThreadPool::TPHandle&)+0x454) [0x7f8eadcc2dc4] 4: (OSD::RemoveWQ::_process(std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> >, ThreadPool::TPHandle&)+0x1f4) [0x7f8eadcc3834] 5: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> >, std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> > >::_void_process(void*, ThreadPool::TPHandle&)+0x10a) [0x7f8eadd16eba] 6: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa56) [0x7f8eae26b6f6] 7: (ThreadPool::WorkThread::entry()+0x10) [0x7f8eae26c5c0] 8: (()+0x8182) [0x7f8eac630182] 9: (clone()+0x6d) [0x7f8eaa97747d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
this is mira093 osd.40 and osd.39, after upgrading from hammer -> infernalis.
pg is 0.53c
Updated by Sage Weil over 8 years ago
looks like an object is in the wrong directory
root@mira093:/var/lib/ceph/osd/ceph-40/current/0.53c_head# ls -al | grep 1000a89628c -rw-r--r-- 1 root root 48 Aug 26 09:56 1000a89628c.00000000__head_D2200E36__0 root@mira093:/var/lib/ceph/osd/ceph-40/current/0.53c_head# attr -l 1000a89628c.00000000__head_D2200E36__0 Attribute "cephos.spill_out" has a 2 byte value for 1000a89628c.00000000__head_D2200E36__0 Attribute "ceph._" has a 255 byte value for 1000a89628c.00000000__head_D2200E36__0 Attribute "ceph._parent" has a 347 byte value for 1000a89628c.00000000__head_D2200E36__0 Attribute "ceph.snapset" has a 31 byte value for 1000a89628c.00000000__head_D2200E36__0 root@mira093:/var/lib/ceph/osd/ceph-40/current/0.53c_head# attr -q -h 1000a89628c.00000000__head_D2200E36__0 attr: invalid option -- 'h' Unrecognized option: ? Usage: attr [-LRSq] -s attrname [-V attrvalue] pathname # set value attr [-LRSq] -g attrname pathname # get value attr [-LRSq] -r attrname pathname # remove attr attr [-LRq] -l pathname # list attrs -s reads a value from stdin and -g writes a value to stdout root@mira093:/var/lib/ceph/osd/ceph-40/current/0.53c_head# attr -q -g ceph._ 1000a89628c.00000000__head_D2200E36__0 > /tmp/a root@mira093:/var/lib/ceph/osd/ceph-40/current/0.53c_head# ceph-dencoder import /tmp/a type object_info_t decode dump_json { "oid": { "oid": "1000a89628c.00000000", "key": "", "snapid": -2, "hash": 3525316150, "max": 0, "pool": 0, "namespace": "" }, "version": "469776'609414", "prior_version": "408114'595463", "last_reqid": "mds.0.122:96384116", "user_version": 609414, "size": 48, "mtime": "2015-04-29 03:23:53.397117", "local_mtime": "2015-04-29 03:23:54.774958", "lost": 0, "flags": 52, "snaps": [], "truncate_seq": 0, "truncate_size": 0, "data_digest": 3949395613, "omap_digest": 4294967295, "watchers": {} } root@mira093:/var/lib/ceph/osd/ceph-40/current/0.53c_head# printf "%x\n" 3525316150 d2200e36
Updated by Sage Weil over 8 years ago
same assert was triggering several days before the infernalis upgrade, so this is nothing new.
Updated by Sage Weil over 8 years ago
- Priority changed from Urgent to Normal
I think this is just cruft from something that happened ages ago.
Updated by Samuel Just over 8 years ago
- Assignee set to Samuel Just
- Priority changed from Normal to Urgent
Updated by Samuel Just over 8 years ago
- Assignee deleted (
Samuel Just) - Priority changed from Urgent to Normal
Updated by Sage Weil over 8 years ago
Sage Weil wrote:
I think this is just cruft from something that happened ages ago.
2015-04-29 03:23:54.774958, specifically
Updated by Yuri Weinstein over 8 years ago
Updated by Sage Weil over 8 years ago
- Subject changed from osd/SnapMapper.cc: 282: FAILED assert(check(oid)) to osd/SnapMapper.cc: 282: FAILED assert(check(oid)) on hammer->jewel upgrade
- Priority changed from Normal to Urgent
- Source changed from Development to Q/A
Updated by Yuri Weinstein over 8 years ago
Also on hammer -> infernalis
Run: http://pulpito.ovh.sepia.ceph.com:8081/teuthology-2015-12-22_17:02:01-upgrade:hammer-x-infernalis-distro-basic-openstack/
Job: 48665
Logs: http://teuthology.ovh.sepia.ceph.com/teuthology/teuthology-2015-12-22_17:02:01-upgrade:hammer-x-infernalis-distro-basic-openstack/48665/teuthology.log
2015-12-22T19:27:53.872 INFO:tasks.thrashosds:joining thrashosds 2015-12-22T19:27:54.013 INFO:tasks.ceph.osd.1.target091046.stderr:osd/SnapMapper.cc: In function 'int SnapMapper::remove_oid(const hobject_t&, MapCacher::Transaction<std::basic_string<char>, ceph::buffer::list>*)' thread 7f64a689d700 time 2015-12-22 19:27:53.913624 2015-12-22T19:27:54.013 INFO:tasks.ceph.osd.1.target091046.stderr:osd/SnapMapper.cc: 282: FAILED assert(check(oid)) 2015-12-22T19:27:54.020 INFO:tasks.ceph.osd.1.target091046.stderr: ceph version 9.2.0-25-gf480cea (f480cea217008fa7b1e476d30dcb13023e6431d1) 2015-12-22T19:27:54.020 INFO:tasks.ceph.osd.1.target091046.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f64cb45d27b] 2015-12-22T19:27:54.020 INFO:tasks.ceph.osd.1.target091046.stderr: 2: (SnapMapper::remove_oid(hobject_t const&, MapCacher::Transaction<std::string, ceph::buffer::list>*)+0x1ed) [0x7f64caf4ea1d] 2015-12-22T19:27:54.021 INFO:tasks.ceph.osd.1.target091046.stderr: 3: (remove_dir(CephContext*, ObjectStore*, SnapMapper*, OSDriver*, ObjectStore::Sequencer*, coll_t, std::shared_ptr<DeletingState>, bool*, ThreadPool::TPHandle&)+0x454) [0x7f64caeb2084] 2015-12-22T19:27:54.021 INFO:tasks.ceph.osd.1.target091046.stderr: 4: (OSD::RemoveWQ::_process(std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> >, ThreadPool::TPHandle&)+0x1f4) [0x7f64caeb2af4] 2015-12-22T19:27:54.021 INFO:tasks.ceph.osd.1.target091046.stderr: 5: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> >, std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> > >::_void_process(void*, ThreadPool::TPHandle&)+0x10a) [0x7f64caf062ea] 2015-12-22T19:27:54.021 INFO:tasks.ceph.osd.1.target091046.stderr: 6: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa56) [0x7f64cb44ecf6] 2015-12-22T19:27:54.022 INFO:tasks.ceph.osd.1.target091046.stderr: 7: (ThreadPool::WorkThread::entry()+0x10) [0x7f64cb44fbc0] 2015-12-22T19:27:54.022 INFO:tasks.ceph.osd.1.target091046.stderr: 8: (()+0x8182) [0x7f64c9a7b182] 2015-12-22T19:27:54.022 INFO:tasks.ceph.osd.1.target091046.stderr: 9: (clone()+0x6d) [0x7f64c7dc247d] 2015-12-22T19:27:54.022 INFO:tasks.ceph.osd.1.target091046.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Yuri Weinstein over 8 years ago
- Release set to infernalis
- Release set to jewel
- ceph-qa-suite upgrade/hammer-x added
Updated by Sage Weil over 8 years ago
2015-12-22 19:27:53.913613 7f64a689d700 20 snap_mapper.remove_oid -1/00000000/temp_4.40s0_0_34564_1/head
could this be David's bug where the temp object isn't getting cleaned up?
Updated by Sage Weil over 8 years ago
- Related to Bug #13862: pgs stuck inconsistent after infernalis upgrade added
Updated by David Zafman over 8 years ago
This object has a hash and is in pool 0. Doesn't seem related to 13862. Don't know why check() failed.
1000a89628c.00000000__head_D2200E36__0
This looks like a Hammer temporary object with no hash and pool == -1. The 13862 fix will attempt to remove it.
-1/00000000/temp_4.40s0_0_34564_1/head
The fix for 13862 will remove this second object type using a simple remove transaction as opposed to using the recursive_remove_collection() which uses mapper.remove_oid() presumably to handle snapshots in the transaction before doing the object remove transaction. The mapper.remove_oid() runs into the assert(). So 13862 is at least a partial fix for this bug.
Updated by Yuri Weinstein over 8 years ago
- ceph-qa-suite upgrade/firefly-hammer-x added
Also in run:
http://pulpito.ovh.sepia.ceph.com:8081/teuthology-2016-01-16_11:18:01-upgrade:firefly-hammer-x-infernalis-distro-basic-openstack/
Job: ['3764']
Log: http://teuthology.ovh.sepia.ceph.com/teuthology/teuthology-2016-01-16_11:18:01-upgrade:firefly-hammer-x-infernalis-distro-basic-openstack/3764/teuthology.log
2016-01-16T13:04:00.315 INFO:tasks.workunit.client.0.target078138.stdout:done waiting 2016-01-16T13:04:00.324 INFO:tasks.workunit.client.0.target078138.stdout:test/librados/TestCase.cc:336: Failure 2016-01-16T13:04:00.324 INFO:tasks.workunit.client.0.target078138.stdout:Value of: completion->get_return_value() 2016-01-16T13:04:00.325 INFO:tasks.workunit.client.0.target078138.stdout: Actual: -2 2016-01-16T13:04:00.325 INFO:tasks.workunit.client.0.target078138.stdout:Expected: 0 2016-01-16T13:04:00.432 INFO:tasks.workunit.client.0.target078138.stdout:[ FAILED ] LibRadosMiscPP.CopyScrubPP (88783 ms) 2016-01-16T13:04:00.500 INFO:tasks.ceph.mon.a.target078137.stderr:2016-01-16 13:04:00.460412 7f6223890700 -1 mon.a@1(peon).osd e1882 update_from_paxos full map CRC mismatch, resetting to canonical 2016-01-16T13:04:00.517 INFO:tasks.workunit.client.0.target078138.stdout:[----------] 18 tests from LibRadosMiscPP (111523 ms total) 2016-01-16T13:04:00.518 INFO:tasks.workunit.client.0.target078138.stdout: 2016-01-16T13:04:00.519 INFO:tasks.workunit.client.0.target078138.stdout:[----------] Global test environment tear-down 2016-01-16T13:04:00.519 INFO:tasks.workunit.client.0.target078138.stdout:[==========] 24 tests from 4 test cases ran. (115730 ms total) 2016-01-16T13:04:00.519 INFO:tasks.workunit.client.0.target078138.stdout:[ PASSED ] 23 tests. 2016-01-16T13:04:00.519 INFO:tasks.workunit.client.0.target078138.stdout:[ FAILED ] 1 test, listed below: 2016-01-16T13:04:00.520 INFO:tasks.workunit.client.0.target078138.stdout:[ FAILED ] LibRadosMiscPP.CopyScrubPP 2016-01-16T13:04:00.520 INFO:tasks.workunit.client.0.target078138.stdout: 2016-01-16T13:04:00.520 INFO:tasks.workunit.client.0.target078138.stdout: 1 FAILED TEST 2016-01-16T13:04:00.530 INFO:tasks.workunit:Stopping ['rados/test-upgrade-v9.0.1.sh', 'cls'] on client.0... 2016-01-16T13:04:00.530 INFO:teuthology.orchestra.run.target078138:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/workunit.client.0 /home/ubuntu/cephtest/clone' 2016-01-16T13:04:00.537 INFO:tasks.ceph.osd.1.target078137.stderr:osd/SnapMapper.cc: In function 'int SnapMapper::remove_oid(const hobject_t&, MapCacher::Transaction<std::basic_string<char>, ceph::buffer::list>*)' thread 7f3611c9c700 time 2016-01-16 13:04:00.492339 2016-01-16T13:04:00.537 INFO:tasks.ceph.osd.1.target078137.stderr:osd/SnapMapper.cc: 282: FAILED assert(check(oid)) 2016-01-16T13:04:00.537 INFO:tasks.ceph.osd.1.target078137.stderr: ceph version 9.2.0-39-g1296c2b (1296c2baef3412f462ee2124af747a892ea8b7a9) 2016-01-16T13:04:00.538 INFO:tasks.ceph.osd.1.target078137.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f363393a255] 2016-01-16T13:04:00.538 INFO:tasks.ceph.osd.1.target078137.stderr: 2: (SnapMapper::remove_oid(hobject_t const&, MapCacher::Transaction<std::string, ceph::buffer::list>*)+0x215) [0x7f363340ae15] 2016-01-16T13:04:00.538 INFO:tasks.ceph.osd.1.target078137.stderr: 3: (remove_dir(CephContext*, ObjectStore*, SnapMapper*, OSDriver*, ObjectStore::Sequencer*, coll_t, std::shared_ptr<DeletingState>, bool*, ThreadPool::TPHandle&)+0x454) [0x7f363336a334] 2016-01-16T13:04:00.538 INFO:tasks.ceph.osd.1.target078137.stderr: 4: (OSD::RemoveWQ::_process(std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> >, ThreadPool::TPHandle&)+0x207) [0x7f363336adb7] 2016-01-16T13:04:00.539 INFO:tasks.ceph.osd.1.target078137.stderr: 5: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> >, std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> > >::_void_process(void*, ThreadPool::TPHandle&)+0x11a) [0x7f36333c042a] 2016-01-16T13:04:00.539 INFO:tasks.ceph.osd.1.target078137.stderr: 6: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa76) [0x7f363392b806] 2016-01-16T13:04:00.539 INFO:tasks.ceph.osd.1.target078137.stderr: 7: (ThreadPool::WorkThread::entry()+0x10) [0x7f363392c6d0] 2016-01-16T13:04:00.539 INFO:tasks.ceph.osd.1.target078137.stderr: 8: (()+0x7dc5) [0x7f36319cedc5] 2016-01-16T13:04:00.540 INFO:tasks.ceph.osd.1.target078137.stderr: 9: (clone()+0x6d) [0x7f363027521d] 2016-01-16T13:04:00.540 INFO:tasks.ceph.osd.1.target078137.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Yuri Weinstein over 8 years ago
Updated by Sage Weil about 8 years ago
That last failure included the 13862 partial fix (altho it was an upgrade from 0.94.5).
Updated by David Zafman about 8 years ago
- Status changed from 12 to Duplicate
Actually, the jewel upgrade was to sha1 e7107f87151e9674502088bbe2d183ff4528647f which didn't include the 13862 fix.
Please re-run upgrading to current Jewel (currently 4b97cd7bf515aabb4474b223486291029f6644ac) or v10.0.3 or later.
Updated by Dan Mick about 8 years ago
This happened again on the lrc with the same pg. Sam helped me remove the offending object from all the OSDs that went down (about 12 of them) and then all the acting set OSDs.
Updated by Nathan Cutler almost 7 years ago
- Status changed from Duplicate to New
Reopening because this is now happening in jewel (10.2.8 integration branch). Something must have regressed.
Updated by Sage Weil almost 7 years ago
analyzing /a/smithfarm-2017-06-27_15:24:41-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/1332796 ...
- the object is 3:5c2fc0f8:::vpm04318680-33...
- currently in 3.as2
- failed assert because mask_bits = 5 and match = 10 (3.as2), but oid.hash & 31 is 26 because it should be in 3.1as2.
- split happened here:
2017-06-27 15:56:11.521298 7f9d38d03700 15 filestore(/var/lib/ceph/osd/ceph-4) _split_collection 3.as2_head bits: 5 2017-06-27 15:56:11.521335 7f9d38d03700 15 filestore(/var/lib/ceph/osd/ceph-4) collection_stat /var/lib/ceph/osd/ceph-4/current/3.as2_head 2017-06-27 15:56:11.521352 7f9d38d03700 10 filestore(/var/lib/ceph/osd/ceph-4) collection_stat /var/lib/ceph/osd/ceph-4/current/3.as2_head = 0 2017-06-27 15:56:11.521356 7f9d38d03700 15 filestore(/var/lib/ceph/osd/ceph-4) collection_stat /var/lib/ceph/osd/ceph-4/current/3.1as2_head 2017-06-27 15:56:11.521361 7f9d38d03700 10 filestore(/var/lib/ceph/osd/ceph-4) collection_stat /var/lib/ceph/osd/ceph-4/current/3.1as2_head = 0 2017-06-27 15:56:11.521369 7f9d38d03700 20 filestore dbobjectmap: seq is 692 2017-06-27 15:56:11.528297 7f9d38d03700 10 filestore(/var/lib/ceph/osd/ceph-4) _set_global_replay_guard: 18301.0.1 done 2017-06-27 15:56:11.528321 7f9d38d03700 10 filestore(/var/lib/ceph/osd/ceph-4) _set_replay_guard 18301.0.1 START 2017-06-27 15:56:11.529441 7f9d38d03700 10 filestore(/var/lib/ceph/osd/ceph-4) _set_replay_guard 18301.0.1 done 2017-06-27 15:56:11.529463 7f9d38d03700 10 filestore(/var/lib/ceph/osd/ceph-4) _set_replay_guard 18301.0.1 START 2017-06-27 15:56:11.530010 7f9d38d03700 10 filestore(/var/lib/ceph/osd/ceph-4) _set_replay_guard 18301.0.1 done 2017-06-27 15:56:11.546513 7f9d38d03700 20 LFNIndex(/var/lib/ceph/osd/ceph-4/current/3.as2_head) lfn_unlink removing alt attr from /var/lib/ceph/osd/ceph-4/current/3.as2_head/vpm04318680-15 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061_25f195e8d3715936d04d_0_long 2017-06-27 15:56:11.549356 7f9d38d03700 20 LFNIndex(/var/lib/ceph/osd/ceph-4/current/3.as2_head) lfn_unlink removing alt attr from /var/lib/ceph/osd/ceph-4/current/3.as2_head/vpm04318680-15 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061_4d64ff7e5f2c878d95d2_0_long 2017-06-27 15:56:11.551752 7f9d38d03700 20 LFNIndex(/var/lib/ceph/osd/ceph-4/current/3.as2_head) lfn_unlink removing alt attr from /var/lib/ceph/osd/ceph-4/current/3.as2_head/vpm04318680-15 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061_89fe26e467ca09df10b5_0_long 2017-06-27 15:56:11.561472 7f9d38d03700 20 LFNIndex(/var/lib/ceph/osd/ceph-4/current/3.as2_head) lfn_unlink removing alt attr from /var/lib/ceph/osd/ceph-4/current/3.as2_head/vpm04318680-33 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061_961acae792ffce686b7c_0_long 2017-06-27 15:56:11.563680 7f9d38d03700 20 LFNIndex(/var/lib/ceph/osd/ceph-4/current/3.as2_head) lfn_unlink removing alt attr from /var/lib/ceph/osd/ceph-4/current/3.as2_head/vpm04318680-33 01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061_a71554d0bbce848f9876_0_long 2017-06-27 15:56:11.564846 7f9d38d03700 10 filestore(/var/lib/ceph/osd/ceph-4) _close_replay_guard 18301.0.1 2017-06-27 15:56:11.564860 7f9d38d03700 20 filestore dbobjectmap: seq is 693 2017-06-27 15:56:11.567148 7f9d38d03700 10 filestore(/var/lib/ceph/osd/ceph-4) _close_replay_guard 18301.0.1 done 2017-06-27 15:56:11.567198 7f9d38d03700 10 filestore(/var/lib/ceph/osd/ceph-4) _close_replay_guard 18301.0.1 2017-06-27 15:56:11.567221 7f9d38d03700 20 filestore dbobjectmap: seq is 693 2017-06-27 15:56:11.568748 7f9d38d03700 10 filestore(/var/lib/ceph/osd/ceph-4) _close_replay_guard 18301.0.1 done
but there is no useful debugging to see why the object didn't get moved over. :/
there is another pg splitting in another thread (3.1es1), but i don't see how it could interfere here.
Updated by Sage Weil almost 7 years ago
I don't see any recent changes in jewel since 10.2.7, or anything between there and master that looks suspicious. I don't have the exactly commit that was tested, though (ca7ab74ae7884f24983d94b729cc262108ff6aba is not in ceph-ci.git).
Updated by Nathan Cutler almost 7 years ago
@Sage Weil: Pushed ca7ab74ae7884f24983d94b729cc262108ff6aba to ceph-ci as "wip-13381"
Updated by Nathan Cutler almost 7 years ago
Re-running the test at http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-30_07:33:11-upgrade:hammer-x-wip-13381-distro-basic-smithi/
Test passed