Project

General

Profile

Backport #21097

Updated by Abhishek Lekshmanan about 2 years ago

https://github.com/ceph/ceph/pull/17234 <pre>
2017-08-03 15:19:13.189356 7f05bcff9700 20 cr:s=0x7f053c004a30:op=0x7f053c0f3d90:21RGWReadMDLogEntriesCR: operate()
2017-08-03 15:19:13.189370 7f05bcff9700 20 cr:s=0x7f053c004a30:op=0x7f053c02f0c0:18RGWMetaSyncShardCR: operate()
2017-08-03 15:19:13.189378 7f05bcff9700 20 meta sync: incremental_sync:1665: shard_id=0 log_entry: 1_1501787951.872747_56.1:bucket.instance:haixjc-1:b2f89ca6-a10a-425c-ad79-82ba886bd6fe.4109.1:2017-08-03 15:19:11.872747
2017-08-03 15:19:13.189418 7f05bcff9700 20 cr:s=0x7f053c083170:op=0x7f053c0f3d90:24RGWMetaSyncSingleEntryCR: operate()
2017-08-03 15:19:13.189423 7f05bcff9700 20 meta sync: skipping pending operation
2017-08-03 15:19:13.189440 7f05bcff9700 20 cr:s=0x7f053c004a30:op=0x7f053c02f0c0:18RGWMetaSyncShardCR: operate()
2017-08-03 15:19:13.189454 7f05bcff9700 20 meta sync: incremental_sync:1665: shard_id=0 log_entry: 1_1501787952.051313_57.1:bucket.instance:haixjc-1:b2f89ca6-a10a-425c-ad79-82ba886bd6fe.4109.1:2017-08-03 15:19:12.051313
2017-08-03 15:19:13.189457 7f05bcff9700 0 meta sync: ERROR: cannot start syncing 1_1501787952.051313_57.1. Duplicate entry?
2017-08-03 15:19:13.189460 7f05bcff9700 20 meta sync: incremental_sync:1665: shard_id=0 log_entry: 1_1501787952.111111_58.1:bucket:haixjc-1:2017-08-03 15:19:12.111111
2017-08-03 15:19:13.189480 7f05bcff9700 20 cr:s=0x7f053c083170:op=0x7f053c0f3d90:24RGWMetaSyncSingleEntryCR: operate()
2017-08-03 15:19:13.189487 7f05bcff9700 20 run: stack=0x7f053c083170 is done
2017-08-03 15:19:13.189518 7f05bcff9700 20 cr:s=0x7f053c071a20:op=0x7f053c0a4450:24RGWMetaSyncSingleEntryCR: operate()
2017-08-03 15:19:13.189521 7f05bcff9700 20 meta sync: skipping pending operation
2017-08-03 15:19:13.189529 7f05bcff9700 20 cr:s=0x7f053c004a30:op=0x7f053c02f0c0:18RGWMetaSyncShardCR: operate()
2017-08-03 15:19:13.189535 7f05bcff9700 20 meta sync: incremental_sync:1665: shard_id=0 log_entry: 1_1501787952.263821_59.1:bucket:haixjc-1:2017-08-03 15:19:12.263821
2017-08-03 15:19:13.189539 7f05bcff9700 0 meta sync: ERROR: cannot start syncing 1_1501787952.263821_59.1. Duplicate entry?
2017-08-03 15:19:13.189548 7f05bcff9700 20 meta sync: incremental_sync:1665: shard_id=0 log_entry: 1_1501787952.473367_60.1:bucket.instance:haixjc-2:b2f89ca6-a10a-425c-ad79-82ba886bd6fe.4109.2:2017-08-03 15:19:12.473367
2017-08-03 15:19:13.189572 7f05bcff9700 20 cr:s=0x7f053c071a20:op=0x7f053c0a4450:24RGWMetaSyncSingleEntryCR: operate()
2017-08-03 15:19:13.189580 7f05bcff9700 20 run: stack=0x7f053c071a20 is done
2017-08-03 15:19:13.189585 7f05bcff9700 20 cr:s=0x7f053c0726e0:op=0x7f053c0f3d90:24RGWMetaSyncSingleEntryCR: operate()
2017-08-03 15:19:13.189587 7f05bcff9700 20 meta sync: skipping pending operation
2017-08-03 15:19:13.189593 7f05bcff9700 20 cr:s=0x7f053c004a30:op=0x7f053c02f0c0:18RGWMetaSyncShardCR: operate()
2017-08-03 15:19:13.189598 7f05bcff9700 20 meta sync: incremental_sync:1665: shard_id=0 log_entry: 1_1501787952.611686_61.1:bucket.instance:haixjc-2:b2f89ca6-a10a-425c-ad79-82ba886bd6fe.4109.2:2017-08-03 15:19:12.611686
2017-08-03 15:19:13.189602 7f05bcff9700 0 meta sync: ERROR: cannot start syncing 1_1501787952.611686_61.1. Duplicate entry?
2017-08-03 15:19:13.189604 7f05bcff9700 20 meta sync: incremental_sync:1665: shard_id=0 log_entry: 1_1501787952.660282_62.1:bucket:haixjc-2:2017-08-03 15:19:12.660282
2017-08-03 15:19:13.189624 7f05bcff9700 20 cr:s=0x7f053c0726e0:op=0x7f053c0f3d90:24RGWMetaSyncSingleEntryCR: operate()
2017-08-03 15:19:13.189630 7f05bcff9700 20 run: stack=0x7f053c0726e0 is done
2017-08-03 15:19:13.189636 7f05bcff9700 20 cr:s=0x7f053c068790:op=0x7f053c0a45c0:24RGWMetaSyncSingleEntryCR: operate()
2017-08-03 15:19:13.189637 7f05bcff9700 20 meta sync: skipping pending operation
2017-08-03 15:19:13.189643 7f05bcff9700 20 cr:s=0x7f053c004a30:op=0x7f053c02f0c0:18RGWMetaSyncShardCR: operate()
2017-08-03 15:19:13.189648 7f05bcff9700 20 meta sync: incremental_sync:1665: shard_id=0 log_entry: 1_1501787952.787951_65.1:bucket:haixjc-2:2017-08-03 15:19:12.787951
2017-08-03 15:19:13.189651 7f05bcff9700 0 meta sync: ERROR: cannot start syncing 1_1501787952.787951_65.1. Duplicate entry?
2017-08-03 15:19:13.189655 7f05bcff9700 4 meta sync: cr:s=0x7f053c004a30:op=0x7f053c02f0c0:18RGWMetaSyncShardCR: adjusting marker pos=1_1501787951.872747_56.1
2017-08-03 15:19:13.196648 7f05bcff9700 -1 /home/cbodley/ceph/src/rgw/rgw_sync.cc: In function 'void RGWMetaSyncShardCR::collect_children()' thread 7f05bcff9700 time 2017-08-03 15:19:13.189662
/home/cbodley/ceph/src/rgw/rgw_sync.cc: 1398: FAILED assert(prev_iter != pos_to_prev.end())

ceph version Development (no_version) luminous (rc)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x137) [0x7f0630aebbb2]
2: (RGWMetaSyncShardCR::collect_children()+0x303) [0x5569581341bb]
3: (RGWMetaSyncShardCR::incremental_sync()+0x2862) [0x556958138dea]
4: (RGWMetaSyncShardCR::operate()+0x1ee) [0x556958133cd6]
5: (RGWCoroutinesStack::operate(RGWCoroutinesEnv*)+0x191) [0x556957e7af39]
6: (RGWCoroutinesManager::run(std::__cxx11::list<RGWCoroutinesStack*, std::allocator<RGWCoroutinesStack*> >&)+0x290) [0x556957e7cb04]
7: (RGWCoroutinesManager::run(RGWCoroutine*)+0xbc) [0x556957e7deb0]
8: (RGWRemoteMetaLog::run_sync()+0x175e) [0x55695812045c]
9: (RGWMetaSyncStatusManager::run()+0x1c) [0x556957f7470a]
10: (RGWMetaSyncProcessorThread::process()+0x1c) [0x556957f76f06]
11: (RGWRadosThread::Worker::entry()+0xe8) [0x556957f10dd2]
12: (Thread::entry_wrapper()+0xc1) [0x7f0630f372a3]
13: (Thread::_entry_func(void*)+0x18) [0x7f0630f371d8]
14: (()+0x773a) [0x7f0639b5173a]
15: (clone()+0x3f) [0x7f062cf83e0f]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
</pre>

Further debugging shows that RGWMetaSyncShardCR is looping over the same entries a second time. The marker_tracker detects some of those duplicates (see the "Duplicate entry?" messages), but not others.

Back