Actions
Bug #49302
closedHuge amount of RGW crashes in the multisite setup with a backtrace
% Done:
0%
Source:
Tags:
Multisite, sync, rgw
Backport:
octopus pacific quincy
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Hi,
I have in my 3 Datacenter multisite setup altogether 110 RGW crashes with the following information:
{ "backtrace": [ "(()+0x12dd0) [0x7fb9b6defdd0]", "(RGWCoroutine::set_sleeping(bool)+0x10) [0x7fb9c1d274d0]", "(RGWOmapAppend::flush_pending()+0x4a) [0x7fb9c1d2d3da]", "(RGWOmapAppend::finish()+0x14) [0x7fb9c1d2d4d4]", "(RGWDataSyncShardCR::stop_spawned_services()+0x2f) [0x7fb9c1c6c9df]", "(RGWDataSyncShardCR::incremental_sync()+0x771) [0x7fb9c1c845d1]", "(RGWDataSyncShardCR::operate()+0x9d) [0x7fb9c1c87cdd]", "(RGWCoroutinesStack::operate(RGWCoroutinesEnv*)+0x67) [0x7fb9c1d27ac7]", "(RGWCoroutinesManager::run(std::__cxx11::list<RGWCoroutinesStack*, std::allocator<RGWCoroutinesStack*> >&)+0x271) [0x7fb9c1d288f1]", "(RGWCoroutinesManager::run(RGWCoroutine*)+0x8b) [0x7fb9c1d29b5b]", "(RGWRemoteDataLog::run_sync(int)+0x1ad) [0x7fb9c1c605bd]", "(RGWDataSyncProcessorThread::process()+0x46) [0x7fb9c1df2226]", "(RGWRadosThread::Worker::entry()+0x176) [0x7fb9c1dbab86]", "(()+0x82de) [0x7fb9b6de52de]", "(clone()+0x43) [0x7fb9b54fbe83]" ], "ceph_version": "15.2.7", "crash_id": "2021-02-15T09:44:29.206441Z_ac2988b1-57af-485e-8a76-99e08d017bff", "entity_name": "client.rgw.hk-cephmon-2s01.rgw0", "os_id": "centos", "os_name": "CentOS Linux", "os_version": "8 (Core)", "os_version_id": "8", "process_name": "radosgw", "stack_sig": "8f62d50897d7b1b190387523f6d687e60dbef4e6746b430310d721c5a558f3b5", "timestamp": "2021-02-15T09:44:29.206441Z", "utsname_hostname": "hk-cephmon-2s01", "utsname_machine": "x86_64", "utsname_release": "4.18.0-193.28.1.el8_2.x86_64", "utsname_sysname": "Linux", "utsname_version": "#1 SMP Thu Oct 22 00:20:22 UTC 2020" }
The sync mechanism suffering not sure is it a bug or some setup issue?
Updated by Mule Te about 3 years ago
I have the same issue here. Another problem I notice is old objects will not be synced to secondary zone. :(
Updated by Ist Gab about 3 years ago
Mule Te wrote:
I have the same issue here. Another problem I notice is old objects will not be synced to secondary zone. :(
Yes, I have that one too.
Updated by Casey Bodley almost 2 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 46007
Updated by Casey Bodley almost 2 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport set to octopus pacific quincy
Updated by Backport Bot almost 2 years ago
- Copied to Backport #55457: pacific: Huge amount of RGW crashes in the multisite setup with a backtrace added
Updated by Backport Bot almost 2 years ago
- Copied to Backport #55458: quincy: Huge amount of RGW crashes in the multisite setup with a backtrace added
Updated by Backport Bot almost 2 years ago
- Copied to Backport #55459: octopus: Huge amount of RGW crashes in the multisite setup with a backtrace added
Updated by Casey Bodley over 1 year ago
- Has duplicate Bug #56920: crash: RGWCoroutinesStack::wakeup() added
Updated by Backport Bot over 1 year ago
- Tags changed from Multisite, sync, rgw to Multisite, sync, rgw backport_processed
Updated by Konstantin Shalygin over 1 year ago
- Status changed from Pending Backport to Resolved
- Tags changed from Multisite, sync, rgw backport_processed to Multisite, sync, rgw
Actions