Project

General

Profile

Bug #22080

radosgw-admin data sync run crashes

Added by Abhishek Lekshmanan almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
11/08/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Seen this on a jewel-master luminous-secondary cluster. Maybe reproducible on a L-L cluster as well haven't tried.

$ radosgw-admin data sync-run --source-zone=az1
 ceph version 12.2.1-367-g40d92ddf14 (40d92ddf1435ebeea6d9c17464367ef9ad332f0e) luminous (stable)
 1: (ceph::BackTrace::BackTrace(int)+0x48) [0x55d8caafb994]
 2: (()+0x9f6b2b) [0x55d8caafab2b]
 3: (()+0x10b10) [0x7fdfaa0d6b10]
 4: (RGWDataSyncCR::operate()+0x1db) [0x55d8ca81fff9]
 5: (RGWCoroutinesStack::operate(RGWCoroutinesEnv*)+0x15b) [0x55d8ca85dced]
 6: (RGWCoroutinesManager::run(std::list<RGWCoroutinesStack*, std::allocator<RGWCoroutinesStack*> >&)+0x1fd) [0x55d8ca85f51f]
 7: (RGWCoroutinesManager::run(RGWCoroutine*)+0x9b) [0x55d8ca860847]
 8: (RGWRemoteDataLog::run_sync(int)+0xc1) [0x55d8ca802615]
 9: (RGWDataSyncStatusManager::run()+0x28) [0x55d8ca6dbeb8]
 10: (main()+0x17ae5) [0x55d8ca6c0579]
 11: (__libc_start_main()+0xf5) [0x7fdf9ddfa6e5]
 12: (_start()+0x29) [0x55d8ca69c7f9]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

running this through gdb

1456          data_sync_module = sync_env->sync_module->get_data_handler();
Missing separate debuginfos, use: zypper install krb5-debuginfo-1.12.5-9.1.x86_64 libblkid1-debuginfo-2.29.2-3.4.x86_64 libcurl4-debuginfo-7.37.0-23.1.x86_64 libexpat1-debuginfo-2.1.0-24.1.x86_64 libfreebl3-debuginfo-3.28.6-44.1.x86_64 libibverbs1-debuginfo-14-6.4.x86_64 libkeyutils1-debuginfo-1.5.9-7.13.x86_64 libldap-2_4-2-debuginfo-2.4.44-18.1.x86_64 libopenssl1_0_0-debuginfo-1.0.2j-10.1.x86_64 libpcre1-debuginfo-8.39-11.1.x86_64 libselinux1-debuginfo-2.5-4.17.x86_64 libsoftokn3-debuginfo-3.28.6-44.1.x86_64 liburcu0-debuginfo-debuginfo-0.8.8-5.3.x86_64 liburcu2-debuginfo-debuginfo-0.8.7-4.1.x86_64 libuuid1-debuginfo-2.29.2-3.4.x86_64 libz1-debuginfo-1.2.8-13.15.x86_64 mozilla-nspr-debuginfo-4.17-1.1.x86_64 mozilla-nss-debuginfo-3.33-2.1.x86_64
(gdb) bt
#0  0x0000555555c6fff9 in RGWDataSyncCR::operate (this=0x5555566037b0) at /ssd/builds/cpp/ceph_cmake_new/src/rgw/rgw_data_sync.cc:1456
#1  0x0000555555cadced in RGWCoroutinesStack::operate (this=0x555556617a10, _env=0x7fffffff97b0) at /ssd/builds/cpp/ceph_cmake_new/src/rgw/rgw_coroutine.cc:195
#2  0x0000555555caf51f in RGWCoroutinesManager::run (this=0x7fffffffb0a8, stacks=...) at /ssd/builds/cpp/ceph_cmake_new/src/rgw/rgw_coroutine.cc:485
#3  0x0000555555cb0847 in RGWCoroutinesManager::run (this=0x7fffffffb0a8, op=0x5555566031c0) at /ssd/builds/cpp/ceph_cmake_new/src/rgw/rgw_coroutine.cc:624
#4  0x0000555555c52615 in RGWRemoteDataLog::run_sync (this=0x7fffffffb0a8, num_shards=128) at /ssd/builds/cpp/ceph_cmake_new/src/rgw/rgw_data_sync.cc:1643
#5  0x0000555555b2beb8 in RGWDataSyncStatusManager::run (this=0x7fffffffb050) at /ssd/builds/cpp/ceph_cmake_new/src/rgw/rgw_data_sync.h:320
#6  0x0000555555b10579 in main (argc=7, argv=0x7fffffffda48) at /ssd/builds/cpp/ceph_cmake_new/src/rgw/rgw_admin.cc:6438

Possibly looks like we don't intialize sync-module instance in rgw-admin ( sync_modules_manage->create_instance in rados init)?


Related issues

Copied to rgw - Backport #23180: luminous: radosgw-admin data sync run crashes Resolved

History

#1 Updated by Abhishek Lekshmanan almost 2 years ago

  • Assignee set to Yehuda Sadeh

#2 Updated by Abhishek Lekshmanan almost 2 years ago

seeing in a Luminous-Luminous cluster as well

#3 Updated by Kefu Chai almost 2 years ago

  • Status changed from New to Need Review

#4 Updated by Yehuda Sadeh almost 2 years ago

  • Status changed from Need Review to Testing

#5 Updated by Casey Bodley over 1 year ago

  • Status changed from Testing to Pending Backport
  • Backport set to luminous

#6 Updated by Nathan Cutler over 1 year ago

  • Copied to Backport #23180: luminous: radosgw-admin data sync run crashes added

#7 Updated by Nathan Cutler over 1 year ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF