Project

General

Profile

Actions

Bug #54397

open

RadosGW Crash on Start

Added by Christian Kugler about 2 years ago. Updated about 1 year ago.

Status:
Triaged
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
startup services
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):

252a97ceebcd30e1396dbdf2f6872a71c40bcfa4a58e684a1ddcb07a1248237c
5ba32bd2dae20e6377ebba271c5bfaaef8d2f7fdce2467c88d25c2bf64fd7fca
89af7ee16dcaec4566fbb4c4c6c79d25d9f81901435eb76eacb3c7ee8e92064f


Description

Some of our gateways crash if restartet via systemctl restart ceph-radosgw@rgw07.service. After trying several times they started again. Restarting it again can work but not always.
I found a similar sounding crash: https://tracker.ceph.com/issues/52744

2022-02-24T10:13:36.378+0100 7f8ff0951700 -1 received  signal: Terminated from /sbin/init  (PID: 1) UID: 0
2022-02-24T10:13:36.378+0100 7f8ff0951700  1 handle_sigterm
2022-02-24T10:13:36.378+0100 7f8ff0951700  1 handle_sigterm set alarm for 120
2022-02-24T10:13:36.378+0100 7f8ff452c540 -1 shutting down
2022-02-24T10:15:06.514+0100 7ff2c58a1540  0 deferred set uid:gid to 64045:64045 (ceph:ceph)
2022-02-24T10:15:06.514+0100 7ff2c58a1540  0 ceph version 15.2.15 (efc1e7ef13525f02d58d97ccd62fec325323c618) octopus (stable), process radosgw, pid 990334
2022-02-24T10:15:06.514+0100 7ff2c58a1540  0 framework: beast
2022-02-24T10:15:06.514+0100 7ff2c58a1540  0 framework conf key: port, val: 7480
2022-02-24T10:15:06.514+0100 7ff2c58a1540  1 radosgw_Main not setting numa affinity
2022-02-24T10:15:16.754+0100 7ff2c58a1540  1 int RGWSI_Notify::robust_notify(RGWSI_RADOS::Obj&, const RGWCacheNotifyInfo&, optional_yield):394 Notify failed on object .log:meta.history: (110) Connection timed out
2022-02-24T10:15:16.754+0100 7ff2c58a1540  1 int RGWSI_Notify::robust_notify(RGWSI_RADOS::Obj&, const RGWCacheNotifyInfo&, optional_yield):397 Backtrace: :  ceph version 15.2.15 (efc1e7ef13525f02d58d97ccd62fec325323c618) octopus (stable)
 1: (RGWSI_Notify::distribute(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, RGWCacheNotifyInfo const&, optional_yield)+0xa7) [0x7ff2d0846fc7]
 2: (RGWSI_SysObj_Cache::distribute_cache(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw_raw_obj const&, ObjectCacheInfo&, int, optional_yield)+0x378) [0x7ff2d08536f8]
 3: (RGWSI_SysObj_Cache::write(rgw_raw_obj const&, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > >&, bool, ceph::buffer::v15_2_0::list const&, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, optional_yield)+0x39c) [0x7ff2d0855bec]
 4: (RGWSI_SysObj::Obj::WOp::write(ceph::buffer::v15_2_0::list&, optional_yield)+0x3a) [0x7ff2d03c18fa]
 5: (rgw_put_system_obj(RGWSysObjectCtx&, rgw_pool const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::buffer::v15_2_0::list&, bool, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, optional_yield, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > >*)+0x11f) [0x7ff2d07770df]
 6: (rgw_put_system_obj(RGWSysObjectCtx&, rgw_pool const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::buffer::v15_2_0::list&, bool, RGWObjVersionTracker*, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > >*)+0x19) [0x7ff2d0777169]
 7: (RGWSI_MDLog::write_history(RGWMetadataLogHistory const&, RGWObjVersionTracker*, bool)+0x139) [0x7ff2d03bace9]
 8: (RGWSI_MDLog::init_oldest_log_period()+0x6c5) [0x7ff2d03bd825]
 9: (RGWSI_MDLog::do_start()+0xec) [0x7ff2d03bdedc]
 10: (RGWServiceInstance::start()+0x1a) [0x7ff2d03e5c7a]
 11: (RGWServices_Def::init(ceph::common::CephContext*, bool, bool, bool)+0xab3) [0x7ff2d03e76f3]
 12: (RGWServices::do_init(ceph::common::CephContext*, bool, bool, bool)+0x1a) [0x7ff2d03e915a]
 13: (RGWRados::initialize()+0xed) [0x7ff2d06daf8d]
 14: (RGWStoreManager::init_storage_provider(ceph::common::CephContext*, bool, bool, bool, bool, bool, bool)+0x9f) [0x7ff2d076ea7f]
 15: (radosgw_Main(int, char const**)+0xdde) [0x7ff2d0389cee]
 16: (__libc_start_main()+0xea) [0x7ff2cfc9dd0a]
 17: (_start()+0x2a) [0x55aab100509a]

2022-02-24T10:15:16.754+0100 7ff2c58a1540  1 int RGWSI_Notify::robust_notify(RGWSI_RADOS::Obj&, const RGWCacheNotifyInfo&, optional_yield):413 Invalidating obj=.log:meta.history tries=0
2022-02-24T10:15:26.758+0100 7ff2c58a1540  1 int RGWSI_Notify::robust_notify(RGWSI_RADOS::Obj&, const RGWCacheNotifyInfo&, optional_yield):418 invalidation attempt 0 failed: (110) Connection timed out
2022-02-24T10:15:26.758+0100 7ff2c58a1540  1 int RGWSI_Notify::robust_notify(RGWSI_RADOS::Obj&, const RGWCacheNotifyInfo&, optional_yield):413 Invalidating obj=.log:meta.history tries=1

Files

rgw logs.rtf (141 KB) rgw logs.rtf Peter Razumovsky, 02/14/2023 03:15 PM

Related issues 8 (1 open7 closed)

Related to rgw - Bug #51911: crash: RGWSI_Notify::unwatchNew

Actions
Has duplicate rgw - Bug #54739: crash: RGWSI_Notify::unwatch(RGWSI_RADOS::Obj&, unsigned long)Duplicate

Actions
Has duplicate rgw - Bug #54722: crash: virtual ceph::common::RefCountedObject::~RefCountedObject(): assert(nref == 0)Duplicate

Actions
Has duplicate rgw - Bug #54719: crash: RGWSI_Notify::unwatch(RGWSI_RADOS::Obj&, unsigned long)Duplicate

Actions
Has duplicate rgw - Bug #54699: crash: RGWSI_Notify::finalize_watch()Duplicate

Actions
Has duplicate rgw - Bug #54694: crash: virtual ceph::common::RefCountedObject::~RefCountedObject(): assert(nref == 0)Duplicate

Actions
Has duplicate rgw - Bug #54683: crash: tcmalloc::allocate_full_cpp_throw_oom(unsigned long)Duplicate

Actions
Has duplicate rgw - Bug #54637: crash: virtual ceph::common::RefCountedObject::~RefCountedObject(): assert(nref == 0)Duplicate

Actions
Actions

Also available in: Atom PDF