Project

General

Profile

Actions

Bug #51652

open

heartbeat timeouts on filestore OSDs while deleting objects in upgrade:pacific-p2p-pacific

Added by Yuri Weinstein almost 3 years ago. Updated 6 months ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):

Description

Run: https://pulpito.ceph.com/teuthology-2021-07-10_14:15:22-upgrade:pacific-p2p-pacific-distro-basic-smithi/
Job: 6262272
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2021-07-10_14:15:22-upgrade:pacific-p2p-pacific-distro-basic-smithi/6262272/teuthology.log

2021-07-10T15:19:21.538+0000 7f157aff9700 20 osd.0 pg_epoch: 1331 pg[8.e( v 1281'2407 (1203'2377,1281'2407] lb MIN local-lis/les=1241/1242 n=126 ec=472/472 lis/c=1241/1241 les/c/f=1242/1242/0 sis=1318) [] r=-1 lpr=1318 DELETING pi=[124
1,1318)/1 crt=1281'2407 lcod 1262'2406 mlcod 0'0 unknown NOTIFY mbc={}] do_delete_work deleting 30 objects
2021-07-10T15:19:21.538+0000 7f157aff9700 10 log is not dirty
2021-07-10T15:19:21.538+0000 7f157aff9700 20 osd.0 1331 dispatch_context not up in osdmap
2021-07-10T15:19:21.542+0000 7f157aff9700  1 heartbeat_map clear_timeout 'OSD::osd_op_tp thread 0x7f157aff9700' had timed out after 15.000000954s
2021-07-10T15:19:21.542+0000 7f157aff9700  1 heartbeat_map clear_timeout 'OSD::osd_op_tp thread 0x7f157aff9700' had suicide timed out after 150.000000000s
2021-07-10T15:19:21.546+0000 7f157aff9700 -1 *** Caught signal (Aborted) **
 in thread 7f157aff9700 thread_name:tp_osd_tp

 ceph version 16.2.5-14-g6872adc9 (6872adc98a8f647027b52e7382347dcce81ecdf6) pacific (stable)
 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0) [0x7f159ef103c0]
 2: pthread_kill()
 3: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*, std::chrono::time_point<ceph::coarse_mono_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >)+0x48a) [0x5563501729aa]
 4: (ceph::HeartbeatMap::clear_timeout(ceph::heartbeat_handle_d*)+0x63) [0x556350172bc3]
 5: (FileStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x8f0) [0x55634ff230f0]
 6: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x87) [0x55634fabbbe7]
 7: (OSD::dispatch_context(PeeringCtx&, PG*, std::shared_ptr<OSDMap const>, ThreadPool::TPHandle*)+0x1f3) [0x55634fa3f3a3]
 8: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x208) [0x55634fa85e98]
 9: (OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&)+0xc9) [0x55634fa86109]
 10: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x872) [0x55634fa70862]
 11: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x403) [0x556350192a63]
 12: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x556350195884]
 13: /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f159ef04609]
 14: clone()
Actions

Also available in: Atom PDF