Project

General

Profile

Actions

Bug #40519

closed

rbd_mirror/ImageSyncThrottler.cc: 61: FAILED ceph_assert(m_queue.empty())

Added by Mykola Golub almost 5 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous,mimic,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

A crash was observed in one of teuthology runs [1]:

2019-06-23T07:57:02.429 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr:/build/ceph-15.0.0-2068-g742df4f/src/tools/rbd_mirror/ImageSyncThrottler.cc: In function 'void rbd::mirror::ImageSyncThrottler<ImageCtxT>::start_op(const string&, Context*) [with ImageCtxT = librbd::ImageCtx; std::string = std::__cxx11::basic_string<char>]' thread 7f5c1aad7700 time 2019-06-23T07:57:02.429701+0000
2019-06-23T07:57:02.429 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr:/build/ceph-15.0.0-2068-g742df4f/src/tools/rbd_mirror/ImageSyncThrottler.cc: 61: FAILED ceph_assert(m_queue.empty())
2019-06-23T07:57:02.430 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: ceph version 15.0.0-2068-g742df4f (742df4fd43b2d170a24e2826fd033e137f79b855) octopus (dev)
2019-06-23T07:57:02.430 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14f) [0x7f5c2e728037]
2019-06-23T07:57:02.430 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 2: (()+0x2d3221) [0x7f5c2e728221]
2019-06-23T07:57:02.431 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 3: (rbd::mirror::ImageSyncThrottler<librbd::ImageCtx>::start_op(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, Context*)+0xa71) [0x55c7eb7b13c1]
2019-06-23T07:57:02.431 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 4: (rbd::mirror::InstanceWatcher<librbd::ImageCtx>::handle_sync_request(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, Context*)+0x4ba) [0x55c7eb75b5ea]
2019-06-23T07:57:02.431 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 5: (rbd::mirror::InstanceWatcher<librbd::ImageCtx>::handle_payload(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rbd::mirror::instance_watcher::SyncRequestPayload const&, librbd::Watcher::C_NotifyAck*)+0x65) [0x55c7eb75c9b5]
2019-06-23T07:57:02.431 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 6: (rbd::mirror::InstanceWatcher<librbd::ImageCtx>::handle_notify(unsigned long, unsigned long, unsigned long, ceph::buffer::v14_2_0::list&)+0x475) [0x55c7eb761d85]
2019-06-23T07:57:02.431 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 7: (librbd::Watcher::WatchCtx::handle_notify(unsigned long, unsigned long, unsigned long, ceph::buffer::v14_2_0::list&)+0x5e) [0x55c7eb8c41be]
2019-06-23T07:57:02.431 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 8: (()+0xaac31) [0x7f5c376f6c31]
2019-06-23T07:57:02.432 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 9: (()+0xc355c) [0x7f5c3770f55c]
2019-06-23T07:57:02.432 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 10: (()+0x66119) [0x7f5c376b2119]
2019-06-23T07:57:02.432 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 11: (Finisher::finisher_thread_entry()+0x19d) [0x7f5c2e7bdefd]
2019-06-23T07:57:02.432 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 12: (()+0x76db) [0x7f5c2e23d6db]
2019-06-23T07:57:02.432 INFO:tasks.rbd_mirror.cluster2.client.mirror.2.smithi039.stderr: 13: (clone()+0x3f) [0x7f5c2db5a88f]

I believe the cause is incorrect duplicate handling in `ImageSyncThrottler::start_op`. We already check here if there is a request with the same id in in-flight ops (started), but does not check if there is such a request in the queue (waiting).

[1] http://qa-proxy.ceph.com/teuthology/trociny-2019-06-21_06:39:59-rbd-wip-mgolub-testing-distro-basic-smithi/4055998/teuthology.log


Related issues 3 (0 open3 closed)

Copied to rbd - Backport #40592: luminous: rbd_mirror/ImageSyncThrottler.cc: 61: FAILED ceph_assert(m_queue.empty())ResolvedMykola GolubActions
Copied to rbd - Backport #40593: mimic: rbd_mirror/ImageSyncThrottler.cc: 61: FAILED ceph_assert(m_queue.empty())ResolvedMykola GolubActions
Copied to rbd - Backport #40594: nautilus: rbd_mirror/ImageSyncThrottler.cc: 61: FAILED ceph_assert(m_queue.empty())ResolvedMykola GolubActions
Actions

Also available in: Atom PDF