Project

General

Profile

Actions

Bug #44161

closed

rbd_mirror/NamespaceReplayer.cc: 248: FAILED ceph_assert(m_image_map)

Added by Mykola Golub about 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

http://qa-proxy.ceph.com/teuthology/trociny-2020-02-15_12:15:24-rbd-wip-mgolub-testing-distro-basic-smithi/4767054/teuthology.log

If a `PoolReplayer` gets "instance added" notification when it is in the process of activating a new `NamespaceReplayer`, just at the moment when the `NamespaceReplayer` is initialized but the leader is not acquired yet for it, the `PoolReplayer` will deliver the notification to this "partially activated" `NamespaceReplayer` causing the assertion failure.

2020-02-15T13:24:21.567+0000 7f308e190700 10 rbd::mirror::InstanceWatcher: 0x565507f24380 handle_acquire_lock: r=0
2020-02-15T13:24:21.567+0000 7f308e190700 10 rbd::mirror::NamespaceReplayer: 0x565507ede820 handle_init_instance_watcher: r=0
2020-02-15T13:24:21.567+0000 7f308e190700 20 rbd::mirror::ServiceDaemon: 0x5655059d2a30 add_namespace: pool_id=2, namespace=ns1
2020-02-15T13:24:21.567+0000 7f3081176700 10 rbd::mirror::LeaderWatcher: 0x5655072cd900 handle_notify: notify_id=506806141388, handle=94923192575488, notifier_id=5470
2020-02-15T13:24:21.567+0000 7f3081176700 10 rbd::mirror::LeaderWatcher: 0x5655072cd900 handle_notify: our own notification, ignoring
2020-02-15T13:24:21.567+0000 7f308e190700 10 rbd::mirror::InstanceWatcher: 0x565507f24700 handle_register_watch: r=0
2020-02-15T13:24:21.567+0000 7f308e190700 10 rbd::mirror::InstanceWatcher: 0x565507f24700 acquire_lock: 
2020-02-15T13:24:21.571+0000 7f308e190700 10 rbd::mirror::LeaderWatcher: 0x5655072cd900 handle_notify_heartbeat: r=0
2020-02-15T13:24:21.571+0000 7f308e190700 10 rbd::mirror::LeaderWatcher: 0x5655072cd900 is_leader: 1
2020-02-15T13:24:21.571+0000 7f308e190700 10 rbd::mirror::LeaderWatcher: 0x5655072cd900 handle_notify_heartbeat: 4 acks received, 0 timed out
2020-02-15T13:24:21.571+0000 7f308e190700 10 rbd::mirror::Instances: 0x565507315720 acked: instance_ids=[5470,5473,5483,5487]
2020-02-15T13:24:21.571+0000 7f308e190700 10 rbd::mirror::LeaderWatcher: 0x5655072cd900 schedule_timer_task: scheduling heartbeat after 5 sec (task 0x5655096e1e30)
2020-02-15T13:24:21.571+0000 7f308e190700  5 rbd::mirror::Instances: 0x565507315720 handle_acked: instance_ids=[5470,5473,5483,5487]
2020-02-15T13:24:21.571+0000 7f308e190700 10 rbd::mirror::Instances: 0x565507315720 cancel_remove_task: 
2020-02-15T13:24:21.571+0000 7f308e190700 10 rbd::mirror::Instances: 0x565507315720 schedule_remove_task: 
2020-02-15T13:24:21.571+0000 7f308e190700  5 rbd::mirror::Instances: 0x565507315720 notify_instances_added: instance_ids=[5483,5487]
2020-02-15T13:24:21.571+0000 7f308e190700  5 rbd::mirror::PoolReplayer: 0x565505a05000 handle_instances_added: instance_ids=[5483,5487]
2020-02-15T13:24:21.571+0000 7f308e190700 10 rbd::mirror::LeaderWatcher: 0x5655072cd900 is_leader: 1
2020-02-15T13:24:21.571+0000 7f308e190700 10 rbd::mirror::NamespaceReplayer: 0x5655073329c0 handle_instances_added: instance_ids=[5483,5487]
2020-02-15T13:24:21.571+0000 7f308e190700 10 rbd::mirror::NamespaceReplayer: 0x565507ede820 handle_instances_added: instance_ids=[5483,5487]
2020-02-15T13:24:21.571+0000 7f308e190700 -1 /build/ceph-15.1.0-765-g962e0c8/src/tools/rbd_mirror/NamespaceReplayer.cc: In function 'void rbd::mirror::NamespaceReplayer<ImageCtxT>::handle_in
stances_added(const std::vector<std::__cxx11::basic_string<char> >&) [with ImageCtxT = librbd::ImageCtx]' thread 7f308e190700 time 2020-02-15T13:24:21.574901+0000
/build/ceph-15.1.0-765-g962e0c8/src/tools/rbd_mirror/NamespaceReplayer.cc: 248: FAILED ceph_assert(m_image_map)

 ceph version 15.1.0-765-g962e0c8 (962e0c808320b33b5f7cbc3839c78e569d32b94e) octopus (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x154) [0x7f3093804aaa]
 2: (()+0x279c82) [0x7f3093804c82]
 3: (rbd::mirror::NamespaceReplayer<librbd::ImageCtx>::handle_instances_added(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std:
:__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)+0x247) [0x565504561cd7]
 4: (rbd::mirror::PoolReplayer<librbd::ImageCtx>::handle_instances_added(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cx
x11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)+0x27c) [0x56550451029c]
 5: (rbd::mirror::Instances<librbd::ImageCtx>::notify_instances_added(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11
::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)+0x13d) [0x56550459c7ad]
 6: (rbd::mirror::Instances<librbd::ImageCtx>::C_NotifyBase::finish(int)+0xa) [0x56550459936a]
 7: (ThreadPool::PointerWQ<Context>::_void_process(void*, ThreadPool::TPHandle&)+0x148) [0x565504527808]
 8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xdab) [0x7f30938ec0bb]
 9: (ThreadPool::WorkThread::entry()+0x11) [0x7f30938ec881]
 10: (()+0x76db) [0x7f30933736db]
 11: (clone()+0x3f) [0x7f309235188f]

Related issues 1 (0 open1 closed)

Copied to rbd - Backport #44261: nautilus: rbd_mirror/NamespaceReplayer.cc: 248: FAILED ceph_assert(m_image_map)RejectedMykola GolubActions
Actions #1

Updated by Mykola Golub about 4 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 33368
Actions #2

Updated by Jason Dillaman about 4 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #3

Updated by Nathan Cutler about 4 years ago

  • Copied to Backport #44261: nautilus: rbd_mirror/NamespaceReplayer.cc: 248: FAILED ceph_assert(m_image_map) added
Actions #4

Updated by Nathan Cutler about 4 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF