Project

General

Profile

Actions

Bug #45409

closed

rbd-mirror: src/librbd/journal/Replay.cc: 264: FAILED ceph_assert(!m_shut_down)

Added by Mykola Golub almost 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

http://qa-proxy.ceph.com/teuthology/trociny-2020-05-02_08:16:06-rbd-wip-mgolub-testing-distro-basic-smithi/5016384/

remote/smithi085/log/cluster2-client.mirror.2.43476.log.gz:

2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::InstanceReplayer: 0x55b6d9a7e8c0 release_image: global_image_id=a69573d6-b237-45f5-93ad-f7df82ca42cb
2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::InstanceReplayer: 0x55b6d9a7e8c0 stop_image_replayer: 0x55b6debed400 global_image_id=a69573d6-b237-45f5-93ad-f7df82ca42cb,
+on_finish=0x55b6dedc1280
2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::ImageReplayer: 0x55b6debed400 [3/a69573d6-b237-45f5-93ad-f7df82ca42cb] stop: on_finish=0x55b6df3fb740, manual=0, desc=
2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::ImageReplayer: 0x55b6debed400 [3/a69573d6-b237-45f5-93ad-f7df82ca42cb] stop: interrupting replay
2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::ImageReplayer: 0x55b6debed400 [3/a69573d6-b237-45f5-93ad-f7df82ca42cb] on_stop_journal_replay:
2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::ImageReplayer: 0x55b6debed400 [3/a69573d6-b237-45f5-93ad-f7df82ca42cb] cancel_update_mirror_image_replay_status:
2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::ImageReplayer: 0x55b6debed400 [3/a69573d6-b237-45f5-93ad-f7df82ca42cb] set_state_description: r=0, desc=
2020-05-03T19:56:44.225+0000 7f88c9d1b700 15 rbd::mirror::ImageReplayer: 0x55b6debed400 [3/a69573d6-b237-45f5-93ad-f7df82ca42cb] update_mirror_image_status: force=1, state=--
2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::ImageReplayer: 0x55b6debed400 [3/a69573d6-b237-45f5-93ad-f7df82ca42cb] shut_down: r=0
2020-05-03T19:56:44.225+0000 7f88c9d1b700 15 rbd::mirror::ImageReplayer: 0x55b6debed400 [3/a69573d6-b237-45f5-93ad-f7df82ca42cb] shut_down: waiting for in-flight operations to complete
...
2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::image_replayer::journal::Replayer: 0x55b6e1329380 replay_flush:
...
2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::ImageReplayer: 0x55b6debed400 [3/a69573d6-b237-45f5-93ad-f7df82ca42cb] shut_down: r=0
2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::image_replayer::journal::Replayer: 0x55b6e1329380 shut_down:
2020-05-03T19:56:44.225+0000 7f88c9d1b700 10 rbd::mirror::image_replayer::journal::Replayer: 0x55b6e1329380 shut_down_local_journal_replay:
2020-05-03T19:56:44.225+0000 7f88c9d1b700 -1 /build/ceph-16.0.0-1210-g9d31a602158/src/librbd/journal/Replay.cc: In function 'void librbd::journal::Replay<ImageCtxT>::shut_down(bool,
+Context*) [with ImageCtxT = librbd::ImageCtx]' thread 7f88c9d1b700 time 2020-05-03T19:56:44.231091+0000
/build/ceph-16.0.0-1210-g9d31a602158/src/librbd/journal/Replay.cc: 264: FAILED ceph_assert(!m_shut_down)

 ceph version 16.0.0-1210-g9d31a602158 (9d31a6021582356af6329c3977f82e0029ee421e) pacific (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x154) [0x7f88d0b82450]
 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f88d0b8262b]
 3: (librbd::journal::Replay<librbd::ImageCtx>::shut_down(bool, Context*)+0x3f3) [0x55b6d7013063]
 4: (rbd::mirror::image_replayer::journal::Replayer<librbd::ImageCtx>::shut_down_local_journal_replay()+0x6d) [0x55b6d6dc1e9d]
 5: (rbd::mirror::image_replayer::journal::Replayer<librbd::ImageCtx>::shut_down(Context*)+0xd0) [0x55b6d6dc9f10]
 6: (ThreadPool::PointerWQ<Context>::_void_process(void*, ThreadPool::TPHandle&)+0x147) [0x55b6d6cf8df7]
 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0x9fa) [0x7f88d0c33baa]
 8: (ThreadPool::WorkThread::entry()+0x11) [0x7f88d0c34a91]
 9: (()+0x76db) [0x7f88d07016db]
 10: (clone()+0x3f) [0x7f88cf6df88f]

So, it looks like journal replay shut_down was initiated at the moment when replay_flush was handled (called from handle_replay_ready to allocate a new local journal tag prior to processing), which did journal replay shut_down too.


Related issues 1 (0 open1 closed)

Copied to rbd - Backport #45581: octopus: rbd-mirror: src/librbd/journal/Replay.cc: 264: FAILED ceph_assert(!m_shut_down)RejectedActions
Actions #1

Updated by Mykola Golub almost 4 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 34930
Actions #2

Updated by Jason Dillaman almost 4 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #3

Updated by Nathan Cutler almost 4 years ago

  • Copied to Backport #45581: octopus: rbd-mirror: src/librbd/journal/Replay.cc: 264: FAILED ceph_assert(!m_shut_down) added
Actions #4

Updated by Jason Dillaman almost 4 years ago

  • Status changed from Pending Backport to Resolved
  • Backport deleted (octopus)

Dropping octopus backport. It will be fixed via the backport for #45714.

Actions

Also available in: Atom PDF