Project

General

Profile

Bug #58172

get_rollback_snap_id throws bad variant access

Added by Christopher Hoffman about 2 months ago. Updated about 2 months ago.

Status:
Duplicate
Priority:
Normal
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The function get_rollback_snap_id generates core dump when force promoting image

#14 0x00007f53848c4a55 in std::__throw_bad_variant_access (__what=__what@entry=0x7f5384d26040 "std::get: wrong index for variant") at /usr/include/c++/12/variant:1324
#15 0x00007f53848c4a76 in std::__throw_bad_variant_access (__valueless=<optimized out>) at /usr/include/c++/12/variant:1332
#16 0x00007f5384a52097 in std::get<3ul, cls::rbd::UserSnapshotNamespace, cls::rbd::GroupSnapshotNamespace, cls::rbd::TrashSnapshotNamespace, cls::rbd::MirrorSnapshotNamespace, cls::rbd::UnknownSnapshotNamespace> (__v=...)
    at /usr/include/c++/12/variant:1689
#17 std::get<cls::rbd::MirrorSnapshotNamespace, cls::rbd::UserSnapshotNamespace, cls::rbd::GroupSnapshotNamespace, cls::rbd::TrashSnapshotNamespace, cls::rbd::MirrorSnapshotNamespace, cls::rbd::UnknownSnapshotNamespace> (__v=...)
    at /usr/include/c++/12/variant:1127
#18 librbd::mirror::snapshot::util::(anonymous namespace)::get_rollback_snap_id (it=..., end=..., rollback_snap_id=rollback_snap_id@entry=0x55619ffc96c0, cct=cct@entry=0x55619ff68940)
    at /sdb/choffman/code/fork/ceph/src/librbd/mirror/snapshot/Utils.cc:31
#19 0x00007f5384a5285b in librbd::mirror::snapshot::util::can_create_primary_snapshot<librbd::ImageCtx> (image_ctx=0x55619ffcef60, demoted=demoted@entry=false, force=force@entry=true, 
    requires_orphan=requires_orphan@entry=0x7f5379cae3af, rollback_snap_id=rollback_snap_id@entry=0x55619ffc96c0) at /sdb/choffman/code/fork/ceph/src/librbd/mirror/snapshot/Utils.cc:117
#20 0x00007f5384a4d86e in librbd::mirror::snapshot::PromoteRequest<librbd::ImageCtx>::send (this=0x55619ffc9690) at /sdb/choffman/code/fork/ceph/src/librbd/mirror/snapshot/PromoteRequest.cc:37
#21 0x00007f5384a463b8 in librbd::mirror::PromoteRequest<librbd::ImageCtx>::promote (this=this@entry=0x7f533c0063f0) at /sdb/choffman/code/fork/ceph/src/librbd/mirror/PromoteRequest.cc:83
#22 0x00007f5384a46543 in librbd::mirror::PromoteRequest<librbd::ImageCtx>::handle_get_info (this=0x7f533c0063f0, r=0) at /sdb/choffman/code/fork/ceph/src/librbd/mirror/PromoteRequest.cc:68

The line of code in function

    auto mirror_ns = std::get<cls::rbd::MirrorSnapshotNamespace>(
      it->second.snap_namespace);

Steps to reproduce

bin/rbd --cluster site-a snap create data/image1@snap0
timeout 1m bin/rbd --cluster site-a bench --io-type write --io-size 64K --io-threads 2 --io-total 50G --io-pattern seq data/image1
bin/rbd --cluster site-a mirror image snapshot data/image1
sleep 10s
bin/rbd --cluster site-b  mirror image promote data/image1 --force


Related issues

Duplicates rbd - Bug #53537: [rbd-mirror] ceph_assert on rollback_snapshot info during a forcepromote while snapshot is syncing In Progress

History

#1 Updated by Ilya Dryomov about 2 months ago

It looks like Deepika also ran into this and fixed it in her PR for the "ceph_assert(info != nullptr)" crash:

https://github.com/ceph/ceph/pull/45375/commits/fe1295bdfd21be44c2ac479185368c486bd5df4f

The PR didn't get merged because, as discussed offline, there is more to this "ceph_assert(info != nullptr)" crash. The proper fix would likely remove get_rollback_snap_id() method entirely.

#2 Updated by Ilya Dryomov about 2 months ago

  • Status changed from In Progress to Duplicate

#3 Updated by Ilya Dryomov about 2 months ago

  • Duplicates Bug #53537: [rbd-mirror] ceph_assert on rollback_snapshot info during a forcepromote while snapshot is syncing added

Also available in: Atom PDF