Project

General

Profile

Bug #22932

[rbd-mirror] infinite loop is possible when formatting the status message

Added by Jason Dillaman 12 months ago. Updated 10 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
02/06/2018
Due date:
% Done:

0%

Source:
Tags:
Backport:
luminous,jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Per Mykola Golub:

When running rbd-mirror stress test against your PR (which I suppose
is unrelated to the issue) I was observing the rbd-mirror looping
forever in this loop:

template <typename I>
bool ReplayStatusFormatter<I>::calculate_behind_master_or_send_update() {
  ...

  while (master.tag_tid != mirror_tag_tid) {
    auto tag_it = m_tag_cache.find(master.tag_tid);
    if (tag_it == m_tag_cache.end()) {
      send_update_tag_cache(master.tag_tid, mirror_tag_tid);
      return false;
    }
    librbd::journal::TagData &tag_data = tag_it->second;
    m_entries_behind_master += master.entry_tid;
    master = cls::journal::ObjectPosition(0, tag_data.predecessor.tag_tid,
                                          tag_data.predecessor.entry_tid);
  }

The state was:

m_master_position = {object_number = 5, tag_tid = 5, entry_tid = 1}
m_mirror_position = {object_number = 3, tag_tid = 4, entry_tid = 1323}

m_tag_cache = std::map with 5 elements =
 {[0] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 2, entry_tid = 0}},
  [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 0, entry_tid = 0}},
  [2] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}},
  [3] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}},
  [5] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 3, entry_tid = 1}}}}

So, `master.tag_tid` went through this endless sequence:

  5 -> 3 -> 1 -> 0 -> 2 -> 1 -> 0 -> 2 -> 1 ...

Related issues

Copied to rbd - Backport #22964: luminous: [rbd-mirror] infinite loop is possible when formatting the status message Resolved
Copied to rbd - Backport #22965: jewel: [rbd-mirror] infinite loop is possible when formatting the status message Resolved

History

#1 Updated by Jason Dillaman 12 months ago

The tag_tid values should always be increasing, so the "while (master.tag_tid != mirror_tag_tid)" loop could really just be "while (master.tag_tid >= mirror_tag_tid)" for safety. Additionally, the tag 5 predecessor is incorrectly pointing to tag 3 because it is using the cached client commit position instead of the current commit position after a replay of the journal.

#2 Updated by Mykola Golub 12 months ago

  • Status changed from New to In Progress
  • Assignee set to Mykola Golub

#3 Updated by Mykola Golub 12 months ago

Below is an extract from the log file for this particular case, with the comments that show how it ended up with that tag cache:

2018-02-06 15:52:46.542 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:52:46.542 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=0, tag_tid=0, entry_tid=0]
2018-02-06 15:52:46.542 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=1, mirror_tag_tid=0
2018-02-06 15:52:46.542 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: need to update tag cache
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 1: [tid=1, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: making fake tag
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 1: [mirror_uuid=, predecessor=[mirror_uuid=]]

A fake entry inserted to m_tag_cache:

 [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 0, entry_tid = 0}}

2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=0, mirror_tag_tid=0
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=0, tag_tid=0, entry_tid=0]
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: clearing tags not needed any more (below mirror position)
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: 0 entries cleared
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=0, tag_tid=0, entry_tid=0], m_entries_behind_master=3
2018-02-06 15:52:46.562 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:52:46.562 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=0, tag_tid=0, entry_tid=0]
2018-02-06 15:52:46.562 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: clearing tags not needed any more (below mirror position)
2018-02-06 15:52:46.562 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: 0 entries cleared
2018-02-06 15:52:46.562 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=0, tag_tid=0, entry_tid=0], m_entries_behind_master=3
2018-02-06 15:53:16.582 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:53:16.582 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=3, tag_tid=2, entry_tid=895]
2018-02-06 15:53:16.582 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=3, tag_tid=2, entry_tid=895], m_entries_behind_master=0
2018-02-06 15:53:46.598 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:53:46.598 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432]
2018-02-06 15:53:46.598 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=3, mirror_tag_tid=2
2018-02-06 15:53:46.598 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: need to update tag cache
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 3: [tid=3, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 01 00 00 00 00 00 00  |................|
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 3: [mirror_uuid=, predecessor=[mirror_uuid=, tag_tid=1, entry_tid=3]]

Entry inserted to m_tag_cache:

 [3] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}}

2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=1, mirror_tag_tid=2

Why didn't it use the already inserted entry for tag_tid == 1? (the problem with send_update_tag_cache: it does not check the entry already exists in the cache)

2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 1: [tid=1, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: making fake tag
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 1: [mirror_uuid=, predecessor=[mirror_uuid=]]

Why it log that decoded tag 1 has empty predecessor? I expected tag_data.predecessor.tag_tid == 2 (mirror_tag_tid) at the moment. (because predecessor.commit_valid == false).

It tried to insert to m_tag_cache a fake entry:

 [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 2, entry_tid = 0}}

but the insert failed due to previously inserted fake entry (with predecessor.tag_tid == 0).

2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=2, mirror_tag_tid=2
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432]
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=0, mirror_tag_tid=2
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: need to update tag cache
2018-02-06 15:53:46.602 7fcf89ffb700 -1 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: error retrieving tag 0: (2) No such file or directory

It tried to retrieve tag 0, due to previously inserted entry to m_tag_cache for tid = 1 (with predecessor.tag_tid == 0)

2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: making fake tag
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 0: [mirror_uuid=, predecessor=[mirror_uuid=]]

A fake entry inserted to m_tag_cache:

 [0] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 2, entry_tid = 0}}

2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=2, mirror_tag_tid=2
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432]
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: clearing tags not needed any more (below mirror position)
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: 0 entries cleared
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432], m_entries_behind_master=-2428
2018-02-06 15:53:46.618 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:53:46.618 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432]
2018-02-06 15:53:46.618 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: clearing tags not needed any more (below mirror position)
2018-02-06 15:53:46.618 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: 0 entries cleared
2018-02-06 15:53:46.618 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432], m_entries_behind_master=-2428
2018-02-06 15:54:16.638 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:54:16.638 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=607]
2018-02-06 15:54:16.638 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=607], m_entries_behind_master=0
2018-02-06 15:54:16.710 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:54:16.710 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=607]
2018-02-06 15:54:16.710 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=607], m_entries_behind_master=0
2018-02-06 15:54:46.746 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:54:46.746 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=5, tag_tid=5, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=1323]
2018-02-06 15:54:46.746 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=5, mirror_tag_tid=4
2018-02-06 15:54:46.746 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: need to update tag cache
2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 5: [tid=5, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 03 00 00 00 00 00 00  |................|
2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 5: [mirror_uuid=, predecessor=[mirror_uuid=, tag_tid=3, entry_tid=1]]

Entry inserted to m_tag_cache:

  [5] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 3, entry_tid = 1}}

2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=3, mirror_tag_tid=4
2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 3: [tid=3, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 01 00 00 00 00 00 00  |................|
2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 3: [mirror_uuid=, predecessor=[mirror_uuid=, tag_tid=1, entry_tid=3]]
2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=1, mirror_tag_tid=4
2018-02-06 15:54:46.750 7fcf89ffb700 -1 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: error retrieving tag 1: (2) No such file or directory

Tag 1 does not arleady exist?

2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: making fake tag
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 1: [mirror_uuid=, predecessor=[mirror_uuid=]]

It tried to insert to m_tag_cache a fake entry:

 [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 4, entry_tid = 0}}

but the insert failed due to previously inserted fake entry (with predecessor.tag_tid == 0).

2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=4, mirror_tag_tid=4
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=5, tag_tid=5, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=1323]
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=2, mirror_tag_tid=4
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: need to update tag cache
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 2: [tid=2, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 01 00 00 00 00 00 00  |................|
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 2: [mirror_uuid=, predecessor=[mirror_uuid=, tag_tid=1, entry_tid=3]]

Entry inserted to m_tag_cache:

 [2] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}}

2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=1, mirror_tag_tid=4
2018-02-06 15:54:46.750 7fcf89ffb700 -1 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: error retrieving tag 1: (2) No such file or directory

tag 1 does not arleady exist?

2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: making fake tag
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 1: [mirror_uuid=, predecessor=[mirror_uuid=]]

It tried to insert to m_tag_cache a fake entry:

 [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 4, entry_tid = 0}}

but the insert failed due to previously inserted fake entry (with predecessor.tag_tid == 0).

2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=4, mirror_tag_tid=4
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=5, tag_tid=5, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=1323]

At that moment m_tag_cache was:

m_tag_cache = 
 {[0] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 2, entry_tid = 0}},
  [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 0, entry_tid = 0}},
  [2] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}},
  [3] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}},
  [5] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 3, entry_tid = 1}}}}

and it went to infinit loop with master.tag_tid iterating 5 -> 3 -> 1 -> 0 -> 2 -> 1 -> ...

#4 Updated by Mykola Golub 12 months ago

  • Status changed from In Progress to Need Review

#5 Updated by Jason Dillaman 11 months ago

  • Status changed from Need Review to Pending Backport

#6 Updated by Nathan Cutler 11 months ago

  • Copied to Backport #22964: luminous: [rbd-mirror] infinite loop is possible when formatting the status message added

#7 Updated by Nathan Cutler 11 months ago

  • Copied to Backport #22965: jewel: [rbd-mirror] infinite loop is possible when formatting the status message added

#8 Updated by Nathan Cutler 10 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF