Project

General

Profile

Bug #22932

[rbd-mirror] infinite loop is possible when formatting the status message

Added by Jason Dillaman about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous,jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Per Mykola Golub:

When running rbd-mirror stress test against your PR (which I suppose
is unrelated to the issue) I was observing the rbd-mirror looping
forever in this loop:

template <typename I>
bool ReplayStatusFormatter<I>::calculate_behind_master_or_send_update() {
  ...

  while (master.tag_tid != mirror_tag_tid) {
    auto tag_it = m_tag_cache.find(master.tag_tid);
    if (tag_it == m_tag_cache.end()) {
      send_update_tag_cache(master.tag_tid, mirror_tag_tid);
      return false;
    }
    librbd::journal::TagData &tag_data = tag_it->second;
    m_entries_behind_master += master.entry_tid;
    master = cls::journal::ObjectPosition(0, tag_data.predecessor.tag_tid,
                                          tag_data.predecessor.entry_tid);
  }

The state was:

m_master_position = {object_number = 5, tag_tid = 5, entry_tid = 1}
m_mirror_position = {object_number = 3, tag_tid = 4, entry_tid = 1323}

m_tag_cache = std::map with 5 elements =
 {[0] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 2, entry_tid = 0}},
  [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 0, entry_tid = 0}},
  [2] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}},
  [3] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}},
  [5] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 3, entry_tid = 1}}}}

So, `master.tag_tid` went through this endless sequence:

  5 -> 3 -> 1 -> 0 -> 2 -> 1 -> 0 -> 2 -> 1 ...

Related issues

Copied to rbd - Backport #22964: luminous: [rbd-mirror] infinite loop is possible when formatting the status message Resolved
Copied to rbd - Backport #22965: jewel: [rbd-mirror] infinite loop is possible when formatting the status message Resolved

History

#1 Updated by Jason Dillaman about 6 years ago

The tag_tid values should always be increasing, so the "while (master.tag_tid != mirror_tag_tid)" loop could really just be "while (master.tag_tid >= mirror_tag_tid)" for safety. Additionally, the tag 5 predecessor is incorrectly pointing to tag 3 because it is using the cached client commit position instead of the current commit position after a replay of the journal.

#2 Updated by Mykola Golub about 6 years ago

  • Status changed from New to In Progress
  • Assignee set to Mykola Golub

#3 Updated by Mykola Golub about 6 years ago

Below is an extract from the log file for this particular case, with the comments that show how it ended up with that tag cache:

2018-02-06 15:52:46.542 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:52:46.542 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=0, tag_tid=0, entry_tid=0]
2018-02-06 15:52:46.542 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=1, mirror_tag_tid=0
2018-02-06 15:52:46.542 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: need to update tag cache
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 1: [tid=1, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: making fake tag
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 1: [mirror_uuid=, predecessor=[mirror_uuid=]]

A fake entry inserted to m_tag_cache:

 [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 0, entry_tid = 0}}

2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=0, mirror_tag_tid=0
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=0, tag_tid=0, entry_tid=0]
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: clearing tags not needed any more (below mirror position)
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: 0 entries cleared
2018-02-06 15:52:46.546 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=0, tag_tid=0, entry_tid=0], m_entries_behind_master=3
2018-02-06 15:52:46.562 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:52:46.562 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=0, tag_tid=0, entry_tid=0]
2018-02-06 15:52:46.562 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: clearing tags not needed any more (below mirror position)
2018-02-06 15:52:46.562 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: 0 entries cleared
2018-02-06 15:52:46.562 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=0, tag_tid=0, entry_tid=0], m_entries_behind_master=3
2018-02-06 15:53:16.582 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:53:16.582 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=3, tag_tid=2, entry_tid=895]
2018-02-06 15:53:16.582 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=3, tag_tid=1, entry_tid=3], m_mirror_position=[object_number=3, tag_tid=2, entry_tid=895], m_entries_behind_master=0
2018-02-06 15:53:46.598 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:53:46.598 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432]
2018-02-06 15:53:46.598 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=3, mirror_tag_tid=2
2018-02-06 15:53:46.598 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: need to update tag cache
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 3: [tid=3, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 01 00 00 00 00 00 00  |................|
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 3: [mirror_uuid=, predecessor=[mirror_uuid=, tag_tid=1, entry_tid=3]]

Entry inserted to m_tag_cache:

 [3] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}}

2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=1, mirror_tag_tid=2

Why didn't it use the already inserted entry for tag_tid == 1? (the problem with send_update_tag_cache: it does not check the entry already exists in the cache)

2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 1: [tid=1, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: making fake tag
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 1: [mirror_uuid=, predecessor=[mirror_uuid=]]

Why it log that decoded tag 1 has empty predecessor? I expected tag_data.predecessor.tag_tid == 2 (mirror_tag_tid) at the moment. (because predecessor.commit_valid == false).

It tried to insert to m_tag_cache a fake entry:

 [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 2, entry_tid = 0}}

but the insert failed due to previously inserted fake entry (with predecessor.tag_tid == 0).

2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=2, mirror_tag_tid=2
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432]
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=0, mirror_tag_tid=2
2018-02-06 15:53:46.598 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: need to update tag cache
2018-02-06 15:53:46.602 7fcf89ffb700 -1 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: error retrieving tag 0: (2) No such file or directory

It tried to retrieve tag 0, due to previously inserted entry to m_tag_cache for tid = 1 (with predecessor.tag_tid == 0)

2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: making fake tag
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 0: [mirror_uuid=, predecessor=[mirror_uuid=]]

A fake entry inserted to m_tag_cache:

 [0] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 2, entry_tid = 0}}

2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=2, mirror_tag_tid=2
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432]
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: clearing tags not needed any more (below mirror position)
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: 0 entries cleared
2018-02-06 15:53:46.602 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432], m_entries_behind_master=-2428
2018-02-06 15:53:46.618 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:53:46.618 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432]
2018-02-06 15:53:46.618 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: clearing tags not needed any more (below mirror position)
2018-02-06 15:53:46.618 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: 0 entries cleared
2018-02-06 15:53:46.618 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=0, tag_tid=2, entry_tid=2432], m_entries_behind_master=-2428
2018-02-06 15:54:16.638 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:54:16.638 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=607]
2018-02-06 15:54:16.638 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=607], m_entries_behind_master=0
2018-02-06 15:54:16.710 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:54:16.710 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=607]
2018-02-06 15:54:16.710 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 format: m_master_position=[object_number=1, tag_tid=3, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=607], m_entries_behind_master=0
2018-02-06 15:54:46.746 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:54:46.746 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=5, tag_tid=5, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=1323]
2018-02-06 15:54:46.746 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=5, mirror_tag_tid=4
2018-02-06 15:54:46.746 7fcfdcf4f700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: need to update tag cache
2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 5: [tid=5, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 03 00 00 00 00 00 00  |................|
2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 5: [mirror_uuid=, predecessor=[mirror_uuid=, tag_tid=3, entry_tid=1]]

Entry inserted to m_tag_cache:

  [5] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 3, entry_tid = 1}}

2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=3, mirror_tag_tid=4
2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 3: [tid=3, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 01 00 00 00 00 00 00  |................|
2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 3: [mirror_uuid=, predecessor=[mirror_uuid=, tag_tid=1, entry_tid=3]]
2018-02-06 15:54:46.746 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=1, mirror_tag_tid=4
2018-02-06 15:54:46.750 7fcf89ffb700 -1 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: error retrieving tag 1: (2) No such file or directory

Tag 1 does not arleady exist?

2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: making fake tag
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 1: [mirror_uuid=, predecessor=[mirror_uuid=]]

It tried to insert to m_tag_cache a fake entry:

 [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 4, entry_tid = 0}}

but the insert failed due to previously inserted fake entry (with predecessor.tag_tid == 0).

2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=4, mirror_tag_tid=4
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=5, tag_tid=5, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=1323]
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=2, mirror_tag_tid=4
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: need to update tag cache
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: retrieved tag 2: [tid=2, tag_class=0, data=00000000  00 00 00 00 00 00 00 00  01 01 00 00 00 00 00 00  |................|
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 2: [mirror_uuid=, predecessor=[mirror_uuid=, tag_tid=1, entry_tid=3]]

Entry inserted to m_tag_cache:

 [2] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}}

2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=1, mirror_tag_tid=4
2018-02-06 15:54:46.750 7fcf89ffb700 -1 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: error retrieving tag 1: (2) No such file or directory

tag 1 does not arleady exist?

2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: making fake tag
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 handle_update_tag_cache: decoded tag 1: [mirror_uuid=, predecessor=[mirror_uuid=]]

It tried to insert to m_tag_cache a fake entry:

 [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 4, entry_tid = 0}}

but the insert failed due to previously inserted fake entry (with predecessor.tag_tid == 0).

2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 send_update_tag_cache: master_tag_tid=4, mirror_tag_tid=4
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 get_or_send_update: 
2018-02-06 15:54:46.750 7fcf89ffb700 20 rbd::mirror::image_replayer::ReplayStatusFormatter: 0x7fcf2000d5e0 calculate_behind_master_or_send_update: m_master_position=[object_number=5, tag_tid=5, entry_tid=1], m_mirror_position=[object_number=3, tag_tid=4, entry_tid=1323]

At that moment m_tag_cache was:

m_tag_cache = 
 {[0] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 2, entry_tid = 0}},
  [1] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = false, tag_tid = 0, entry_tid = 0}},
  [2] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}},
  [3] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 1, entry_tid = 3}},
  [5] = {mirror_uuid = "", predecessor = {mirror_uuid = "", commit_valid = true, tag_tid = 3, entry_tid = 1}}}}

and it went to infinit loop with master.tag_tid iterating 5 -> 3 -> 1 -> 0 -> 2 -> 1 -> ...

#4 Updated by Mykola Golub about 6 years ago

  • Status changed from In Progress to Fix Under Review

#5 Updated by Jason Dillaman about 6 years ago

  • Status changed from Fix Under Review to Pending Backport

#6 Updated by Nathan Cutler about 6 years ago

  • Copied to Backport #22964: luminous: [rbd-mirror] infinite loop is possible when formatting the status message added

#7 Updated by Nathan Cutler about 6 years ago

  • Copied to Backport #22965: jewel: [rbd-mirror] infinite loop is possible when formatting the status message added

#8 Updated by Nathan Cutler almost 6 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF