Bug #44938
closed[rbd-mirror] tx-only peer from heartbeat can race w/ CLI
0%
Description
If a peer cluster's rbd-mirror sends a heartbeat it will create a tx-only peer entry. However, this can race w/ the rbd CLI in the tests when it tries to add the same peer as a rx-tx peer.
Updated by Jason Dillaman about 4 years ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 34422
Updated by Mykola Golub about 4 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler about 4 years ago
- Copied to Backport #45036: octopus: [rbd-mirror] tx-only peer from heartbeat can race w/ CLI added
Updated by Jason Dillaman about 4 years ago
NOTE: this will require a second commit to address since the first commit did not fully resolve the issue.
Updated by Jason Dillaman about 4 years ago
- Status changed from Pending Backport to Fix Under Review
Additional PR: https://github.com/ceph/ceph/pull/34573
Updated by Mykola Golub about 4 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Mykola Golub about 4 years ago
Additional PR: https://github.com/ceph/ceph/pull/34753
It should fix failures like this one: http://qa-proxy.ceph.com/teuthology/trociny-2020-04-24_15:23:36-rbd-wip-mgolub-testing-distro-basic-smithi/4981088/teuthology.log
Updated by Nathan Cutler almost 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
Updated by Jason Dillaman almost 4 years ago
- Status changed from Resolved to Pending Backport
Updated by Nathan Cutler almost 4 years ago
@Jason Borden The octopus backport PR for this issue was already merged. Is it missing something?
Updated by Jason Dillaman almost 4 years ago
Yes, see comment #7. It has an additional fix that we should backport. I'm just re-using these tracker tickets instead of opening new ones.
Updated by Nathan Cutler almost 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
Updated by Nathan Cutler over 3 years ago
Jason Dillaman wrote:
Yes, see comment #7. It has an additional fix that we should backport. I'm just re-using these tracker tickets instead of opening new ones.
I believe both commits are now in octopus:
commit f93516d78734740c30c5532e5032be1893004bb1 Author: Mykola Golub <mgolub@suse.com> Date: Sat Apr 25 08:36:25 2020 +0100 qa/workunits/rbd: retry the addition of a mirror pool peer fb4311f5 has fixed this for setup, but "remove mirroring pool" test needs fixing too. Fixes: https://tracker.ceph.com/issues/44938 Signed-off-by: Mykola Golub <mgolub@suse.com> (cherry picked from commit 7eced158a9a3c47cc408b35219b4428e97e018fb) commit 4644cd663de27bd19b07eb8dca0153032060694b Author: Jason Dillaman <dillaman@redhat.com> Date: Wed Apr 15 16:27:07 2020 -0400 qa/workunits/rbd: retry the addition of a mirror pool peer We might race with the remote rbd-mirror daemon creating a tx-only peer when adding a new peer. Therefore, delete the tx-only peer and attempt to re-create it. Fixes: https://tracker.ceph.com/issues/44938 Signed-off-by: Jason Dillaman <dillaman@redhat.com> (cherry picked from commit fb4311f597a98b6870d7895e6403fb32356bfbe9)