Project

General

Profile

Actions

Bug #51225

open

rbd-mirrror: image status count can be wrong

Added by Arthur Outhenin-Chalandre almost 3 years ago. Updated almost 3 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I spotted some cases where the image count is wrong when doing `rbd mirror pool status my_pool --verbose`.

Here is one case for case for instance:
```
[root@cephdev8 tmp.86MiiVMVX3]# rbd mirror pool status mirror --cluster cluster2 --verbose
2021-06-15T12:08:51.314+0000 7ff238a45200 -1 WARNING: all dangerous and experimental features are enabled.
2021-06-15T12:08:51.315+0000 7ff238a45200 -1 WARNING: all dangerous and experimental features are enabled.
2021-06-15T12:08:51.333+0000 7ff238a45200 -1 WARNING: all dangerous and experimental features are enabled.
health: WARNING
daemon health: UNKNOWN
image health: WARNING
images: 2 total
1 unknown
1 replaying

DAEMONS
none

IMAGES
split-brain:
global_id: 1339ae4b-a363-44ba-996b-5205d18f912b
state: down+unknown
description: status not found
last_update:
peer_sites:
name: cluster1
state: up+replaying
description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1623758639,"remote_snapshot_timestamp":1623758639,"replay_state":"idle"}
last_update: 2021-06-15 12:08:44
[root@cephdev8 tmp.86MiiVMVX3]# rbd mirror pool status mirror --cluster cluster2 --verbose
2021-06-15T12:09:55.165+0000 7f1f7784a200 -1 WARNING: all dangerous and experimental features are enabled.
2021-06-15T12:09:55.165+0000 7f1f7784a200 -1 WARNING: all dangerous and experimental features are enabled.
2021-06-15T12:09:55.183+0000 7f1f7784a200 -1 WARNING: all dangerous and experimental features are enabled.
health: WARNING
daemon health: UNKNOWN
image health: WARNING
images: 2 total
1 unknown
1 replaying

DAEMONS
none

IMAGES
split-brain:
global_id: 1339ae4b-a363-44ba-996b-5205d18f912b
state: down+unknown
description: status not found
last_update:
peer_sites:
name: cluster1
state: up+replaying
description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1623758639,"remote_snapshot_timestamp":1623758639,"replay_state":"idle"}
last_update: 2021-06-15 12:09:44
```

This is while running the version on https://github.com/ceph/ceph/pull/41696, but I saw it on my octopus cluster too.
I will try to provide a clear reproducer and/or search for a fix as soon as I can.

Actions #1

Updated by Arthur Outhenin-Chalandre almost 3 years ago

Looking at the code I can't see any ways that it could miscalculate the image count. I think the root cause is still https://tracker.ceph.com/issues/51031.

Actions

Also available in: Atom PDF