Bug #51225
openrbd-mirrror: image status count can be wrong
0%
Description
I spotted some cases where the image count is wrong when doing `rbd mirror pool status my_pool --verbose`.
Here is one case for case for instance:
```
[root@cephdev8 tmp.86MiiVMVX3]# rbd mirror pool status mirror --cluster cluster2 --verbose
2021-06-15T12:08:51.314+0000 7ff238a45200 -1 WARNING: all dangerous and experimental features are enabled.
2021-06-15T12:08:51.315+0000 7ff238a45200 -1 WARNING: all dangerous and experimental features are enabled.
2021-06-15T12:08:51.333+0000 7ff238a45200 -1 WARNING: all dangerous and experimental features are enabled.
health: WARNING
daemon health: UNKNOWN
image health: WARNING
images: 2 total
1 unknown
1 replaying
DAEMONS
none
IMAGES
split-brain:
global_id: 1339ae4b-a363-44ba-996b-5205d18f912b
state: down+unknown
description: status not found
last_update:
peer_sites:
name: cluster1
state: up+replaying
description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1623758639,"remote_snapshot_timestamp":1623758639,"replay_state":"idle"}
last_update: 2021-06-15 12:08:44
[root@cephdev8 tmp.86MiiVMVX3]# rbd mirror pool status mirror --cluster cluster2 --verbose
2021-06-15T12:09:55.165+0000 7f1f7784a200 -1 WARNING: all dangerous and experimental features are enabled.
2021-06-15T12:09:55.165+0000 7f1f7784a200 -1 WARNING: all dangerous and experimental features are enabled.
2021-06-15T12:09:55.183+0000 7f1f7784a200 -1 WARNING: all dangerous and experimental features are enabled.
health: WARNING
daemon health: UNKNOWN
image health: WARNING
images: 2 total
1 unknown
1 replaying
DAEMONS
none
IMAGES
split-brain:
global_id: 1339ae4b-a363-44ba-996b-5205d18f912b
state: down+unknown
description: status not found
last_update:
peer_sites:
name: cluster1
state: up+replaying
description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1623758639,"remote_snapshot_timestamp":1623758639,"replay_state":"idle"}
last_update: 2021-06-15 12:09:44
```
This is while running the version on https://github.com/ceph/ceph/pull/41696, but I saw it on my octopus cluster too.
I will try to provide a clear reproducer and/or search for a fix as soon as I can.
Updated by Arthur Outhenin-Chalandre almost 3 years ago
Looking at the code I can't see any ways that it could miscalculate the image count. I think the root cause is still https://tracker.ceph.com/issues/51031.