Project

General

Profile

Bug #54448

[rbd-mirror] "failed to unlink local peer from remote image" due to EROFS error

Added by Deepika Upadhyay about 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus,pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

this issue is reproducible with using 150 images with the reproducer script from: https://tracker.ceph.com/issues/55852

debug 2022-03-01T17:05:22.217+0000 7fa13ac4d700  5 librbd::SnapshotRemoveRequest: 0x564785038d80 trash_snap: 
debug 2022-03-01T17:05:22.363+0000 7fa12dc33700  5 librbd::Watcher: 0x564784d9e300 notifications_blocked: blocked=0
debug 2022-03-01T17:05:22.363+0000 7fa12dc33700 10 librbd::Watcher::C_NotifyAck 0x564787a0e2d0 C_NotifyAck: id=25460566169347, handle=94865195372928
debug 2022-03-01T17:05:22.363+0000 7fa12dc33700 10 librbd::ImageWatcher: 0x564784d9e300 remote snap_remove request: .mirror.primary.6e69a78f-b13b-4202-adf2-db6c669df6a9.35fcedad-b1d6-4f7b-848e-443639f0f057
debug 2022-03-01T17:05:22.363+0000 7fa12dc33700 20 librbd::ExclusiveLock: 0x564780fa2820 accept_request=0 (request_type=0)
debug 2022-03-01T17:05:22.363+0000 7fa12dc33700 10 librbd::Watcher::C_NotifyAck 0x564787a0e2d0 finish: r=0
debug 2022-03-01T17:05:22.366+0000 7fa14145a700 20 librbd::watcher::Notifier: 0x564781b94a00 handle_notify: r=-110
debug 2022-03-01T17:05:22.366+0000 7fa14145a700 20 librbd::watcher::Notifier: 0x564781b94a00 handle_notify: pending=0
debug 2022-03-01T17:05:22.366+0000 7fa148468700 20 librbd::watcher::Notifier: 0x564781b94a00 notify: pending=1debug 2022-03-01T17:05:22.366+0000 7fa148468700 10 rbd::mirror::InstanceWatcher: C_NotifyInstanceRequest: 0x564787c00d80 finish: r=-110
debug 2022-03-01T17:05:22.366+0000 7fa148468700 -1 rbd::mirror::InstanceWatcher: C_NotifyInstanceRequest: 0x564787c00d80 finish: resending after timeout
debug 2022-03-01T17:05:22.366+0000 7fa148468700 10 rbd::mirror::InstanceWatcher: C_NotifyInstanceRequest: 0x564787c00d80 send
debug 2022-03-01T17:05:22.366+0000 7fa148468700 10 rbd::mirror::InstanceWatcher: C_NotifyInstanceRequest: 0x564787c00d80 send: sending to 11037

debug 2022-03-01T17:05:22.367+0000 7fa141c5b700  5 librbd::Watcher: 0x564781029c00 notifications_blocked: blocked=0
debug 2022-03-01T17:05:22.367+0000 7fa141c5b700 10 librbd::Watcher::C_NotifyAck 0x56479abb6d70 C_NotifyAck: id=43748537016369, handle=94865132570880
debug 2022-03-01T17:05:22.367+0000 7fa141c5b700 10 rbd::mirror::InstanceWatcher: 0x564781029c00 handle_notify: notify_id=43748537016369, handle=94865132570880, notifier_id=11037
debug 2022-03-01T17:05:22.367+0000 7fa141c5b700 10 rbd::mirror::InstanceWatcher: 0x564781029c00 handle_payload: sync_start: instance_id=11037, request_id=3438
debug 2022-03-01T17:05:22.367+0000 7fa141c5b700 10 rbd::mirror::InstanceWatcher: 0x564781029c00 prepare_request: instance_id=11037, request_id=3438
debug 2022-03-01T17:05:22.367+0000 7fa141c5b700 10 rbd::mirror::InstanceWatcher: 0x564781029c00 prepare_request: duplicate for in-progress request
debug 2022-03-01T17:05:22.385+0000 7fa12d432700  5 librbd::Watcher: 0x56478534e300 notifications_blocked: blocked=0
debug 2022-03-01T17:05:22.385+0000 7fa12d432700 10 librbd::Watcher::C_NotifyAck 0x5647855bde00 C_NotifyAck: id=25460566169348, handle=94865161705088
debug 2022-03-01T17:05:22.385+0000 7fa12d432700 10 librbd::ImageWatcher: 0x56478534e300 remote snap_remove request: .mirror.primary.0251e40e-7a21-4b82-b7ba-d3daa42a08f5.ea71c53d-2759-4e0b-bf92-e25d5d76a6a6
debug 2022-03-01T17:05:22.385+0000 7fa12d432700 20 librbd::ExclusiveLock: 0x564780f9fa00 accept_request=0 (request_type=0)
debug 2022-03-01T17:05:22.385+0000 7fa12d432700 10 librbd::Watcher::C_NotifyAck 0x5647855bde00 finish: r=0
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 20 librbd::watcher::Notifier: 0x564784d9e380 handle_notify: r=0
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 20 librbd::watcher::Notifier: 0x564784d9e380 handle_notify: pending=0
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 20 librbd::image_watcher::NotifyLockOwner: 0x564785107b80 handle_notify: r=0
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 20 librbd::Operations: handle_remote_request: r=-30
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 20 librbd::ImageState: 0x564786eabe80 handle_update_notification: refresh_seq = 1, last_refresh = 0
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 20 librbd::ImageState: 0x564784c220e0 ImageUpdateWatchers::notify
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 20 librbd::ImageState: 0x564784c220e0 ImageUpdateWatchers::send_notify: handle=0, watcher=0x56479acd0760
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 15 librbd::mirror::snapshot::UnlinkPeerRequest: 0x5647864b6140 handle_remove_snapshot: r=-30
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 -1 librbd::mirror::snapshot::UnlinkPeerRequest: 0x5647864b6140 handle_remove_snapshot: failed to remove snapshot: (30) Read-only file system
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 15 librbd::mirror::snapshot::UnlinkPeerRequest: 0x5647864b6140 finish: r=-30
debug 2022-03-01T17:05:22.598+0000 7fa121c1b700 20 librbd::ImageState: 0x564784c220e0 ImageUpdateWatchers::handle_notify: handle=0, watcher=0x56479acd0760
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 10 rbd::mirror::image_replayer::snapshot::Replayer: 0x56479af89800 handle_unlink_peer: r=-30
debug 2022-03-01T17:05:22.598+0000 7fa12dc33700 -1 rbd::mirror::image_replayer::snapshot::Replayer: 0x56479af89800 handle_unlink_peer: failed to unlink local peer from remote image: (30) Read-only file system


Related issues

Duplicated by rbd - Bug #55852: [rbd-mirror] remote got demoted in non-primary without actually performing it Duplicate
Duplicated by rbd - Bug #54098: [rbd-mirror] hangs forever after split brain, while searching for demotion snapshot Duplicate
Copied to rbd - Backport #56166: octopus: [rbd-mirror] "failed to unlink local peer from remote image" due to EROFS error Resolved
Copied to rbd - Backport #56167: quincy: [rbd-mirror] "failed to unlink local peer from remote image" due to EROFS error Resolved
Copied to rbd - Backport #56168: pacific: [rbd-mirror] "failed to unlink local peer from remote image" due to EROFS error Resolved

History

#1 Updated by Deepika Upadhyay about 2 years ago

  • Subject changed from rbd_mirror: failed to unlink local peer from remote image (becomes read only image) to rbd_mirror: failed to unlink local peer from remote image

#2 Updated by Deepika Upadhyay almost 2 years ago

passing case:

2022-04-19T12:57:39.826+0000 7fbe1ddbb700 20 librbd::image_watcher::NotifyLockOwner: 0x7fbe1802d000 send_notify
2022-04-19T12:57:39.826+0000 7fbe1ddbb700 20 librbd::watcher::Notifier: 0x7fbe00008b20 notify: pending=1
2022-04-19T12:57:39.826+0000 7fbe1ddbb700  1 -- 172.17.0.4:0/250165805 --> [v2:192.168.122.61:6800/1040034008,v1:192.168.122.61:6801/1040034008] -- osd_op(unknown.0.0:26 2.1b 2:de02d232:::rbd_header.11174dfcc329:head [notify cookie 140454020590992 in=191b] snapc 0=[] ondisk+read+known_if_redirected e26552) v8 -- 0x7fbe0000f1e0 con 0x7fbe0c0023a0
2022-04-19T12:57:39.831+0000 7fbe1fdbf700  1 -- 172.17.0.4:0/250165805 <== osd.0 v2:192.168.122.61:6800/1040034008 26 ==== watch-notify(notify (1) cookie 140454020608784 notify 114039971934468 ret 0) v3 ==== 221+0+0 (crc 0 0 0) 0x7fbe1800b650 con 0x7fbe0c0023a0
2022-04-19T12:57:39.831+0000 7fbe1d5ba700  5 librbd::Watcher: 0x7fbe00008aa0 notifications_blocked: blocked=0
2022-04-19T12:57:39.831+0000 7fbe1d5ba700 10 librbd::Watcher::C_NotifyAck 0x7fbe00018330 C_NotifyAck: id=114039971934468, handle=140454020608784
2022-04-19T12:57:39.832+0000 7fbe1d5ba700 10 librbd::ImageWatcher: 0x7fbe00008aa0 remote snap_create request: [166795,140454020608784,1] [mirror state=non-primary, complete=1, mirror_peer_uuids=, primary_mirror_uuid=, primary_snap_id=head, last_copied_object_number=0, snap_seqs={}] .mirror.non_primary.eebb6525-0b65-4a45-88a5-98823578b5d6.bf70e461-0658-4011-a80c-db36e235315d 0
2022-04-19T12:57:39.832+0000 7fbe1d5ba700 20 librbd::ExclusiveLock: 0x7fbe0400d2e0 accept_request=0 (request_type=2)
2022-04-19T12:57:39.832+0000 7fbe1d5ba700 10 librbd::Watcher::C_NotifyAck 0x7fbe00018330 finish: r=0
2022-04-19T12:57:39.832+0000 7fbe1d5ba700  1 -- 172.17.0.4:0/250165805 --> [v2:192.168.122.61:6800/1040034008,v1:192.168.122.61:6801/1040034008] -- osd_op(unknown.0.0:27 2.1b 2:de02d232:::rbd_header.11174dfcc329:head [notify-ack in=20b] snapc 0=[] ondisk+read+known_if_redirected e26552) v8 -- 0x7fbe0400a750 con 0x7fbe0c0023a0
2022-04-19T12:57:39.833+0000 7fbe1fdbf700  1 -- 172.17.0.4:0/250165805 <== osd.0 v2:192.168.122.61:6800/1040034008 27 ==== osd_op_reply(26 rbd_header.11174dfcc329 [notify cookie 140454020590992 out=8b] v0'0 uv3112 ondisk = 0) v8 ==== 167+0+8 (crc 0 0 0) 0x7fbe1801f040 con 0x7fbe0c0023a0
2022-04-19T12:57:39.837+0000 7fbe1fdbf700  1 -- 172.17.0.4:0/250165805 <== osd.0 v2:192.168.122.61:6800/1040034008 28 ==== osd_op_reply(27 rbd_header.11174dfcc329 [notify-ack] v0'0 uv3112 ondisk = 0) v8 ==== 167+0+0 (crc 0 0 0) 0x7fbe1801f040 con 0x7fbe0c0023a0
2022-04-19T12:57:39.838+0000 7fbe1fdbf700  1 -- 172.17.0.4:0/250165805 <== osd.0 v2:192.168.122.61:6800/1040034008 29 ==== watch-notify(notify_complete (2) cookie 140454020590992 notify 114039971934468 ret 0) v3 ==== 42+0+58 (crc 0 0 0) 0x7fbe1800b650 con 0x7fbe0c0023a0
2022-04-19T12:57:39.838+0000 7fbe1d5ba700 20 librbd::watcher::Notifier: 0x7fbe00008b20 handle_notify: r=0
2022-04-19T12:57:39.838+0000 7fbe1d5ba700 20 librbd::watcher::Notifier: 0x7fbe00008b20 handle_notify: pending=0
2022-04-19T12:57:39.838+0000 7fbe1d5ba700 20 librbd::image_watcher::NotifyLockOwner: 0x7fbe1802d000 handle_notify: r=0

failing case:
  294949 2022-05-05T15:15:00.591+0000 7f2f2cfe6700 20 librbd::image_watcher::NotifyLockOwner: 0x55cecfe9d600 handle_notify: r=0
  294950 2022-05-05T15:15:00.591+0000 7f2f2cfe6700 20 librbd::image_watcher::NotifyLockOwner: 0x55cecfe9d600 handle_notify client responded with r=-30

we propogate that and we have a read only file system now:
  294965 2022-05-05T15:15:00.632+0000 7f2f2cfe6700 10 rbd::mirror::image_replayer::snapshot::Replayer: 0x55ce9b247f80 handle_unlink_peer: r=-30
  294966 2022-05-05T15:15:00.632+0000 7f2f2cfe6700 -1 rbd::mirror::image_replayer::snapshot::Replayer: 0x55ce9b247f80 handle_unlink_peer: failed to unlink local peer from remote image: (30) Read-only file system
  294967 2022-05-05T15:15:00.632+0000 7f2f2cfe6700 20 librbd::ImageState: 0x55ce9c142f20 handle_update_notification: refresh_seq = 11, last_refresh = 10

#3 Updated by Deepika Upadhyay almost 2 years ago

  11381622:2022-05-18T12:24:24.578+0000 7ff947026700 10 rbd::mirror::image_replayer::snapshot::Replayer: 0x55ebcec71a80 scan_remote_mirror_snapshots: remote image demoted
  11381623:2022-05-18T12:24:24.578+0000 7ff947026700 10
  rbd::mirror::image_replayer::snapshot::Replayer: 0x55ebcec71a80
  notify_status_updated:
  11381624:2022-05-18T12:24:24.578+0000 7ff947026700 10
  librbd::managed_lock::GetLockerRequest: 0x55ebf30a5880 finish: r=-2
  11381625:2022-05-18T12:24:24.578+0000 7ff947026700 10
  librbd::managed_lock::AcquireRequest: 0x55ebf2c2c4b0 handle_get_locker: r=-2
  11381626:2022-05-18T12:24:24.578+0000 7ff947026700 20
  librbd::managed_lock::AcquireRequest: 0x55ebf2c2c4b0 handle_get_locker: no
  lockers detected
  11381627:2022-05-18T12:24:24.578+0000 7ff947026700 10
  librbd::managed_lock::AcquireRequest: 0x55ebf2c2c4b0 send_lock:
  entity=client.4143, cookie=auto 94471353268864
  11381628:2022-05-18T12:24:24.578+0000 7ff94101a700 20 librbd::image_watcher::NotifyLockOwner: 0x55ebcd6e0980 send_notify
  11381629:2022-05-18T12:24:24.578+0000 7ff94101a700 20 librbd::watcher::Notifier: 0x55ebd747a0c0 notify: pending=1                                                                                                                               
  11381630:2022-05-18T12:24:24.578+0000 7ff947026700  1 -- 172.21.10.3:0/2697207087 --> [v2:172.21.10.3:6812/620286,v1:172.21.10.3:6813/620286] -- osd_op(unknown.0.0:159117 2.3 2:c5e91ef4:::rbd_header.102fd0a24993:head [call lock.lock        in=75b] snapc 0=[] ondisk+write+known_if_redirected+supports_pool_eio e700) v8 -- 0x55ebcd3ac000 con 0x55ebcc3dd000    

seeing for 172.21.10.3:0/2697207087 on the other side of cluster we see exclusive lock retrived:
1174301:2022-05-18T12:25:42.548+0000 7f2214c0d700 10 librbd::managed_lock::GetLockerRequest: 0x5633fdc6acb0 handle_get_lockers: retrieved exclusive locker: client.4143@172.21.10.3:0/2697207087
  1174308:2022-05-18T12:25:42.548+0000 7f2214c0d700 10 librbd::managed_lock::GetLockerRequest: 0x5633fc3948c0 finish: r=0
  1174309:2022-05-18T12:25:42.548+0000 7f2214c0d700 10 librbd::managed_lock::AcquireRequest: 0x5634223caa50 handle_get_locker: r=0
  1174310:2022-05-18T12:25:42.548+0000 7f2214c0d700 10 librbd::managed_lock::AcquireRequest: 0x5634223caa50 send_lock: entity=client.4141, cookie=auto 94780546079360
  1174311:2022-05-18T12:25:42.548+0000 7f2214c0d700  1 -- 172.21.10.3:0/3359576215 --> [v2:172.21.10.3:6812/620286,v1:172.21.10.3:6813/620286] -- osd_op(unknown.0.0:35919 2.11 2:8d1185d5:::rbd_header.102f3abca166:  head [call lock.lock in=75b] snapc 0=[] ondisk+write+known_if_redirected+supports_pool_eio e732) v8 -- 0x5633c1959400 con 0x5633c3198800
  1174312:2022-05-18T12:25:42.549+0000 7f221cc1d700  1 -- 172.21.10.3:0/2750107484 <== osd.0 v2:172.21.10.3:6800/617678 57809 ==== osd_op_reply(52820 rbd_mirroring [call,call] v800'12509 uv12509 ondisk = 0) v8      ==== 199+0+0 (crc 0 0 0) 0x5633c3252b40 con 0x5633c257ec00
  1174313:2022-05-18T12:25:42.549+0000 7f221a418700 10 rbd::mirror::MirrorStatusUpdater 0x5633bcba9080 handle_update_task:
  1174314:2022-05-18T12:25:42.549+0000 7f221a418700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9080 try_remove_mirror_image_status: global_image_id=f4c1fd60-900d-4038-b4af-a99c741fc9ef, queue_update=0,           immediate_update=0
  1174315:2022-05-18T12:25:42.549+0000 7f221a418700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 exists: global_image_id=f4c1fd60-900d-4038-b4af-a99c741fc9ef
  1174316:2022-05-18T12:25:42.549+0000 7f221a418700 15 rbd::mirror::ImageReplayer: 0x5633c359e900 [2/f4c1fd60-900d-4038-b4af-a99c741fc9ef] remove_image_status_remote: removing remote mirror image status
  1174317:2022-05-18T12:25:42.549+0000 7f221a418700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 try_remove_mirror_image_status: global_image_id=f4c1fd60-900d-4038-b4af-a99c741fc9ef, queue_update=0,           immediate_update=0
  1174318:2022-05-18T12:25:42.549+0000 7f221a418700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9080 try_remove_mirror_image_status: global_image_id=8eb8b7c7-535b-48c2-9f12-e187d50826bf, queue_update=0,           immediate_update=0
  1174319:2022-05-18T12:25:42.549+0000 7f221a418700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 exists: global_image_id=8eb8b7c7-535b-48c2-9f12-e187d50826bf
  1174320:2022-05-18T12:25:42.549+0000 7f221a418700 15 rbd::mirror::ImageReplayer: 0x5633c3486c00 [2/8eb8b7c7-535b-48c2-9f12-e187d50826bf] remove_image_status_remote: removing remote mirror image status
  1174321:2022-05-18T12:25:42.549+0000 7f221a418700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 try_remove_mirror_image_status: global_image_id=8eb8b7c7-535b-48c2-9f12-e187d50826bf, queue_update=0,           immediate_update=0
  1174322:2022-05-18T12:25:42.549+0000 7f2216c11700  1 -- 172.21.10.3:0/3359576215 <== osd.0 v2:172.21.10.3:6812/620286 36895 ==== osd_op_reply(35916 rbd_mirroring [call,call,call,call,call] v732'4045 uv4045        ondisk = 0) v8 ==== 325+0+0 (crc 0 0 0) 0x5633c537b680 con 0x5633c3198800
  1174323:2022-05-18T12:25:42.549+0000 7f221440c700 10 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 handle_update_task:
  1174324:2022-05-18T12:25:42.549+0000 7f221440c700 10 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 queue_update_task:
  1174325:2022-05-18T12:25:42.549+0000 7f221440c700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 try_remove_mirror_image_status: global_image_id=9c820a40-a985-426d-a103-326720b18f42, queue_update=0,           immediate_update=0
  1174326:2022-05-18T12:25:42.549+0000 7f221440c700 10 rbd::mirror::ImageReplayer: 0x5633c3658f00 [2/9c820a40-a985-426d-a103-326720b18f42] handle_shut_down: stop complete
  1174327:2022-05-18T12:25:42.549+0000 7f221440c700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 try_remove_mirror_image_status: global_image_id=a05fa673-2404-4d7a-8f66-fb4cd11337ea, queue_update=0,           immediate_update=0
  1174328:2022-05-18T12:25:42.549+0000 7f221440c700 10 rbd::mirror::ImageReplayer: 0x5633c3553500 [2/a05fa673-2404-4d7a-8f66-fb4cd11337ea] handle_shut_down: stop complete
  1174329:2022-05-18T12:25:42.549+0000 7f221440c700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 try_remove_mirror_image_status: global_image_id=6c1ac387-2640-4260-babd-d5ac8be358c9, queue_update=0,           immediate_update=0
  1174330:2022-05-18T12:25:42.549+0000 7f221440c700 10 rbd::mirror::ImageReplayer: 0x5633c3486000 [2/6c1ac387-2640-4260-babd-d5ac8be358c9] handle_shut_down: stop complete
  1174331:2022-05-18T12:25:42.549+0000 7f221440c700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 try_remove_mirror_image_status: global_image_id=cdb6f66c-02cc-4062-9bf0-51f429976757, queue_update=0,           immediate_update=0
  1174332:2022-05-18T12:25:42.549+0000 7f221440c700 10 rbd::mirror::ImageReplayer: 0x5633c3814300 [2/cdb6f66c-02cc-4062-9bf0-51f429976757] handle_shut_down: stop complete
  1174333:2022-05-18T12:25:42.549+0000 7f221440c700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 try_remove_mirror_image_status: global_image_id=473e6552-65b0-43f3-b784-af0c72cdfc3b, queue_update=0,           immediate_update=0
  1174334:2022-05-18T12:25:42.549+0000 7f221440c700 10 rbd::mirror::ImageReplayer: 0x5633c37dc300 [2/473e6552-65b0-43f3-b784-af0c72cdfc3b] handle_shut_down: stop complete
  1174335:2022-05-18T12:25:42.549+0000 7f221440c700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 try_remove_mirror_image_status: global_image_id=bf9c7437-fce5-4106-b536-6b5aac4fc985, queue_update=0,           immediate_update=0
  1174336:2022-05-18T12:25:42.549+0000 7f221440c700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 try_remove_mirror_image_status: global_image_id=f4c1fd60-900d-4038-b4af-a99c741fc9ef, queue_update=0,           immediate_update=0
  1174337:2022-05-18T12:25:42.549+0000 7f221440c700 15 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 try_remove_mirror_image_status: global_image_id=8eb8b7c7-535b-48c2-9f12-e187d50826bf, queue_update=0,

  ...
  1174338:2022-05-18T12:25:42.549+0000 7f2221426700 10 rbd::mirror::MirrorStatusUpdater 0x5633bcba9ce0 update_task:
  1174339:2022-05-18T12:25:42.550+0000 7f2216c11700  1 -- 172.21.10.3:0/3359576215 <== osd.0 v2:172.21.10.3:6812/620286 36896 ==== osd_op_reply(35918 rbd_header.102f1d074d92 [call] v732'1732 uv1716 ondisk = -16     ((16) Device or resource busy)) v8 ==== 167+0+0 (crc 0 0 0) 0x5633c537b680 con 0x5633c3198800
  1174340:2022-05-18T12:25:42.556+0000 7f2221426700  1 -- 172.21.10.3:0/3359576215 --> [v2:172.21.10.3:6812/620286,v1:172.21.10.3:6813/620286] -- osd_op(unknown.0.0:35920 2.6 2:67cae2ed:::rbd_mirroring:head [call   rbd.mirror_image_status_set in=146b,call rbd.mirror_image_status_set in=146b,call rbd.mirror_image_status_set in=146b] snapc 0=[] ondisk+write+known_if_redirected+supports_pool_eio e732) v8 -- 0x5633c36f9800      con 0x5633c3198800
  1174341:2022-05-18T12:25:42.556+0000 7f2216c11700  1 -- 172.21.10.3:0/3359576215 <== osd.0 v2:172.21.10.3:6812/620286 36897 ==== osd_op_reply(35919 rbd_header.102f3abca166 [call] v732'1699 uv1689 ondisk = -16     ((16) Device or resource busy)) v8 ==== 167+0+0 (crc 0 0 0) 0x5633c537b680 con 0x5633c3198800
  1174342:2022-05-18T12:25:42.556+0000 7f2214c0d700 10 librbd::managed_lock::AcquireRequest: 0x5633d1be4d20 handle_lock: r=-16
  1174343:2022-05-18T12:25:42.556+0000 7f2214c0d700 10 librbd::managed_lock::AcquireRequest: 0x5633d1be4d20 send_break_lock:
  1174344:2022-05-18T12:25:42.556+0000 7f2214c0d700 10 librbd::managed_lock::BreakRequest: 0x5633cf044780 send_get_watchers:
  1174345:2022-05-18T12:25:42.556+0000 7f2214c0d700  1 -- 172.21.10.3:0/3359576215 --> [v2:172.21.10.3:6812/620286,v1:172.21.10.3:6813/620286] -- osd_op(unknown.0.0:35921 2.2 2:477722c1:::rbd_header.102f1d074d92:   head [list-watchers] snapc 0=[] ondisk+read+known_if_redirected+supports_pool_eio e732) v8 -- 0x5633c3794400 con 0x5633c3198800
  1174346:2022-05-18T12:25:42.556+0000 7f2214c0d700 10 librbd::managed_lock::AcquireRequest: 0x5634223caa50 handle_lock: r=-16
  1174347:2022-05-18T12:25:42.556+0000 7f2214c0d700 10 librbd::managed_lock::AcquireRequest: 0x5634223caa50 send_break_lock:
  1174348:2022-05-18T12:25:42.556+0000 7f2214c0d700 10 librbd::managed_lock::BreakRequest: 0x5633c3cf8b40 send_get_watchers:
  1174349:2022-05-18T12:25:42.556+0000 7f2214c0d700  1 -- 172.21.10.3:0/3359576215 --> [v2:172.21.10.3:6812/620286,v1:172.21.10.3:6813/620286] -- osd_op(unknown.0.0:35922 2.11 2:8d1185d5:::rbd_header.102f3abca166:  head [list-watchers] snapc 0=[] ondisk+read+known_if_redirected+supports_pool_eio e732) v8 -- 0x5633c37b6000 con 0x5633c3198800
  1174350:2022-05-18T12:25:42.557+0000 7f2216c11700  1 -- 172.21.10.3:0/3359576215 <== osd.0 v2:172.21.10.3:6812/620286 36898 ==== osd_op_reply(35921 rbd_header.102f1d074d92 [list-watchers out=134b] v0'0 uv1716     ondisk = 0) v8 ==== 167+0+134 (crc 0 0 0) 0x5633c537b680 con 0x5633c3198800
  1174351:2022-05-18T12:25:42.557+0000 7f221440c700 10 librbd::managed_lock::BreakRequest: 0x5633cf044780 handle_get_watchers: r=0
  1174352:2022-05-18T12:25:42.557+0000 7f221440c700 20 librbd::managed_lock::BreakRequest: 0x5633cf044780 handle_get_watchers: watcher=[addr=172.21.10.3:0/2697207087, entity=client.4143]
    1174353:2022-05-18T12:25:42.557+0000 7f221440c700 10 librbd::managed_lock::BreakRequest: 0x5633cf044780 handle_get_watchers: lock owner is still alive
  1174354:2022-05-18T12:25:42.557+0000 7f2216c11700  1 -- 172.21.10.3:0/3359576215 <== osd.0 v2:172.21.10.3:6812/620286 36899 ==== osd_op_reply(35922 rbd_header.102f3abca166 [list-watchers out=134b] v0'0 uv1689     ondisk = 0) v8 ==== 167+0+134 (crc 0 0 0) 0x5633c537b680 con 0x5633c3198800
  1174355:2022-05-18T12:25:42.557+0000 7f221440c700 20 librbd::managed_lock::BreakRequest: 0x5633cf044780 handle_get_watchers: watcher=[addr=172.21.10.3:0/3359576215, entity=client.4141]
  1174356:2022-05-18T12:25:42.557+0000 7f221440c700 10 librbd::managed_lock::BreakRequest: 0x5633cf044780 finish: r=-11
  1174357:2022-05-18T12:25:42.557+0000 7f221440c700 10 librbd::managed_lock::AcquireRequest: 0x5633d1be4d20 handle_break_lock: r=-11
  1174358:2022-05-18T12:25:42.557+0000 7f221440c700  5 librbd::managed_lock::AcquireRequest: 0x5633d1be4d20 handle_break_lock: lock owner is still alive
  1174359:2022-05-18T12:25:42.557+0000 7f221440c700 10 librbd::ManagedLock: 0x5633ffb1afd8 handle_acquire_lock: r=-11
  1174360:2022-05-18T12:25:42.557+0000 7f221440c700  5 librbd::ManagedLock: 0x5633ffb1afd8 handle_acquire_lock: unable to acquire exclusive lock
  1174361:2022-05-18T12:25:42.557+0000 7f2214c0d700 10 librbd::managed_lock::BreakRequest: 0x5633c3cf8b40 handle_get_watchers: r=0
  1174362:2022-05-18T12:25:42.557+0000 7f2214c0d700 20 librbd::managed_lock::BreakRequest: 0x5633c3cf8b40 handle_get_watchers: watcher=[addr=172.21.10.3:0/2697207087, entity=client.4143]

  but we end up with lock owner being alive.

  1174363:2022-05-18T12:25:42.557+0000 7f221440c700 10 librbd::ExclusiveLock: 0x5633ffb1afc0 post_acquire_lock_handler: r=-11
  1174364:2022-05-18T12:25:42.557+0000 7f2214c0d700 10 librbd::managed_lock::BreakRequest: 0x5633c3cf8b40 handle_get_watchers: lock owner is still alive
  1174365:2022-05-18T12:25:42.557+0000 7f221440c700 10 librbd::ImageState: 0x5633d0b713f0 handle_prepare_lock_complete
  1174366:2022-05-18T12:25:42.557+0000 7f2214c0d700 20 librbd::managed_lock::BreakRequest: 0x5633c3cf8b40 handle_get_watchers: watcher=[addr=172.21.10.3:0/3359576215, entity=client.4141]
  1174367:2022-05-18T12:25:42.557+0000 7f2214c0d700 10 librbd::managed_lock::BreakRequest: 0x5633c3cf8b40 finish: r=-11
  1174368:2022-05-18T12:25:42.557+0000 7f2214c0d700 10 librbd::managed_lock::AcquireRequest: 0x5634223caa50 handle_break_lock: r=-11
  1174369:2022-05-18T12:25:42.557+0000 7f2214c0d700  5 librbd::managed_lock::AcquireRequest: 0x5634223caa50 handle_break_lock: lock owner is still alive
  1174370:2022-05-18T12:25:42.557+0000 7f2214c0d700 10 librbd::ManagedLock: 0x5633d8f166d8 handle_acquire_lock: r=-11
  1174371:2022-05-18T12:25:42.557+0000 7f2214c0d700  5 librbd::ManagedLock: 0x5633d8f166d8 handle_acquire_lock: unable to acquire exclusive lock
  1174372:2022-05-18T12:25:42.557+0000 7f2214c0d700 10 librbd::ExclusiveLock: 0x5633d8f166c0 post_acquire_lock_handler: r=-11
  1174373:2022-05-18T12:25:42.557+0000 7f221440c700 10 librbd::ManagedLock: 0x5633ffb1afd8 handle_post_acquire_lock: r=0
  1174374:2022-05-18T12:25:42.557+0000 7f2214c0d700 10 librbd::ImageState: 0x5633c46c9ad0 handle_prepare_lock_complete
  1174375:2022-05-18T12:25:42.557+0000 7f221440c700 20 librbd::Operations: handle_acquire_exclusive_lock: r=0
  1174376:2022-05-18T12:25:42.557+0000 7f2214c0d700 10 librbd::ManagedLock: 0x5633d8f166d8 handle_post_acquire_lock: r=0
  1174377:2022-05-18T12:25:42.557+0000 7f2214c0d700 20 librbd::Operations: handle_acquire_exclusive_lock: r=0
  1174378:2022-05-18T12:25:42.557+0000 7f221440c700 20 librbd::ManagedLock: 0x5633ffb1afd8 is_lock_owner: =0
  1174379:2022-05-18T12:25:42.557+0000 7f2214c0d700 20 librbd::ManagedLock: 0x5633d8f166d8 is_lock_owner: =0
  1174380:2022-05-18T12:25:42.557+0000 7f221440c700 20 librbd::Operations: send_remote_request
  1174381:2022-05-18T12:25:42.557+0000 7f2214c0d700 20 librbd::Operations: send_remote_request
  1174382:2022-05-18T12:25:42.557+0000 7f221440c700 20 librbd::ManagedLock: 0x5633ffb1afd8 is_lock_owner: =0
  1174383:2022-05-18T12:25:42.557+0000 7f2214c0d700 20 librbd::ManagedLock: 0x5633d8f166d8 is_lock_owner: =0
  1174384:2022-05-18T12:25:42.557+0000 7f221440c700 10 librbd::ImageWatcher: 0x5633d0c7d600 async request: [4141,94781923207424,6]
  1174385:2022-05-18T12:25:42.557+0000 7f2214c0d700 10 librbd::ImageWatcher: 0x5633e0bd1a80 async request: [4141,94780546079360,5]
  1174386:2022-05-18T12:25:42.557+0000 7f221440c700 20 librbd::ImageWatcher: scheduling async request time out: [4141,94781923207424,6]
  1174387:2022-05-18T12:25:42.557+0000 7f2214c0d700 20 librbd::ImageWatcher: scheduling async request time out: [4141,94780546079360,5]
  1174388:2022-05-18T12:25:42.557+0000 7f2214c0d700 20 librbd::image_watcher::NotifyLockOwner: 0x563413fea780 send_notify
  1174389:2022-05-18T12:25:42.557+0000 7f221440c700 20 librbd::image_watcher::NotifyLockOwner: 0x5633fc730900 send_notify
  1174390:2022-05-18T12:25:42.557+0000 7f2214c0d700 20 librbd::watcher::Notifier: 0x5633e0bd1b40 notify: pending=1
  1174391:2022-05-18T12:25:42.557+0000 7f221440c700 20 librbd::watcher::Notifier: 0x5633d0c7d6c0 notify: pending=1
  1174392:2022-05-18T12:25:42.557+0000 7f2214c0d700  1 -- 172.21.10.3:0/3359576215 --> [v2:172.21.10.3:6812/620286,v1:172.21.10.3:6813/620286] -- osd_op(unknown.0.0:35923 2.11 2:8d1185d5:::rbd_header.102f3abca166:
  1174393:2022-05-18T12:25:42.557+0000 7f221440c700  1 -- 172.21.10.3:0/3359576215 --> [v2:172.21.10.3:6812/620286,v1:172.21.10.3:6813/620286] -- osd_op(unknown.0.0:35924 2.2 2:477722c1:::rbd_header.102f1d074d92:h
  1174394:2022-05-18T12:25:42.558+0000 7f2216c11700  1 -- 172.21.10.3:0/3359576215 <== osd.0 v2:172.21.10.3:6812/620286 36900 ==== watch-notify(notify (1) cookie 94780546079360 notify 3143916082444 ret 0) v3 ====
  1174395:2022-05-18T12:25:42.558+0000 7f2214c0d700  5 librbd::Watcher: 0x5633e0bd1a80 notifications_blocked: blocked=0
  1174396:2022-05-18T12:25:42.558+0000 7f2214c0d700 10 librbd::Watcher::C_NotifyAck 0x5633c635fc70 C_NotifyAck: id=3143916082444, handle=94780546079360

  seeing on other side this notify :

    11982889:2022-05-18T12:27:05.273+0000 7ff947026700 20 librbd::managed_lock::BreakRequest: 0x55ebded15540 handle_get_watchers: watcher=[addr=172.21.10.3:0/2697207087, entity=client.4143]
    11982890:2022-05-18T12:27:05.273+0000 7ff947026700 20 librbd::managed_lock::BreakRequest: 0x55ebded15540 handle_get_watchers: watcher=[addr=172.21.10.3:0/3359576215, entity=client.4141]
    11982891:2022-05-18T12:27:05.273+0000 7ff947026700 20 librbd::managed_lock::BreakRequest: 0x55ebded15540 handle_get_watchers: watcher=[addr=172.21.10.3:0/1974536754, entity=client.5203]   

#4 Updated by Deepika Upadhyay almost 2 years ago

  • Related to Bug #55852: [rbd-mirror] remote got demoted in non-primary without actually performing it added

#5 Updated by Deepika Upadhyay almost 2 years ago

  • Description updated (diff)

#6 Updated by Ilya Dryomov almost 2 years ago

  • Subject changed from rbd_mirror: failed to unlink local peer from remote image to [rbd-mirror] "failed to unlink local peer from remote image" due to EROFS error
  • Status changed from New to In Progress
  • Assignee set to Ilya Dryomov

#7 Updated by Ilya Dryomov almost 2 years ago

  • Status changed from In Progress to Fix Under Review
  • Backport set to octopus,pacific,quincy
  • Pull request ID set to 46759

#8 Updated by Ilya Dryomov almost 2 years ago

  • Related to deleted (Bug #55852: [rbd-mirror] remote got demoted in non-primary without actually performing it)

#9 Updated by Ilya Dryomov almost 2 years ago

  • Duplicated by Bug #55852: [rbd-mirror] remote got demoted in non-primary without actually performing it added

#10 Updated by Ilya Dryomov almost 2 years ago

  • Status changed from Fix Under Review to Pending Backport

#11 Updated by Backport Bot almost 2 years ago

  • Copied to Backport #56166: octopus: [rbd-mirror] "failed to unlink local peer from remote image" due to EROFS error added

#12 Updated by Backport Bot almost 2 years ago

  • Copied to Backport #56167: quincy: [rbd-mirror] "failed to unlink local peer from remote image" due to EROFS error added

#13 Updated by Backport Bot almost 2 years ago

  • Copied to Backport #56168: pacific: [rbd-mirror] "failed to unlink local peer from remote image" due to EROFS error added

#14 Updated by Ilya Dryomov almost 2 years ago

  • Duplicated by Bug #54098: [rbd-mirror] hangs forever after split brain, while searching for demotion snapshot added

#15 Updated by Ilya Dryomov over 1 year ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF