Project

General

Profile

Actions

Bug #36580

closed

jewel: Snapshots changing "underneath"

Added by Christian Theune over 5 years ago. Updated almost 5 years ago.

Status:
Won't Fix
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

we're in the process of finally upgrading our old Hammer cluster to Jewel. I know that Jewel is already EOL and we'll be updating to the next newer version soon, too, however, we have a blocking issue that we can't move on without fixing it.

Our backup leverages RBD snapshots. Our backup scrubbing noticed broken images because the backup wouldn't be identical to the origin in some places.

I was able to pin this down to this happening:

root@patty /srv/backy # rbd export rbd.hdd/test04.root@backy-7FxpY53tLdfLjEuahVxTBN - | md5sum
Exporting image: 100% complete...done.
c9e398b525a6d4c6f906e8af296c9e2f  -

This is the cluster we are using in development to test our upgrade and it did see a couple of weird intermediate situations (like accidentally downgrading half the cluster to Hammer). The snapshot seems to become stable once I restart the Qemu process and a friend who is using the same mechanisms on Jewel has not reported an issue like this.

I wonder how to track this down further and repair the cluster. We'll be rolling this out to our staging environment in a bit and I'll check there again, but maybe something comes to mind that you could recommend us to check ...

PS: I can't select "10.2.11" as the affected version.


Files

debug-snapshot.log (109 KB) debug-snapshot.log Christian Theune, 10/25/2018 10:24 AM
osd.0.log (496 KB) osd.0.log Christian Theune, 10/31/2018 07:46 AM
Actions #1

Updated by Christian Theune over 5 years ago

Sorry, I pressed submit too quickly, here's the actual problematic output I wanted to report:

root@patty /srv/backy # rbd export rbd.hdd/test04.root@backy-y7sgJyr9cUZNPPHPXK9QpC - | md5sum
Exporting image: 100% complete...done.
001b2ff56b0e63a49f12badb52f3378a  -
root@patty /srv/backy # rbd export rbd.hdd/test04.root@backy-y7sgJyr9cUZNPPHPXK9QpC - | md5sum
Exporting image: 100% complete...done.
ff83ba00cc5b3a44b9a9d474000f5ce2  -

Maybe the rbd info helps, too, but there's nothing exciting from our POV:

root@cartman02 ~ # rbd info rbd.hdd/test04.root
rbd image 'test04.root':
    size 30720 MB in 7680 objects
    order 22 (4096 kB objects)
    block_name_prefix: rbd_data.13554b23d1b58ba
    format: 2
    features: layering, striping
    flags:
    stripe unit: 4096 kB
    stripe count: 1

As I mentioned, restarting the VM (new Qemu process) seems to help for a bit but it reappears after a while.

Also, please note that we're running with this "approved" patch:
https://github.com/ceph/ceph/pull/24423

Not sure whether that patch might be related ... I would guess/hope not, but hey ...

Actions #2

Updated by Jason Dillaman over 5 years ago

  • Status changed from New to Need More Info
  • Severity deleted (1 - critical)

@Christian: I am not really sure what you mean when you say restarting QEMU helps for a bit? It helps what for a bit? Are you saying that you have an active VM writing to "rbd.hdd/test04.root" and if you export the snapshot while the VM is writing you get different values each time -- and when you restart the VM the snapshot stays consistent?

Actions #3

Updated by Christian Theune over 5 years ago

Sorry for the confusion, but you guessed right, that's exactly what we're seeing ... :/

Actions #4

Updated by Jason Dillaman over 5 years ago

@Christian: ... and am I correct that you are creating a new snapshot while the VM running? What librbd1 version is the QEMU process using?

Actions #5

Updated by Christian Theune over 5 years ago

So the process currently shows 1.0.0:

nobody   20001  8.6  9.9 8457712 3288236 ?     Sl   Oct22 258:09 qemu-system-x86_64 -daemonize -nodefaults -name test04,process=kvm.test04 -chroot /srv/vm/test04 -runas nobody -serial file:/var/log/vm/test04.log -display vnc=127.0.0.1:4608 -pidfile /run/qemu.test04.pid -vga std -m 3072 -readconfig /run/qemu.test04.cfg
root@quimby ~ # lsof -p 20001 | grep librbd
kvm.test0 20001 nobody  mem    REG              253,3 10982752     8591 /usr/lib64/librbd.so.1.0.0
root@quimby ~ # eix ceph
[I] sys-cluster/ceph
     Available versions:  0.56.3 ~0.67.7 ~0.72.2-r3 **0.79 {M}0.94.10-r1[1] 10.2.11-r2^t[1] **9999 {babeltrace cephfs cryptopp debug fuse gtk jemalloc ldap +libaio libatomic libxfs libzfs lttng +nss (+)radosgw static-libs (+)tcmalloc test xfs zfs PYTHON_TARGETS="python2_7 python3_4 python3_5"}
     Installed versions:  10.2.11-r2^t[1](02:37:46 PM 10/04/2018)(fuse ldap libaio libatomic nss radosgw tcmalloc xfs -babeltrace -cephfs -cryptopp -debug -gtk -jemalloc -lttng -static-libs -test -zfs PYTHON_TARGETS="python2_7 python3_4 -python3_5")
     Homepage:            https://ceph.com/
     Description:         Ceph distributed filesystem

[1] "gocept" /usr/local/portage
root@quimby ~ # equery b  /usr/lib64/librbd.so.1.0.0
 * Searching for /usr/lib64/librbd.so.1.0.0 ...
sys-cluster/ceph-10.2.11-r2 (/usr/lib64/librbd.so.1.0.0)
root@quimby ~ # ls -lah /usr/lib64/librbd.so.1.0.0
-rwxr-xr-x 1 root root 11M Oct  4 14:35 /usr/lib64/librbd.so.1.0.0
root@quimby ~ #
Actions #6

Updated by Jason Dillaman over 5 years ago

@Christian: it looks like you have client libraries from "ceph-10.2.11-r2" installed -- I'm not sure where those libraries came from since the latest Jewel release was 10.2.9.

Also, am I correct that you are creating a new snapshot while the VM running?

Actions #7

Updated by Christian Theune over 5 years ago

Yeah, we're taking the snapshot with the VM running (that's the point of it, right? ;) )

I honestly hate correcting you, but I think:

https://github.com/ceph/ceph/tree/v10.2.11
and
https://download.ceph.com/tarballs/ceph-10.2.11.tar.gz

those agree there's a 10.2.11 :)

Actions #8

Updated by Jason Dillaman over 5 years ago

Christian Theune wrote:

Yeah, we're taking the snapshot with the VM running (that's the point of it, right? ;) )

Yup, should be safe. It sounds like for whatever reason, the rbd CLI is not able to ping the QEMU librbd instance to notify it that a snapshot was created. You can start by running "rbd --debug-rbd=20 snap create ..." and attaching the resulting log.

Is the rbd CLI also from v10.2.11?

those agree there's a 10.2.11 :)

Indeed -- turns out my local jewel repo wasn't up-to-date with its tags.

Actions #9

Updated by Christian Theune over 5 years ago

Just to be sure, I checked a couple of servers, but we're building Ceph with a single big package and all servers are (back) on Jewel:

Qemu Server:

root@quimby ~ # rbd --version
ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)

Client that runs our backup software:

root@patty ~ # rbd --version
ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)

Here's the output from creating a new snapshot. It doesn't always trigger and I'm running exports again, which will take a bit ...

root@quimby ~ # rbd --debug-rbd=20 snap create rbd.hdd/test04.root@foobar
2018-10-24 15:31:43.152594 7f6108052d80  5 librbd::AioImageRequestWQ: 0x55d1b52c6700 : ictx=0x55d1b52c4c40
2018-10-24 15:31:43.152610 7f6108052d80 20 librbd::ImageState: 0x55d1b52be190 open
2018-10-24 15:31:43.152614 7f6108052d80 10 librbd::ImageState: 0x55d1b52be190 0x55d1b52be190 send_open_unlock
2018-10-24 15:31:43.152628 7f6108052d80 10 librbd::image::OpenRequest: 0x55d1b52c6fe0 send_v2_detect_header
2018-10-24 15:31:43.155530 7f60e4ff9700 10 librbd::image::OpenRequest: handle_v2_detect_header: r=0
2018-10-24 15:31:43.155540 7f60e4ff9700 10 librbd::image::OpenRequest: 0x55d1b52c6fe0 send_v2_get_id
2018-10-24 15:31:43.156860 7f60e4ff9700 10 librbd::image::OpenRequest: handle_v2_get_id: r=0
2018-10-24 15:31:43.156874 7f60e4ff9700 10 librbd::image::OpenRequest: 0x55d1b52c6fe0 send_v2_get_immutable_metadata
2018-10-24 15:31:43.160356 7f60e4ff9700 10 librbd::image::OpenRequest: handle_v2_get_immutable_metadata: r=0
2018-10-24 15:31:43.160373 7f60e4ff9700 10 librbd::image::OpenRequest: 0x55d1b52c6fe0 send_v2_get_stripe_unit_count
2018-10-24 15:31:43.161836 7f60e4ff9700 10 librbd::image::OpenRequest: handle_v2_get_stripe_unit_count: r=0
2018-10-24 15:31:43.161859 7f60e4ff9700 10 librbd::ImageCtx: init_layout stripe_unit 4194304 stripe_count 1 object_size 4194304 prefix rbd_data.13554b23d1b58ba format rbd_data.13554b23d1b58ba.%016llx
2018-10-24 15:31:43.161866 7f60e4ff9700 10 librbd::image::OpenRequest: 0x55d1b52c6fe0 send_v2_apply_metadata: start_key=conf_
2018-10-24 15:31:43.163096 7f60e4ff9700 10 librbd::image::OpenRequest: 0x55d1b52c6fe0 handle_v2_apply_metadata: r=0
2018-10-24 15:31:43.163117 7f60e4ff9700 20 librbd::ImageCtx: apply_metadata
2018-10-24 15:31:43.163386 7f60e4ff9700 20 librbd::ImageCtx: enabling caching...
2018-10-24 15:31:43.163389 7f60e4ff9700 20 librbd::ImageCtx: Initial cache settings: size=33554432 num_objects=10 max_dirty=25165824 target_dirty=16777216 max_dirty_age=1
2018-10-24 15:31:43.163498 7f60e4ff9700 10 librbd::ImageCtx:  cache bytes 33554432 -> about 1048 objects
2018-10-24 15:31:43.163556 7f60e4ff9700 10 librbd::image::OpenRequest: 0x55d1b52c6fe0 send_register_watch
2018-10-24 15:31:43.163767 7f60e4ff9700 10 librbd::ImageWatcher: 0x7f60c0000f30 registering image watcher
2018-10-24 15:31:43.176240 7f60e4ff9700 10 librbd::image::OpenRequest: 0x55d1b52c6fe0 handle_register_watch: r=0
2018-10-24 15:31:43.176248 7f60e4ff9700 10 librbd::image::OpenRequest: 0x55d1b52c6fe0 send_refresh
2018-10-24 15:31:43.176252 7f60e4ff9700 10 librbd::image::RefreshRequest: 0x7f60c0008a50 send_v2_get_mutable_metadata
2018-10-24 15:31:43.178256 7f60e4ff9700 10 librbd::image::RefreshRequest: 0x7f60c0008a50 handle_v2_get_mutable_metadata: r=0
2018-10-24 15:31:43.178274 7f60e4ff9700 10 librbd::image::RefreshRequest: 0x7f60c0008a50 send_v2_get_flags
2018-10-24 15:31:43.179751 7f60e4ff9700 10 librbd::image::RefreshRequest: 0x7f60c0008a50 handle_v2_get_flags: r=0
2018-10-24 15:31:43.179765 7f60e4ff9700 10 librbd::image::RefreshRequest: 0x7f60c0008a50 send_v2_get_snapshots
2018-10-24 15:31:43.181531 7f60e4ff9700 10 librbd::image::RefreshRequest: 0x7f60c0008a50 handle_v2_get_snapshots: r=0
2018-10-24 15:31:43.181553 7f60e4ff9700 10 librbd::image::RefreshRequest: 0x7f60c0008a50 send_v2_apply
2018-10-24 15:31:43.181690 7f60d7fff700 10 librbd::image::RefreshRequest: 0x7f60c0008a50 handle_v2_apply
2018-10-24 15:31:43.181704 7f60d7fff700 20 librbd::image::RefreshRequest: 0x7f60c0008a50 apply
2018-10-24 15:31:43.181732 7f60d7fff700 20 librbd::image::RefreshRequest: new snapshot id=374574 name=backy-hir62HRmwEW2iJfsRZwzoS size=32212254720
2018-10-24 15:31:43.181755 7f60d7fff700 10 librbd::image::RefreshRequest: 0x7f60c0008a50 send_flush_aio
2018-10-24 15:31:43.181765 7f60d7fff700 10 librbd::image::RefreshRequest: 0x7f60c0008a50 handle_flush_aio: r=0
2018-10-24 15:31:43.181774 7f60d7fff700 10 librbd::image::OpenRequest: handle_refresh: r=0
2018-10-24 15:31:43.181780 7f60d7fff700 10 librbd::ImageState: 0x55d1b52be190 0x55d1b52be190 handle_open: r=0
2018-10-24 15:31:43.181860 7f6108052d80  5 librbd::Operations: 0x55d1b52c5c00 snap_create: snap_name=foobar
2018-10-24 15:31:43.181876 7f6108052d80 20 librbd::Operations: send_local_request
2018-10-24 15:31:43.181880 7f6108052d80  5 librbd::Operations: 0x55d1b52c5c00 execute_snap_create: snap_name=foobar
2018-10-24 15:31:43.181902 7f6108052d80  5 librbd::SnapshotCreateRequest: 0x55d1b52c73e0 send_suspend_requests
2018-10-24 15:31:43.181906 7f6108052d80  5 librbd::SnapshotCreateRequest: 0x55d1b52c73e0 send_suspend_aio
2018-10-24 15:31:43.181909 7f6108052d80  5 librbd::AioImageRequestWQ: block_writes: 0x55d1b52c4c40, num=1
2018-10-24 15:31:43.182033 7f60d7fff700  5 librbd::SnapshotCreateRequest: 0x55d1b52c73e0 handle_suspend_aio: r=0
2018-10-24 15:31:43.182042 7f60d7fff700  5 librbd::SnapshotCreateRequest: 0x55d1b52c73e0 send_allocate_snap_id
2018-10-24 15:31:43.944196 7f60d7fff700  5 librbd::SnapshotCreateRequest: 0x55d1b52c73e0 handle_allocate_snap_id: r=0, snap_id=374576
2018-10-24 15:31:43.944219 7f60d7fff700  5 librbd::SnapshotCreateRequest: 0x55d1b52c73e0 send_create_snap
2018-10-24 15:31:43.959012 7f60e4ff9700  5 librbd::SnapshotCreateRequest: 0x55d1b52c73e0 handle_create_snap: r=0
2018-10-24 15:31:43.959023 7f60e4ff9700  5 librbd::SnapshotCreateRequest: 0x55d1b52c73e0 update_snap_context
2018-10-24 15:31:43.959036 7f60e4ff9700  5 librbd::AioImageRequestWQ: unblock_writes: 0x55d1b52c4c40, num=0
2018-10-24 15:31:43.959046 7f60e4ff9700 10 librbd::Request: 0x55d1b52c73e0 create_context_finisher
2018-10-24 15:31:43.959050 7f60e4ff9700 10 librbd::Request: 0x55d1b52c73e0 finish: r=0
2018-10-24 15:31:43.959053 7f60e4ff9700 20 librbd::ImageState: 0x55d1b52be190 handle_update_notification: refresh_seq = 1, last_refresh = 0
2018-10-24 15:31:43.959057 7f60e4ff9700 20 librbd::ImageState: 0x55d1b52c5a80 ImageUpdateWatchers::notify
2018-10-24 15:31:43.959059 7f60e4ff9700 10 librbd::ImageWatcher: 0x7f60c0000f30: notify_header_update
2018-10-24 15:31:43.959088 7f60e4ff9700 20 librbd::image_watcher::Notifier: notify: pending=1
2018-10-24 15:31:43.960203 7f60e4ff9700 10 librbd::ImageWatcher: 0x7f60c00090b0 C_NotifyAck start: id=3609640839413762, handle=140053514847008
2018-10-24 15:31:43.960212 7f60e4ff9700 10 librbd::ImageWatcher: 0x7f60c0000f30 image header updated
2018-10-24 15:31:43.960214 7f60e4ff9700 20 librbd::ImageState: 0x55d1b52be190 handle_update_notification: refresh_seq = 2, last_refresh = 0
2018-10-24 15:31:43.960216 7f60e4ff9700 20 librbd::ImageState: 0x55d1b52c5a80 ImageUpdateWatchers::notify
2018-10-24 15:31:43.960218 7f60e4ff9700 20 librbd::ImageState: 0x55d1b52be190 flush_update_watchers
2018-10-24 15:31:43.960219 7f60e4ff9700 20 librbd::ImageState: 0x55d1b52c5a80 ImageUpdateWatchers::flush
2018-10-24 15:31:43.960221 7f60e4ff9700 20 librbd::ImageState: 0x55d1b52c5a80 ImageUpdateWatchers::flush: completing flush
2018-10-24 15:31:43.960222 7f60e4ff9700 10 librbd::ImageWatcher: 0x7f60c000bee0 C_ResponseMessage: r=0
2018-10-24 15:31:43.960227 7f60e4ff9700 10 librbd::ImageWatcher: 0x7f60c00090b0 C_NotifyAck finish: id=3609640839413762, handle=140053514847008
2018-10-24 15:31:43.961047 7f60e4ff9700 20 librbd::image_watcher::Notifier: handle_notify: pending=0
2018-10-24 15:31:43.961102 7f60d7fff700 20 librbd::Operations: handle_local_request: r=0
2018-10-24 15:31:43.961131 7f6108052d80 20 librbd::ImageState: 0x55d1b52be190 close
2018-10-24 15:31:43.961136 7f6108052d80 10 librbd::ImageState: 0x55d1b52be190 0x55d1b52be190 send_close_unlock
2018-10-24 15:31:43.961151 7f6108052d80 10 librbd::image::CloseRequest: 0x55d1b52b7d20 send_shut_down_update_watchers
2018-10-24 15:31:43.961155 7f6108052d80 20 librbd::ImageState: 0x55d1b52be190 shut_down_update_watchers
2018-10-24 15:31:43.961156 7f6108052d80 20 librbd::ImageState: 0x55d1b52c5a80 ImageUpdateWatchers::shut_down
2018-10-24 15:31:43.961157 7f6108052d80 20 librbd::ImageState: 0x55d1b52c5a80 ImageUpdateWatchers::shut_down: completing shut down
2018-10-24 15:31:43.961170 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 handle_shut_down_update_watchers: r=0
2018-10-24 15:31:43.961173 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 send_unregister_image_watcher
2018-10-24 15:31:43.961175 7f60d7fff700 10 librbd::ImageWatcher: 0x7f60c0000f30 unregistering image watcher
2018-10-24 15:31:43.966040 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 handle_unregister_image_watcher: r=0
2018-10-24 15:31:43.966047 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 send_shut_down_aio_queue
2018-10-24 15:31:43.966050 7f60d7fff700  5 librbd::AioImageRequestWQ: shut_down: in_flight=0
2018-10-24 15:31:43.966057 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 handle_shut_down_aio_queue: r=0
2018-10-24 15:31:43.966060 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 send_flush
2018-10-24 15:31:43.966064 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 handle_flush: r=0
2018-10-24 15:31:43.966066 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 send_flush_readahead
2018-10-24 15:31:43.966069 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 handle_flush_readahead: r=0
2018-10-24 15:31:43.966071 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 send_shut_down_cache
2018-10-24 15:31:43.966122 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 handle_shut_down_cache: r=0
2018-10-24 15:31:43.966128 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 send_flush_op_work_queue
2018-10-24 15:31:43.966132 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 handle_flush_op_work_queue: r=0
2018-10-24 15:31:43.966136 7f60d7fff700 10 librbd::image::CloseRequest: 0x55d1b52b7d20 handle_flush_image_watcher: r=0
2018-10-24 15:31:43.966155 7f60d7fff700 10 librbd::ImageState: 0x55d1b52be190 0x55d1b52be190 handle_close: r=0
Actions #10

Updated by Jason Dillaman over 5 years ago

Nothing looks out-of-place in that log dump. I think the next step would be to add the following to your QEMU host's "ceph.conf" (adjust as necessary to ensure QEMU can write to the paths):

[client]
debug rbd = 20
log file = /var/log/ceph/qemu-guest-$pid.log
admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok

Restart QEMU to pick up the changes and re-create a snapshot and the issue. Basically, in the logs for QEMU, I would expect to see logs like:

librbd::ImageWatcher: ... C_NotifyAck start: id=..., handle=...
librbd::ImageWatcher: ... image header updated
librbd::ImageState: ... handle_update_notification: refresh_seq = X, last_refresh = Y
...
librbd::ImageState: ... refresh
...
librbd::image::RefreshRequest: ... apply
librbd::image::RefreshRequest: ... new snapshot id=... name=... size=..." 
Actions #11

Updated by Christian Theune over 5 years ago

Right, the specific snapshot is fine, too:

root@quimby ~ # rbd export rbd.hdd/test04.root@foobar - | md5sum
Exporting image: 100% complete...done.
f3d6fbb5308bc38668694facd082c94b  -
root@quimby ~ # rbd export rbd.hdd/test04.root@foobar - | md5sum
Exporting image: 100% complete...done.
f3d6fbb5308bc38668694facd082c94b  -

So I'll add that debugging level to our config and apply that to the VMs that triggered most often. I may need to let this run for a few days until it happens again ....

Thanks so far, I'll get back to you when I have data.

Actions #12

Updated by Christian Theune over 5 years ago

Alright, here's a Qemu log from a snapshot that failed tonight. Unfortunately the snapshot was removed afterwards. Newer snapshots have been stable again.

I have more log data around, limiting this to the relevant time where the snapshot was taken.

Actions #13

Updated by Irek Fasikhov over 5 years ago

Hi
Today I see the same problem on the Hummer release. VM in status off!
When cloning a disk and running VM "Boot Error..."

[root@ceph08p24 ceph]# ceph -v
ceph version 0.94.9 (fe6d859066244b97b24f09d46552afc2071e6f90)
[root@ceph01p24 log]# rbd --debug-rbd=20 snap create vm-238-disk-2@test2
2018-10-25 13:55:06.943015 7f2b7b6af7c0 20 librbd::ImageCtx: enabling caching...
2018-10-25 13:55:06.943069 7f2b7b6af7c0 20 librbd::ImageCtx: Initial cache settings: size=1342177280 num_objects=10 max_dirty=25165824 target_dirty=16777216 max_dirty_age=1
2018-10-25 13:55:06.943252 7f2b7b6af7c0 20 librbd: open_image: ictx = 0x431d660 name = 'vm-238-disk-2' id = '' snap_name = ''
2018-10-25 13:55:06.970412 7f2b7b6af7c0 20 librbd: detect format of vm-238-disk-2 : new
2018-10-25 13:55:06.973022 7f2b7b6af7c0 10 librbd::ImageCtx:  cache bytes 1342177280 -> about 2000 objects
2018-10-25 13:55:06.973027 7f2b7b6af7c0 10 librbd::ImageCtx: init_layout stripe_unit 4194304 stripe_count 1 object_size 4194304 prefix rbd_data.d813933d1b58ba format rbd_data.d813933d1b58ba.%016llx
2018-10-25 13:55:06.973090 7f2b7b6af7c0 10 librbd::ImageWatcher: 0x4326be0 registering image watcher
2018-10-25 13:55:06.974009 7f2b7b6af7c0 20 librbd: ictx_refresh 0x431d660
2018-10-25 13:55:06.976979 7f2b7b6af7c0 20 librbd: new snapshot id=30347 name=test size=34359738368 features=1
2018-10-25 13:55:06.976993 7f2b7b6af7c0 20 librbd: new snapshot id=30340 name=snap2510 size=34359738368 features=1
2018-10-25 13:55:06.976996 7f2b7b6af7c0 20 librbd: new snapshot id=28130 name=prod size=34359738368 features=1
2018-10-25 13:55:06.976997 7f2b7b6af7c0 20 librbd: new snapshot id=27933 name=configure size=34359738368 features=1
2018-10-25 13:55:06.977023 7f2b7b6af7c0 20 librbd::ImageCtx: waiting for cache to be flushed
2018-10-25 13:55:06.977026 7f2b7b6af7c0 20 librbd::ImageCtx: finished flushing cache
2018-10-25 13:55:06.977036 7f2b7b6af7c0 20 librbd: snap_create 0x431d660 test2
2018-10-25 13:55:06.977037 7f2b7b6af7c0 20 librbd: ictx_check 0x431d660
2018-10-25 13:55:06.977043 7f2b7b6af7c0 20 librbd: snap_create_helper 0x431d660 test2
2018-10-25 13:55:06.977043 7f2b7b6af7c0 20 librbd: ictx_check 0x431d660
2018-10-25 13:55:06.977045 7f2b7b6af7c0 20 librbd::ImageCtx: waiting for cache to be flushed
2018-10-25 13:55:06.977046 7f2b7b6af7c0 20 librbd::ImageCtx: finished flushing cache
2018-10-25 13:55:08.149165 7f2b7b6af7c0 20 librbd: notify_change refresh_seq = 0 last_refresh = 0
2018-10-25 13:55:08.149724 7f2b6d542700 10 librbd::ImageWatcher: 0x7f2b2c0008c0 C_NotifyAck start: id=470036925909236, handle=70448752
2018-10-25 13:55:08.149737 7f2b6d542700 10 librbd::ImageWatcher: 0x4326be0 image header updated
2018-10-25 13:55:08.149741 7f2b6d542700 10 librbd::ImageWatcher: 0x7f2b2c0008c0 C_NotifyAck finish: id=470036925909236, handle=70448752
2018-10-25 13:55:08.150311 7f2b7b6af7c0 20 librbd: close_image 0x431d660
2018-10-25 13:55:08.150319 7f2b7b6af7c0 10 librbd::ImageCtx: canceling async requests: count=0
2018-10-25 13:55:08.150371 7f2b7b6af7c0 10 librbd::ImageWatcher: 0x4326be0 unregistering image watcher

[root@ceph01p24 log]# rbd export vm-238-disk-2@test2 - | md5sum
Exporting image: 100% complete...done.
ebdb00fba79265639f393eaf1ba8cbd8  -
[root@ceph01p24 log]# rbd export vm-238-disk-2@test2 - | md5sum
Exporting image: 100% complete...done.
ebdb00fba79265639f393eaf1ba8cbd8  -
[root@ceph01p24 log]# rbd export vm-238-disk-2@test2 - | md5sum
Exporting image: 100% complete...done.
018b732ef3b50299cae9d6b2c1e18522  -
[root@ceph01p24 log]# rbd export vm-238-disk-2@test2 - | md5sum
Exporting image: 100% complete...done.
018b732ef3b50299cae9d6b2c1e18522  -
Actions #14

Updated by Irek Fasikhov over 5 years ago

additional tests

[root@ceph01p24 logs]# rbd export vm-238-disk-1@snap2510_2  ./test --debug-rbd=20 >&  log
[root@ceph01p24 logs]# rbd export vm-238-disk-1@snap2510_2  ./test2 --debug-rbd=20 >&  log2
[root@ceph01p24 logs]# rbd export vm-238-disk-1@snap2510_2  ./test3 --debug-rbd=20 >&  log3

[root@ceph01p24 logs]# md5sum test
62f622c7f19035a1e5b917e0a9225c31  test
[root@ceph01p24 logs]# md5sum test2
4abec29385525b7cad755a9faae8649d  test2
[root@ceph01p24 logs]# md5sum test3
62f622c7f19035a1e5b917e0a9225c31  test3

[root@ceph01p24 logs]# cmp -l test2 test > cmp_log
[root@ceph01p24 logs]# cmp -l test3 test > cmp_log

The offset starts at 42207281152 bytes
Log files: https://yadi.sk/d/qaFwzM6Pxf54PA

Actions #15

Updated by Christian Theune over 5 years ago

I'm still trying to get some meaningful output and to conserve the state of a broken snapshot. The organic use does show it very sporadically and my tooling is too eager to clean up after itself. I'm currently trying this to find a broken one:

(rbd snap purge rbd.hdd/test23.root; for x in {1..100}; do echo $x; date; rbd snap create rbd.hdd/test23.root@test$x; rbd export rbd.hdd/test23.root@test$x - | md5sum; done) &> /tmp/testrace

This now runs the CLI rbd client with 'debug rbd = 20' as well as the VM. It will likely take a while as the image is 30GiB large and the logging causes quite some output to appear ...

Actions #16

Updated by Christian Theune over 5 years ago

Ah, and of course that's a stupid loop, I'll need to read it twice for each snapshot, so ....

Actions #17

Updated by Christian Theune over 5 years ago

(Arg, the previous comment had broken markup which I didn't catch. Feel free to delete it. Here it is in proper form.)

Alright. I got some data. Here's two log files: one from the batch loop creating snapshots, and one from the qemu librbd client log.

The command I used to run this was:

(rbd snap purge rbd.hdd/test23.root; for x in {1..100}; do echo $x; date; rbd snap create rbd.hdd/test23.root@test$x; rbd export rbd.hdd/test23.root@test$x - | md5sum; rbd export rbd.hdd/test23.root@test$x - | md5sum; done) &> /tmp/testrace

In parallel I ran a fio in the VM with this command:

while true; do fio --name=test --rw=randrw --bs=4k --iodepth=64 --direct=1 --numjobs=10 --time_based --runtime=10 --size=100m; sleep 50; done

Interestingly, each of the snapshots was now broken, here's an extract of the testrace log file:

10
Mon Oct 29 10:17:31 CET 2018
Exporting image: 100% complete...done.
d57cc0aaf0265a5f652a8ad6b9d35f5a  -
Exporting image: 100% complete...done.
0869e0bd7adec5af79f7cd2801dbf274  -
11
Mon Oct 29 10:23:59 CET 2018
Exporting image: 100% complete...done.
827a357f9131b09ddb853b6b9c7faf80  -
Exporting image: 100% complete...done.
c58551fe04e8893677eb45f564eb8efa  -
12
Mon Oct 29 10:30:24 CET 2018
Exporting image: 100% complete...done.
0dbf48a07f901093547ec4ba7ffefea5  -
Exporting image: 100% complete...done.
29587faad6695020a78389e4105a67b7  -
13
Mon Oct 29 10:36:52 CET 2018
Exporting image: 100% complete...done.
0b6b1ec856cc6c1b17e62bc2d0a41746  -
Exporting image: 26% complete...^C

I extracted the Qemu log file for the event of run 12 (Mon Oct 29 10:30:24 CET 2018), limiting it to the period of 10:30-10:40 to reduze the amount of data I'm throwing over the fence here. The testrace log contains all runs as it's not as easy to filter while including the echoed output next to the librbd output.

I still have the VM running in this state and also the snapshots are still available for me for any further poking ...

As the packed files are still larger, I've uploaded them here:
http://shared00.fe.rzob.gocept.net/~ctheune/ceph-36580/

Just in case: if you like/want/need I can also be available interactively (I'm in CET) via IRC, Zoom or other measures ...

Actions #18

Updated by Christian Theune over 5 years ago

(Man, really sorry for the spam. I keep mistyping and it's quite unfortunate that I can't use the Redmine edit feature here ...)

Alright, I did dig around a little further and unfortunately the data I sent was incomplete (puppet removed the debugging options while the test was running for the client). One of the earlier runs has an interesting part on the Qemu side of things.

The snapshot being created (round 3 in the loop) looks like this:

2018-10-29 09:31:40.875153 7f6926dbfd80  5 librbd::AioImageRequestWQ: 0x561ce47167b0 : ictx=0x561ce4714cf0
2018-10-29 09:31:40.875177 7f6926dbfd80 20 librbd::ImageState: 0x561ce470e240 open
2018-10-29 09:31:40.875181 7f6926dbfd80 10 librbd::ImageState: 0x561ce470e240 0x561ce470e240 send_open_unlock
2018-10-29 09:31:40.875202 7f6926dbfd80 10 librbd::image::OpenRequest: 0x561ce4717090 send_v2_detect_header
2018-10-29 09:31:40.877326 7f68fb7fe700 10 librbd::image::OpenRequest: handle_v2_detect_header: r=0
2018-10-29 09:31:40.877340 7f68fb7fe700 10 librbd::image::OpenRequest: 0x561ce4717090 send_v2_get_id
2018-10-29 09:31:40.878407 7f68fb7fe700 10 librbd::image::OpenRequest: handle_v2_get_id: r=0
2018-10-29 09:31:40.878424 7f68fb7fe700 10 librbd::image::OpenRequest: 0x561ce4717090 send_v2_get_immutable_metadata
2018-10-29 09:31:40.880878 7f68fb7fe700 10 librbd::image::OpenRequest: handle_v2_get_immutable_metadata: r=0
2018-10-29 09:31:40.880894 7f68fb7fe700 10 librbd::image::OpenRequest: 0x561ce4717090 send_v2_get_stripe_unit_count
2018-10-29 09:31:40.881734 7f68fb7fe700 10 librbd::image::OpenRequest: handle_v2_get_stripe_unit_count: r=-8
2018-10-29 09:31:40.881765 7f68fb7fe700 10 librbd::ImageCtx: init_layout stripe_unit 4194304 stripe_count 1 object_size 4194304 prefix rbd_data.b8e5fa2eb141f2 format rbd_data.b8e5fa2eb141f2.%016llx
2018-10-29 09:31:40.881774 7f68fb7fe700 10 librbd::image::OpenRequest: 0x561ce4717090 send_v2_apply_metadata: start_key=conf_
2018-10-29 09:31:40.882622 7f68fb7fe700 10 librbd::image::OpenRequest: 0x561ce4717090 handle_v2_apply_metadata: r=0
2018-10-29 09:31:40.882647 7f68fb7fe700 20 librbd::ImageCtx: apply_metadata
2018-10-29 09:31:40.882877 7f68fb7fe700 20 librbd::ImageCtx: enabling caching...
2018-10-29 09:31:40.882888 7f68fb7fe700 20 librbd::ImageCtx: Initial cache settings: size=33554432 num_objects=10 max_dirty=25165824 target_dirty=16777216 max_dirty_age=1
2018-10-29 09:31:40.882986 7f68fb7fe700 10 librbd::ImageCtx:  cache bytes 33554432 -> about 1048 objects
2018-10-29 09:31:40.883037 7f68fb7fe700 10 librbd::image::OpenRequest: 0x561ce4717090 send_register_watch
2018-10-29 09:31:40.883224 7f68fb7fe700 10 librbd::ImageWatcher: 0x7f68e0000f30 registering image watcher
2018-10-29 09:31:40.893348 7f68fb7fe700 10 librbd::image::OpenRequest: 0x561ce4717090 handle_register_watch: r=0
2018-10-29 09:31:40.893362 7f68fb7fe700 10 librbd::image::OpenRequest: 0x561ce4717090 send_refresh
2018-10-29 09:31:40.893368 7f68fb7fe700 10 librbd::image::RefreshRequest: 0x7f68e0008a50 send_v2_get_mutable_metadata
2018-10-29 09:31:40.894622 7f68fb7fe700 10 librbd::image::RefreshRequest: 0x7f68e0008a50 handle_v2_get_mutable_metadata: r=0
2018-10-29 09:31:40.894645 7f68fb7fe700 10 librbd::image::RefreshRequest: 0x7f68e0008a50 send_v2_get_flags
2018-10-29 09:31:40.895679 7f68fb7fe700 10 librbd::image::RefreshRequest: 0x7f68e0008a50 handle_v2_get_flags: r=0
2018-10-29 09:31:40.895696 7f68fb7fe700 10 librbd::image::RefreshRequest: 0x7f68e0008a50 send_v2_get_snapshots
2018-10-29 09:31:40.897269 7f68fb7fe700 10 librbd::image::RefreshRequest: 0x7f68e0008a50 handle_v2_get_snapshots: r=0
2018-10-29 09:31:40.897293 7f68fb7fe700 10 librbd::image::RefreshRequest: 0x7f68e0008a50 send_v2_apply
2018-10-29 09:31:40.897400 7f68faffd700 10 librbd::image::RefreshRequest: 0x7f68e0008a50 handle_v2_apply
2018-10-29 09:31:40.897408 7f68faffd700 20 librbd::image::RefreshRequest: 0x7f68e0008a50 apply
2018-10-29 09:31:40.897421 7f68faffd700 20 librbd::image::RefreshRequest: new snapshot id=375273 name=test2 size=32212254720
2018-10-29 09:31:40.897430 7f68faffd700 20 librbd::image::RefreshRequest: new snapshot id=375272 name=test1 size=32212254720
2018-10-29 09:31:40.897446 7f68faffd700 10 librbd::image::RefreshRequest: 0x7f68e0008a50 send_flush_aio
2018-10-29 09:31:40.897456 7f68faffd700 10 librbd::image::RefreshRequest: 0x7f68e0008a50 handle_flush_aio: r=0
2018-10-29 09:31:40.897463 7f68faffd700 10 librbd::image::OpenRequest: handle_refresh: r=0
2018-10-29 09:31:40.897468 7f68faffd700 10 librbd::ImageState: 0x561ce470e240 0x561ce470e240 handle_open: r=0
2018-10-29 09:31:40.897559 7f6926dbfd80  5 librbd::Operations: 0x561ce4715cb0 snap_create: snap_name=test3
2018-10-29 09:31:40.897582 7f6926dbfd80 20 librbd::Operations: send_local_request
2018-10-29 09:31:40.897586 7f6926dbfd80  5 librbd::Operations: 0x561ce4715cb0 execute_snap_create: snap_name=test3
2018-10-29 09:31:40.897601 7f6926dbfd80  5 librbd::SnapshotCreateRequest: 0x561ce4717490 send_suspend_requests
2018-10-29 09:31:40.897604 7f6926dbfd80  5 librbd::SnapshotCreateRequest: 0x561ce4717490 send_suspend_aio
2018-10-29 09:31:40.897607 7f6926dbfd80  5 librbd::AioImageRequestWQ: block_writes: 0x561ce4714cf0, num=1
2018-10-29 09:31:40.897672 7f68faffd700  5 librbd::SnapshotCreateRequest: 0x561ce4717490 handle_suspend_aio: r=0
2018-10-29 09:31:40.897680 7f68faffd700  5 librbd::SnapshotCreateRequest: 0x561ce4717490 send_allocate_snap_id
2018-10-29 09:31:41.175544 7f68faffd700  5 librbd::SnapshotCreateRequest: 0x561ce4717490 handle_allocate_snap_id: r=0, snap_id=375274
2018-10-29 09:31:41.175567 7f68faffd700  5 librbd::SnapshotCreateRequest: 0x561ce4717490 send_create_snap
2018-10-29 09:31:41.220048 7f68fb7fe700  5 librbd::SnapshotCreateRequest: 0x561ce4717490 handle_create_snap: r=0
2018-10-29 09:31:41.220062 7f68fb7fe700  5 librbd::SnapshotCreateRequest: 0x561ce4717490 update_snap_context
2018-10-29 09:31:41.220079 7f68fb7fe700  5 librbd::AioImageRequestWQ: unblock_writes: 0x561ce4714cf0, num=0
2018-10-29 09:31:41.220093 7f68fb7fe700 10 librbd::Request: 0x561ce4717490 create_context_finisher
2018-10-29 09:31:41.220098 7f68fb7fe700 10 librbd::Request: 0x561ce4717490 finish: r=0
2018-10-29 09:31:41.220103 7f68fb7fe700 20 librbd::ImageState: 0x561ce470e240 handle_update_notification: refresh_seq = 1, last_refresh = 0
2018-10-29 09:31:41.220108 7f68fb7fe700 20 librbd::ImageState: 0x561ce4715b30 ImageUpdateWatchers::notify
2018-10-29 09:31:41.220111 7f68fb7fe700 10 librbd::ImageWatcher: 0x7f68e0000f30: notify_header_update
2018-10-29 09:31:41.220153 7f68fb7fe700 20 librbd::image_watcher::Notifier: notify: pending=1
2018-10-29 09:31:41.220979 7f68fb7fe700 10 librbd::ImageWatcher: 0x7f68e00092f0 C_NotifyAck start: id=3614189209780254, handle=140088411456288
2018-10-29 09:31:41.220990 7f68fb7fe700 10 librbd::ImageWatcher: 0x7f68e0000f30 image header updated
2018-10-29 09:31:41.220993 7f68fb7fe700 20 librbd::ImageState: 0x561ce470e240 handle_update_notification: refresh_seq = 2, last_refresh = 0
2018-10-29 09:31:41.220996 7f68fb7fe700 20 librbd::ImageState: 0x561ce4715b30 ImageUpdateWatchers::notify
2018-10-29 09:31:41.221000 7f68fb7fe700 20 librbd::ImageState: 0x561ce470e240 flush_update_watchers
2018-10-29 09:31:41.221002 7f68fb7fe700 20 librbd::ImageState: 0x561ce4715b30 ImageUpdateWatchers::flush
2018-10-29 09:31:41.221004 7f68fb7fe700 20 librbd::ImageState: 0x561ce4715b30 ImageUpdateWatchers::flush: completing flush
2018-10-29 09:31:41.221006 7f68fb7fe700 10 librbd::ImageWatcher: 0x7f68e0010ee0 C_ResponseMessage: r=0
2018-10-29 09:31:41.221014 7f68fb7fe700 10 librbd::ImageWatcher: 0x7f68e00092f0 C_NotifyAck finish: id=3614189209780254, handle=140088411456288
2018-10-29 09:31:41.221701 7f68fb7fe700 20 librbd::image_watcher::Notifier: handle_notify: pending=0
2018-10-29 09:31:41.221735 7f68faffd700 20 librbd::Operations: handle_local_request: r=0
2018-10-29 09:31:41.221851 7f6926dbfd80 20 librbd::ImageState: 0x561ce470e240 close
2018-10-29 09:31:41.221872 7f6926dbfd80 10 librbd::ImageState: 0x561ce470e240 0x561ce470e240 send_close_unlock
2018-10-29 09:31:41.221877 7f6926dbfd80 10 librbd::image::CloseRequest: 0x561ce4707dd0 send_shut_down_update_watchers
2018-10-29 09:31:41.221889 7f6926dbfd80 20 librbd::ImageState: 0x561ce470e240 shut_down_update_watchers
2018-10-29 09:31:41.221893 7f6926dbfd80 20 librbd::ImageState: 0x561ce4715b30 ImageUpdateWatchers::shut_down
2018-10-29 09:31:41.221894 7f6926dbfd80 20 librbd::ImageState: 0x561ce4715b30 ImageUpdateWatchers::shut_down: completing shut down
2018-10-29 09:31:41.222032 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 handle_shut_down_update_watchers: r=0
2018-10-29 09:31:41.222043 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 send_unregister_image_watcher
2018-10-29 09:31:41.222047 7f68faffd700 10 librbd::ImageWatcher: 0x7f68e0000f30 unregistering image watcher
2018-10-29 09:31:41.228356 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 handle_unregister_image_watcher: r=0
2018-10-29 09:31:41.228369 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 send_shut_down_aio_queue
2018-10-29 09:31:41.228375 7f68faffd700  5 librbd::AioImageRequestWQ: shut_down: in_flight=0
2018-10-29 09:31:41.228383 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 handle_shut_down_aio_queue: r=0
2018-10-29 09:31:41.228387 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 send_flush
2018-10-29 09:31:41.228392 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 handle_flush: r=0
2018-10-29 09:31:41.228394 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 send_flush_readahead
2018-10-29 09:31:41.228397 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 handle_flush_readahead: r=0
2018-10-29 09:31:41.228399 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 send_shut_down_cache
2018-10-29 09:31:41.228637 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 handle_shut_down_cache: r=0
2018-10-29 09:31:41.228651 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 send_flush_op_work_queue
2018-10-29 09:31:41.228659 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 handle_flush_op_work_queue: r=0
2018-10-29 09:31:41.228664 7f68faffd700 10 librbd::image::CloseRequest: 0x561ce4707dd0 handle_flush_image_watcher: r=0
2018-10-29 09:31:41.228688 7f68faffd700 10 librbd::ImageState: 0x561ce470e240 0x561ce470e240 han

The log output from the affected VM during this time is suprisingly silent for about 11 seconds starting just right before the snapshot creation:

2018-10-29 09:31:40.163998 7f40a3fff700 10 librbd::ImageCtx: prune_parent_extents image overlap 0, object overlap 0 from image extents []
2018-10-29 09:31:40.164007 7f40a3fff700 20 librbd::AioObjectRequest: send write 0x7f409c009330 rbd_data.b8e5fa2eb141f2.0000000000000625 1507328~4096
2018-10-29 09:31:40.164012 7f40a3fff700 20 librbd::AioObjectRequest: send_write 0x7f409c009330 rbd_data.b8e5fa2eb141f2.0000000000000625 1507328~4096 object exist 1 write_full 0
2018-10-29 09:31:40.164016 7f40a3fff700 20 librbd::AioObjectRequest: send_write 0x7f409c009330 rbd_data.b8e5fa2eb141f2.0000000000000625 1507328~4096 object exist 1
2018-10-29 09:31:40.168765 7f40b0ff9700 20 librbd::AioObjectRequest: write 0x7f409c009330 rbd_data.b8e5fa2eb141f2.0000000000000625 1507328~4096 should_complete: r = 0
2018-10-29 09:31:40.168785 7f40b0ff9700 20 librbd::AioObjectRequest: WRITE_FLAT
2018-10-29 09:31:40.168789 7f40b0ff9700 20 librbd::AioObjectRequest: complete 0x7f409c009330
2018-10-29 09:31:40.168793 7f40b0ff9700 20 librbd::AioCompletion: 0x5642ab904c00 complete_request: cb=1, pending=0
2018-10-29 09:31:40.168796 7f40b0ff9700 20 librbd::AioCompletion: 0x5642ab904c00 finalize: r=0, read_buf=0, real_bl=0
2018-10-29 09:31:40.168813 7f40b0ff9700 20 librbd::AsyncOperation: 0x5642ab904d38 finish_op
# XXX ?!?! ... what does this pause tell us?
2018-10-29 09:31:51.795708 7f40d751bb80 20 librbd::AioImageRequestWQ: aio_write: ictx=0x5642aa700d20, completion=0x5642ab7e0dc0, off=8425472, len=4096, flags=0
2018-10-29 09:31:51.795752 7f40d751bb80 20 librbd::AioImageRequestWQ: queue: ictx=0x5642aa700d20, req=0x5642aa6b03b0
2018-10-29 09:31:51.795851 7f4043fff700 20 librbd::AsyncOperation: 0x5642ab7e0ef8 start_op
2018-10-29 09:31:51.795863 7f4043fff700 20 librbd::AioImageRequestWQ: process: ictx=0x5642aa700d20, req=0x5642aa6b03b0
2018-10-29 09:31:51.795868 7f4043fff700 20 librbd::AioImageRequest: aio_write: ictx=0x5642aa700d20, completion=0x5642ab7e0dc0
2018-10-29 09:31:51.795894 7f4043fff700 20 librbd::AioCompletion: 0x5642ab7e0dc0 set_request_count: pending=1
2018-10-29 09:31:51.795899 7f4043fff700 20 librbd::AioImageRequest:  oid rbd_data.b87d27515f007c.0000000000000002 36864~4096 from [0,4096]
2018-10-29 09:31:51.795918 7f4043fff700 10 librbd::ImageCtx: prune_parent_extents image overlap 0, object overlap 0 from image extents []

Also note that those have been collected on the same host, so the snapshot creation log and the running VM log should all have proper micro timestamps stemming from the very same clock.

Actions #19

Updated by Christian Theune over 5 years ago

Alright, so you mentioned what the log should look like and ... tada ... the Qemu side doesn't show any indication of the `handle_update_notification` call. I went back a step and apparently Qemu has lost the watch:

root@quimby ~ # rbd status rbd.hdd/test23.root
Watchers: none

I guess I'll dig around for other bugs under what conditions watchers got lost in the past ...

Actions #20

Updated by Christian Theune over 5 years ago

Ok, this is going somewhere, apparently the watcher got lost a 'long time ago':

root@quimby ~ # zgrep -i "Watch" /var/log/ceph/client-27993.log
2018-10-24 16:04:15.240357 7f40b0ff9700 10 librbd::image::OpenRequest: 0x5642aa6ab240 send_register_watch
2018-10-24 16:04:15.240511 7f40b0ff9700 10 librbd::ImageWatcher: 0x7f408c000c10 registering image watcher
2018-10-24 16:04:15.246914 7f40b0ff9700 10 librbd::image::OpenRequest: 0x5642aa6ab240 handle_register_watch: r=0
2018-10-24 16:04:15.267749 7f4080ff9700 10 librbd::image::OpenRequest: 0x5642aa6de4f0 send_register_watch
2018-10-24 16:04:15.267872 7f4080ff9700 10 librbd::ImageWatcher: 0x7f405c002330 registering image watcher
2018-10-24 16:04:15.280427 7f4080ff9700 10 librbd::image::OpenRequest: 0x5642aa6de4f0 handle_register_watch: r=0
2018-10-24 16:04:15.303295 7f4050ff9700 10 librbd::image::OpenRequest: 0x5642aa701f90 send_register_watch
2018-10-24 16:04:15.303419 7f4050ff9700 10 librbd::ImageWatcher: 0x7f402c002330 registering image watcher
2018-10-24 16:04:15.313604 7f4050ff9700 10 librbd::image::OpenRequest: 0x5642aa701f90 handle_register_watch: r=0
2018-10-24 16:25:50.688595 7f40b0ff9700 -1 librbd::ImageWatcher: 0x7f408c000c10 image watch failed: 139915203449904, (107) Transport endpoint is not connected
2018-10-24 16:25:50.688617 7f40b0ff9700 10 librbd::ImageWatcher: 0x7f408c000c10 current lock owner: [0,0]
2018-10-24 16:25:50.688755 7f40a2ffd700 10 librbd::ImageWatcher: 0x7f408c000c10 re-registering image watch
2018-10-24 16:25:50.688787 7f40a2ffd700 10 librbd::image_watcher::RewatchRequest: 0x7f3f6c0025e0 unwatch
2018-10-24 16:25:50.690942 7f40b0ff9700 10 librbd::image_watcher::RewatchRequest: 0x7f3f6c0025e0 handle_unwatch r=0
2018-10-24 16:25:50.690956 7f40b0ff9700 10 librbd::image_watcher::RewatchRequest: 0x7f3f6c0025e0 rewatch
2018-10-24 16:25:50.695590 7f40b0ff9700 -1 librbd::ImageWatcher: 0x7f408c000c10 image watch failed: 139915203449904, (107) Transport endpoint is not connected
2018-10-24 16:25:50.695607 7f40b0ff9700 10 librbd::ImageWatcher: 0x7f408c000c10 current lock owner: [0,0]
2018-10-24 16:25:50.697635 7f40b0ff9700 10 librbd::image_watcher::RewatchRequest: 0x7f3f6c0025e0 handle_rewatch r=0
2018-10-24 16:25:50.697643 7f40b0ff9700 10 librbd::image_watcher::RewatchRequest: 0x7f3f6c0025e0 finish r=0
2018-10-24 16:25:50.697647 7f40b0ff9700 10 librbd::ImageWatcher: 0x7f408c000c10 handle_rewatch: r=0
2018-10-24 16:25:50.697650 7f40b0ff9700 10 librbd::ImageWatcher: 0x7f408c000c10 image header updated
2018-10-24 16:25:50.697663 7f40b0ff9700 20 librbd::ImageState: 0x5642aa6b9610 ImageUpdateWatchers::notify

Interestingly enough I'm stumbling over this issue that a colleague of mine who's been using a very similar approach towards snapshotting and creating backups:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1607694

Our cluster did show latency issues at that time:

2018-10-24 16:24:58.150218 osd.0 172.20.4.6:6801/873162 58 : cluster [WRN] 291 slow requests, 5 included below; oldest blocked for > 30.812638 secs
2018-10-24 16:24:58.150223 osd.0 172.20.4.6:6801/873162 59 : cluster [WRN] slow request 30.812638 seconds old, received at 2018-10-24 16:24:27.337492: osd_repop(client.98262053.0:15808331 616.26d 616:b6452d5c:::rbd_data.bc2139238e1f29.
2018-10-24 16:24:58.150228 osd.0 172.20.4.6:6801/873162 60 : cluster [WRN] slow request 30.786387 seconds old, received at 2018-10-24 16:24:27.363742: osd_repop(client.98262053.0:15808360 616.31b 616:d8c5a140:::rbd_data.bc2139238e1f29.
2018-10-24 16:24:58.150233 osd.0 172.20.4.6:6801/873162 61 : cluster [WRN] slow request 30.721019 seconds old, received at 2018-10-24 16:24:27.429111: osd_op(client.98262053.0:15808426 616.6eb57f8f rbd_data.bc2139238e1f29.0000000000000
2018-10-24 16:24:58.150236 osd.0 172.20.4.6:6801/873162 62 : cluster [WRN] slow request 30.585740 seconds old, received at 2018-10-24 16:24:27.564390: osd_repop(client.98262053.0:15808561 616.6d 616:b60be8ec:::rbd_data.bc2139238e1f29.0
2018-10-24 16:24:58.150239 osd.0 172.20.4.6:6801/873162 63 : cluster [WRN] slow request 30.502090 seconds old, received at 2018-10-24 16:24:27.648040: osd_op(client.98262053.0:15808646 616.4ec15934 rbd_data.bc2139238e1f29.0000000000001

So that does explain the silence in the Qemu side of things, until it resolves:

2018-10-24 16:25:50.161530 osd.0 172.20.4.6:6801/873162 370 : cluster [WRN] 331 slow requests, 5 included below; oldest blocked for > 46.603743 secs
2018-10-24 16:25:50.161542 osd.0 172.20.4.6:6801/873162 371 : cluster [WRN] slow request 31.342276 seconds old, received at 2018-10-24 16:25:18.819136: osd_repop(client.101899129.0:707984 616.9b 616:d91e929d:::rbd_data.ec4e6d6b8b4567.0
2018-10-24 16:25:50.161545 osd.0 172.20.4.6:6801/873162 372 : cluster [WRN] slow request 32.914782 seconds old, received at 2018-10-24 16:25:17.246629: osd_repop(client.101014336.0:14382038 616.3c6 616:63e61e9e:::rbd_data.9a59702ae8944
2018-10-24 16:25:50.161548 osd.0 172.20.4.6:6801/873162 373 : cluster [WRN] slow request 31.930717 seconds old, received at 2018-10-24 16:25:18.230694: osd_repop(client.101014336.0:14382081 616.298 616:1976fb5a:::rbd_data.9a59702ae8944
2018-10-24 16:25:50.161550 osd.0 172.20.4.6:6801/873162 374 : cluster [WRN] slow request 46.460521 seconds old, received at 2018-10-24 16:25:03.700891: osd_repop(client.101014336.0:14381870 616.120 616:04b11d99:::rbd_data.9a59702ae8944
2018-10-24 16:25:50.161553 osd.0 172.20.4.6:6801/873162 375 : cluster [WRN] slow request 46.297014 seconds old, received at 2018-10-24 16:25:03.864398: osd_repop(client.101014336.0:14381890 616.120 616:04b11d99:::rbd_data.9a59702ae8944

Side-note: earlier this day we were struggling with network issues (nf_conntrack problems but that was cleanly resolved around 6 hours earlier).

There seems to be something happening on OSD.0 at that point that is a bit strange. I see the queue growing but also IO happening. No complaints from the OSD that I can see.

Here's metrics for the OSD and affected disk:
https://stats.flyingcircus.io/grafana/dashboard/snapshot/960dBMeY76XiSDfUOinH6vs3s0Gv2Pes?orgId=1

We're logging with this on the OSDs:

debug filestore = 4
debug mon = 4
debug osd = 4
debug journal = 4
debug throttle = 4

So I guess there could be two things at play:

1. There's still a race condition somewhere.
2. We silently give up on the watcher but allow the RBD connection to continue working.

Apparently the VM was blocked already due to slow requests anyway, I guess we can just keep blocking until we do get a watcher ... not sure whether the 'Transport endpoint not connected' might be something completely different, though. I've seen various fixes/commits around this error message both due to race conditions and due to the targeted object changing unexpectedly.

Finding the source of the slow requests would also be interesting, but that appears to be maybe be a different story as those shouldn't result in a lost watch anyway ...

Actions #21

Updated by Christian Theune over 5 years ago

Ooooh, and it gets more mysterious even as I've stumbled over another bug report here that I was also already involved in a while ago:
http://tracker.ceph.com/issues/17913

It appears I've seen a watch-related issue a while ago, too, but we haven't seen it completely getting lost back then. (Also, that was still on Hammer and we're talking about Jewel here).

I guess I'll review change logs of the newer releases carefully for anything that indicates issues in this area that have been fixed that might be available for backporting ...

Actions #22

Updated by Jason Dillaman over 5 years ago

@Christian: sorry for the delay. It definitely seems clear that you lost your watch (which would definitely explain the changing snapshots), but it should have re-established itself. In your "/var/log/ceph/client-27993.log" log, was that for the test23 image? Was that for a Jewel-based librbd QEMU instance?

Actions #23

Updated by Christian Theune over 5 years ago

No worries, thanks for picking this up.

I've got everything still in the state from my analysis, so the client log does correspond to the machine and it's Jewel-based:

root@quimby /run/ceph # ceph --admin-daemon rbd-27993-94844326587664.asok version {"version":"10.2.11"}

Let me know if you need anything from the running system or more log data. I'll be hanging around as @theuni in #ceph-devel today.

Actions #24

Updated by Jason Dillaman over 5 years ago

@Christian: you have log for that VM from at least the 24th (when it lost and supposedly regained the watch) all the way to the 29th when it became apparent that the watch was lost? If so, was the 24th really the last time you saw anything from the "ImageWatcher"? If possible, it might be nice to re-run with "debug objecter = 20" and "debug ms = 1" for the VM so that it records the messages to and from the OSDs. It's quite possible there is some edge condition where perhaps the PG associated w/ the image header migrated or something similar and the watch was transparently lost and not reacquired.

Actions #25

Updated by Christian Theune over 5 years ago

Right. I've uploaded the full log here:
http://shared00.fe.rzob.gocept.net/~ctheune/ceph-36580/

I'm restarting the VM with the adjusted debug settings now. I'll leave debug rbd = 20 in.

Actions #26

Updated by Christian Theune over 5 years ago

Alright, I'm not able to reproduce this with the logging enabled.

Things I tried:

  • letting the VM just sit around
  • running fio on the OSD disks to enfore slow requests
  • causing OSDs to flap on/off
  • blocking storage access network on OSDs
  • blocking storage access network on KVM host
  • SIGSTOPing OSDs (with noout/nodown)

Things I noticed:

  • The original log up there would indicate that the messages would indicate a successful re-establishment of the watch (at 2018-10-24 16:25:50). (Looking at the code r=0 tells me that it should be a success.)
  • I didn't have any moving PGs at that point according to the cluster log.
  • Two OSDs (0 and 6) which are relevant did show a weird disk access pattern around that time and were piling up slow requests, so that would be a trigger of loosing and re-establishing the watch.
  • That means we actually don't really know when the watch was lost ... (except that we never hear about the watch on that log level again). This maybe indicates that the client didn't know that the watch was lost - which kinda makes sense because otherwise the client would be blocking writes. So the cluster somehow lost the watch that the client thinks it still has ... Maybe that theory is a path we could follow? Obviously there would be inconsistencies like the client still doing 'watch ping cookie' operations and something lying about it ...

Something I'll try next: identify maybe another long running VM that still has this issue and look at the data on the header object, maybe that indicates something ...

Actions #27

Updated by Christian Theune over 5 years ago

Alright, I found a VM that has been online for a while and where one of the disks is in the mentioned state.
I managed to increase the log level for that VM and have used the asok to set:

    "debug_ms": "1\/1",
    "debug_rbd": "20\/20",
    "debug_objecter": "20\/20",

Here's a bit of log output and what I'm missing is the `watch ping cookie` operation that I usually see with `rbd 20` on other connections ...

I'll keep this VM online and I can poke it interactivately. I'll be on IRC in case that helps.

Nov  7 11:19:08 clancy qemu-system-x86_64: 2018-11-07 11:19:08.664787 7ff677fff700  1 -- 172.20.4.90:0/1264248106 --> 172.20.4.6:6811/806558 -- ping magic: 0 v1 -- ?+0 0x7ff5cc00ab40 con 0x7ff66c01e5f0
Nov  7 11:19:09 clancy qemu-system-x86_64: 2018-11-07 11:19:09.927657 7ff6777fe700  1 -- 172.20.4.90:0/1264248106 <== mon.0 172.20.4.6:6789/0 87739 ==== osd_map(927855..927855 src has 927297..927855) v3 ==== 521+0+0 (3330975465 0 0) 0x7ff6400025a0 con 0x7ff66c04ae90
Nov  7 11:19:09 clancy qemu-system-x86_64: 2018-11-07 11:19:09.927685 7ff6777fe700 10 client.100603686.objecter ms_dispatch 0x564526ec8d90 osd_map(927855..927855 src has 927297..927855) v3
Nov  7 11:19:09 clancy qemu-system-x86_64: 2018-11-07 11:19:09.927704 7ff6777fe700  3 client.100603686.objecter handle_osd_map got epochs [927855,927855] > 927854
Nov  7 11:19:09 clancy qemu-system-x86_64: 2018-11-07 11:19:09.927708 7ff6777fe700  3 client.100603686.objecter handle_osd_map decoding incremental epoch 927855
Nov  7 11:19:09 clancy qemu-system-x86_64: 2018-11-07 11:19:09.927746 7ff6777fe700 10 client.100603686.objecter  checking linger op 3
Nov  7 11:19:09 clancy qemu-system-x86_64: 2018-11-07 11:19:09.927772 7ff6777fe700 20 client.100603686.objecter dump_active .. 0 homeless
Nov  7 11:19:13 clancy qemu-system-x86_64: 2018-11-07 11:19:13.665009 7ff677fff700 10 client.100603686.objecter tick
Nov  7 11:19:13 clancy qemu-system-x86_64: 2018-11-07 11:19:13.665034 7ff677fff700 10 client.100603686.objecter  pinging osd that serves lingering tid 3 (osd.6)
Nov  7 11:19:13 clancy qemu-system-x86_64: 2018-11-07 11:19:13.665116 7ff677fff700 10 client.100603686.objecter _maybe_request_map subscribing (onetime) to next osd map
Nov  7 11:19:13 clancy qemu-system-x86_64: 2018-11-07 11:19:13.665141 7ff677fff700  1 -- 172.20.4.90:0/1264248106 --> 172.20.4.6:6789/0 -- mon_subscribe({osdmap=927856}) v2 -- ?+0 0x7ff5cc0185e0 con 0x7ff66c04ae90
Nov  7 11:19:13 clancy qemu-system-x86_64: 2018-11-07 11:19:13.665165 7ff677fff700  1 -- 172.20.4.90:0/1264248106 --> 172.20.4.6:6811/806558 -- ping magic: 0 v1 -- ?+0 0x7ff5cc00c4f0 con 0x7ff66c01e5f0
Nov  7 11:19:15 clancy qemu-system-x86_64: 2018-11-07 11:19:15.432819 7ff6777fe700  1 -- 172.20.4.90:0/1264248106 <== mon.0 172.20.4.6:6789/0 87740 ==== osd_map(927856..927856 src has 927297..927856) v3 ==== 521+0+0 (1297441584 0 0) 0x7ff6400025a0 con 0x7ff66c04ae90
Nov  7 11:19:15 clancy qemu-system-x86_64: 2018-11-07 11:19:15.432847 7ff6777fe700 10 client.100603686.objecter ms_dispatch 0x564526ec8d90 osd_map(927856..927856 src has 927297..927856) v3
Nov  7 11:19:15 clancy qemu-system-x86_64: 2018-11-07 11:19:15.432865 7ff6777fe700  3 client.100603686.objecter handle_osd_map got epochs [927856,927856] > 927855
Nov  7 11:19:15 clancy qemu-system-x86_64: 2018-11-07 11:19:15.432869 7ff6777fe700  3 client.100603686.objecter handle_osd_map decoding incremental epoch 927856
Nov  7 11:19:15 clancy qemu-system-x86_64: 2018-11-07 11:19:15.432908 7ff6777fe700 10 client.100603686.objecter  checking linger op 3
Nov  7 11:19:15 clancy qemu-system-x86_64: 2018-11-07 11:19:15.432933 7ff6777fe700 20 client.100603686.objecter dump_active .. 0 homeless
Nov  7 11:19:18 clancy qemu-system-x86_64: 2018-11-07 11:19:18.665284 7ff677fff700 10 client.100603686.objecter tick
Nov  7 11:19:18 clancy qemu-system-x86_64: 2018-11-07 11:19:18.665309 7ff677fff700 10 client.100603686.objecter  pinging osd that serves lingering tid 3 (osd.6)
Nov  7 11:19:18 clancy qemu-system-x86_64: 2018-11-07 11:19:18.665319 7ff677fff700 10 client.100603686.objecter _maybe_request_map subscribing (onetime) to next osd map
Nov  7 11:19:18 clancy qemu-system-x86_64: 2018-11-07 11:19:18.665343 7ff677fff700  1 -- 172.20.4.90:0/1264248106 --> 172.20.4.6:6789/0 -- mon_subscribe({osdmap=927857}) v2 -- ?+0 0x7ff5cc00c4f0 con 0x7ff66c04ae90
Nov  7 11:19:18 clancy qemu-system-x86_64: 2018-11-07 11:19:18.665367 7ff677fff700  1 -- 172.20.4.90:0/1264248106 --> 172.20.4.6:6811/806558 -- ping magic: 0 v1 -- ?+0 0x7ff5cc00d560 con 0x7ff66c01e5f0
Nov  7 11:19:23 clancy qemu-system-x86_64: 2018-11-07 11:19:23.056274 7ff6777fe700  1 -- 172.20.4.90:0/1264248106 <== mon.0 172.20.4.6:6789/0 87741 ==== osd_map(927857..927857 src has 927297..927857) v3 ==== 521+0+0 (1385899805 0 0) 0x7ff6400025a0 con 0x7ff66c04ae90
Nov  7 11:19:23 clancy qemu-system-x86_64: 2018-11-07 11:19:23.056302 7ff6777fe700 10 client.100603686.objecter ms_dispatch 0x564526ec8d90 osd_map(927857..927857 src has 927297..927857) v3
Nov  7 11:19:23 clancy qemu-system-x86_64: 2018-11-07 11:19:23.056320 7ff6777fe700  3 client.100603686.objecter handle_osd_map got epochs [927857,927857] > 927856
Nov  7 11:19:23 clancy qemu-system-x86_64: 2018-11-07 11:19:23.056325 7ff6777fe700  3 client.100603686.objecter handle_osd_map decoding incremental epoch 927857
Nov  7 11:19:23 clancy qemu-system-x86_64: 2018-11-07 11:19:23.056367 7ff6777fe700 10 client.100603686.objecter  checking linger op 3
Nov  7 11:19:23 clancy qemu-system-x86_64: 2018-11-07 11:19:23.056393 7ff6777fe700 20 client.100603686.objecter dump_active .. 0 homeless
Nov  7 11:19:23 clancy qemu-system-x86_64: 2018-11-07 11:19:23.665481 7ff677fff700 10 client.100603686.objecter tick
Nov  7 11:19:23 clancy qemu-system-x86_64: 2018-11-07 11:19:23.665505 7ff677fff700 10 client.100603686.objecter  pinging osd that serves lingering tid 3 (osd.6)
Nov  7 11:19:23 clancy qemu-system-x86_64: 2018-11-07 11:19:23.665516 7ff677fff700 10 client.100603686.objecter _maybe_request_map subscribing (onetime) to next osd map
Nov  7 11:19:23 clancy qemu-system-x86_64: 2018-11-07 11:19:23.665541 7ff677fff700  1 -- 172.20.4.90:0/1264248106 --> 172.20.4.6:6789/0 -- mon_subscribe({osdmap=927858}) v2 -- ?+0 0x7ff5cc00c4f0 con 0x7ff66c04ae90
Nov  7 11:19:23 clancy qemu-system-x86_64: 2018-11-07 11:19:23.665568 7ff677fff700  1 -- 172.20.4.90:0/1264248106 --> 172.20.4.6:6811/806558 -- ping magic: 0 v1 -- ?+0 0x7ff5cc00c6e0 con 0x7ff66c01e5f0
Nov  7 11:19:28 clancy qemu-system-x86_64: 2018-11-07 11:19:28.665702 7ff677fff700 10 client.100603686.objecter tick
Nov  7 11:19:28 clancy qemu-system-x86_64: 2018-11-07 11:19:28.665724 7ff677fff700 10 client.100603686.objecter  pinging osd that serves lingering tid 3 (osd.6)
Nov  7 11:19:28 clancy qemu-system-x86_64: 2018-11-07 11:19:28.665734 7ff677fff700 10 client.100603686.objecter _maybe_request_map subscribing (onetime) to next osd map
Nov  7 11:19:28 clancy qemu-system-x86_64: 2018-11-07 11:19:28.665745 7ff677fff700  1 -- 172.20.4.90:0/1264248106 --> 172.20.4.6:6811/806558 -- ping magic: 0 v1 -- ?+0 0x7ff5cc018b10 con 0x7ff66c01e5f0
Nov  7 11:19:29 clancy qemu-system-x86_64: 2018-11-07 11:19:29.470507 7ff6777fe700  1 -- 172.20.4.90:0/1264248106 <== mon.0 172.20.4.6:6789/0 87742 ==== osd_map(927858..927858 src has 927297..927858) v3 ==== 521+0+0 (1912488452 0 0) 0x7ff6400025a0 con 0x7ff66c04ae90
Nov  7 11:19:29 clancy qemu-system-x86_64: 2018-11-07 11:19:29.470539 7ff6777fe700 10 client.100603686.objecter ms_dispatch 0x564526ec8d90 osd_map(927858..927858 src has 927297..927858) v3
Nov  7 11:19:29 clancy qemu-system-x86_64: 2018-11-07 11:19:29.470558 7ff6777fe700  3 client.100603686.objecter handle_osd_map got epochs [927858,927858] > 927857
Nov  7 11:19:29 clancy qemu-system-x86_64: 2018-11-07 11:19:29.470563 7ff6777fe700  3 client.100603686.objecter handle_osd_map decoding incremental epoch 927858
Nov  7 11:19:29 clancy qemu-system-x86_64: 2018-11-07 11:19:29.470602 7ff6777fe700 10 client.100603686.objecter  checking linger op 3
Nov  7 11:19:29 clancy qemu-system-x86_64: 2018-11-07 11:19:29.470628 7ff6777fe700 20 client.100603686.objecter dump_active .. 0 homeless
Nov  7 11:19:33 clancy qemu-system-x86_64: 2018-11-07 11:19:33.665939 7ff677fff700 10 client.100603686.objecter tick
Nov  7 11:19:33 clancy qemu-system-x86_64: 2018-11-07 11:19:33.665962 7ff677fff700 10 client.100603686.objecter  pinging osd that serves lingering tid 3 (osd.6)
Nov  7 11:19:33 clancy qemu-system-x86_64: 2018-11-07 11:19:33.665971 7ff677fff700 10 client.100603686.objecter _maybe_request_map subscribing (onetime) to next osd map
Nov  7 11:19:33 clancy qemu-system-x86_64: 2018-11-07 11:19:33.665993 7ff677fff700  1 -- 172.20.4.90:0/1264248106 --> 172.20.4.6:6789/0 -- mon_subscribe({osdmap=927859}) v2 -- ?+0 0x7ff5cc0199a0 con 0x7ff66c04ae90
Nov  7 11:19:33 clancy qemu-system-x86_64: 2018-11-07 11:19:33.666014 7ff677fff700  1 -- 172.20.4.90:0/1264248106 --> 172.20.4.6:6811/806558 -- ping magic: 0 v1 -- ?+0 0x7ff5cc018b10 con 0x7ff66c01e5f0
Nov  7 11:19:35 clancy qemu-system-x86_64: 2018-11-07 11:19:35.990272 7ff6777fe700  1 -- 172.20.4.90:0/1264248106 <== mon.0 172.20.4.6:6789/0 87743 ==== osd_map(927859..927859 src has 927297..927859) v3 ==== 521+0+0 (3446787442 0 0) 0x7ff6400025a0 con 0x7ff66c04ae90
Nov  7 11:19:35 clancy qemu-system-x86_64: 2018-11-07 11:19:35.990299 7ff6777fe700 10 client.100603686.objecter ms_dispatch 0x564526ec8d90 osd_map(927859..927859 src has 927297..927859) v3
Nov  7 11:19:35 clancy qemu-system-x86_64: 2018-11-07 11:19:35.990318 7ff6777fe700  3 client.100603686.objecter handle_osd_map got epochs [927859,927859] > 927858
Nov  7 11:19:35 clancy qemu-system-x86_64: 2018-11-07 11:19:35.990322 7ff6777fe700  3 client.100603686.objecter handle_osd_map decoding incremental epoch 927859
Nov  7 11:19:35 clancy qemu-system-x86_64: 2018-11-07 11:19:35.990360 7ff6777fe700 10 client.100603686.objecter  checking linger op 3
Nov  7 11:19:35 clancy qemu-system-x86_64: 2018-11-07 11:19:35.990386 7ff6777fe700 20 client.100603686.objecter dump_active .. 0 homeless
Actions #28

Updated by Jason Dillaman over 5 years ago

  • Subject changed from Snapshots changing "underneath" to jewel: Snapshots changing "underneath"

The watch is encountering back-to-back errors when re-registering which leaves it in an inconsistent state:

2018-11-07 10:44:46.099391 7f55117fa700 -1 librbd::ImageWatcher: 0x7f54f8001170 image watch failed: 140002914732224, (107) Transport endpoint is not connected
2018-11-07 10:44:46.099413 7f55117fa700 10 librbd::ImageWatcher: 0x7f54f8001170 current lock owner: [0,0]
2018-11-07 10:44:46.099626 7f54eb7fe700 10 librbd::ImageWatcher: 0x7f54f8001170 re-registering image watch
2018-11-07 10:44:47.607902 7f55117fa700 -1 librbd::ImageWatcher: 0x7f54f8001170 image watch failed: 140002914732224, (107) Transport endpoint is not connected
2018-11-07 10:44:47.607921 7f55117fa700 10 librbd::ImageWatcher: 0x7f54f8001170 current lock owner: [0,0]
2018-11-07 10:44:47.621431 7f55117fa700 10 librbd::ImageWatcher: 0x7f54f8001170 handle_rewatch: r=0

This code has been heavily refactored starting with luminous and the issue most likely no longer exists in non-EOLed releases.

Actions #29

Updated by Jason Dillaman almost 5 years ago

  • Status changed from Need More Info to Won't Fix

Jewel is EOL and issue is not present on later releases.

Actions

Also available in: Atom PDF