https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2015-01-20T04:46:59ZCeph Linux kernel client - Bug #10352: libceph delayed work items aren't teared down properlyhttps://tracker.ceph.com/issues/10352?journal_id=466302015-01-20T04:46:59ZIlya Dryomov
<ul></ul><p>This turned out to have nothing to do with how we tear down delayed_works - that code is fine.<br />The problem was rbd messing up refcounts left and right for clones with parent_overlap 0, leaving libceph state behind. We saw a bunch of evidence (e.g. <a class="issue tracker-1 status-3 priority-5 priority-high3 closed" title="Bug: krbd: EPERM from map-snapshot-io.sh (Resolved)" href="https://tracker.ceph.com/issues/9896">#9896</a> is probably related, I'll have to take a look) but never connected the dots, probably because fsx is currently the only test that exercises parent_overlap code to such an extent.</p>
<p>Reproducer:</p>
<pre>
# cat delayed_work.sh
#!/bin/bash
MON=1 OSD=1 ./vstart.sh -l -n mon osd
./rbd create --image-format 2 --size 1 foo
./rbd snap create foo@snap
./rbd snap protect foo@snap
./rbd clone foo@snap bar
./rbd resize --allow-shrink --size 0 bar
./rbd resize --size 1 bar
DEV=$(./rbd map bar)
./rbd unmap $DEV
sleep 3
./stop.sh
sleep 3
MON=1 OSD=1 ./vstart.sh -l -n mon osd
sleep 13
rmmod rbd && rmmod libceph
# Bad kernel will crash with a bad page fault on an address corresponding to
# mon_client.c/delayed_work() or osd_client.c/handle_osds_timeout() in the
# unloaded libceph.ko within 10-20 seconds.
sleep 20
echo OK
</pre>
<p>Fixes posted:</p>
<p>[PATCH 1/3] rbd: fix rbd_dev_parent_get() when parent_overlap 0<br />[PATCH 2/3] rbd: drop parent_ref in rbd_dev_unprobe() unconditionally</p> Linux kernel client - Bug #10352: libceph delayed work items aren't teared down properlyhttps://tracker.ceph.com/issues/10352?journal_id=466322015-01-20T04:47:37ZIlya Dryomov
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>7</i></li></ul> Linux kernel client - Bug #10352: libceph delayed work items aren't teared down properlyhttps://tracker.ceph.com/issues/10352?journal_id=474422015-02-04T12:18:33ZIlya Dryomov
<ul><li><strong>Status</strong> changed from <i>7</i> to <i>Resolved</i></li></ul><p>In 3.19-rc7.</p>