Bug #39945: RBD I/O error leads to ghost-mapped RBD - Linux kernel client - Ceph

Actions

Copy link

Bug #39945

open

RBD I/O error leads to ghost-mapped RBD

Added by Cliff Pajaro almost 5 years ago. Updated over 4 years ago.

Status:

Need More Info

Priority:

Low

Assignee:

Ilya Dryomov

Category:

rbd

Target version:

Ceph - v12.2.12

% Done:

Source:

Community (user)

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v12.2.12

ceph-qa-suite:

Crash signature (v1):

Crash signature (v2):

Description

I have attached a couple of kernel logs (same system) showing the IO errors.

On the system that had the RBDs mapped, the RBDs don't appear in "rbd showmapped" and are not inside "/sys/bus/rbd/devices". Attempting "echo # > /sys/bus/rbd/remove" doesn't work.

The RBDs do appear in /sys/kernel/debug/block/rbd7 and /sys/kernel/debug/block/rbd8.
The RBDs appear when calling "rbd status".

Watchers:
        watcher=10.10.4.63:0/876863094 client.44641389 cookie=18446462598732841001

Watchers:
        watcher=10.10.4.63:0/876863094 client.44641389 cookie=18446462598733070026

It seems the only way to release the RBDs is to reboot the system. The RBDs can be mapped, mounted, read, etc. with no problems (attempted on a different system and on the same system).

# ps aux | grep [r]bd
root     3011042  0.0  0.0      0     0 ?        I<   Apr20   0:00 [rbd]
root     3373133  0.0  0.0      0     0 ?        I<   May09   0:00 [rbd8-tasks]
root     3957315  0.0  0.0      0     0 ?        I<   Apr26   0:00 [rbd7-tasks]
# cat /proc/3011042/stack 
[<0>] rescuer_thread+0x2ef/0x330
[<0>] kthread+0x111/0x130
[<0>] ret_from_fork+0x1f/0x30
[<0>] 0xffffffffffffffff
# cat /proc/3373133/stack        
[<0>] rescuer_thread+0x2ef/0x330
[<0>] kthread+0x111/0x130
[<0>] ret_from_fork+0x1f/0x30
[<0>] 0xffffffffffffffff
# cat /proc/3957315/stack       
[<0>] rescuer_thread+0x2ef/0x330
[<0>] kthread+0x111/0x130
[<0>] ret_from_fork+0x1f/0x30
[<0>] 0xffffffffffffffff

Version information:

# ceph version
ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)
# ceph --version
ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)
# uname -r
4.19.34-vanilla-cephsb-2

Files

Download all files

var_log_messages_rbd7.txt (7.98 KB) var_log_messages_rbd7.txt		Cliff Pajaro, 05/15/2019 05:43 PM
var_log_messages_rbd8.txt (8.19 KB) var_log_messages_rbd8.txt		Cliff Pajaro, 05/15/2019 05:43 PM

Actions

Copy link

Updated by Ilya Dryomov almost 5 years ago

Assignee set to Ilya Dryomov

On the system that had the RBDs mapped, the RBDs don't appear in "rbd showmapped" and are not inside "/sys/bus/rbd/devices". Attempting "echo # > /sys/bus/rbd/remove" doesn't work.

This is because someone issued "rbd unmap" for these devices:

__ioc_clear_queue+0x36/0x60
ioc_clear_queue+0x91/0xd0
blk_exit_queue+0x15/0x40
blk_cleanup_queue+0xd5/0x140
rbd_free_disk+0x19/0x40 [rbd]
rbd_dev_device_release+0x2c/0x50 [rbd]
do_rbd_remove.isra.18+0x197/0x220 [rbd]
kernfs_fop_write+0x105/0x180
__vfs_write+0x33/0x1d0
vfs_write+0xa9/0x1a0
ksys_write+0x4f/0xb0
do_syscall_64+0x3e/0xe0
entry_SYSCALL_64_after_hwframe+0x44/0xa9

"rbd unmap" removed them from the list of mapped devices and then crashed in the block layer:

BUG: unable to handle kernel paging request at ffffffff8139c2a0
RIP: 0010:ioc_destroy_icq+0x37/0xb0

Cliff, was it regular "rbd unmap" or "rbd unmap -o force"? Did you attempt to unmount ext4 before unmapping?

Just to get an idea of how to reproduce, where are the I/O errors coming from? Is this the kernel where you replaced the "write to read-only mapping" assert with an error?

Actions

Copy link

Updated by Ilya Dryomov over 4 years ago

Status changed from New to Need More Info
Priority changed from Normal to Low

Actions

Copy link

Updated by Cliff Pajaro over 4 years ago

Hi Ilya,

This issue has popped up again and I have a similar `rbd25-tasks` process still alive, with the rbd status command showing a watcher remaining.

In this case we have a python script using subprocess to call a normal `rbd unmap`.

When I finally take a look at the system (because it has an RBD that is stuck), the python script is waiting for the `rbd unmap` process to return:

Thread 0x7fdd0b69f700 (LWP 12982) "python3" 0x00007fdd0d50024e in __waitpid (pid=122759, stat_loc=0x7fdd0b69b614, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:30

The `rbd unmap` process is in a zombie state. This process only goes away if the python script (and the java application that started it) are stopped. Even after the `rbd unmap` process disappears the `rbd25-tasks` process remains.

Actions

Copy link

Updated by Ilya Dryomov over 4 years ago

Hi Cliff,

What kernel are you running? Is it the same one or something newer?

Is there anything in dmesg? Previously there were I/O errors, and a "unable to handle kernel paging request" crash in the block layer.

Actions

Copy link

Updated by Cliff Pajaro over 4 years ago

Hi Ilya,

We are now running a newer kernel: 4.19.68-vanilla-ams-2

I just noticed traces in /var/log/messages.
Occurrence #1:

BUG: unable to handle kernel paging request at ffffffff814d68a0
PGD 240c067 P4D 240c067 PUD 240d063 PMD 14001e1
Oops: 0003 [#1] SMP NOPTI
CPU: 2 PID: 80583 Comm: rbd Tainted: G           O      4.19.68-vanilla-ams-2 #1
Hardware name: Quanta Cloud Technology Inc. L08A-AMS/L08A-AMS-MB, BIOS S1Q_3A08.NEO03 10/18/2016
RIP: 0010:ioc_destroy_icq+0x37/0xb0
Code: 89 fb 48 8b 6f 08 48 8b 50 18 48 63 b0 ac 01 00 00 48 8d 7d 28 4c 8b 22 e8 66 17 53 00 48 8b 43 28 48 85 c0 74 20 48 8b 53 20 <48> 89 10 48 85 d2 74 04 48 89 42 08 48 c7 43 20 00 00 00 00 48 c7
RSP: 0018:ffffc9000558bd28 EFLAGS: 00010086
RAX: ffffffff814d68a0 RBX: ffff88813106c000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000000000000d RDI: ffff888856505d28
RBP: ffff888856505d00 R08: 0000000000000000 R09: ffffffff814f9f80
R10: ffff8885c0ceaa00 R11: 0000000000000000 R12: ffffffff824eca20
R13: 0000000000000206 R14: ffffc9000558bf08 R15: ffff88872659d020
FS:  00007fe763d900c0(0000) GS:ffff88885f880000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff814d68a0 CR3: 000000071dd2e006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 __ioc_clear_queue+0x36/0x60
 ioc_clear_queue+0x91/0xd0
 blk_exit_queue+0x15/0x40
 blk_cleanup_queue+0xd5/0x140
 rbd_free_disk+0x19/0x40 [rbd]
 rbd_dev_device_release+0x2c/0x50 [rbd]
 do_rbd_remove.isra.23+0x197/0x220 [rbd]
 ? x86_setup_perfctr+0xe5/0x140
 kernfs_fop_write+0x116/0x190
 __vfs_write+0x36/0x1e0
 vfs_write+0xa9/0x1a0
 ksys_write+0x57/0xd0
 do_syscall_64+0x48/0x100
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fe7648133fd
Code: 80 00 00 00 00 48 89 55 d8 48 89 75 e0 89 7d e8 e8 b8 fc ff ff 48 8b 55 d8 48 8b 75 e0 41 89 c0 8b 7d e8 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 c7 48 89 45 e8 e8 ef fc ff ff 48 8b
RSP: 002b:00007ffdaedc1d70 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fe7648133fd
RDX: 0000000000000002 RSI: 00007ffdaedc1fa0 RDI: 0000000000000007
RBP: 00007ffdaedc1da0 R08: 0000000000000000 R09: 000055743cb70810
R10: fffffffffffff86a R11: 0000000000000293 R12: 00007ffdaedc1fa0
R13: 0000000000000007 R14: 00007ffdaedc1e90 R15: 00007ffdaedc1f90
Modules linked in: ipt_MASQUERADE usb_storage rbd libceph xt_UDPLB(O) xt_nat xt_multiport xt_addrtype iptable_security iptable_mangle iptable_raw iptable_nat nf_nat_ipv4 nf_nat bpfilter sb_edac x86_pkg_temp_thermal intel_pch_thermal ioatdma gkuart(O) usbserial dca
CR2: ffffffff814d68a0
---[ end trace 58804622e1ce6039 ]---
RIP: 0010:ioc_destroy_icq+0x37/0xb0
Code: 89 fb 48 8b 6f 08 48 8b 50 18 48 63 b0 ac 01 00 00 48 8d 7d 28 4c 8b 22 e8 66 17 53 00 48 8b 43 28 48 85 c0 74 20 48 8b 53 20 <48> 89 10 48 85 d2 74 04 48 89 42 08 48 c7 43 20 00 00 00 00 48 c7
RSP: 0018:ffffc9000558bd28 EFLAGS: 00010086
RAX: ffffffff814d68a0 RBX: ffff88813106c000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000000000000d RDI: ffff888856505d28
RBP: ffff888856505d00 R08: 0000000000000000 R09: ffffffff814f9f80
R10: ffff8885c0ceaa00 R11: 0000000000000000 R12: ffffffff824eca20
R13: 0000000000000206 R14: ffffc9000558bf08 R15: ffff88872659d020
FS:  00007fe763d900c0(0000) GS:ffff88885f880000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff814d68a0 CR3: 000000071dd2e006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Occurrence #2 (completely different data center):

BUG: unable to handle kernel paging request at ffffffff814d68a0
PGD 240c067 P4D 240c067 PUD 240d063 PMD 14001e1
Oops: 0003 [#1] SMP NOPTI
CPU: 7 PID: 122759 Comm: rbd Tainted: G           O      4.19.68-vanilla-ams-2 #1
Hardware name: Quanta Cloud Technology Inc. L08A-AMS/L08A-AMS-MB, BIOS S1Q_3A08.NEO03 10/18/2016
RIP: 0010:ioc_destroy_icq+0x37/0xb0
Code: 89 fb 48 8b 6f 08 48 8b 50 18 48 63 b0 ac 01 00 00 48 8d 7d 28 4c 8b 22 e8 66 17 53 00 48 8b 43 28 48 85 c0 74 20 48 8b 53 20 <48> 89 10 48 85 d2 74 04 48 89 42 08 48 c7 43 20 00 00 00 00 48 c7
RSP: 0018:ffffc90004e1fd28 EFLAGS: 00010086
RAX: ffffffff814d68a0 RBX: ffff88812e45f000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000000000001c RDI: ffff888853617b88
RBP: ffff888853617b60 R08: 0000000000000000 R09: ffffffff814f9f80
R10: ffff8881207d79c0 R11: 0000000000000000 R12: ffffffff824eca20
R13: 0000000000000206 R14: ffffc90004e1ff08 R15: ffff888147b4a920
FS:  00007f5901f540c0(0000) GS:ffff88885f9c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff814d68a0 CR3: 0000000794d3c003 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 __ioc_clear_queue+0x36/0x60
 ioc_clear_queue+0x91/0xd0
 blk_exit_queue+0x15/0x40
 blk_cleanup_queue+0xd5/0x140
 rbd_free_disk+0x19/0x40 [rbd]
 rbd_dev_device_release+0x2c/0x50 [rbd]
 do_rbd_remove.isra.23+0x197/0x220 [rbd]
 ? x86_setup_perfctr+0xe5/0x140
 kernfs_fop_write+0x116/0x190
 __vfs_write+0x36/0x1e0
 vfs_write+0xa9/0x1a0
 ksys_write+0x57/0xd0
 do_syscall_64+0x48/0x100
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f59029d73fd
Code: 80 00 00 00 00 48 89 55 d8 48 89 75 e0 89 7d e8 e8 b8 fc ff ff 48 8b 55 d8 48 8b 75 e0 41 89 c0 8b 7d e8 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 c7 48 89 45 e8 e8 ef fc ff ff 48 8b
RSP: 002b:00007fffa6406050 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f59029d73fd
RDX: 0000000000000002 RSI: 00007fffa6406280 RDI: 0000000000000007
RBP: 00007fffa6406080 R08: 0000000000000000 R09: 0000558bee491860
R10: fffffffffffff86a R11: 0000000000000293 R12: 00007fffa6406280
R13: 0000000000000007 R14: 00007fffa6406170 R15: 00007fffa6406270
Modules linked in: ipt_MASQUERADE rbd libceph xt_nat xt_UDPLB(O) xt_multiport xt_addrtype iptable_security iptable_mangle iptable_raw iptable_nat nf_nat_ipv4 nf_nat bpfilter sb_edac x86_pkg_temp_thermal intel_pch_thermal gkuart(O) usbserial ioatdma dca
CR2: ffffffff814d68a0
---[ end trace f6752c709f61d42f ]---
RIP: 0010:ioc_destroy_icq+0x37/0xb0
Code: 89 fb 48 8b 6f 08 48 8b 50 18 48 63 b0 ac 01 00 00 48 8d 7d 28 4c 8b 22 e8 66 17 53 00 48 8b 43 28 48 85 c0 74 20 48 8b 53 20 <48> 89 10 48 85 d2 74 04 48 89 42 08 48 c7 43 20 00 00 00 00 48 c7
RSP: 0018:ffffc90004e1fd28 EFLAGS: 00010086
RAX: ffffffff814d68a0 RBX: ffff88812e45f000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000000000001c RDI: ffff888853617b88
RBP: ffff888853617b60 R08: 0000000000000000 R09: ffffffff814f9f80
R10: ffff8881207d79c0 R11: 0000000000000000 R12: ffffffff824eca20
R13: 0000000000000206 R14: ffffc90004e1ff08 R15: ffff888147b4a920
FS:  00007f5901f540c0(0000) GS:ffff88885f9c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff814d68a0 CR3: 0000000794d3c003 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

From what I can tell the stack trace is identical between the 2 new occurrences and what you pointed out above (from 7 months ago).

Actions

Copy link

Updated by Cliff Pajaro over 4 years ago

To reply to your questions from before:

was it regular "rbd unmap" or "rbd unmap -o force"?

This was a regular rbd unmap

Did you attempt to unmount ext4 before unmapping?

Yes, there are successful ext4 filesystem unmounts before rbd unmapping.

Just to get an idea of how to reproduce, where are the I/O errors coming from? Is this the kernel where you replaced the "write to read-only mapping" assert with an error?

This kernel does have a fix to prevent "trim/write to read-only devices". The I/O errors I reported before are no longer appearing (if I am recalling this correctly).

Actions

Copy link

Updated by Ilya Dryomov over 4 years ago

Looks like it crashed on a corrupted ioc->icq_list:

  18:   e8 66 17 53 00          callq  0x531783  # radix_tree_delete()
  1d:   48 8b 43 28             mov    0x28(%rbx),%rax  # icq->ioc_node->pprev
  21:   48 85 c0                test   %rax,%rax  # hlist_unhashed()
  24:   74 20                   je     0x46
  26:   48 8b 53 20             mov    0x20(%rbx),%rdx  # icq->ioc_node->next
  2a:*  48 89 10                mov    %rdx,(%rax)  # *pprev = next

 63 static void ioc_destroy_icq(struct io_cq *icq)
 64 {
 65         struct io_context *ioc = icq->ioc;
 66         struct request_queue *q = icq->q;
 67         struct elevator_type *et = q->elevator->type;
 68 
 69         lockdep_assert_held(&ioc->lock);
 70 
 71         radix_tree_delete(&ioc->icq_tree, icq->q->id);
 72         hlist_del_init(&icq->ioc_node);  <--
 73         list_del_init(&icq->q_node);

691 static inline void hlist_del_init(struct hlist_node *n)
692 {
693         if (!hlist_unhashed(n)) {
694                 __hlist_del(n);  <--
695                 INIT_HLIST_NODE(n);
696         }
697 }

674 static inline void __hlist_del(struct hlist_node *n)
675 {
676         struct hlist_node *next = n->next;
677         struct hlist_node **pprev = n->pprev;
678 
679         WRITE_ONCE(*pprev, next);  <--
680         if (next)
681                 next->pprev = pprev;
682 }

WRITE_ONCE(*pprev, next) is basically *pprev = next, and pprev is busted. It's notable that you got the same busted address (ffffffff814d68a0) in two completely different data centers.

Unfortunately, there is not much else I can do without a crash dump. This list is managed by the block layer, and so far I don't see anything wrong on the rbd side.

Where did this 4.19.68-vanilla-ams-2 kernel come from? Are you building your own kernels? If so, are there any modifications or just config changes?

Actions

Copy link

Updated by Cliff Pajaro over 4 years ago

Thanks for the explanation. Yes, we build our own kernels and we have some modifications.

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 585378bc988c..5516c52edecf 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -3652,6 +3652,15 @@ static void rbd_queue_workfn(struct work_struct *work)
         goto err;
     }

+    /* snapshot mounts are readonly, so don't allow discards */
+    if (rbd_dev->spec->snap_id != CEPH_NOSNAP && op_type == OBJ_OP_DISCARD) {
+        result = -EBADF;
+        goto err;
+    } else if (rbd_dev->spec->snap_id != CEPH_NOSNAP && op_type == OBJ_OP_WRITE) {
+        result = -EROFS;
+        goto err;
+    }
+
     /* Ignore/skip any zero-length requests */

     if (!length) {

and a couple of things from upstream:

From: Ilya Dryomov <idryomov@gmail.com>

[ Upstream commit 187df76325af5d9e12ae9daec1510307797e54f0 ]

A bvec can now consist of multiple physically contiguous pages.
This means that bvec_iter_advance() can move to a different page while
staying in the same bvec (i.e. ->bi_bvec_done != 0).

The messenger works in terms of segments which can now be defined as
the smaller of a bvec and a page.  The "more bytes to process in this
segment" condition holds only if bvec_iter_advance() leaves us in the
same bvec _and_ in the same page.  On next bvec (possibly in the same
page) and on next page (possibly in the same bvec) we may need to set
->last_piece.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin (Microsoft) <sashal@kernel.org>
---
 net/ceph/messenger.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index f7d7f32ac673..ef5216206bdf 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -870,6 +870,7 @@ static bool ceph_msg_data_bio_advance(struct ceph_msg_data_cursor *cursor,
                     size_t bytes)
 {
     struct ceph_bio_iter *it = &cursor->bio_iter;
+    struct page *page = bio_iter_page(it->bio, it->iter);

     BUG_ON(bytes > cursor->resid);
     BUG_ON(bytes > bio_iter_len(it->bio, it->iter));
@@ -881,7 +882,8 @@ static bool ceph_msg_data_bio_advance(struct ceph_msg_data_cursor *cursor,
         return false;   /* no more data */
     }

-    if (!bytes || (it->iter.bi_size && it->iter.bi_bvec_done))
+    if (!bytes || (it->iter.bi_size && it->iter.bi_bvec_done &&
+               page == bio_iter_page(it->bio, it->iter)))
         return false;    /* more bytes to process in this segment */

     if (!it->iter.bi_size) {
@@ -929,6 +931,7 @@ static bool ceph_msg_data_bvecs_advance(struct ceph_msg_data_cursor *cursor,
                     size_t bytes)
 {
     struct bio_vec *bvecs = cursor->data->bvec_pos.bvecs;
+    struct page *page = bvec_iter_page(bvecs, cursor->bvec_iter);

     BUG_ON(bytes > cursor->resid);
     BUG_ON(bytes > bvec_iter_len(bvecs, cursor->bvec_iter));
@@ -940,7 +943,8 @@ static bool ceph_msg_data_bvecs_advance(struct ceph_msg_data_cursor *cursor,
         return false;   /* no more data */
     }

-    if (!bytes || cursor->bvec_iter.bi_bvec_done)
+    if (!bytes || (cursor->bvec_iter.bi_bvec_done &&
+               page == bvec_iter_page(bvecs, cursor->bvec_iter)))
         return false;    /* more bytes to process in this segment */

     BUG_ON(cursor->last_piece);

From: Al Viro <viro@zeniv.linux.org.uk>

[ Upstream commit daf5cc27eed99afdea8d96e71b89ba41f5406ef6 ]

free the symlink body after the same RCU delay we have for freeing the
struct inode itself, so that traversal during RCU pathwalk wouldn't step
into freed memory.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin (Microsoft) <sashal@kernel.org>
---
 fs/ceph/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 4055ab4d5c52..3e518c2ae2bf 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -524,6 +524,7 @@ static void ceph_i_callback(struct rcu_head *head)
     struct inode *inode = container_of(head, struct inode, i_rcu);
     struct ceph_inode_info *ci = ceph_inode(inode);

+    kfree(ci->i_symlink);
     kmem_cache_free(ceph_inode_cachep, ci);
 }

@@ -561,7 +562,6 @@ void ceph_destroy_inode(struct inode *inode)
         ceph_put_snap_realm(mdsc, realm);
     }

-    kfree(ci->i_symlink);
     while ((n = rb_first(&ci->i_fragtree)) != NULL) {
         frag = rb_entry(n, struct ceph_inode_frag, node);
         rb_erase(n, &ci->i_fragtree);
-- 
2.19.1

Actions

Copy link

Updated by Ilya Dryomov over 4 years ago

These modifications shouldn't have any effect on the block layer.

Perhaps try to configure kdump and capture a crash dump next time? It will allow us to examine ioc->icq_list list and related data structures for inconsistencies. The locking around them appears to be quite complex...

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » Linux kernel client

Custom queries

Bug #39945

RBD I/O error leads to ghost-mapped RBD

Updated by Ilya Dryomov almost 5 years ago

Updated by Ilya Dryomov over 4 years ago

Updated by Cliff Pajaro over 4 years ago

Updated by Ilya Dryomov over 4 years ago

Updated by Cliff Pajaro over 4 years ago

Updated by Cliff Pajaro over 4 years ago

Updated by Ilya Dryomov over 4 years ago

Updated by Cliff Pajaro over 4 years ago

Updated by Ilya Dryomov over 4 years ago