Project

General

Profile

Bug #7327

Assertion failure in rbd_object_request_destroy()

Added by Nathaniel Filardo almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Running Debian kernel 3.12.6 (debian package linux-image-3.12-1-amd64_3.12.6-2_amd64) inside a Xen DomU, we occasionally trip over an assertion as follows. Anything more we can tell you to be useful?

[94999.206638]
[94999.206638] Assertion failure in rbd_obj_request_destroy() at line 1867:
[94999.206638]
[94999.206638] rbd_assert(obj_request->img_request == NULL);
[94999.206638]
[94999.206697] ------------[ cut here ]------------
[94999.206701] kernel BUG at /build/linux-xS3nxO/linux-3.12.6/drivers/block/rbd.c:1867!
[94999.206706] invalid opcode: 0000 [#1] SMP
[94999.206712] Modules linked in: openafs(PO) cbc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc rbd libceph fuse evdev snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr coretemp ext4 crc16 mbcache jbd2 btrfs xor raid6_pq crc32c libcrc32c xen_netfront xen_blkfront
[94999.206743] CPU: 0 PID: 72 Comm: kworker/0:1H Tainted: P O 3.12-1-amd64 #1 Debian 3.12.6-2
[94999.206755] Workqueue: kblockd cfq_kick_queue
[94999.206762] task: ffff880000090040 ti: ffff880004012000 task.ti: ffff880004012000
[94999.206767] RIP: e030:[<ffffffffa024967a>] [<ffffffffa024967a>] rbd_obj_request_destroy+0x14a/0x150 [rbd]
[94999.206778] RSP: e02b:ffff880004013cf0 EFLAGS: 00010286
[94999.206782] RAX: 000000000000006e RBX: ffff8800f5526e40 RCX: 0000000000000000
[94999.206787] RDX: ffff8800fea0ff98 RSI: ffff8800fea0e4a8 RDI: 0000000000000200
[94999.206791] RBP: ffff8800657b76e0 R08: 0000000000000000 R09: 0000000000000123
[94999.206795] R10: 0720072007200720 R11: 0720072007200720 R12: ffff8800f5526d98
[94999.206799] R13: 0000000000000000 R14: ffff880051c9a180 R15: ffff8800f5526d98
[94999.206807] FS: 00007fdcaf90c700(0000) GS:ffff8800fea00000(0000) knlGS:0000000000000000
[94999.206812] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[94999.206816] CR2: 00007fcbf4114000 CR3: 00000000f4607000 CR4: 0000000000002660
[94999.206821] Stack:
[94999.206823] 0000000000000000 ffff8800657b76e0 00000314fa300000 ffffffffa024abc1
[94999.206831] 0000000700000009 0000000000080000 ffff88006cf67388 0000100000050000
[94999.206837] 0000000000300000 00000314fa300000 0000000000000001 0000000100000220
[94999.206844] Call Trace:
[94999.206850] [<ffffffffa024abc1>] ? rbd_img_request_fill+0x611/0x8b0 [rbd]
[94999.206857] [<ffffffffa024c49d>] ? rbd_request_fn+0x1dd/0x2b0 [rbd]
[94999.206863] [<ffffffff81240d0a>] ? __blk_run_queue+0x2a/0x40
[94999.206868] [<ffffffff8125bca4>] ? cfq_kick_queue+0x24/0x40
[94999.206875] [<ffffffff810746bd>] ? process_one_work+0x16d/0x420
[94999.206880] [<ffffffff81075286>] ? worker_thread+0x116/0x3b0
[94999.206885] [<ffffffff81075170>] ? rescuer_thread+0x330/0x330
[94999.206890] [<ffffffff8107b76f>] ? kthread+0xaf/0xc0
[94999.206896] [<ffffffff8107b6c0>] ? kthread_create_on_node+0x110/0x110
[94999.206902] [<ffffffff81499a8c>] ? ret_from_fork+0x7c/0xb0
[94999.206907] [<ffffffff8107b6c0>] ? kthread_create_on_node+0x110/0x110
[94999.206911] Code: a0 31 c0 e8 7a 0b 24 e1 0f 0b 48 c7 c1 38 f8 24 a0 ba 4b 07 00 00 48 c7 c6 e0 08 25 a0 48 c7 c7 d0 f6 24 a0 31 c0 e8 57 0b 24 e1 <0f> 0b 0f 1f 40 00 48 85 ff 74 2b 8b 87 a8 00 00 00 48 8d 97 a8
[94999.206951] RIP [<ffffffffa024967a>] rbd_obj_request_destroy+0x14a/0x150 [rbd]
[94999.206958] RSP <ffff880004013cf0>
[94999.206978] ---[ end trace 0318cd8e9f38ca0c ]---
[94999.207045] BUG: unable to handle kernel paging request at ffffffffffffffd8
[94999.207055] IP: [<ffffffff8107bcd7>] kthread_data+0x7/0x10
[94999.207060] PGD 180f067 PUD 1811067 PMD 0
[94999.207065] Oops: 0000 [#2] SMP
[94999.207069] Modules linked in: openafs(PO) cbc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc rbd libceph fuse evdev snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr coretemp ext4 crc16 mbcache jbd2 btrfs xor raid6_pq crc32c libcrc32c xen_netfront xen_blkfront
[94999.207096] CPU: 0 PID: 72 Comm: kworker/0:1H Tainted: P D O 3.12-1-amd64 #1 Debian 3.12.6-2
[94999.207111] task: ffff880000090040 ti: ffff880004012000 task.ti: ffff880004012000
[94999.207116] RIP: e030:[<ffffffff8107bcd7>] [<ffffffff8107bcd7>] kthread_data+0x7/0x10
[94999.207123] RSP: e02b:ffff880004013ab0 EFLAGS: 00010002
[94999.207126] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000005
[94999.207130] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880000090040
[94999.207135] RBP: ffff880000090040 R08: 0000000000000001 R09: 000000000000b65a
[94999.207139] R10: ffffffff81a7f800 R11: 0000000000000000 R12: ffff8800fea142c0
[94999.207143] R13: 0000000000000000 R14: ffff880000090030 R15: ffff880000090040
[94999.207149] FS: 00007fdcaf90c700(0000) GS:ffff8800fea00000(0000) knlGS:0000000000000000
[94999.207154] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[94999.207158] CR2: 0000000000000028 CR3: 00000000f4607000 CR4: 0000000000002660
[94999.207163] Stack:
[94999.207165] ffffffff810758a8 ffff880000090388 ffffffff814900be 00000000000142c0
[94999.207172] ffff880004013fd8 00000000000142c0 ffff880004013fd8 ffff8800000905a8
[94999.207178] ffff880000090290 ffff880000090030 ffff8800f58427c0 ffff880000090030
[94999.207184] Call Trace:
[94999.207188] [<ffffffff810758a8>] ? wq_worker_sleeping+0x8/0x80
[94999.207195] [<ffffffff814900be>] ? __schedule+0x45e/0x720
[94999.207200] [<ffffffff8105dcf7>] ? do_exit+0x6e7/0xa40
[94999.207205] [<ffffffff814932b8>] ? oops_end+0x98/0xe0
[94999.207211] [<ffffffff81012e51>] ? do_invalid_op+0x81/0xa0
[94999.207217] [<ffffffffa024967a>] ? rbd_obj_request_destroy+0x14a/0x150 [rbd]
[94999.207222] [<ffffffff814920d1>] ? _raw_spin_lock_irqsave+0x11/0x40
[94999.207228] [<ffffffff8108079d>] ? up+0xd/0x40
[94999.207233] [<ffffffff81491e1e>] ? _raw_spin_unlock_irqrestore+0xe/0x10
[94999.207240] [<ffffffff810a2a88>] ? console_unlock+0x278/0x3f0
[94999.207246] [<ffffffff81102029>] ? irq_work_queue+0x69/0xb0
[94999.207251] [<ffffffff810a280b>] ? wake_up_klogd+0x2b/0x30
[94999.207255] [<ffffffff810a2dda>] ? vprintk_emit+0x1da/0x510
[94999.207261] [<ffffffff8149b1de>] ? invalid_op+0x1e/0x30
[94999.207266] [<ffffffffa024967a>] ? rbd_obj_request_destroy+0x14a/0x150 [rbd]
[94999.207272] [<ffffffffa024abc1>] ? rbd_img_request_fill+0x611/0x8b0 [rbd]
[94999.207278] [<ffffffffa024c49d>] ? rbd_request_fn+0x1dd/0x2b0 [rbd]
[94999.207283] [<ffffffff81240d0a>] ? __blk_run_queue+0x2a/0x40
[94999.207288] [<ffffffff8125bca4>] ? cfq_kick_queue+0x24/0x40
[94999.207294] [<ffffffff810746bd>] ? process_one_work+0x16d/0x420
[94999.207299] [<ffffffff81075286>] ? worker_thread+0x116/0x3b0
[94999.207303] [<ffffffff81075170>] ? rescuer_thread+0x330/0x330
[94999.207308] [<ffffffff8107b76f>] ? kthread+0xaf/0xc0
[94999.207313] [<ffffffff8107b6c0>] ? kthread_create_on_node+0x110/0x110
[94999.207318] [<ffffffff81499a8c>] ? ret_from_fork+0x7c/0xb0
[94999.207323] [<ffffffff8107b6c0>] ? kthread_create_on_node+0x110/0x110
[94999.207327] Code: 00 00 00 00 65 48 8b 04 25 40 c8 00 00 48 8b 80 f0 02 00 00 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 0f 1f 40 00 48 8b 87 f0 02 00 00 <48> 8b 40 d8 c3 0f 1f 40 00 48 83 ec 08 48 8b b7 f0 02 00 00 ba
[94999.207367] RIP [<ffffffff8107bcd7>] kthread_data+0x7/0x10
[94999.207372] RSP <ffff880004013ab0>
[94999.207375] CR2: ffffffffffffffd8
[94999.207378] ---[ end trace 0318cd8e9f38ca0d ]---
[94999.207382] Fixing recursive fault but reboot is needed!

History

#1 Updated by Ian Colle almost 8 years ago

  • Assignee set to Ilya Dryomov
  • Priority changed from Normal to Urgent

#2 Updated by Ilya Dryomov almost 8 years ago

Hi Nathaniel,

What kind of workload are you running? Have you noticed any pattern
wrt when it occurs, what might be triggering it, etc?

Can you also describe your setup? Things like filesystem on top of
rbd, rbd image size, are you using snapshots, clones, etc.

#3 Updated by Nathaniel Filardo almost 8 years ago

No pattern has emerged, except that it happened twice in rapid succession after a long period of not happening (weeks) and hasn't happened in the 48 hours since (here's hoping it continues to not happen).

The server in question has three RBDs mapped, all of which are being used to provide ext4 filesystems, mounted with options "rw,noatime,errors=remount-ro,stripe=1024,data=ordered". Two are 10T and one is 5T. The server is providing OpenAFS file services to our cluster, which makes it a relatively boring POSIX consumer (no xattrs or anything like that, even), though it might be a little fsync()-heavy relative to the norm.

There are no snapshots or clones of these RBDs.

Thanks!

#4 Updated by Ilya Dryomov almost 8 years ago

  • Status changed from New to Fix Under Review

OK, so this turned out to be just a busted error path. After finding
that exact debian rbd.ko and some fiddling with assembly, I can tell
that the problem was a jump to the out_partial label, which was caused
either by ENOMEM or a NULL from bio_chain_clone_range().

Now, a NULL from bio_chain_clone_range() might too be ENOMEM, but it
might also be a screw up in our bio logic. Nathaniel, was this box
under memory pressure at those points by any chance?

An error path fix has been posted:

[PATCH] rbd: fix error paths in rbd_img_request_fill()

#5 Updated by Nathaniel Filardo almost 8 years ago

Ilya Dryomov wrote:

[snip]
Now, a NULL from bio_chain_clone_range() might too be ENOMEM, but it
might also be a screw up in our bio logic. Nathaniel, was this box
under memory pressure at those points by any chance?

I am not sure. It's probable that they would experience memory pressure
some times; they're fileservers, after all, and had 2G or 4G of RAM,
depending on which we're talking about. But like I said, no clear pattern
of crashes had emerged, other than that it seemed like it would happen as
soon as I stopped paying attention. ;)

On the other hand, the same workload has been running in QEMU (using QEMU's
RBD support) without so much as a glitch in 512M of RAM, so maybe they don't
experience that much non-cache pressure after all. The actual userland
processes are not very large, in terms of resident size, but the filesystem
and block caches do fill up and stay that way.

#6 Updated by Ilya Dryomov almost 8 years ago

  • Status changed from Fix Under Review to Resolved

I see. In any case, it shouldn't crash anymore. If it pops up again,
you'll see something like

rbd: rbd<id>: read <hom much> at <offset> result -12

for reads, and similar for writes.

Also available in: Atom PDF