Project

General

Profile

Actions

Bug #51529

open

[rbd-nbd] kernel BUG on "rbd-nbd unmap" during rbd-nbd.sh

Added by Ilya Dryomov almost 3 years ago. Updated almost 3 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

# test the hook is slow
cat > ${QUIESCE_HOOK} <<EOF
#/bin/sh
echo "test the hook is slow" >&2
sleep 7
EOF
rbd snap create ${POOL}/${IMAGE}@quiesce2
_sudo dd if=${DATA} of=${DEV} bs=1M count=1 oflag=direct

# test rbd-nbd_quiesce hook that comes with distribution
unmap_device ${DEV} ${PID}
2021-07-02T23:33:18.358 INFO:tasks.workunit.client.0.smithi194.stderr:+ rbd snap create rbd/testrbdnbd13664@quiesce2
Creating snap: 100% complete...done.
2021-07-02T23:33:26.370 INFO:tasks.workunit.client.0.smithi194.stderr:+ _sudo dd if=/tmp/tmp.fNfKGOUgrV/data of=/dev/nbd0 bs=1M count=1 oflag=direct
2021-07-02T23:33:26.370 INFO:tasks.workunit.client.0.smithi194.stderr:+ local cmd
2021-07-02T23:33:26.370 INFO:tasks.workunit.client.0.smithi194.stderr:++ id -u
2021-07-02T23:33:26.371 INFO:tasks.workunit.client.0.smithi194.stderr:+ '[' 1000 -eq 0 ']'
2021-07-02T23:33:26.371 INFO:tasks.workunit.client.0.smithi194.stderr:++ which dd
2021-07-02T23:33:26.371 INFO:tasks.workunit.client.0.smithi194.stderr:+ cmd=/usr/bin/dd
2021-07-02T23:33:26.371 INFO:tasks.workunit.client.0.smithi194.stderr:+ shift
2021-07-02T23:33:26.372 INFO:tasks.workunit.client.0.smithi194.stderr:+ sudo -nE /usr/bin/dd if=/tmp/tmp.fNfKGOUgrV/data of=/dev/nbd0 bs=1M count=1 oflag=direct
2021-07-02T23:33:26.372 INFO:tasks.workunit.client.0.smithi194.stderr:1+0 records in
2021-07-02T23:33:26.372 INFO:tasks.workunit.client.0.smithi194.stderr:1+0 records out
2021-07-02T23:33:26.372 INFO:tasks.workunit.client.0.smithi194.stderr:1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0490547 s, 21.4 MB/s
2021-07-02T23:33:26.372 INFO:tasks.workunit.client.0.smithi194.stderr:+ unmap_device /dev/nbd0 14770
2021-07-02T23:33:26.373 INFO:tasks.workunit.client.0.smithi194.stderr:+ local dev=/dev/nbd0
2021-07-02T23:33:26.373 INFO:tasks.workunit.client.0.smithi194.stderr:+ local pid=14770
2021-07-02T23:33:26.373 INFO:tasks.workunit.client.0.smithi194.stderr:+ _sudo rbd-nbd unmap /dev/nbd0
2021-07-02T23:33:26.373 INFO:tasks.workunit.client.0.smithi194.stderr:+ local cmd
2021-07-02T23:33:26.373 INFO:tasks.workunit.client.0.smithi194.stderr:++ id -u
2021-07-02T23:33:26.374 INFO:tasks.workunit.client.0.smithi194.stderr:+ '[' 1000 -eq 0 ']'
2021-07-02T23:33:26.374 INFO:tasks.workunit.client.0.smithi194.stderr:++ which rbd-nbd
2021-07-02T23:33:26.374 INFO:tasks.workunit.client.0.smithi194.stderr:+ cmd=/usr/bin/rbd-nbd
2021-07-02T23:33:26.374 INFO:tasks.workunit.client.0.smithi194.stderr:+ shift
2021-07-02T23:33:26.375 INFO:tasks.workunit.client.0.smithi194.stderr:+ sudo -nE /usr/bin/rbd-nbd unmap /dev/nbd0

2021-07-03T11:20:56.759 DEBUG:teuthology.task.console_log:Killing console logger for smithi194
[  489.637688] block nbd0: Device being setup by another task
[  497.624291] blk_update_request: I/O error, dev nbd0, sector 128 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[  497.634848] blk_update_request: I/O error, dev nbd0, sector 128 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  497.645054] Buffer I/O error on dev nbd0, logical block 128, async page read
[  497.652129] blk_update_request: I/O error, dev nbd0, sector 129 op 0x0:(READ) flags 0x0 phys_seg 7 prio class 0
[  497.662228] Buffer I/O error on dev nbd0, logical block 129, async page read
[  497.669298] Buffer I/O error on dev nbd0, logical block 130, async page read
[  497.676355] Buffer I/O error on dev nbd0, logical block 131, async page read
[  497.683420] Buffer I/O error on dev nbd0, logical block 132, async page read
[  497.690480] Buffer I/O error on dev nbd0, logical block 133, async page read
[  497.697533] Buffer I/O error on dev nbd0, logical block 134, async page read
[  497.704591] Buffer I/O error on dev nbd0, logical block 135, async page read
[  503.521713] block nbd0: Device being setup by another task
[  505.620547] block nbd0: Send disconnect failed -32
[  505.625744] blk_update_request: I/O error, dev nbd0, sector 130944 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[  505.636556] blk_update_request: I/O error, dev nbd0, sector 130944 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  505.647007] Buffer I/O error on dev nbd0, logical block 130944, async page read
[  505.654352] blk_update_request: I/O error, dev nbd0, sector 130945 op 0x0:(READ) flags 0x0 phys_seg 7 prio class 0
[  505.664721] Buffer I/O error on dev nbd0, logical block 130945, async page read
[  505.672042] Buffer I/O error on dev nbd0, logical block 130946, async page read
[  505.679360] Buffer I/O error on dev nbd0, logical block 130947, async page read
[  505.686691] Buffer I/O error on dev nbd0, logical block 130948, async page read
[  505.694008] Buffer I/O error on dev nbd0, logical block 130949, async page read
[  505.701333] Buffer I/O error on dev nbd0, logical block 130950, async page read
[  505.708654] Buffer I/O error on dev nbd0, logical block 130951, async page read

Entering kdb (current=0xffff90cd5bc89740, pid 10) on processor 0 Oops: (null)
due to oops @ 0xffffffff8e4d0042
CPU: 0 PID: 10 Comm: ksoftirqd/0 Not tainted 5.4.0-77-generic #86-Ubuntu
Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 2.0 12/17/2015
RIP: 0010:__noinstr_text_end+0x1b22/0x38b0
Code: 48 8d 4f 34 0f 0b 49 8d 4c 24 34 0f 0b 48 8d 4f 34 0f 0b 49 8d 0c 24 0f 0b 49 8d 8f dc 00 00 00 0f 0b 49 8d 8c 24 dc 00 00 00 <0f> 0b 49 8d 8c 24 dc 00 00 00 0f 0b 49 8d 8c 24 dc 00 00 00 0f 0b
RSP: 0018:ffffa6e6800bbdb8 EFLAGS: 00010296
RAX: 0000000000000000 RBX: ffffc6e67f9d9ac0 RCX: ffff90cd54d4235c
RDX: 0000000000080700 RSI: 0000000000000000 RDI: ffff90cd56166858
RBP: ffffa6e6800bbdd8 R08: 0000000000000000 R09: 0000000000000001
R10: ffff90cd27459e00 R11: 0000000000000001 R12: ffff90cd54d42280
R13: ffff90cd4e41a0e0 R14: 0000000000000000 R15: 0000000000000004
FS:  0000000000000000(0000) GS:ffff90cd5fa00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007efffb7fea08 CR3: 000000015880a003 CR4: 00000000003606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 blk_mq_end_request+0x11b/0x130
 nbd_complete_rq+0x24/0x70 [nbd]
more>

http://qa-proxy.ceph.com/teuthology/yuriw-2021-07-02_22:41:48-rbd-pacific-distro-basic-smithi/6250362/teuthology.log


Related issues 1 (1 open0 closed)

Related to rbd - Bug #50905: [rbd-nbd] kernel lockup during rbd_fsx_nbdNew

Actions
Actions #1

Updated by Ilya Dryomov almost 3 years ago

Likely

void blk_mq_end_request(struct request *rq, blk_status_t error)
{
        if (blk_update_request(rq, error, blk_rq_bytes(rq)))
                BUG();
        __blk_mq_end_request(rq, error);
}

meaning that the request didn't get completed with a single blk_update_request() but I haven't checked the kernel binary.

Actions #2

Updated by Ilya Dryomov almost 3 years ago

  • Related to Bug #50905: [rbd-nbd] kernel lockup during rbd_fsx_nbd added
Actions

Also available in: Atom PDF