Actions
Bug #5636
closedkrbd: crash in image refresh
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
dumpall is attached
[1]kdb> bt Stack traceback for pid 19757 0xffff880223cadeb0 19757 2 1 1 R 0xffff880223cae338 *kworker/u64:3 ffff88020b4e9cc8 0000000000000018 ffffffffa0248dbd ffff88014ea5f800 ffff88020cb4bf60 0000000800000006 0000000800000006 ffff88020b4e9d18 ffffffffa0248ec1 ffff88020c8e9360 ffff88020cb4bf60 0000000000000018 Call Trace: [<ffffffffa0248dbd>] ? rbd_dev_refresh+0x6d/0x130 [rbd] [<ffffffffa0248ec1>] ? rbd_watch_cb+0x41/0x140 [rbd] [<ffffffffa06a8d42>] ? do_event_work+0x52/0xc0 [libceph] [<ffffffff8105f3ea>] ? process_one_work+0x1da/0x540 [<ffffffff8105f37f>] ? process_one_work+0x16f/0x540 [<ffffffff810605cc>] ? worker_thread+0x11c/0x370 [<ffffffff810604b0>] ? manage_workers.isra.20+0x2e0/0x2e0 [<ffffffff8106728a>] ? kthread+0xea/0xf0 [<ffffffff810671a0>] ? flush_kthread_worker+0x150/0x150 [<ffffffff8164071c>] ? ret_from_fork+0x7c/0xb0 [<ffffffff810671a0>] ? flush_kthread_worker+0x150/0x150
job was
ubuntu@teuthology:/a/teuthology-2013-07-15_01:01:11-kernel-next-testing-basic/67676$ cat orig.config.yaml kernel: kdb: true sha1: 365b57b1317524bb0cdd15859a224ba1ab58d1d7 machine_type: plana nuke-on-error: true overrides: admin_socket: branch: next ceph: conf: global: ms inject socket failures: 500 mon: debug mon: 20 debug ms: 20 debug paxos: 20 osd: osd op thread timeout: 60 fs: btrfs log-whitelist: - slow request sha1: 9baa66801ab02854c344eb2fd1a8da8c5806125b install: ceph: sha1: 9baa66801ab02854c344eb2fd1a8da8c5806125b s3tests: branch: next workunit: sha1: 9baa66801ab02854c344eb2fd1a8da8c5806125b roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - - client.0 tasks: - chef: null - clock.check: null - install: null - ceph: null - workunit: clients: all: - rbd/image_read.sh
Files
Updated by Sage Weil over 10 years ago
again on ubuntu@teuthology:/a/teuthology-2013-08-22_01:01:30-krbd-next-testing-basic-plana/1020
<1>[ 257.820476] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 <1>[ 257.828499] IP: [<ffffffffa00ffdde>] rbd_dev_refresh+0x8e/0x130 [rbd] <4>[ 257.835061] PGD 0 <4>[ 257.837183] Oops: 0002 [#1] SMP
which is here:
/* If it's a mapped snapshot, validate its EXISTS flag */ rbd_exists_validate(rbd_dev); up_write(&rbd_dev->header_rwsem); 5db5: 4c 89 ef mov %r13,%rdi 5db8: e8 00 00 00 00 callq 5dbd <rbd_dev_refresh+0x6d> 5db9: R_X86_64_PC32 up_write-0x4 if (mapping_size != rbd_dev->mapping.size) { 5dbd: 4c 8b ab 98 01 00 00 mov 0x198(%rbx),%r13 ^^^^^ 5dc4: 4d 39 f5 cmp %r14,%r13 5dc7: 74 22 je 5deb <rbd_dev_refresh+0x9b> sector_t size; size = (sector_t)rbd_dev->mapping.size / SECTOR_SIZE; 5dc9: 49 c1 ed 09 shr $0x9,%r13 dout("setting size to %llu sectors", (unsigned long long)size); 5dcd: f6 05 00 00 00 00 04 testb $0x4,0x0(%rip) # 5dd4 <rbd_dev_refresh+0x84>
this is a work item.. and we just released the lock. maybe the rbd reference went away?
Updated by Josh Durgin over 10 years ago
- Status changed from New to Fix Under Review
- Assignee set to Josh Durgin
branch wip-rbd-bugs-shutdown-lock contains a few fixes
Updated by Sage Weil over 10 years ago
- Status changed from Fix Under Review to Resolved
Actions