Actions
Bug #18474
closedoops in __unregister_request
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
I left a xfstests run going overnight last night and when I came back I saw this oops on the console:
[14231.902682] run fstests generic/205 at 2017-01-09 23:11:24 [14339.010567] ceph: dropping unsafe request 18446621567284867072 [14339.011516] ------------[ cut here ]------------ [14339.012188] kernel BUG at fs/ceph/mds_client.c:576! [14339.012816] invalid opcode: 0000 [#1] SMP [14339.013338] Modules linked in: loop ceph(OE) libceph fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute bridge stp llc ebtable_nat ip6table_raw ip6table_mangle ip6table_security ip6table_ nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 iptable_raw iptable_mangle iptable_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables nfsd auth_rp cgss crct10dif_pclmul crc32_pclmul ghash_clmulni_intel nfs_acl lockd ppdev joydev acpi_cpufreq parport_pc tpm_tis pcspkr parport i2c_piix4 tpm_tis_core tpm virtio_balloon qemu_fw_cfg grace sunrpc xfs libcrc32c virtio_blk virtio_net v irtio_console qxl drm_kms_helper ata_generic ttm serio_raw crc32c_intel drm pata_acpi virtio_pci virtio_ring virtio floppy [14339.026549] CPU: 1 PID: 29659 Comm: kworker/1:2 Tainted: G OE 4.9.0+ #13 [14339.027572] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014 [14339.029143] Workqueue: ceph-msgr ceph_con_workfn [libceph] [14339.029916] task: ffff9094ea818000 task.stack: ffffb500420b4000 [14339.030676] RIP: 0010:[<ffffffffc05a41bd>] [<ffffffffc05a41bd>] __unregister_request+0x1ad/0x1b0 [ceph] [14339.032275] RSP: 0018:ffffb500420b7c08 EFLAGS: 00010246 [14339.033134] RAX: 0000000000000000 RBX: ffff9094c012f400 RCX: 0000000000000000 [14339.034282] RDX: 0000000000000000 RSI: ffff9094efb87100 RDI: ffff9094efb87000 [14339.035108] RBP: ffffb500420b7c20 R08: 0000000000000000 R09: 0000000000000000 [14339.036325] R10: 0000000000000c00 R11: 0000000000000479 R12: ffff9094efb87000 [14339.037455] R13: ffff9094c012f408 R14: ffff9094efd00db8 R15: ffff9094c012f738 [14339.038532] FS: 0000000000000000(0000) GS:ffff9094ffc80000(0000) knlGS:0000000000000000 [14339.039827] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [14339.040674] CR2: 0000555bf06f56b8 CR3: 00000001311a5000 CR4: 00000000000406e0 [14339.041755] Stack: [14339.042082] ffff9094efd00800 ffff9094c012f400 ffff9094efb87000 ffffb500420b7c60 [14339.043306] ffffffffc05a4247 ffff9094efb87008 ffff9094efb87000 0000000000000003 [14339.044514] ffff9094eee4af00 ffff9094efd00800 ffff9094eeea3420 ffffb500420b7d18 [14339.045718] Call Trace: [14339.046109] [<ffffffffc05a4247>] cleanup_session_requests+0x87/0x130 [ceph] [14339.047199] [<ffffffffc05a78ee>] dispatch+0x69e/0x16d0 [ceph] [14339.048059] [<ffffffffc052920b>] ceph_con_workfn+0x5fb/0x2ef0 [libceph] [14339.049099] [<ffffffffad0daa2c>] ? dequeue_task_fair+0x5ec/0x940 [14339.050035] [<ffffffffad05cf4e>] ? kvm_sched_clock_read+0x1e/0x30 [14339.051029] [<ffffffffad02fc49>] ? sched_clock+0x9/0x10 [14339.051807] [<ffffffffad025728>] ? __switch_to+0x2a8/0x5c0 [14339.052652] [<ffffffffad0bbe54>] process_one_work+0x184/0x430 [14339.053572] [<ffffffffad0bc14e>] worker_thread+0x4e/0x490 [14339.054364] [<ffffffffad0bc100>] ? process_one_work+0x430/0x430 [14339.055220] [<ffffffffad0bc100>] ? process_one_work+0x430/0x430 [14339.056336] [<ffffffffad0c1b59>] kthread+0xd9/0xf0 [14339.057270] [<ffffffffad0c1a80>] ? kthread_park+0x60/0x60 [14339.058275] [<ffffffffad0c1a80>] ? kthread_park+0x60/0x60 [14339.059293] [<ffffffffad80cb15>] ret_from_fork+0x25/0x30 [14339.060293] Code: 11 00 00 74 e6 48 8b 40 f8 49 89 84 24 f8 00 00 00 e9 96 fe ff ff 4c 8b a3 88 00 00 00 4d 85 e4 0f 85 56 ff ff ff e9 cb fe ff ff <0f> 0b 90 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 49 [14339.064785] RIP [<ffffffffc05a41bd>] __unregister_request+0x1ad/0x1b0 [ceph] [14339.065884] RSP <ffffb500420b7c08> [14339.067866] ---[ end trace 8a0bb333836080c5 ]---
...the debugger says:
(gdb) list *(__unregister_request+0x1ad) 0x231ed is in __unregister_request (fs/ceph/mds_client.c:576). 571 put_request_session(req); 572 ceph_unreserve_caps(req->r_mdsc, &req->r_caps_reservation); 573 kfree(req); 574 } 575 576 DEFINE_RB_FUNCS(request, struct ceph_mds_request, r_tid, r_node) 577 578 /* 579 * lookup session, bump ref if found. 580 *
Which may be the BUG_ON in DEFINE_RB_INSDEL_FUNCS:
BUG_ON(!RB_EMPTY_NODE(&t->nodefld));
...unfortunately I didn't get a vmcore, so it's hard to know for sure.
Actions