Project

General

Profile

Actions

Bug #18474

closed

oops in __unregister_request

Added by Jeff Layton over 7 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
fs/ceph
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

I left a xfstests run going overnight last night and when I came back I saw this oops on the console:

[14231.902682] run fstests generic/205 at 2017-01-09 23:11:24
[14339.010567] ceph:  dropping unsafe request 18446621567284867072
[14339.011516] ------------[ cut here ]------------
[14339.012188] kernel BUG at fs/ceph/mds_client.c:576!
[14339.012816] invalid opcode: 0000 [#1] SMP
[14339.013338] Modules linked in: loop ceph(OE) libceph fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute bridge stp llc ebtable_nat ip6table_raw ip6table_mangle ip6table_security ip6table_
nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 iptable_raw iptable_mangle iptable_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables nfsd auth_rp
cgss crct10dif_pclmul crc32_pclmul ghash_clmulni_intel nfs_acl lockd ppdev joydev acpi_cpufreq parport_pc tpm_tis pcspkr parport i2c_piix4 tpm_tis_core tpm virtio_balloon qemu_fw_cfg grace sunrpc xfs libcrc32c virtio_blk virtio_net v
irtio_console qxl drm_kms_helper ata_generic ttm serio_raw crc32c_intel drm pata_acpi virtio_pci virtio_ring virtio floppy
[14339.026549] CPU: 1 PID: 29659 Comm: kworker/1:2 Tainted: G           OE   4.9.0+ #13
[14339.027572] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
[14339.029143] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[14339.029916] task: ffff9094ea818000 task.stack: ffffb500420b4000
[14339.030676] RIP: 0010:[<ffffffffc05a41bd>]  [<ffffffffc05a41bd>] __unregister_request+0x1ad/0x1b0 [ceph]
[14339.032275] RSP: 0018:ffffb500420b7c08  EFLAGS: 00010246
[14339.033134] RAX: 0000000000000000 RBX: ffff9094c012f400 RCX: 0000000000000000
[14339.034282] RDX: 0000000000000000 RSI: ffff9094efb87100 RDI: ffff9094efb87000
[14339.035108] RBP: ffffb500420b7c20 R08: 0000000000000000 R09: 0000000000000000
[14339.036325] R10: 0000000000000c00 R11: 0000000000000479 R12: ffff9094efb87000
[14339.037455] R13: ffff9094c012f408 R14: ffff9094efd00db8 R15: ffff9094c012f738
[14339.038532] FS:  0000000000000000(0000) GS:ffff9094ffc80000(0000) knlGS:0000000000000000
[14339.039827] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14339.040674] CR2: 0000555bf06f56b8 CR3: 00000001311a5000 CR4: 00000000000406e0
[14339.041755] Stack:
[14339.042082]  ffff9094efd00800 ffff9094c012f400 ffff9094efb87000 ffffb500420b7c60
[14339.043306]  ffffffffc05a4247 ffff9094efb87008 ffff9094efb87000 0000000000000003
[14339.044514]  ffff9094eee4af00 ffff9094efd00800 ffff9094eeea3420 ffffb500420b7d18
[14339.045718] Call Trace:
[14339.046109]  [<ffffffffc05a4247>] cleanup_session_requests+0x87/0x130 [ceph]
[14339.047199]  [<ffffffffc05a78ee>] dispatch+0x69e/0x16d0 [ceph]
[14339.048059]  [<ffffffffc052920b>] ceph_con_workfn+0x5fb/0x2ef0 [libceph]
[14339.049099]  [<ffffffffad0daa2c>] ? dequeue_task_fair+0x5ec/0x940
[14339.050035]  [<ffffffffad05cf4e>] ? kvm_sched_clock_read+0x1e/0x30
[14339.051029]  [<ffffffffad02fc49>] ? sched_clock+0x9/0x10
[14339.051807]  [<ffffffffad025728>] ? __switch_to+0x2a8/0x5c0
[14339.052652]  [<ffffffffad0bbe54>] process_one_work+0x184/0x430
[14339.053572]  [<ffffffffad0bc14e>] worker_thread+0x4e/0x490
[14339.054364]  [<ffffffffad0bc100>] ? process_one_work+0x430/0x430
[14339.055220]  [<ffffffffad0bc100>] ? process_one_work+0x430/0x430
[14339.056336]  [<ffffffffad0c1b59>] kthread+0xd9/0xf0
[14339.057270]  [<ffffffffad0c1a80>] ? kthread_park+0x60/0x60
[14339.058275]  [<ffffffffad0c1a80>] ? kthread_park+0x60/0x60
[14339.059293]  [<ffffffffad80cb15>] ret_from_fork+0x25/0x30
[14339.060293] Code: 11 00 00 74 e6 48 8b 40 f8 49 89 84 24 f8 00 00 00 e9 96 fe ff ff 4c 8b a3 88 00 00 00 4d 85 e4 0f 85 56 ff ff ff e9 cb fe ff ff <0f> 0b 90 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 49
[14339.064785] RIP  [<ffffffffc05a41bd>] __unregister_request+0x1ad/0x1b0 [ceph]
[14339.065884]  RSP <ffffb500420b7c08>
[14339.067866] ---[ end trace 8a0bb333836080c5 ]---

...the debugger says:

(gdb) list *(__unregister_request+0x1ad)
0x231ed is in __unregister_request (fs/ceph/mds_client.c:576).
571        put_request_session(req);
572        ceph_unreserve_caps(req->r_mdsc, &req->r_caps_reservation);
573        kfree(req);
574    }
575    
576    DEFINE_RB_FUNCS(request, struct ceph_mds_request, r_tid, r_node)
577    
578    /*
579     * lookup session, bump ref if found.
580     *

Which may be the BUG_ON in DEFINE_RB_INSDEL_FUNCS:

        BUG_ON(!RB_EMPTY_NODE(&t->nodefld));

...unfortunately I didn't get a vmcore, so it's hard to know for sure.

Actions

Also available in: Atom PDF