Actions
Bug #252
closedGFP at tcp_sendpage+0x327/0x5d3
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
just saw this on both ceph2 and ceph4. running bonnie.sh and .. iozone? and a few times earlier this week.
client is on master branch, eb2d68b502a3460a28ec21076ba759d38023131c, + yehuda's module loading fix.
[254410.244871] ceph: osd1 10.3.14.128:6800 connection failed [254413.248422] ceph: osd1 10.3.14.128:6800 connection failed [254417.499979] ceph: osd1 10.3.14.128:6800 connection failed [254420.503820] ceph: osd1 10.3.14.128:6800 connection failed [254427.503082] ceph: osd1 10.3.14.128:6800 connection failed [254438.510184] ceph: osd1 10.3.14.128:6800 connection failed [254457.548634] ceph: osd1 10.3.14.128:6800 connection failed [254492.553782] ceph: osd1 10.3.14.128:6800 connection failed [254557.464357] ceph: tid 441784 timed out on osd1, will reset osd [254559.556324] ceph: osd1 10.3.14.128:6800 connection failed [254611.507868] ceph: mds0 hung [254622.563055] ceph: tid 441784 timed out on osd1, will reset osd [254682.654162] ceph: tid 441784 timed out on osd1, will reset osd [254688.691700] ceph: mon0 10.3.14.136:6789 socket closed [254688.697331] ceph: mon0 10.3.14.136:6789 session lost, hunting for new mon [254688.950176] ceph: osd2 down [254688.953154] ceph: osd8 down [254688.956312] ceph: mon0 10.3.14.136:6789 session established [254697.685943] ceph: osd3 down [254697.688915] ceph: osd4 down [254704.047740] ceph: osd8 up [254704.050559] ceph: osd8 weight 0x10000 (in) [254707.701106] ceph: osd2 up [254707.703880] ceph: osd2 weight 0x10000 (in) [254709.429926] ceph: get_reply unknown tid 441781 from osd7 [254712.649652] ceph: mds0 came back [254712.653038] ceph: mds0 caps went stale, renewing [254712.712709] ceph: osd4 up [254712.715494] ceph: osd4 weight 0x10000 (in) [254717.724337] ceph: osd3 up [254717.727128] ceph: osd3 weight 0x10000 (in) [254735.795747] general protection fault: 0000 [#1] PREEMPT SMP [254735.797730] last sysfs file: /sys/kernel/uevent_seqnum [254735.797730] CPU 0 [254735.797730] Modules linked in: aes_x86_64 aes_generic ceph fan ac battery ehci_hcd uhci_hcd container thermal processor button [254735.797730] [254735.797730] Pid: 2859, comm: ceph-msgr/0 Not tainted 2.6.35-rc3+ #44 PDSMi+/PDSMi [254735.797730] RIP: 0010:[<ffffffff813df46b>] [<ffffffff813df46b>] tcp_sendpage+0x327/0x5d3 [254735.797730] RSP: 0018:ffff88011bccbbd0 EFLAGS: 00010246 [254735.797730] RAX: ffffffff8171b390 RBX: ffff88011bed2aa8 RCX: 000000000000fe88 [254735.797730] RDX: 6b6b6b6b6b6b6b6b RSI: 00000000000001c8 RDI: ffff88007dacd3e0 [254735.797730] RBP: ffff88011bccbc60 R08: 0000000000000000 R09: ffffffff813df19d [254735.797730] R10: ffffffff810aa8dc R11: ffff8800a2ee69b8 R12: ffff88011c4adde0 [254735.797730] R13: 0000000000000d4c R14: 0000000000000000 R15: 00000000000002b4 [254735.797730] FS: 0000000000000000(0000) GS:ffff880002600000(0000) knlGS:0000000000000000 [254735.797730] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [254735.797730] CR2: 00007ff945586250 CR3: 000000011bde6000 CR4: 00000000000006f0 [254735.797730] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [254735.797730] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [254735.797730] Process ceph-msgr/0 (pid: 2859, threadinfo ffff88011bcca000, task ffff88011bcd03d0) [254735.797730] Stack: [254735.797730] 0000000000000000 0000000000000000 0000000000000000 ffff88011bed2c80 [254735.797730] <0> 0000c04000000000 0000000000000d4c 000002b400004040 0000000000000000 [254735.797730] <0> 00008800000005a8 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b 0000000000000000 [254735.797730] Call Trace: [254735.797730] [<ffffffff813a49bd>] kernel_sendpage+0x16/0x1f [254735.797730] [<ffffffffa00f8758>] try_write+0x649/0xff4 [ceph] [254735.797730] [<ffffffffa00f9bfa>] con_work+0x135/0x6b2 [ceph] [254735.797730] [<ffffffff81048406>] worker_thread+0x1e8/0x2fa [254735.797730] [<ffffffff810483ad>] ? worker_thread+0x18f/0x2fa [254735.797730] [<ffffffff8102ce5c>] ? sub_preempt_count+0x92/0x9e [254735.797730] [<ffffffffa00f9ac5>] ? con_work+0x0/0x6b2 [ceph] [254735.797730] [<ffffffff8104b4c8>] ? autoremove_wake_function+0x0/0x38 [254735.797730] [<ffffffff8104821e>] ? worker_thread+0x0/0x2fa [254735.797730] [<ffffffff8104b196>] kthread+0x7d/0x85 [254735.797730] [<ffffffff81003794>] kernel_thread_helper+0x4/0x10 [254735.797730] [<ffffffff8144fc80>] ? restore_args+0x0/0x30 [254735.797730] [<ffffffff8104b119>] ? kthread+0x0/0x85 [254735.797730] [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10 [254735.797730] Code: 00 45 85 c0 74 21 41 8b 94 24 a8 00 00 00 41 8d 46 ff 49 03 94 24 b0 00 00 00 48 98 48 c1 e0 04 44 01 6c 02 3c eb 61 48 8b 55 b8 <66> 83 3a 00 79 04 48 8b 52 10 8b 42 08 85 c0 75 04 0f 0b eb fe [254735.797730] RIP [<ffffffff813df46b>] tcp_sendpage+0x327/0x5d3 [254735.797730] RSP <ffff88011bccbbd0> [254736.074181] ---[ end trace a23ea86bf8dbfa65 ]--- [254778.094378] ceph: tid 424933 timed out on osd3, will reset osd [254838.185488] ceph: tid 424933 timed out on osd3, will reset osd
Updated by Sage Weil almost 14 years ago
- Status changed from New to Resolved
Ah, finally. Fixed by commit:ed98adad3d87594c55347824e85137d1829c9e70
Actions