Project

General

Profile

Actions

Bug #2261

closed

paging error in libceph after crashed osd comes back online

Added by Pim van Riezen about 12 years ago. Updated almost 12 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

RIP: e030:[<ffffffff8152256a>]  [<ffffffff8152256a>] ceph_con_send+0x6b/0xc1
RSP: e02b:ffff88001db19c40  EFLAGS: 00010287
RAX: ffff88001d75a878 RBX: ffff88001d75a800 RCX: ffff88001d63b020
RDX: ffff88001d71c000 RSI: ffff88001d75a800 RDI: ffff880011bb89c8
RBP: ffff880011bb8830 R08: ffff88001da4a238 R09: 0000000000000002
R10: 0000000000000000 R11: ffff88001db19a70 R12: ffff880011bb89a8
R13: ffff88001da4a190 R14: ffff88001dbbcfb7 R15: 0000000000000000
FS:  00007f5fe9a6b6e0(0000) GS:ffff88001ffd9000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f5fe92c9d40 CR3: 000000001d6b7000 CR4: 0000000000000660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/0:2 (pid: 1762, threadinfo ffff88001db18000, task ffff88001e1c3000)
Stack:
 ffff88001d63b000 ffff88001da4a190 ffff88001d63b020 ffffffff81524805
 ffff88001da4a218 ffff88001da4a1e8 ffff88001dbbcfbb ffffffff81524846
 ffff88001da4a190 0000000000000000 ffff880011bb8800 ffffffff81525623
Call Trace:
 [<ffffffff81524805>] ? __send_request+0x91/0xa6
 [<ffffffff81524846>] ? send_queued+0x2c/0x53
 [<ffffffff81525623>] ? ceph_osdc_handle_map+0x2a4/0x2f5
 [<ffffffff815256be>] ? dispatch+0x4a/0x252
 [<ffffffff8152113c>] ? try_read+0xdcf/0xee7
 [<ffffffff8102d833>] ? finish_task_switch+0x4f/0x97
 [<ffffffff8153f43e>] ? __schedule+0x691/0x71d
 [<ffffffff81521609>] ? con_work+0xb0/0xd18
 [<ffffffff8100648d>] ? xen_force_evtchn_callback+0x9/0xa
 [<ffffffff81006a62>] ? check_events+0x12/0x20
 [<ffffffff8104c005>] ? process_one_work+0x1ce/0x320
 [<ffffffff81465e78>] ? tcp_sendpage+0x5ae/0x5ae
 [<ffffffff81521559>] ? ceph_parse_ips+0x1b0/0x1b0
 [<ffffffff8104dd5b>] ? worker_thread+0x11d/0x202
 [<ffffffff8104dc3e>] ? gcwq_mayday_timeout+0x60/0x60
 [<ffffffff81050ec2>] ? kthread+0x7e/0x86
 [<ffffffff81547d44>] ? kernel_thread_helper+0x4/0x10
 [<ffffffff81546e73>] ? int_ret_from_sys_call+0x7/0x1b
 [<ffffffff81540d21>] ? retint_restore_args+0x5/0x6
 [<ffffffff81547d40>] ? gs_change+0x13/0x13
Code: 46 50 74 04 0f 0b eb fe 4c 8d a7 78 01 00 00 c6 86 b2 00 00 00 01 4c 89 e7 e8 22 d6 01 00 48 8b 7b 78 48 8d 43 78 48 39 c7 74 04 <0f> 0b eb fe 48 8d 95 98 01 00 00 48 8b 72 08 e8 2a 2c dc ff 0f 
RIP  [<ffffffff8152256a>] ceph_con_send+0x6b/0xc1
 RSP <ffff88001db19c40>
---[ end trace b7cc2a6ca88b241c ]---
BUG: unable to handle kernel paging request at fffffffffffffff8
IP: [<ffffffff81050b83>] kthread_data+0x7/0xc
PGD 1915067 PUD 1916067 PMD 0 
Oops: 0000 [#2] SMP 
CPU 0
Actions #1

Updated by Alex Elder about 12 years ago

  • Status changed from New to In Progress
  • Assignee set to Alex Elder
  • Priority changed from Normal to High
Actions #2

Updated by Josh Durgin about 12 years ago

  • Description updated (diff)
Actions #3

Updated by Alex Elder almost 12 years ago

No progress on this.

There has been a lot of work on the messenger code since this bug was
reported. One change that could conceivably have fixed this is:

libceph: flush msgr queue during mon_client shutdown
commit f3dea7ed

In any case, it would be very nice if we could reproduce this problem,
and then see if it still exists.

I think though, since we don't have any information about what led
to this we might as well change its Status to "Can't Reproduce."

I'll leave that up to Sage to do though.

Actions #4

Updated by Sage Weil almost 12 years ago

  • Status changed from In Progress to Can't reproduce

the osd_client refcounting bug fix may explain this one, too... commit:0d47766f14211a73eaf54cab234db134ece79f49

anyway, i'll close it out!

Actions

Also available in: Atom PDF