Project

General

Profile

Actions

Bug #2868

closed

kclient: crash in __kick_osd_requests -> __reset_osd -> __remove_osd

Added by Sage Weil over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

[1]kdb> rd
ax: ffff88018dbf1018  bx: ffff880209677818  cx: ffff880208c1e019
dx: ffff88018dbf1018  si: ffff88018dbf1018  di: 0000000000000000
bp: ffff880223de7ba0  sp: ffff880223de7b90  r8: 0000000000000001
r9: 2222222222222222  r10: 2222222222222222  r11: 0000000000000000
r12: ffff880222559ab8  r13: ffff880222559958  r14: ffff88022205fd48
r15: ffff88022205f800  ip: ffffffff8131f076  flags: 00010286  cs: 00000010
ss: 00000018  ds: 00000018  es: 00000018  fs: 00000018  gs: 00000018

Stack traceback for pid 14261
0xffff88020aff0000    14261        2  1    1   R  0xffff88020aff0458 *kworker/1:3
<c> ffff880223de7b90<c> 0000000000000018<c> ffff88022205f800<c> ffff880222559958<c>
<c> ffff880223de7bc0<c> ffffffffa036ebec<c> ffff88022205f800<c> ffff880222559958<c>
<c> ffff880223de7bf0<c> ffffffffa036eff7<c> ffff88022205f800<c> ffff880222559958<c>
Call Trace:
 [<ffffffffa036ebec>] ? __remove_osd+0x3c/0xa0 [libceph]
 [<ffffffffa036eff7>] ? __reset_osd+0x137/0x180 [libceph]
 [<ffffffffa036f974>] ? __kick_osd_requests+0x34/0x210 [libceph]
 [<ffffffffa03700dc>] ? osd_reset+0x5c/0xa0 [libceph]
 [<ffffffffa036abc6>] ? con_work+0xe76/0x1620 [libceph]
 [<ffffffffa036eba1>] ? put_osd_con+0x11/0x20 [libceph]
 [<ffffffffa036aa8d>] ? con_work+0xd3d/0x1620 [libceph]
 [<ffffffffa0369d50>] ? try_read+0x1860/0x1860 [libceph]
 [<ffffffff8106d18a>] ? process_one_work+0x18a/0x510
 [<ffffffff8106d11e>] ? process_one_work+0x11e/0x510
 [<ffffffff8106d24e>] ? process_one_work+0x24e/0x510
 [<ffffffff8106d53c>] ? process_scheduled_works+0x2c/0x40
 [<ffffffff8106edfc>] ? worker_thread+0x27c/0x350
 [<ffffffff8106eb80>] ? manage_workers.isra.27+0x230/0x230
 [<ffffffff8107411e>] ? kthread+0xae/0xc0
 [<ffffffff810ae29d>] ? trace_hardirqs_on+0xd/0x10
[1]more> 
 [<ffffffff816368f4>] ? kernel_thread_helper+0x4/0x10
 [<ffffffff8162cb50>] ? _raw_spin_unlock_irq+0x30/0x40
 [<ffffffff8162d370>] ? retint_restore_args+0x13/0x13
 [<ffffffff81074070>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff816368f0>] ? gs_change+0x13/0x13

        "name": "ubuntu@plana78.front.sepia.ceph.com", 
        "description": "/var/lib/teuthworker/archive/sage-2012-07-27_10:03:10-regression-wip-msgr-testing-basic/1005" 

kernel: &id001
  kdb: true
  sha1: 7f77b3063194c035c7ac6db634e300126d8f5896
nuke-on-error: true
overrides:
  ceph:
    fs: btrfs
    log-whitelist:
    - slow request
    sha1: 088e0a3a222ee78cca32b7073788a5713d66f89a
  workunit:
    sha1: 088e0a3a222ee78cca32b7073788a5713d66f89a
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana35.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCtCEUU7ZjJxMBOYq9dlPZppwlahU15OIq1a/xqI0YIpAF7ZLf1eis04s6Kzvl45ooKP2J6DW598Lr9zTtFb6j9RoAkgIbIOkWZmLQjeZesi//7B8uXBw7niNzU1iBI83K8dPL2cwTsgwXXtoaD03pW9HCv4ADhH12/CsfoqNPfql7ze4JlI07seHmUqkBuBFxLVgdZDXb+Cjdg2sQnMS5qZo0OGBp2yfBla01Pk8V1sHGg39miF0EDJZyIOL2ziZOcm1f02WVkUAD8MMOtfQbf33DT/ya4Th20YtRMQxNLkUYG6rxcjutzILFKDr7ugbvOwtBhVS7qplqLWAwHdi0p
  ubuntu@plana76.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCtJtDUOZRIhDIfUJTBD9WE3sYbUCFcKu199VUo9RMecazIMnNyCo+3W3efhamKAWayobQXtEko0WspOFXXjEyKKXPtExAtMmw/giNOvtbg/Pz1ghDMMwsFRlvsFD6PYt6xwrEZ/JRETnXishmSQHVC6Gel6deyYzY2sV+BpquslgufihCjVlJmUbUf2tVzUhc8Kxl2bWqwJcKzfSSD9ALoh3t8CWUKH/mvxMNnJv8vDHbtBO2rRUQjLIN1kAm0gsME6Dfg7AQPIDrLw8/ls+R9jWZye6tIbtiGBPRsWSaiVZzuydpgVZarXrwfwcihJs7e8NyT3B5EgszAQSgprYTp
  ubuntu@plana78.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC5u611hXVaR3i+fQJ3IWIw9y4mhS0WrusfWBkk4lDeuEyaJy0SqxbN6D5GhB7KzLZfpyiVcmdFzxk4w9Kc3XgQpJ42mNx4EDTen5ndI8HL8GclRd71RaocVIerynAD94lCAVzLSQCRBOA3HLH35OITPMR+ztZG8zSXUhhdYpwCLTJVb2BE8bA47GzyhCvu4L1JYwnIRzVUrufwugSS4odhlBJvge++omtL+r3Cm8M8W3Ahow+Yk8Ni40m60j82CEs7/Mc95j/CXYIQzwJWkIeRoQMMgT3HmK9koJRzEBkK8FQS9KMnrEi6mJAa+8eL/AgVMvyxpz+W7z77cH5p2Kab
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds: null
- kclient: null
- workunit:
    clients:
      all:
      - suites/ffsb.sh

Related issues 1 (0 open1 closed)

Related to Linux kernel client - Bug #2688: lockup on ffsb + thrashingDuplicate07/02/2012

Actions
Actions #1

Updated by Sage Weil over 11 years ago

  • Project changed from Ceph to Linux kernel client
Actions #2

Updated by Sage Weil over 11 years ago

  • Description updated (diff)
Actions #3

Updated by Sage Weil over 11 years ago

  • Status changed from New to 7

hoping this was the messenger locking stuff, let's see if it pops up again

Actions #4

Updated by Sage Weil over 11 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF