Project

General

Profile

Actions

Bug #2790

closed

libceph: crash in read_partial_message_section on ffsb

Added by Sage Weil almost 12 years ago. Updated almost 12 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
Category:
libceph
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description


[0]kdb> bt
Stack traceback for pid 4558
0xffff88020a66bf00     4558        2  1    0   R  0xffff88020a66c358 *kworker/0:2
<c> ffff8801121019e0<c> 0000000000000018<c> 0000000000000000<c> ffff88020a66bf00<c>
<c> 0000000000000000<c> ffffffff00000001<c> 0000000000000000<c> 0000000000000000<c>
<c> 0000000000000000<c> 0000000000000000<c> 0000000000000000<c> ffff88020a66bf00<c>
Call Trace:
 [<ffffffff8108126c>] ? ttwu_stat+0x4c/0x140
 [<ffffffff810894c4>] ? __enqueue_entity+0x74/0x80
 [<ffffffff810812d8>] ? ttwu_stat+0xb8/0x140
 [<ffffffff8108126c>] ? ttwu_stat+0x4c/0x140
 [<ffffffffa04befaa>] ? con_work+0x1aba/0x2ed0 [libceph]
 [<ffffffffa04befaa>] ? con_work+0x1aba/0x2ed0 [libceph]
 [<ffffffff81507996>] ? kernel_recvmsg+0x46/0x60
 [<ffffffffa04bb778>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
 [<ffffffffa04bc73b>] ? read_partial_message_section.isra.17+0x6b/0xb0 [libceph]
 [<ffffffffa04bddbe>] ? con_work+0x8ce/0x2ed0 [libceph]
 [<ffffffff8108cee0>] ? load_balance+0xd0/0x7e0
 [<ffffffff8108da63>] ? idle_balance+0x133/0x180
 [<ffffffff8107fb28>] ? finish_task_switch+0x48/0x110
 [<ffffffffa04bd4f0>] ? ceph_msg_new+0x2e0/0x2e0 [libceph]
 [<ffffffff8106d18a>] ? process_one_work+0x18a/0x510
 [<ffffffff8106d11e>] ? process_one_work+0x11e/0x510
[0]more> 
 [<ffffffff8106ecdf>] ? worker_thread+0x15f/0x350
 [<ffffffff8106eb80>] ? manage_workers.isra.27+0x230/0x230
 [<ffffffff8107411e>] ? kthread+0xae/0xc0
 [<ffffffff810ae29d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff816368f4>] ? kernel_thread_helper+0x4/0x10
 [<ffffffff8162cb50>] ? _raw_spin_unlock_irq+0x30/0x40
 [<ffffffff8162d370>] ? retint_restore_args+0x13/0x13
 [<ffffffff81074070>] ? __init_kthread_worker+0x70/0x70

ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2012-07-17_07:00:05-marginal-master-testing-basic/12842$ cat config.yaml 
kernel: &id001
  kdb: true
  sha1: 381448be4f27edab3cdeea88bbf6670e19bf4b8a
nuke-on-error: true
overrides:
  ceph:
    fs: btrfs
    log-whitelist:
    - slow request
    sha1: abe05a3fbbb120d8d354623258d9104584db66f7
  workunit:
    sha1: abe05a3fbbb120d8d354623258d9104584db66f7
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana25.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDuXajaQgHe9XnbLOzI8WWFYVz6+TnOiTzbkIJPGOZpzQEjnUtJraQIEt5ABSeovMjiEj+V4XvunfyuSmEd0H9giRSyjmCHTPGlpndfTeCdVtCBpNqf5GkUqHaEY1Hp57XPbya2rGlwtFm0NeIDYx6pfkejKnsTOUqwhgUb6950TRhjHQhMjFgyALSyfAm/4y6vGZfjm57+yyih6XgDkqWiiQ6Y/aJVR2n+iCzvqEzV7JSCU+Brn+k8IQLHho1fadYqc5PjYct5BaVlHcP6c+T8nJE/DvqGwZ4gQaVJcuWJiDfLOPPYo1g/0AFicxauLwVNJ6HFR9FjLLGtGU+2DcVN
  ubuntu@plana26.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDx0US96hot7gygZ69W4nxJQ9myYnn3I22YtOaSPe+yFWOPJVQUuOST+aw5K6JDcjdO2Gq0aS6s01mgoWpZlO/FVDKss7vZ2KjMp3uPkGMpDZarNbR3QTe5YZYrl7Wfw4pMu4jh92hCWJEzy5nH0H3X2YJhOd5BdOYz0P97qsMSPQGxhlvDBYBhDl9MLgsS3lKm/Js/OPLO+Uf3/SZceCjUqO2m3WsrJSiQJKh8XUWUu3z+6C1Wg6TXSSlA/jdVCiokDg7WYwPN9zMwzzGkGv+GUGHKMZaPGRZb9LQJLTBf/OjwRSgclAVdDc3vnZeYAS5+sDnt2grnJnlBd1rBUj3n
  ubuntu@plana30.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDYDpptyOTrVH3RqiwH5A//Q2CkkVz5dPTpd/s8qG/Q4EHVA4WDMu80pcDvdSewOfFJl83MEtDKKjuJOuEzI4OGn0DPptDN5wHC1OWrXqFMcIaWVe/KBYOdWEZbA7FECeXgEZR1Sid2bH7XDUE9AYalpS2/SmuuHEU1ObL6zSpAqoY6AIPCR6LgFrtxAqrYmIdpb8YfSuI5uPBv6qikl0yvam06WNerUNQ9lnZXFmFm1wBeicRvWH3jZ6w/xlQBIp/zG6k9IJa0vaLm+FqztLkDWri8Qz1dbdsz0bNjyzD6iRuDOpgmz0Kf8m2IjaJRgRgz2ARcOOdBJKmwnnW/knk5
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph: null
- kclient: null
- workunit:
    clients:
      all:
      - suites/ffsb.sh

very reproducible.

Related issues 1 (0 open1 closed)

Related to Linux kernel client - Bug #2867: kclient: crash from ffsb in con_work -> kernel_sendmsg ResolvedSage Weil07/27/2012

Actions
Actions

Also available in: Atom PDF