Project

General

Profile

Actions

Bug #5658

closed

kcepht: warning: fs/ceph/inode.c:1000 ceph_fill_trace+0x760/0x900 [ceph]()

Added by Sage Weil almost 11 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

i noticed this in dmesg:

[ 5354.344055] ------------[ cut here ]------------
[ 5354.348726] WARNING: at /srv/autobuild-ceph/gitbuilder.git/build/fs/ceph/inode.c:1000 ceph_fill_trace+0x760/0x900 [ceph]()
[ 5354.359820] Modules linked in: ceph libceph ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs reiserfs coretemp kvm_intel kvm nfsd nfs_acl exportfs auth_rpcgss ghash_clmulni_intel aesni_intel oid_registry nfs fscache ablk_helper cryptd lockd lrw sunrpc gf128mul glue_helper aes_x86_64 psmouse lpc_ich i7core_edac edac_core serio_raw mfd_core joydev microcode hed lp dcdbas parport hid_generic usbhid hid btrfs raid6_pq ixgbe dca ptp pps_core mdio mptsas mptscsih mptbase bnx2 scsi_transport_sas xor zlib_deflate crc32c_intel libcrc32c
[ 5354.408084] CPU: 5 PID: 4575 Comm: kworker/5:2 Tainted: G        W    3.10.0-ceph-00039-g77c8bf2 #1
[ 5354.417173] Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS 1.6.3 02/07/2011
[ 5354.424700] Workqueue: ceph-msgr con_work [libceph]
[ 5354.429619]  ffffffffa0836290 ffff8802207f7a28 ffffffff816311b0 ffff8802207f7a68
[ 5354.437102]  ffffffff8103fae0 ffff88020cf08e00 ffff88020ad9d800 ffff880221819800
[ 5354.444573]  0000000000000000 ffff880220459800 ffff8802204cb400 ffff8802207f7a78
[ 5354.452060] Call Trace:
[ 5354.454527]  [<ffffffff816311b0>] dump_stack+0x19/0x1b
[ 5354.459697]  [<ffffffff8103fae0>] warn_slowpath_common+0x70/0xa0
[ 5354.465740]  [<ffffffff8103fb2a>] warn_slowpath_null+0x1a/0x20
[ 5354.471599]  [<ffffffffa08158f0>] ceph_fill_trace+0x760/0x900 [ceph]
[ 5354.477985]  [<ffffffff81633cfc>] ? mutex_lock_nested+0x27c/0x360
[ 5354.484112]  [<ffffffffa08314a3>] ? dispatch+0xba3/0x1740 [ceph]
[ 5354.490159]  [<ffffffffa08314b8>] dispatch+0xbb8/0x1740 [ceph]
[ 5354.496029]  [<ffffffff81511196>] ? kernel_recvmsg+0x46/0x60
[ 5354.501716]  [<ffffffffa07dbe38>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
[ 5354.508450]  [<ffffffff810a319d>] ? trace_hardirqs_on+0xd/0x10
[ 5354.514313]  [<ffffffffa07df1f8>] con_work+0x1948/0x2d50 [libceph]
[ 5354.520530]  [<ffffffff81080c73>] ? idle_balance+0x133/0x180
[ 5354.526222]  [<ffffffff81071c08>] ? finish_task_switch+0x48/0x110
[ 5354.532336]  [<ffffffff81071c08>] ? finish_task_switch+0x48/0x110
[ 5354.538461]  [<ffffffff8105f37f>] ? process_one_work+0x16f/0x540
[ 5354.544492]  [<ffffffff8105f3ea>] process_one_work+0x1da/0x540
[ 5354.550355]  [<ffffffff8105f37f>] ? process_one_work+0x16f/0x540
[ 5354.556397]  [<ffffffff810605cc>] worker_thread+0x11c/0x370
[ 5354.561991]  [<ffffffff810604b0>] ? manage_workers.isra.20+0x2e0/0x2e0
[ 5354.568550]  [<ffffffff8106728a>] kthread+0xea/0xf0
[ 5354.573453]  [<ffffffff810671a0>] ? flush_kthread_worker+0x150/0x150
[ 5354.579841]  [<ffffffff8164071c>] ret_from_fork+0x7c/0xb0
[ 5354.585264]  [<ffffffff810671a0>] ? flush_kthread_worker+0x150/0x150
[ 5354.591648] ---[ end trace fbc981dc5d005af8 ]---

job was
kernel:
  kdb: true
  sha1: 77c8bf2f972a9d6ff446c49a41678bf931bbee44
machine_type: plana
nuke-on-error: true
overrides:
  admin_socket:
    branch: next
  ceph:
    conf:
      mds:
        debug mds: 20
        debug ms: 1
      mon:
        debug mon: 20
        debug ms: 20
        debug paxos: 20
      osd:
        osd op thread timeout: 60
    fs: btrfs
    log-whitelist:
    - slow request
    sha1: 884fa2fcb6d707b23317bab1da909586ddc27608
  ceph-deploy:
    conf:
      client:
        debug monc: 20
        debug ms: 1
        debug objecter: 20
        debug rados: 20
        log file: /var/log/ceph/ceph-..log
      mon:
        debug mon: 20
        debug ms: 20
        debug paxos: 20
  install:
    ceph:
      sha1: 884fa2fcb6d707b23317bab1da909586ddc27608
  s3tests:
    branch: next
  workunit:
    sha1: 884fa2fcb6d707b23317bab1da909586ddc27608
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
- - client.1
tasks:
- chef: null
- clock.check: null
- install: null
- ceph: null
- kclient:
  - client.0
- knfsd:
  - client.0
- nfs:
    client.1:
      options:
      - rw
      - hard
      - intr
      - nfsvers=3
      server: client.0
- workunit:
    clients:
      client.1:
      - suites/dbench-short.sh

and the task is hung (tho i didn't verify the warning and hang are related)


Files

Actions #1

Updated by Zheng Yan almost 11 years ago

Server::handle_client_lookup_ino() calls Server::reply_request() with a non-NULL tracedn. I think tracedn should be NULL for LOOKUPINO and LOOKUPHASH

Actions #2

Updated by Zheng Yan almost 11 years ago

the warning and hang should be related because the kclient ignores caps in the inode trace. the MDS waits forever when revoking these caps. please try attached patch.

Actions #3

Updated by Sage Weil almost 11 years ago

  • Status changed from New to 7

looks right, we'll see how it does tonight

Actions #4

Updated by Sage Weil almost 11 years ago

  • Status changed from 7 to Resolved
Actions #5

Updated by Greg Farnum almost 8 years ago

  • Component(FS) kceph added
Actions

Also available in: Atom PDF