Project

General

Profile

Bug #5658

kcepht: warning: fs/ceph/inode.c:1000 ceph_fill_trace+0x760/0x900 [ceph]()

Added by Sage Weil over 10 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

i noticed this in dmesg:

[ 5354.344055] ------------[ cut here ]------------
[ 5354.348726] WARNING: at /srv/autobuild-ceph/gitbuilder.git/build/fs/ceph/inode.c:1000 ceph_fill_trace+0x760/0x900 [ceph]()
[ 5354.359820] Modules linked in: ceph libceph ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs reiserfs coretemp kvm_intel kvm nfsd nfs_acl exportfs auth_rpcgss ghash_clmulni_intel aesni_intel oid_registry nfs fscache ablk_helper cryptd lockd lrw sunrpc gf128mul glue_helper aes_x86_64 psmouse lpc_ich i7core_edac edac_core serio_raw mfd_core joydev microcode hed lp dcdbas parport hid_generic usbhid hid btrfs raid6_pq ixgbe dca ptp pps_core mdio mptsas mptscsih mptbase bnx2 scsi_transport_sas xor zlib_deflate crc32c_intel libcrc32c
[ 5354.408084] CPU: 5 PID: 4575 Comm: kworker/5:2 Tainted: G        W    3.10.0-ceph-00039-g77c8bf2 #1
[ 5354.417173] Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS 1.6.3 02/07/2011
[ 5354.424700] Workqueue: ceph-msgr con_work [libceph]
[ 5354.429619]  ffffffffa0836290 ffff8802207f7a28 ffffffff816311b0 ffff8802207f7a68
[ 5354.437102]  ffffffff8103fae0 ffff88020cf08e00 ffff88020ad9d800 ffff880221819800
[ 5354.444573]  0000000000000000 ffff880220459800 ffff8802204cb400 ffff8802207f7a78
[ 5354.452060] Call Trace:
[ 5354.454527]  [<ffffffff816311b0>] dump_stack+0x19/0x1b
[ 5354.459697]  [<ffffffff8103fae0>] warn_slowpath_common+0x70/0xa0
[ 5354.465740]  [<ffffffff8103fb2a>] warn_slowpath_null+0x1a/0x20
[ 5354.471599]  [<ffffffffa08158f0>] ceph_fill_trace+0x760/0x900 [ceph]
[ 5354.477985]  [<ffffffff81633cfc>] ? mutex_lock_nested+0x27c/0x360
[ 5354.484112]  [<ffffffffa08314a3>] ? dispatch+0xba3/0x1740 [ceph]
[ 5354.490159]  [<ffffffffa08314b8>] dispatch+0xbb8/0x1740 [ceph]
[ 5354.496029]  [<ffffffff81511196>] ? kernel_recvmsg+0x46/0x60
[ 5354.501716]  [<ffffffffa07dbe38>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
[ 5354.508450]  [<ffffffff810a319d>] ? trace_hardirqs_on+0xd/0x10
[ 5354.514313]  [<ffffffffa07df1f8>] con_work+0x1948/0x2d50 [libceph]
[ 5354.520530]  [<ffffffff81080c73>] ? idle_balance+0x133/0x180
[ 5354.526222]  [<ffffffff81071c08>] ? finish_task_switch+0x48/0x110
[ 5354.532336]  [<ffffffff81071c08>] ? finish_task_switch+0x48/0x110
[ 5354.538461]  [<ffffffff8105f37f>] ? process_one_work+0x16f/0x540
[ 5354.544492]  [<ffffffff8105f3ea>] process_one_work+0x1da/0x540
[ 5354.550355]  [<ffffffff8105f37f>] ? process_one_work+0x16f/0x540
[ 5354.556397]  [<ffffffff810605cc>] worker_thread+0x11c/0x370
[ 5354.561991]  [<ffffffff810604b0>] ? manage_workers.isra.20+0x2e0/0x2e0
[ 5354.568550]  [<ffffffff8106728a>] kthread+0xea/0xf0
[ 5354.573453]  [<ffffffff810671a0>] ? flush_kthread_worker+0x150/0x150
[ 5354.579841]  [<ffffffff8164071c>] ret_from_fork+0x7c/0xb0
[ 5354.585264]  [<ffffffff810671a0>] ? flush_kthread_worker+0x150/0x150
[ 5354.591648] ---[ end trace fbc981dc5d005af8 ]---

job was
kernel:
  kdb: true
  sha1: 77c8bf2f972a9d6ff446c49a41678bf931bbee44
machine_type: plana
nuke-on-error: true
overrides:
  admin_socket:
    branch: next
  ceph:
    conf:
      mds:
        debug mds: 20
        debug ms: 1
      mon:
        debug mon: 20
        debug ms: 20
        debug paxos: 20
      osd:
        osd op thread timeout: 60
    fs: btrfs
    log-whitelist:
    - slow request
    sha1: 884fa2fcb6d707b23317bab1da909586ddc27608
  ceph-deploy:
    conf:
      client:
        debug monc: 20
        debug ms: 1
        debug objecter: 20
        debug rados: 20
        log file: /var/log/ceph/ceph-..log
      mon:
        debug mon: 20
        debug ms: 20
        debug paxos: 20
  install:
    ceph:
      sha1: 884fa2fcb6d707b23317bab1da909586ddc27608
  s3tests:
    branch: next
  workunit:
    sha1: 884fa2fcb6d707b23317bab1da909586ddc27608
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
- - client.1
tasks:
- chef: null
- clock.check: null
- install: null
- ceph: null
- kclient:
  - client.0
- knfsd:
  - client.0
- nfs:
    client.1:
      options:
      - rw
      - hard
      - intr
      - nfsvers=3
      server: client.0
- workunit:
    clients:
      client.1:
      - suites/dbench-short.sh

and the task is hung (tho i didn't verify the warning and hang are related)

0001-mds-tracedn-should-be-NULL-for-LOOKUPINO-LOOKUPHASH-.patch View (834 Bytes) Zheng Yan, 07/17/2013 07:07 PM

Associated revisions

Revision dd0246d2 (diff)
Added by Yan, Zheng over 10 years ago

mds: tracedn should be NULL for LOOKUPINO/LOOKUPHASH reply

Fixes: #5658
Signed-off-by: Yan, Zheng <>
Reviewed-by: Sage Weil <>

History

#1 Updated by Zheng Yan over 10 years ago

Server::handle_client_lookup_ino() calls Server::reply_request() with a non-NULL tracedn. I think tracedn should be NULL for LOOKUPINO and LOOKUPHASH

#2 Updated by Zheng Yan over 10 years ago

the warning and hang should be related because the kclient ignores caps in the inode trace. the MDS waits forever when revoking these caps. please try attached patch.

#3 Updated by Sage Weil over 10 years ago

  • Status changed from New to 7

looks right, we'll see how it does tonight

#4 Updated by Sage Weil over 10 years ago

  • Status changed from 7 to Resolved

#5 Updated by Greg Farnum over 7 years ago

  • Component(FS) kceph added

Also available in: Atom PDF