Project

General

Profile

Actions

Bug #15919

closed

BUG_ON in ceph_readdir() on dbench, fsstress

Added by Ilya Dryomov almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
fs/ceph
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Yuri reported three smithis stuck in kdb. kdb was misconfigured so I couldn't get anything reliable out of it (not even a partial stack trace), but I am fairly certain it was one of the BUG_ONs in ceph_readdir(). It was the same BUG_ON in all three instances. I did some more poking today and I think it was

BUG_ON(rde->offset < ctx->pos);

on line 501 - at least that's where I'd start my investigation.

Actions #1

Updated by Ilya Dryomov almost 8 years ago

Here is a stack trace from an undamaged kdb:

sysname    Linux
release    4.6.0-rc3-ceph-15696-g9fe8e19
version    #1 SMP Wed May 18 02:01:35 PDT 2016
machine    x86_64
nodename   smithi003
domainname (none)
ccversion  CCVERSION
date       2016-05-18 17:54:23 tz_minuteswest 0
uptime     00:54
load avg   10.55 9.39 5.88

MemTotal:       32825452 kB
MemFree:        30333888 kB
Buffers:           58888 kB

Stack traceback for pid 8739
0xffff8808563bcd80     8739     8735  1    0   R  0xffff8808563bf000 *dbench
 ffff8808564dfd78 0000000000000018 ffffffffa052e090 ffff8807daa03178
 ffff8808564dfdb0 ffffffff8110353f ffff8808563bcd80 ffffffff817ec955
 ffff8808563bcd80 ffff8807daa03180 ffff8808564dfdd0 ffffffff811036b3
Call Trace:
 [<ffffffffa052e090>] ? ceph_readdir+0xf30/0x1310 [ceph]
 [<ffffffff8110353f>] ? mark_held_locks+0x6f/0x80
 [<ffffffff817ec955>] ? mutex_lock_killable_nested+0x375/0x420
 [<ffffffff811036b3>] ? trace_hardirqs_on_caller+0x163/0x1d0
 [<ffffffff8110372d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81279faf>] ? iterate_dir+0x8f/0x130
 [<ffffffff8127a4c5>] ? SyS_getdents+0x95/0x130
 [<ffffffff8127a150>] ? fillonedir+0x100/0x100
 [<ffffffff81003cfe>] ? do_syscall_64+0x6e/0x170
 [<ffffffff817efb3f>] ? entry_SYSCALL64_slow_path+0x25/0x25

ax: 0000000000000005  bx: ffff8808519221e0  cx: 0000000000000000
dx: 0000000000000000  si: ffff8808555bb477  di: 000000000097007e
bp: ffff8808564dfe68  sp: ffff8808564dfd78  r8: 0000000000000001
r9: 0000000000000008  r10: ffff8808563bd628  r11: 0000000000000000
r12: ffff8808564dfec0  r13: ffff8807daa030a0  r14: ffff880856a0c800
r15: 0000000000000004  ip: ffffffffa052df82  flags: 00010293  cs: 00000010
ss: 00000018  ds: 00000018  es: 00000018  fs: 00000018  gs: 00000018
Actions #2

Updated by Ilya Dryomov almost 8 years ago

  • Priority changed from Normal to Urgent
Actions #3

Updated by Zheng Yan almost 8 years ago

  • Status changed from New to 7

updated commit "ceph: using hash value to compose dentry offset"

Actions #4

Updated by Ilya Dryomov almost 8 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF