Project

General

Profile

Bug #46434

nautilus: osdc: FAILED ceph_assert(bh->waitfor_read.empty())

Added by Ramana Raja 27 days ago. Updated 26 days ago.

Status:
New
Priority:
High
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
multimds
Component(FS):
osdc
Labels (FS):
Pull request ID:
Crash signature:

Description

During Yuri's nautilus backport test run, hit the following failure error in multimds suite's xfs tests running in Ubuntu xenial environment with FUSE clients.

020-07-08T19:24:00.514 INFO:tasks.workunit.client.0.smithi124.stderr:+ ./fsx -z 1MB -N 50000 -p 10000 -l 1048576
2020-07-08T19:24:00.570 INFO:tasks.workunit.client.0.smithi124.stderr:fsx: main: filesystem does not support fallocate mode 0x8, disabling!
2020-07-08T19:24:00.570 INFO:tasks.workunit.client.0.smithi124.stderr:: Operation not supported
2020-07-08T19:24:00.571 INFO:tasks.workunit.client.0.smithi124.stderr:fsx: main: filesystem does not support fallocate mode 0x20, disabling!
2020-07-08T19:24:00.571 INFO:tasks.workunit.client.0.smithi124.stderr:: Operation not supported
2020-07-08T19:24:00.571 INFO:tasks.workunit.client.0.smithi124.stdout:skipping zero size read
2020-07-08T19:24:00.573 INFO:tasks.workunit.client.0.smithi124.stdout:skipping zero length punch hole
2020-07-08T19:24:00.574 INFO:tasks.workunit.client.0.smithi124.stdout:skipping zero size read
2020-07-08T19:24:00.574 INFO:tasks.workunit.client.0.smithi124.stdout:fallocating to largest ever: 0xbc839
2020-07-08T19:24:00.632 INFO:tasks.workunit.client.0.smithi124.stdout:fallocating to largest ever: 0xdc566
2020-07-08T19:24:00.652 INFO:tasks.workunit.client.0.smithi124.stdout:truncating to largest ever: 0xe437a
2020-07-08T19:24:00.787 INFO:tasks.workunit.client.0.smithi124.stdout:truncating to largest ever: 0xf5939
2020-07-08T19:24:00.801 INFO:tasks.workunit.client.0.smithi124.stdout:fallocating to largest ever: 0x100000
2020-07-08T19:24:10.300 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: In function 'void ObjectCacher::Object::discard(loff_t, loff_t, C_GatherBuilder*)' thread 7f3d2f7fe700 time 2020-07-08 19:24:10.305526
2020-07-08T19:24:10.300 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty())
2020-07-08T19:24:10.301 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: ceph version 14.2.10-29-gaf4ccda (af4ccdaffeffda4b9071601ab3a6baccd01ead14) nautilus (stable)
2020-07-08T19:24:10.302 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f3d8ca52732]
2020-07-08T19:24:10.302 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f3d8ca5290d]
2020-07-08T19:24:10.302 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 3: (ObjectCacher::Object::discard(long, long, C_GatherBuilderBase<Context, C_GatherBase<Context, Context> >*)+0x715) [0x529085]
2020-07-08T19:24:10.302 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 4: (ObjectCacher::_discard(ObjectCacher::ObjectSet*, std::vector<ObjectExtent, std::allocator<ObjectExtent> > const&, C_GatherBuilderBase<Context, C_GatherBase<Context, Context> >*)+0x166) [0x529256]
2020-07-08T19:24:10.302 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 5: (ObjectCacher::discard_writeback(ObjectCacher::ObjectSet*, std::vector<ObjectExtent, std::allocator<ObjectExtent> > const&, Context*)+0x6f) [0x53013f]
2020-07-08T19:24:10.303 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 6: (Client::_invalidate_inode_cache(Inode*, long, long)+0xfd) [0x46e1cd]
2020-07-08T19:24:10.303 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 7: (Client::_fallocate(Fh*, int, long, long)+0x275) [0x4cd0a5]
2020-07-08T19:24:10.303 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 8: (Client::ll_fallocate(Fh*, int, long, long)+0x209) [0x4ce219]
2020-07-08T19:24:10.303 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 9: ceph-fuse() [0x44ffab]
2020-07-08T19:24:10.303 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 10: (()+0x13a42) [0x7f3d95758a42]
2020-07-08T19:24:10.304 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 11: (()+0x15679) [0x7f3d9575a679]
2020-07-08T19:24:10.304 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 12: (()+0x11e38) [0x7f3d95756e38]
2020-07-08T19:24:10.304 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 13: (()+0x76ba) [0x7f3d8c56d6ba]
2020-07-08T19:24:10.304 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 14: (clone()+0x6d) [0x7f3d8bd9641d]
2020-07-08T19:24:10.304 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:2020-07-08 19:24:10.302 7f3d2f7fe700 -1 /build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: In function 'void ObjectCacher::Object::discard(loff_t, loff_t, C_GatherBuilder*)' thread 7f3d2f7fe700 time 2020-07-08 19:24:10.305526
2020-07-08T19:24:10.305 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty())
2020-07-08T19:24:10.305 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:

More info here,
https://pulpito.ceph.com/yuriw-2020-07-08_17:27:06-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/5209626/

Yuri ran the multimds suite twice. Both the times observed the above issue only in ubuntu xenial environments.
https://pulpito.ceph.com/yuriw-2020-07-06_17:30:41-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/
https://pulpito.ceph.com/yuriw-2020-07-08_17:27:06-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/

History

#1 Updated by Ramana Raja 27 days ago

  • Description updated (diff)

#2 Updated by Ramana Raja 27 days ago

  • Description updated (diff)

#3 Updated by Patrick Donnelly 26 days ago

  • Subject changed from nautilus: xfs test failure in multi-mds suite to nautilus: osdc: FAILED ceph_assert(bh->waitfor_read.empty())
  • Description updated (diff)
  • Priority changed from Normal to High
  • Target version set to v16.0.0
  • Backport set to octopus,nautilus
  • Component(FS) osdc added

Interesting that it was reproducible for two runs. We've not seen this before but I'm suspicious that it probably exists on master too.

Also available in: Atom PDF