Bug #46434
osdc: FAILED ceph_assert(bh->waitfor_read.empty())
0%
Description
During Yuri's nautilus backport test run, hit the following failure error in multimds suite's xfs tests running in Ubuntu xenial environment with FUSE clients.
020-07-08T19:24:00.514 INFO:tasks.workunit.client.0.smithi124.stderr:+ ./fsx -z 1MB -N 50000 -p 10000 -l 1048576 2020-07-08T19:24:00.570 INFO:tasks.workunit.client.0.smithi124.stderr:fsx: main: filesystem does not support fallocate mode 0x8, disabling! 2020-07-08T19:24:00.570 INFO:tasks.workunit.client.0.smithi124.stderr:: Operation not supported 2020-07-08T19:24:00.571 INFO:tasks.workunit.client.0.smithi124.stderr:fsx: main: filesystem does not support fallocate mode 0x20, disabling! 2020-07-08T19:24:00.571 INFO:tasks.workunit.client.0.smithi124.stderr:: Operation not supported 2020-07-08T19:24:00.571 INFO:tasks.workunit.client.0.smithi124.stdout:skipping zero size read 2020-07-08T19:24:00.573 INFO:tasks.workunit.client.0.smithi124.stdout:skipping zero length punch hole 2020-07-08T19:24:00.574 INFO:tasks.workunit.client.0.smithi124.stdout:skipping zero size read 2020-07-08T19:24:00.574 INFO:tasks.workunit.client.0.smithi124.stdout:fallocating to largest ever: 0xbc839 2020-07-08T19:24:00.632 INFO:tasks.workunit.client.0.smithi124.stdout:fallocating to largest ever: 0xdc566 2020-07-08T19:24:00.652 INFO:tasks.workunit.client.0.smithi124.stdout:truncating to largest ever: 0xe437a 2020-07-08T19:24:00.787 INFO:tasks.workunit.client.0.smithi124.stdout:truncating to largest ever: 0xf5939 2020-07-08T19:24:00.801 INFO:tasks.workunit.client.0.smithi124.stdout:fallocating to largest ever: 0x100000 2020-07-08T19:24:10.300 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: In function 'void ObjectCacher::Object::discard(loff_t, loff_t, C_GatherBuilder*)' thread 7f3d2f7fe700 time 2020-07-08 19:24:10.305526 2020-07-08T19:24:10.300 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) 2020-07-08T19:24:10.301 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: ceph version 14.2.10-29-gaf4ccda (af4ccdaffeffda4b9071601ab3a6baccd01ead14) nautilus (stable) 2020-07-08T19:24:10.302 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f3d8ca52732] 2020-07-08T19:24:10.302 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f3d8ca5290d] 2020-07-08T19:24:10.302 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 3: (ObjectCacher::Object::discard(long, long, C_GatherBuilderBase<Context, C_GatherBase<Context, Context> >*)+0x715) [0x529085] 2020-07-08T19:24:10.302 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 4: (ObjectCacher::_discard(ObjectCacher::ObjectSet*, std::vector<ObjectExtent, std::allocator<ObjectExtent> > const&, C_GatherBuilderBase<Context, C_GatherBase<Context, Context> >*)+0x166) [0x529256] 2020-07-08T19:24:10.302 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 5: (ObjectCacher::discard_writeback(ObjectCacher::ObjectSet*, std::vector<ObjectExtent, std::allocator<ObjectExtent> > const&, Context*)+0x6f) [0x53013f] 2020-07-08T19:24:10.303 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 6: (Client::_invalidate_inode_cache(Inode*, long, long)+0xfd) [0x46e1cd] 2020-07-08T19:24:10.303 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 7: (Client::_fallocate(Fh*, int, long, long)+0x275) [0x4cd0a5] 2020-07-08T19:24:10.303 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 8: (Client::ll_fallocate(Fh*, int, long, long)+0x209) [0x4ce219] 2020-07-08T19:24:10.303 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 9: ceph-fuse() [0x44ffab] 2020-07-08T19:24:10.303 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 10: (()+0x13a42) [0x7f3d95758a42] 2020-07-08T19:24:10.304 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 11: (()+0x15679) [0x7f3d9575a679] 2020-07-08T19:24:10.304 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 12: (()+0x11e38) [0x7f3d95756e38] 2020-07-08T19:24:10.304 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 13: (()+0x76ba) [0x7f3d8c56d6ba] 2020-07-08T19:24:10.304 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr: 14: (clone()+0x6d) [0x7f3d8bd9641d] 2020-07-08T19:24:10.304 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:2020-07-08 19:24:10.302 7f3d2f7fe700 -1 /build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: In function 'void ObjectCacher::Object::discard(loff_t, loff_t, C_GatherBuilder*)' thread 7f3d2f7fe700 time 2020-07-08 19:24:10.305526 2020-07-08T19:24:10.305 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) 2020-07-08T19:24:10.305 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:
More info here,
https://pulpito.ceph.com/yuriw-2020-07-08_17:27:06-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/5209626/
Yuri ran the multimds suite twice. Both the times observed the above issue only in ubuntu xenial environments.
https://pulpito.ceph.com/yuriw-2020-07-06_17:30:41-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/
https://pulpito.ceph.com/yuriw-2020-07-08_17:27:06-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/
Related issues
History
#1 Updated by Ramana Raja about 3 years ago
- Description updated (diff)
#2 Updated by Ramana Raja about 3 years ago
- Description updated (diff)
#3 Updated by Patrick Donnelly about 3 years ago
- Subject changed from nautilus: xfs test failure in multi-mds suite to nautilus: osdc: FAILED ceph_assert(bh->waitfor_read.empty())
- Description updated (diff)
- Priority changed from Normal to High
- Target version set to v16.0.0
- Backport set to octopus,nautilus
- Component(FS) osdc added
Interesting that it was reproducible for two runs. We've not seen this before but I'm suspicious that it probably exists on master too.
#5 Updated by Ramana Raja almost 3 years ago
Saw this issue again in a recent nautilus test run,
https://pulpito.ceph.com/yuriw-2020-10-05_22:19:52-multimds-wip-yuri3-testing-2020-10-05-1317-nautilus-distro-basic-smithi/5500338/
#6 Updated by Neha Ojha almost 3 years ago
#7 Updated by Patrick Donnelly almost 3 years ago
- Subject changed from nautilus: osdc: FAILED ceph_assert(bh->waitfor_read.empty()) to osdc: FAILED ceph_assert(bh->waitfor_read.empty())
Seems this started sometime around June 10th:
$ zgrep 'FAILED ceph_assert(bh->waitfor_read.empty())' *-2020-[01][0-9]-*multimds*/*/teu*log.gz teuthology-2020-07-08_04:15:02-multimds-master-testing-basic-smithi/5208385/teuthology.log.gz:2020-07-09T13:07:42.236 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi120.stderr:/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-3255-g02fc51deb7/rpm/el8/BUILD/ceph-16.0.0-3255-g02fc51deb7/src/osdc/ObjectCacher.cc: 583: FAILED ceph_assert(bh->waitfor_read.empty()) teuthology-2020-07-08_04:15:02-multimds-master-testing-basic-smithi/5208385/teuthology.log.gz:2020-07-09T13:07:42.274 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi120.stderr:/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-3255-g02fc51deb7/rpm/el8/BUILD/ceph-16.0.0-3255-g02fc51deb7/src/osdc/ObjectCacher.cc: 583: FAILED ceph_assert(bh->waitfor_read.empty()) teuthology-2020-07-08_04:15:02-multimds-master-testing-basic-smithi/5208385/teuthology.log.gz:2020-07-09T13:07:42.755 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi120.stderr:/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-3255-g02fc51deb7/rpm/el8/BUILD/ceph-16.0.0-3255-g02fc51deb7/src/osdc/ObjectCacher.cc: 583: FAILED ceph_assert(bh->waitfor_read.empty()) teuthology-2020-07-08_04:15:02-multimds-master-testing-basic-smithi/5208385/teuthology.log.gz:2020-07-09T13:07:43.219 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi120.stderr:/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-3255-g02fc51deb7/rpm/el8/BUILD/ceph-16.0.0-3255-g02fc51deb7/src/osdc/ObjectCacher.cc: 583: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-06-10_18:00:48-multimds-wip-yuri3-testing-2020-06-10-1530-nautilus-distro-basic-smithi/5134213/teuthology.log.gz:2020-06-10T20:27:29.170 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi066.stderr:/build/ceph-14.2.9-806-g042da806d81/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-06-10_18:00:48-multimds-wip-yuri3-testing-2020-06-10-1530-nautilus-distro-basic-smithi/5134213/teuthology.log.gz:2020-06-10T20:27:29.177 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi066.stderr:/build/ceph-14.2.9-806-g042da806d81/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-06-10_18:00:48-multimds-wip-yuri3-testing-2020-06-10-1530-nautilus-distro-basic-smithi/5134213/teuthology.log.gz:2020-06-10T20:27:29.201 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi066.stderr:/build/ceph-14.2.9-806-g042da806d81/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-06-10_18:00:48-multimds-wip-yuri3-testing-2020-06-10-1530-nautilus-distro-basic-smithi/5134213/teuthology.log.gz:2020-06-10T20:27:29.213 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi066.stderr:/build/ceph-14.2.9-806-g042da806d81/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-07-06_17:30:41-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/5204031/teuthology.log.gz:2020-07-06T23:25:36.798 INFO:tasks.cephfs.fuse_mount.ceph-fuse.1.smithi119.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-07-06_17:30:41-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/5204031/teuthology.log.gz:2020-07-06T23:25:36.802 INFO:tasks.cephfs.fuse_mount.ceph-fuse.1.smithi119.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-07-06_17:30:41-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/5204031/teuthology.log.gz:2020-07-06T23:25:36.814 INFO:tasks.cephfs.fuse_mount.ceph-fuse.1.smithi119.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-07-06_17:30:41-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/5204031/teuthology.log.gz:2020-07-06T23:25:36.826 INFO:tasks.cephfs.fuse_mount.ceph-fuse.1.smithi119.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-07-08_17:27:06-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/5209626/teuthology.log.gz:2020-07-08T19:24:10.300 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-07-08_17:27:06-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/5209626/teuthology.log.gz:2020-07-08T19:24:10.305 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-07-08_17:27:06-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/5209626/teuthology.log.gz:2020-07-08T19:24:10.317 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-07-08_17:27:06-multimds-wip-yuri3-testing-2020-07-01-1707-nautilus-distro-basic-smithi/5209626/teuthology.log.gz:2020-07-08T19:24:10.328 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi124.stderr:/build/ceph-14.2.10-29-gaf4ccda/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-08-07_15:05:02-multimds-wip-yuri4-testing-2020-08-07-1350-nautilus-distro-basic-smithi/5308731/teuthology.log.gz:2020-08-07T19:22:55.917 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi164.stderr:/build/ceph-14.2.10-268-g8aa51f1562d/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-08-07_15:05:02-multimds-wip-yuri4-testing-2020-08-07-1350-nautilus-distro-basic-smithi/5308731/teuthology.log.gz:2020-08-07T19:22:55.921 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi164.stderr:/build/ceph-14.2.10-268-g8aa51f1562d/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-08-07_15:05:02-multimds-wip-yuri4-testing-2020-08-07-1350-nautilus-distro-basic-smithi/5308731/teuthology.log.gz:2020-08-07T19:22:55.941 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi164.stderr:/build/ceph-14.2.10-268-g8aa51f1562d/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty()) yuriw-2020-08-07_15:05:02-multimds-wip-yuri4-testing-2020-08-07-1350-nautilus-distro-basic-smithi/5308731/teuthology.log.gz:2020-08-07T19:22:55.949 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi164.stderr:/build/ceph-14.2.10-268-g8aa51f1562d/src/osdc/ObjectCacher.cc: 636: FAILED ceph_assert(bh->waitfor_read.empty())
It also shows up in master (but only July 8th??) so it's not just a Nautilus regression.
#8 Updated by Patrick Donnelly almost 3 years ago
- Status changed from New to In Progress
- Assignee set to Patrick Donnelly
#9 Updated by Patrick Donnelly almost 3 years ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 37732
#10 Updated by Patrick Donnelly almost 3 years ago
- Status changed from Fix Under Review to Pending Backport
#11 Updated by Nathan Cutler almost 3 years ago
- Copied to Backport #48097: nautilus: osdc: FAILED ceph_assert(bh->waitfor_read.empty()) added
#12 Updated by Nathan Cutler almost 3 years ago
- Copied to Backport #48098: octopus: osdc: FAILED ceph_assert(bh->waitfor_read.empty()) added
#13 Updated by Nathan Cutler over 2 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".