Project

General

Profile

Actions

Bug #8332

closed

ceph-test-objectstore: bad return value in unlink

Added by Sage Weil almost 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-05-11_09:34:50-rados-firefly-testing-basic-plana/248775

2014-05-11T11:42:23.241 INFO:teuthology.orchestra.run.err:[10.214.131.30]: objects.size() is 2000
2014-05-11T11:42:23.472 INFO:teuthology.orchestra.run.err:[10.214.131.30]: listed.size() is 2000
2014-05-11T11:43:04.217 INFO:teuthology.orchestra.run.err:[10.214.131.30]: os/LFNIndex.cc: In function 'virtual int LFNIndex::unlink(const ghobject_t&)' thread 7fda816a0700 time 2014-05-11 11:43:04.215398
2014-05-11T11:43:04.217 INFO:teuthology.orchestra.run.err:[10.214.131.30]: os/LFNIndex.cc: 106: FAILED assert(r == 0)
2014-05-11T11:43:04.217 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  ceph version 0.80-12-gdb8873b (db8873b69c73b40110bf1512c114e4a0395671ab)
2014-05-11T11:43:04.217 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  1: (LFNIndex::unlink(ghobject_t const&)+0x1cd) [0x58044d]
2014-05-11T11:43:04.218 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  2: (FileStore::lfn_unlink(coll_t, ghobject_t const&, SequencerPosition const&, bool)+0x2e1) [0x54feb1]
2014-05-11T11:43:04.218 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  3: (FileStore::_remove(coll_t, ghobject_t const&, SequencerPosition const&)+0x172) [0x5501c2]
2014-05-11T11:43:04.218 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  4: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x3a9c) [0x56071c]
2014-05-11T11:43:04.218 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  5: (FileStore::_do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long, ThreadPool::TPHandle*)+0x74) [0x5627d4]
2014-05-11T11:43:04.218 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  6: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x29a) [0x562a8a]
2014-05-11T11:43:04.218 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  7: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x6499e6]
2014-05-11T11:43:04.218 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  8: (ThreadPool::WorkThread::entry()+0x10) [0x64b7f0]
2014-05-11T11:43:04.218 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  9: (()+0x7e9a) [0x7fda85660e9a]
2014-05-11T11:43:04.219 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  10: (clone()+0x6d) [0x7fda83e253fd]
2014-05-11T11:43:04.219 INFO:teuthology.orchestra.run.err:[10.214.131.30]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2014-05-11T11:43:04.219 INFO:teuthology.orchestra.run.err:[10.214.131.30]: 2014-05-11 11:43:04.216832 7fda816a0700 -1 os/LFNIndex.cc: In function 'virtual int LFNIndex::unlink(const ghobject_t&)' thread 7fda816a0700 time 2014-05-11 11:43:04.215398
2014-05-11T11:43:04.219 INFO:teuthology.orchestra.run.err:[10.214.131.30]: os/LFNIndex.cc: 106: FAILED assert(r == 0)

Actions #1

Updated by Samuel Just almost 10 years ago

Hmm, the assert is after a failure injection and cleanup sequence, trying to reproduce.

Actions #2

Updated by Samuel Just almost 10 years ago

  • Assignee set to Samuel Just

reproducible, working on it

Actions #3

Updated by Ian Colle almost 10 years ago

  • Status changed from New to In Progress
Actions #4

Updated by Samuel Just almost 10 years ago

get_info is returning -ENOENT for a clearly extant collection (checked the current/ directory manually). I've set up a teuth task to run while capturing strace output, hasn't reproduced yet with strace though.

Actions #5

Updated by Samuel Just almost 10 years ago

Ahah, the ENOENT is on the subdir which had been merged. The bug occurs when cleanup() happens after the subdir is merged and removed, but before the pending merge marker on the root is cleaned up. The solution is to interpret an ENOENT there as a completed merge.

Actions #6

Updated by Samuel Just almost 10 years ago

  • Status changed from In Progress to 7
Actions #7

Updated by Samuel Just almost 10 years ago

  • Status changed from 7 to Pending Backport
Actions #8

Updated by Sage Weil almost 10 years ago

  • Status changed from Pending Backport to Resolved
Actions #9

Updated by Sage Weil almost 10 years ago

  • Status changed from Resolved to Pending Backport
Actions #10

Updated by Samuel Just almost 10 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF