Actions
Bug #47880
closed[journal] object recorder can race while lock is temporarily release for callbacks
Status:
Resolved
Priority:
Normal
Assignee:
Jason Dillaman
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
nautilus,octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Description
The assertion for "ceph_assert(m_in_flight_callbacks)" can fail in "notify_handler_unlock" if two callbacks race. It's possible a flush request arrived while "handle_append_flushed" had dropped the lock for callbacks but before it could notify the handler.
#2 0x00007f46cf3b5877 in ceph::__ceph_assert_fail (assertion=<optimized out>, file=<optimized out>, line=<optimized out>, func=0x7f46dd5e3980 <journal::ObjectRecorder::notify_handler_unlock(std::unique_lock<std::mutex>&, bool)::__PRETTY_FUNCTION__> "void journal::ObjectRecorder::notify_handler_unlock(std::unique_lock<std::mutex>&, bool)") at /usr/src/debug/ceph-16.0.0-6312.gc1613349.el8.x86_64/src/common/assert.cc:75 #3 0x00007f46cf3b5a40 in ceph::__ceph_assert_fail (ctx=...) at /usr/src/debug/ceph-16.0.0-6312.gc1613349.el8.x86_64/src/common/assert.cc:80 #4 0x00007f46dd4acf7e in journal::ObjectRecorder::notify_handler_unlock (this=<optimized out>, locker=..., notify_overflowed=<optimized out>) at /usr/src/debug/ceph-16.0.0-6312.gc1613349.el8.x86_64/src/log/Entry.h:35 #5 0x00007f46dd4b2b69 in journal::ObjectRecorder::handle_append_flushed (this=0x7f4570002b10, tid=<optimized out>, r=<optimized out>) at /usr/src/debug/ceph-16.0.0-6312.gc1613349.el8.x86_64/src/journal/ObjectRecorder.cc:239 #6 0x00007f46dd4b3628 in Context::complete (r=<optimized out>, this=0x7f46883902b0) at /usr/src/debug/ceph-16.0.0-6312.gc1613349.el8.x86_64/src/include/Context.h:99
m_overflowed = true
m_object_closed = true
m_object_closed_notify = true <---- implies 'ObjectRecorder::closed()' was invoked while IO in-flight or callback in-flight
m_in_flight_callbacks = false
m_in_flight_tids = std::set with 1 element = {[0] = 231}
Updated by Jason Dillaman over 3 years ago
- Status changed from In Progress to Fix Under Review
Updated by Mykola Golub over 3 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler over 3 years ago
- Copied to Backport #47886: octopus: [journal] object recorder can race while lock is temporarily release for callbacks added
Updated by Nathan Cutler over 3 years ago
- Copied to Backport #47887: nautilus: [journal] object recorder can race while lock is temporarily release for callbacks added
Updated by Ilya Dryomov almost 3 years ago
- Related to Bug #51100: object_recorder->is_closed() assert failure in JournalRecorder::open_object_set() in nautilus added
Updated by Ilya Dryomov about 2 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
Actions