Project

General

Profile

Actions

Bug #38395

closed

luminous: write following remove might access previous onode

Added by Igor Fedotov about 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

So the sequence is as follows:
T1:
remove A

T2:
touch A
write A

In Luminous there is a chance that A is removed from the cache (as no ref is taken) before T1 lands to DB.
Subsequent B might access outdated onode A from DB and proceed with invalid A state.
Which finally results in, e.g. duplicate removal from the allocator.
Final symptoms remind #38049 and #36541 but the root cause is different.

This has been fixed in Mimic+ by
https://github.com/ceph/ceph/pull/18196/commits/0347518d02eda4e0e2da5241f1d77bc7304d59fb
and follow-ups.


Files

debug-osd.10.log.gz (240 KB) debug-osd.10.log.gz Igor Fedotov, 02/20/2019 10:25 AM
Actions #1

Updated by Igor Fedotov about 5 years ago

  • Status changed from New to 12
  • Assignee set to Igor Fedotov
Actions #2

Updated by Igor Fedotov about 5 years ago

  • Status changed from 12 to Fix Under Review
  • Pull request ID set to 26540
Actions #4

Updated by Igor Fedotov about 5 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF