Project

General

Profile

Actions

Bug #40706

closed

races in ceph_uninline_data

Added by Jeff Layton almost 5 years ago. Updated over 4 years ago.

Status:
Won't Fix
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

I've been staring at ceph_uninline_data() for a while now, and I think I've convinced myself that there are potential races here.

That function takes the i_ceph_lock and grabs the inline version. If it still appears to be inlined, it then locks the page and starts writing
out the data to the OSDs. Eventually once it's done, it unlocks the page. The caller eventually then sets the inline version to CEPH_INLINE_NONE.

I think that 2 tasks in the kernel could race and end up serializing on the page lock in there. The first task is in ceph_write_iter() and is doing an O_DIRECT write. It finishes uninlining the data, and then does the write directly to the OSD. The second task then comes in before the inline_version is changed, and does the uninlining again, overwriting the data that was written by the first task with stale data from the pagecache.

My guess is that it was done this way to try and minimize taking and dropping the i_ceph_lock, but I don't think it's safe.


Related issues 1 (0 open1 closed)

Related to CephFS - Feature #41311: deprecate CephFS inline_data supportResolvedJeff Layton

Actions
Actions

Also available in: Atom PDF