Bug #43580: pg: fastinfo incorrect when last_update moves backward in time - RADOS - Ceph

Actions

Copy link

Bug #43580

closed

pg: fastinfo incorrect when last_update moves backward in time

Added by Sage Weil over 4 years ago. Updated over 2 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Sage Weil

Category:

Target version:

% Done:

Source:

Tags:

Backport:

luminous,mimic,nautilus

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(RADOS):

Pull request ID:

32615

Crash signature (v1):

Crash signature (v2):

Description

If, during peering, last_update moves backwards, we may rewrite the full info but leave a fastinfo record in place with a newer last_update.

From ML:

In the scenario of EC deployment, suppose we done a peering process for a pg and
changed one shard's last_update from lu1(e1'3) to lu2(e1'2) .lu1 was written as
fastinfo and lu2 was written as info. After that we restarted this osd and
loaded pgs again. when we read pg info from disk, we will find the pg info is
lu1 applied to lu2, which becomes incorrect. the true value should be lu2. That
may cause the coming peering execute incorrectly and result in unfound objects.
I currently considered below two options:
1. delete fastinfo when we need to change info;
2. add extra sequence number to fastinfo and info structure to make it keep them
in the right order.

Related issues 4 (0 open — 4 closed)

Actions

Copy link

Updated by Kefu Chai over 4 years ago

Status changed from New to Fix Under Review
Assignee set to Sage Weil
Pull request ID set to 32615

Actions

Copy link

Updated by Kefu Chai over 4 years ago

Backport set to luminous,mimic,nautilus

Actions

Copy link

Updated by Kefu Chai over 4 years ago

Status changed from Fix Under Review to Pending Backport

Actions

Copy link

Updated by Nathan Cutler over 4 years ago

Copied to Backport #43621: luminous: pg: fastinfo incorrect when last_update moves backward in time added

Actions

Copy link

Updated by Nathan Cutler over 4 years ago

Copied to Backport #43622: mimic: pg: fastinfo incorrect when last_update moves backward in time added

Actions

Copy link

Updated by Nathan Cutler over 4 years ago

Copied to Backport #43623: nautilus: pg: fastinfo incorrect when last_update moves backward in time added

Actions

Copy link

Updated by Sage Weil over 4 years ago

Has duplicate Bug #39398: osd: fast_info need update when pglog rewind added

Actions

Copy link

Updated by Loïc Dachary over 2 years ago

Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Bug #43580

pg: fastinfo incorrect when last_update moves backward in time

Updated by Kefu Chai over 4 years ago

Updated by Kefu Chai over 4 years ago

Updated by Kefu Chai over 4 years ago

Updated by Nathan Cutler over 4 years ago

Updated by Nathan Cutler over 4 years ago

Updated by Nathan Cutler over 4 years ago

Updated by Sage Weil over 4 years ago

Updated by Loïc Dachary over 2 years ago