Project

General

Profile

Actions

Bug #59670

closed

Bug #63334: Recovery starts while norecover flag is set when PG splitting occurs

Ceph status shows PG recovering when norecover flag is set

Added by Aishwarya Mathuria about 1 year ago. Updated 5 days ago.

Status:
Duplicate
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On the Gibba cluster, we observed that ceph -s was showing one PG in recovering state after norecovery flag was set

[root@gibba001 ~]# ceph -s
  cluster:
    id:     7e775b16-ea73-11ed-ac35-3cecef3d8fb8
    health: HEALTH_WARN
            nobackfill,norecover,noscrub,nodeep-scrub flag(s) set
            Degraded data redundancy: 2/27732183 objects degraded (0.000%), 1 pg degraded, 1 pg undersized

  services:
    mon: 5 daemons, quorum gibba001,gibba002,gibba003,gibba006,gibba005 (age 78m)
    mgr: gibba006.oxzbun(active, since 71m), standbys: gibba008.fhfdkj
    osd: 62 osds: 62 up (since 74m), 62 in (since 74m); 1 remapped pgs
         flags nobackfill,norecover,noscrub,nodeep-scrub
    rgw: 6 daemons active (6 hosts, 1 zones)

  data:
    pools:   7 pools, 1217 pgs
    objects: 4.62M objects, 203 GiB
    usage:   446 GiB used, 10 TiB / 11 TiB avail
    pgs:     2/27732183 objects degraded (0.000%)
             1216 active+clean
             1    active+recovering+undersized+degraded+remapped

PG dump:


1.0            2                   0         2          0        0  1114656            0           0  1334      1334  active+recovering+undersized+degraded+remapped  2023-05-04T08:34:56.217391+0000  109'1334  123:1683            [30,15,0]          30               [15,0]              15         0'0  2023-05-04T08:34:39.648644+0000              0'0  2023-05-04T08:34:39.648644+0000              0                    0  periodic scrub scheduled @ 2023-05-05T16:09:53.741099+0000                 0                0
dumped all

From the cluster logs we can see the norecover flag being set and when the OSDs come up we observed the following logs:

2023-05-04T12:16:48.219+0000 7f078e80e700  1 osd.29 82 state: booting -> active
2023-05-04T12:16:48.219+0000 7f078e80e700  1 osd.29 82 pausing recovery (NORECOVER flag set)

And after sometime we can see the state of PG 1.0 in the logs:

2023-05-04T13:24:21.971+0000 7f07909cb700 30 osd.29 pg_epoch: 121 pg[1.0( v 106'1334 lc 80'163 (0'0,106'1334] local-lis/les=0/0 n=2 ec=74/74 lis/c=0/78 les/c/f=0/79/0 sis=84) [29,15,32]/[15,32] r=-1 lpr=84 pi=[78,84)/1 luod=0'0 lua=106'1324 crt=106'1334 mlcod 80'163 *active+remapped* m=2 mbc={}] lock
2023-05-04T13:25:31.984+0000 7f07909cb700 30 osd.29 pg_epoch: 121 pg[1.0( v 106'1334 lc 80'163 (0'0,106'1334] local-lis/les=0/0 n=2 ec=74/74 lis/c=0/78 les/c/f=0/79/0 sis=84) [29,15,32]/[15,32] r=-1 lpr=84 pi=[78,84)/1 luod=0'0 lua=106'1324 crt=106'1334 mlcod 80'163 *active+remapped* m=2 mbc={}] lock

However, ceph status and ceph pg dump still show that PG 1.0 is recovering.


Related issues 1 (1 open0 closed)

Copied to RADOS - Backport #66000: quincy: Ceph status shows PG recovering when norecover flag is setNewAishwarya MathuriaActions
Actions #1

Updated by Radoslaw Zarzynski about 1 year ago

Has the PG ultimately went into the proper state? Asking to exclude a race-condition on just reporting via ceph-mgr.

Actions #2

Updated by Wes Dillingham about 2 months ago

I think its more than just a cosmetic issue of the PG showing recovering as its state. It does in fact "recover" objects when the "norecover" flag is set. As a ceph operator I would expect "norecover" to prevent PGs from entering the "recovery" state but perhaps not to prevent "backfill". If this isnt a bug its a confusing usage of the term "norecover" IMO.

Actions #3

Updated by Radoslaw Zarzynski about 1 month ago

Bump up. IIRC there was a very similar ticket Aishwarya has poked with.

Actions #4

Updated by Aishwarya Mathuria about 1 month ago

We saw this issue again in another setup and it has been fixed here: https://github.com/ceph/ceph/pull/54708.
The problem was that the autoscaler was enabled while the norecover flag was set and client I/O was going on in the cluster.
When there is a read/write to a missing/degraded object recovery starts for that object even if the norecover flag is set, it was decided that this workflow makes sense as stopping recovery in such cases would cause client I/O to hang indefinitely.
The fix made in the PR stops the autoscaler from starting if the user has set the norecover flag.

From my memory, in Gibba cluster we had some read/write workloads going on and the noautoscale flag was not set so it is probably the same issue. I'll try to see if I can confirm that but it was a while back.

Actions #5

Updated by Radoslaw Zarzynski about 1 month ago

  • Status changed from New to Need More Info

The fix has been merged on 5 Jan 2024, so this could fit. It has been bacported only to Reef.

Wes Dillingham, do you see it on your cluster? If so, what's the version?

Actions #6

Updated by Wes Dillingham 11 days ago

Radoslaw Zarzynski wrote in #note-5:

The fix has been merged on 5 Jan 2024, so this could fit. It has been bacported only to Reef.

Wes Dillingham, do you see it on your cluster? If so, what's the version?

17.2.7

Actions #7

Updated by Laura Flores 5 days ago

So it looks like https://github.com/ceph/ceph/pull/54708 might need to be backported to Quincy.

@Aishwarya Mathuria mind creating a backport? We checked, and it is indeed absent from Quincy.

Actions #8

Updated by Laura Flores 5 days ago

  • Status changed from Need More Info to Pending Backport
  • Backport set to quincy
Actions #9

Updated by Backport Bot 5 days ago

  • Copied to Backport #66000: quincy: Ceph status shows PG recovering when norecover flag is set added
Actions #10

Updated by Backport Bot 5 days ago

  • Tags set to backport_processed
Actions #11

Updated by Aishwarya Mathuria 5 days ago

@Laura Flores sure, I'll raise a PR for quincy

Actions #12

Updated by Aishwarya Mathuria 5 days ago

  • Status changed from Pending Backport to Duplicate
  • Parent task set to #63334
Actions

Also available in: Atom PDF