Project

General

Profile

Bug #23439

Crashing OSDs after 'ceph pg repair'

Added by Jan Marquardt about 6 years ago. Updated almost 6 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Scrub/Repair
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Yesterday, ceph reported scrub errors.

cluster [ERR] overall HEALTH_ERR 5 scrub errors; Possible data damage: 1 pg inconsistent

After further investigation we found pg 0.103 is the placement group in question and tried to repair it:

ceph1:~# ceph pg repair 0.103
instructing pg 0.103 on osd.11 to repair

After a while osd.11 crashed. Please find the corresponding log file under http://af.janno.io/ceph-osd.11.log.1.gz

This morning 5 more OSDs had been crashed:

root@head1:~# ceph -s
  cluster:
    id:     c59e56df-2043-4c92-9492-25f05f268d9f
    health: HEALTH_ERR
            29756/10190445 objects misplaced (0.292%)
            5 scrub errors
            Possible data damage: 1 pg inconsistent
            Degraded data redundancy: 30563/10190445 objects degraded (0.300%), 10 pgs degraded, 10 pgs undersized

  services:
    mon: 3 daemons, quorum head1,head2,head3
    mgr: head3(active), standbys: head1, head2
    osd: 40 osds: 34 up, 34 in; 10 remapped pgs

  data:
    pools:   1 pools, 768 pgs
    objects: 3317k objects, 12158 GB
    usage:   37814 GB used, 89230 GB / 124 TB avail
    pgs:     30563/10190445 objects degraded (0.300%)
             29756/10190445 objects misplaced (0.292%)
             758 active+clean
             5   active+undersized+degraded+remapped+backfill_wait
             4   active+undersized+degraded+remapped+backfilling
             1   active+undersized+degraded+remapped+inconsistent+backfill_wait

  io:
    client:   46312 kB/s rd, 7226 kB/s wr, 378 op/s rd, 44 op/s wr
    recovery: 57960 kB/s, 14 objects/s


root@head1:~# ceph osd tree
ID  CLASS WEIGHT    TYPE NAME      STATUS REWEIGHT PRI-AFF
 -1       145.93274 root default
 -2        29.08960     host ceph1
  0   hdd   3.63620         osd.0      up  1.00000 1.00000
  1   hdd   3.63620         osd.1    down        0 1.00000
  2   hdd   3.63620         osd.2      up  1.00000 1.00000
  3   hdd   3.63620         osd.3      up  1.00000 1.00000
  4   hdd   3.63620         osd.4      up  1.00000 1.00000
  5   hdd   3.63620         osd.5      up  1.00000 1.00000
  6   hdd   3.63620         osd.6      up  1.00000 1.00000
  7   hdd   3.63620         osd.7      up  1.00000 1.00000
 -3        29.13217     host ceph2
  8   hdd   3.63620         osd.8    down        0 1.00000
  9   hdd   3.63620         osd.9    down        0 1.00000
 10   hdd   3.63620         osd.10   down        0 1.00000
 11   hdd   3.65749         osd.11   down        0 1.00000
 12   hdd   3.63620         osd.12     up  1.00000 1.00000
 13   hdd   3.65749         osd.13     up  1.00000 1.00000
 14   hdd   3.63620         osd.14     up  1.00000 1.00000
 15   hdd   3.63620         osd.15     up  1.00000 1.00000
 -4        29.11258     host ceph3
 16   hdd   3.63620         osd.16     up  1.00000 1.00000
 18   hdd   3.63620         osd.18     up  1.00000 1.00000
 19   hdd   3.63620         osd.19     up  1.00000 1.00000
 20   hdd   3.65749         osd.20     up  1.00000 1.00000
 21   hdd   3.63620         osd.21     up  1.00000 1.00000
 22   hdd   3.63620         osd.22     up  1.00000 1.00000
 23   hdd   3.63620         osd.23     up  1.00000 1.00000
 24   hdd   3.63789         osd.24     up  1.00000 1.00000
 -9        29.29919     host ceph4
 17   hdd   3.66240         osd.17     up  1.00000 1.00000
 25   hdd   3.66240         osd.25     up  1.00000 1.00000
 26   hdd   3.66240         osd.26     up  1.00000 1.00000
 27   hdd   3.66240         osd.27     up  1.00000 1.00000
 28   hdd   3.66240         osd.28   down        0 1.00000
 29   hdd   3.66240         osd.29     up  1.00000 1.00000
 30   hdd   3.66240         osd.30     up  1.00000 1.00000
 31   hdd   3.66240         osd.31     up  1.00000 1.00000
-11        29.29919     host ceph5
 32   hdd   3.66240         osd.32     up  1.00000 1.00000
 33   hdd   3.66240         osd.33     up  1.00000 1.00000
 34   hdd   3.66240         osd.34     up  1.00000 1.00000
 35   hdd   3.66240         osd.35     up  1.00000 1.00000
 36   hdd   3.66240         osd.36     up  1.00000 1.00000
 37   hdd   3.66240         osd.37     up  1.00000 1.00000
 38   hdd   3.66240         osd.38     up  1.00000 1.00000
 39   hdd   3.66240         osd.39     up  1.00000 1.00000

History

#1 Updated by Jan Marquardt about 6 years ago

With #23258 we already had a similar issue and I am wondering if this is something you always have to expect with Ceph or if this might be an hardware/setup issue or something completely different.

#2 Updated by Jan Marquardt about 6 years ago

And the next three OSDs crashed:

root@head1:~# ceph osd tree
ID  CLASS WEIGHT    TYPE NAME      STATUS REWEIGHT PRI-AFF
 -1       145.93274 root default
 -2        29.08960     host ceph1
  0   hdd   3.63620         osd.0      up  1.00000 1.00000
  1   hdd   3.63620         osd.1    down        0 1.00000
  2   hdd   3.63620         osd.2      up  1.00000 1.00000
  3   hdd   3.63620         osd.3      up  1.00000 1.00000
  4   hdd   3.63620         osd.4      up  1.00000 1.00000
  5   hdd   3.63620         osd.5    down        0 1.00000
  6   hdd   3.63620         osd.6      up  1.00000 1.00000
  7   hdd   3.63620         osd.7      up  1.00000 1.00000
 -3        29.13217     host ceph2
  8   hdd   3.63620         osd.8    down        0 1.00000
  9   hdd   3.63620         osd.9    down        0 1.00000
 10   hdd   3.63620         osd.10   down        0 1.00000
 11   hdd   3.65749         osd.11   down        0 1.00000
 12   hdd   3.63620         osd.12     up  1.00000 1.00000
 13   hdd   3.65749         osd.13     up  1.00000 1.00000
 14   hdd   3.63620         osd.14     up  1.00000 1.00000
 15   hdd   3.63620         osd.15     up  1.00000 1.00000
 -4        29.11258     host ceph3
 16   hdd   3.63620         osd.16     up  1.00000 1.00000
 18   hdd   3.63620         osd.18     up  1.00000 1.00000
 19   hdd   3.63620         osd.19     up  1.00000 1.00000
 20   hdd   3.65749         osd.20     up  1.00000 1.00000
 21   hdd   3.63620         osd.21     up  1.00000 1.00000
 22   hdd   3.63620         osd.22     up  1.00000 1.00000
 23   hdd   3.63620         osd.23     up  1.00000 1.00000
 24   hdd   3.63789         osd.24     up  1.00000 1.00000
 -9        29.29919     host ceph4
 17   hdd   3.66240         osd.17     up  1.00000 1.00000
 25   hdd   3.66240         osd.25     up  1.00000 1.00000
 26   hdd   3.66240         osd.26     up  1.00000 1.00000
 27   hdd   3.66240         osd.27     up  1.00000 1.00000
 28   hdd   3.66240         osd.28   down        0 1.00000
 29   hdd   3.66240         osd.29     up  1.00000 1.00000
 30   hdd   3.66240         osd.30     up  1.00000 1.00000
 31   hdd   3.66240         osd.31     up  1.00000 1.00000
-11        29.29919     host ceph5
 32   hdd   3.66240         osd.32   down        0 1.00000
 33   hdd   3.66240         osd.33     up  1.00000 1.00000
 34   hdd   3.66240         osd.34     up  1.00000 1.00000
 35   hdd   3.66240         osd.35     up  1.00000 1.00000
 36   hdd   3.66240         osd.36   down        0 1.00000
 37   hdd   3.66240         osd.37     up  1.00000 1.00000
 38   hdd   3.66240         osd.38     up  1.00000 1.00000
 39   hdd   3.66240         osd.39     up  1.00000 1.00000

#3 Updated by Jan Marquardt about 6 years ago

And the next three ,,,

root@head1:~# ceph osd tree
ID  CLASS WEIGHT    TYPE NAME      STATUS REWEIGHT PRI-AFF
 -1       145.93274 root default
 -2        29.08960     host ceph1
  0   hdd   3.63620         osd.0      up  1.00000 1.00000
  1   hdd   3.63620         osd.1    down        0 1.00000
  2   hdd   3.63620         osd.2      up  1.00000 1.00000
  3   hdd   3.63620         osd.3      up  1.00000 1.00000
  4   hdd   3.63620         osd.4      up  1.00000 1.00000
  5   hdd   3.63620         osd.5    down        0 1.00000
  6   hdd   3.63620         osd.6      up  1.00000 1.00000
  7   hdd   3.63620         osd.7      up  1.00000 1.00000
 -3        29.13217     host ceph2
  8   hdd   3.63620         osd.8    down        0 1.00000
  9   hdd   3.63620         osd.9    down        0 1.00000
 10   hdd   3.63620         osd.10   down        0 1.00000
 11   hdd   3.65749         osd.11   down        0 1.00000
 12   hdd   3.63620         osd.12     up  1.00000 1.00000
 13   hdd   3.65749         osd.13   down        0 1.00000
 14   hdd   3.63620         osd.14     up  1.00000 1.00000
 15   hdd   3.63620         osd.15     up  1.00000 1.00000
 -4        29.11258     host ceph3
 16   hdd   3.63620         osd.16     up  1.00000 1.00000
 18   hdd   3.63620         osd.18     up  1.00000 1.00000
 19   hdd   3.63620         osd.19     up  1.00000 1.00000
 20   hdd   3.65749         osd.20   down        0 1.00000
 21   hdd   3.63620         osd.21     up  1.00000 1.00000
 22   hdd   3.63620         osd.22     up  1.00000 1.00000
 23   hdd   3.63620         osd.23     up  1.00000 1.00000
 24   hdd   3.63789         osd.24     up  1.00000 1.00000
 -9        29.29919     host ceph4
 17   hdd   3.66240         osd.17     up  1.00000 1.00000
 25   hdd   3.66240         osd.25     up  1.00000 1.00000
 26   hdd   3.66240         osd.26     up  1.00000 1.00000
 27   hdd   3.66240         osd.27     up  1.00000 1.00000
 28   hdd   3.66240         osd.28   down        0 1.00000
 29   hdd   3.66240         osd.29     up  1.00000 1.00000
 30   hdd   3.66240         osd.30     up  1.00000 1.00000
 31   hdd   3.66240         osd.31     up  1.00000 1.00000
-11        29.29919     host ceph5
 32   hdd   3.66240         osd.32   down        0 1.00000
 33   hdd   3.66240         osd.33   down  1.00000 1.00000
 34   hdd   3.66240         osd.34     up  1.00000 1.00000
 35   hdd   3.66240         osd.35     up  1.00000 1.00000
 36   hdd   3.66240         osd.36   down        0 1.00000
 37   hdd   3.66240         osd.37     up  1.00000 1.00000
 38   hdd   3.66240         osd.38     up  1.00000 1.00000
 39   hdd   3.66240         osd.39     up  1.00000 1.00000

#4 Updated by Greg Farnum almost 6 years ago

That URL denies access. You can use ceph-post-file instead to upload logs to a secure location.

It's not clear what happened, but scrub errors are usually rooted in some kind of hardware issue. They may be propagating just because everything's corrupted but only a small set of your OSDs can get scrub reservations at once, or there may be a piece of broken metadata which is getting moved around, but it's not something likely to be solved in an urgent fashion on the tracker.

#5 Updated by Jan Marquardt almost 6 years ago

Hi Greg,

thanks for your response.

That URL denies access. You can use ceph-post-file instead to upload logs to a secure location.

Yeah, damn, your right, sorry. I didn't know ceph-post-file so far. I just uploaded the file: d15bc45c-eb2a-4d92-8d57-94288f0b5490

It's not clear what happened, but scrub errors are usually rooted in some kind of hardware issue. They may be propagating just because everything's corrupted but only a small set of your OSDs can get scrub reservations at once, or there may be a piece of broken metadata which is getting moved around, but it's not something likely to be solved in an urgent fashion on the tracker.

So the mailing list would be better to get support for such cases?
We also suspect at the moment that the hardware is the problem in some way and will therefore probably replace it with new ones.

#6 Updated by Greg Farnum almost 6 years ago

You'll definitely get more attention and advice if somebody else has hit this issue before.

Also available in: Atom PDF