Project

General

Profile

Actions

Bug #21040

closed

bluestore: multiple objects (clones?) referencing same blocks (on all replicas)

Added by Charles Alva over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi Ceph,

Been using Ceph Luminous from 12.0.x and running well. But since Ceph Luminous 12.1.3 and 12.1.4 I got 1 or 2 pgs with active+clean+inconsistent. Unfortunately, there's no official documentation how to do this on Bluestore. Some have inconsistency with all of the three peers got read errors, and some of them are having blank/null inconsistency such as below. Due to frustration and since this is not in production, and after deleting one by one but kept getting inconsistencies, I decided to remove all the inconsistent objects at once.

How to fix this kind of issue the right way and not losing data? How
ceph Bluestore handles HDD bad sectors, shouldn't it be automatically like old filestore? And is Ceph Bluestore has its own trim at runtime periodically to trim SSDs for keeping the write performance optimal?

# ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
    pg 1.6 is active+clean+inconsistent, acting [3,4,1]
    pg 1.26 is active+clean+inconsistent, acting [1,4,3]

root@ceph:~# rados list-inconsistent-obj 1.6 --format=json-pretty
{
    "epoch": 648,
    "inconsistents": []
}

root@ceph:~# rados list-inconsistent-obj 1.26
{"epoch":648,"inconsistents":[]}root@ceph:~# rados list-inconsistent-obj 1.26 --format=json-pretty
{
    "epoch": 648,
    "inconsistents": []
}

root@ceph:~# ceph pg ls inconsistent
PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES      LOG  DISK_LOG STATE                     STATE_STAMP                VERSION    REPORTED    UP      UP_PRIMARY ACTING  ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP                LAST_DEEP_SCRUB DEEP_SCRUB_STAMP           
1.6        1042                  0        0         0       0 4270715526 1523     1523 active+clean+inconsistent 2017-08-18 13:16:02.304114 650'540008 650:1063869 [3,4,1]          3 [3,4,1]              3 650'539996 2017-08-18 13:16:02.304074      644'539753 2017-08-18 10:48:55.007583 
1.26       1082                  0        0         0       0 4424603286 1559     1559 active+clean+inconsistent 2017-08-18 13:15:59.676249 650'463621 650:1073114 [1,4,3]          1 [1,4,3]              1 650'463554 2017-08-18 13:15:59.676214      644'462902 2017-08-18 10:51:48.518118 

root@ceph:~# rados ls -p rbd | grep -i rbd_data.196f8574b0dc51.0000000000000a32
rbd_data.196f8574b0dc51.0000000000000a32

root@ceph:~# rados ls -p rbd | grep -i rbd_data.196f8574b0dc51.0000000000000d71
rbd_data.196f8574b0dc51.0000000000000d71
root@proxmox1:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-1/ --pgid 1.6 --op list rbd_data.196f8574b0dc51.0000000000000a32
["1.6",{"oid":"rbd_data.196f8574b0dc51.0000000000000a32","key":"","snapid":-2,"hash":2399151814,"max":0,"pool":1,"namespace":"","max":0}]
root@proxmox2:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-3/ --pgid 1.6 --op list rbd_data.196f8574b0dc51.0000000000000a32
["1.6",{"oid":"rbd_data.196f8574b0dc51.0000000000000a32","key":"","snapid":-2,"hash":2399151814,"max":0,"pool":1,"namespace":"","max":0}]
root@proxmox3:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-4/ --pgid 1.6 --op list rbd_data.196f8574b0dc51.0000000000000a32
["1.6",{"oid":"rbd_data.196f8574b0dc51.0000000000000a32","key":"","snapid":-2,"hash":2399151814,"max":0,"pool":1,"namespace":"","max":0}]

root@proxmox1:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-1/ --pgid 1.26 --op list rbd_data.196f8574b0dc51.0000000000000d71
["1.26",{"oid":"rbd_data.196f8574b0dc51.0000000000000d71","key":"","snapid":-2,"hash":3960199526,"max":0,"pool":1,"namespace":"","max":0}]
root@proxmox2:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-3/ --pgid 1.26 --op list rbd_data.196f8574b0dc51.0000000000000d71
["1.26",{"oid":"rbd_data.196f8574b0dc51.0000000000000d71","key":"","snapid":-2,"hash":3960199526,"max":0,"pool":1,"namespace":"","max":0}]
root@proxmox3:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-4/ --pgid 1.26 --op list rbd_data.196f8574b0dc51.0000000000000d71
["1.26",{"oid":"rbd_data.196f8574b0dc51.0000000000000d71","key":"","snapid":-2,"hash":3960199526,"max":0,"pool":1,"namespace":"","max":0}]

root@proxmox1:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-1/ --pgid 1.6 rbd_data.196f8574b0dc51.0000000000000a32 removeall
root@proxmox1:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-1/ --pgid 1.26 rbd_data.196f8574b0dc51.0000000000000d71 removeall
root@proxmox2:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-3/ --pgid 1.6 rbd_data.196f8574b0dc51.0000000000000a32 removeall
root@proxmox2:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-3/ --pgid 1.26 rbd_data.196f8574b0dc51.0000000000000d71 removeall
root@proxmox3:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-4/ --pgid 1.6 rbd_data.196f8574b0dc51.0000000000000a32 removeall
root@proxmox3:~# ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-4/ --pgid 1.26 rbd_data.196f8574b0dc51.0000000000000d71 removeall

2017-08-18 07:10:07.277119 7f9b955df700  0 log_channel(cluster) log [DBG] : 1.6 repair starts2017-08-18 07:11:32.119052 7f9b955df700 -1 log_channel(cluster) log [ERR] : 1.6 repair stat mismatch, got 1042/1043 objects, 16/16 clones, 1042/1043 dirty, 1/1 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0 whiteouts, 4266668678/4270862982 bytes, 0/0 hit_set_archive bytes.
2017-08-18 07:11:32.119138 7f9b955df700 -1 log_channel(cluster) log [ERR] : 1.6 repair 1 errors, 1 fixed

2017-08-18 07:11:32.321422 7ff8d90c9700  0 log_channel(cluster) log [DBG] : 1.26 repair starts2017-08-18 07:13:01.834640 7ff8d90c9700 -1 log_channel(cluster) log [ERR] : 1.26 repair stat mismatch, got 1081/1082 objects, 24/24 clones, 1081/1082 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0 whiteouts, 4420490902/4424685206 bytes, 0/0 hit_set_archive bytes.
2017-08-18 07:13:01.834743 7ff8d90c9700 -1 log_channel(cluster) log [ERR] : 1.26 repair 1 errors, 1 fixed

root@ceph:~# ceph health detail
HEALTH_OK

Kind regards,

Charles Alva

Actions

Also available in: Atom PDF