Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block) - bluestore - Ceph

Actions

Copy link

Bug #22464

closed

Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)

Added by Martin Preuss over 6 years ago. Updated over 1 year ago.

Status:

Won't Fix

Priority:

Urgent

Assignee:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v12.2.1, Ceph - v12.2.2

ceph-qa-suite:

ceph-deploy

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I'm new to Ceph. I started a ceph cluster from scratch on Debian 9,
consisting of 3 hosts, each host has 3-4 OSDs (using 4TB hdds, currently
totalling 10 hdds).

Right from the start I always received random scrub errors telling me
that some checksums didn't match the expected value, fixable with "ceph
pg repair".

I looked at the ceph-osd logfiles on each of the hosts and compared with
the corresponding syslogs. I never found any hardware error, so there
was no problem reading or writing a sector hardware-wise. Also there was
never any other suspicious syslog entry around the time of checksum
error reporting.

When I looked at the checksum error entries I found that the reported
bad checksum always was "0x6706be76".

Cluster created with version 12.2.1 (errors already existed with that version) and updated to 12.2.2.
All 3 nodes run Debian 9 with packages from "http://eu.ceph.com/debian-luminous/".

Cluster status:
services:
mon: 3 daemons, quorum ceph1,ceph2,ceph3
mgr: ceph1(active), standbys: ceph2
mds: cephfs-1/1/1 up {0=ceph1=up:active}, 2 up:standby
osd: 10 osds: 10 up, 10 in

data:
    pools:   5 pools, 256 pgs
    objects: 8097k objects, 10671 GB
    usage:   25403 GB used, 11856 GB / 37259 GB avail
    pgs:     256 active+clean

Pools:
pool 1 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 1184 flags hashpspool stripe_width 0 application cephfs
pool 2 'cephfs_data' replicated size 2 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1184 lfor 0/772 flags hashpspool stripe_width 0 compression_algorithm zlib compression_mode force application cephfs
pool 3 'cephfs_home' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 1184 lfor 0/463 flags hashpspool stripe_width 0 compression_algorithm zlib compression_mode force application cephfs
pool 4 'cephfs_multimedia' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1184 lfor 0/705 flags hashpspool stripe_width 0 application cephfs
pool 5 'cephfs_vdr' replicated size 2 min_size 1 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1184 lfor 0/632 flags hashpspool stripe_width 0 application cephfs

OSD tree:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 36.38596 root default
-3 10.91579 host ceph1
0 hdd 3.63860 osd.0 up 0.79999 1.00000
1 hdd 3.63860 osd.1 up 0.70000 1.00000
2 hdd 3.63860 osd.2 up 1.00000 1.00000
-5 14.55438 host ceph2
3 hdd 3.63860 osd.3 up 1.00000 1.00000
4 hdd 3.63860 osd.4 up 1.00000 1.00000
5 hdd 3.63860 osd.5 up 1.00000 1.00000
9 hdd 3.63860 osd.9 up 1.00000 1.00000
-7 10.91579 host ceph3
6 hdd 3.63860 osd.6 up 1.00000 1.00000
7 hdd 3.63860 osd.7 up 1.00000 1.00000
8 hdd 3.63860 osd.8 up 1.00000 1.00000

Files

ceph-errors (5.95 KB) ceph-errors

List of bad pgs per day

Martin Preuss, 01/19/2018 07:09 PM

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » bluestore

Custom queries

Bug #22464

Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)

Updated by Martin Preuss over 6 years ago

Updated by Sage Weil over 6 years ago

Updated by Martin Preuss over 6 years ago

Updated by Sage Weil over 6 years ago

Updated by Adam Kupczyk over 6 years ago

Updated by Martin Preuss over 6 years ago

Updated by Adam Kupczyk over 6 years ago

Updated by Nicolas Drufin over 6 years ago

Updated by Paul Emmerich about 6 years ago

Updated by Paul Emmerich about 6 years ago

Updated by Martin Preuss about 6 years ago

Updated by Martin Preuss about 6 years ago

Updated by Paul Emmerich about 6 years ago

Updated by Martin Preuss about 6 years ago

Updated by Martin Preuss about 6 years ago

Updated by Sage Weil about 6 years ago

Updated by Sage Weil about 6 years ago

Updated by Adam Kupczyk about 6 years ago

Updated by Martin Preuss about 6 years ago

Updated by Paul Emmerich about 6 years ago

Updated by Paul Emmerich about 6 years ago

Updated by Marco Baldini about 6 years ago

Updated by Adam Kupczyk about 6 years ago

Updated by Paul Emmerich about 6 years ago

Updated by Eric Blevins about 6 years ago

Updated by Adam Kupczyk about 6 years ago

Updated by Brian Marcotte about 6 years ago

Updated by Michael Prokop about 6 years ago

Updated by Eric Blevins about 6 years ago

Updated by Björn Lässig about 6 years ago

Updated by Christoph Glaubitz about 6 years ago

Updated by Christoph Glaubitz about 6 years ago

Updated by Sage Weil about 6 years ago

Updated by Sage Weil about 6 years ago

Updated by Sage Weil about 6 years ago

Updated by Marco Baldini about 6 years ago

Updated by Paul Emmerich about 6 years ago

Updated by Sage Weil about 6 years ago

Updated by Paul Emmerich about 6 years ago

Updated by Michael Prokop about 6 years ago

Updated by Paul Emmerich about 6 years ago

Updated by Paul Emmerich about 6 years ago

Updated by Dennis Björklund about 6 years ago

Updated by Dennis Björklund about 6 years ago

Updated by Michael Prokop almost 6 years ago

Updated by Emmanuel Lacour almost 6 years ago

Updated by Emmanuel Lacour almost 6 years ago

Updated by Dennis Björklund almost 6 years ago

Updated by Nathan Cutler almost 6 years ago

Updated by Sage Weil almost 6 years ago

Updated by Paul Emmerich almost 6 years ago

Updated by Paul Emmerich almost 6 years ago

Updated by Honggang Yang over 5 years ago

Updated by Paul Emmerich over 5 years ago

Updated by Stefan Seidel over 5 years ago

Updated by Paul Emmerich over 5 years ago

Updated by Stefan Seidel over 5 years ago

Updated by Paul Emmerich over 5 years ago

Updated by Alfredo Rezinovsky over 5 years ago

Updated by Dennis Björklund over 5 years ago

Updated by Nick Fisk over 5 years ago

Updated by Paul Emmerich over 5 years ago

Updated by Nick Fisk over 5 years ago

Updated by Jan Pekař over 5 years ago

Updated by Gaudenz Steinlin over 5 years ago

Updated by Nick Fisk over 5 years ago

Updated by Mark Lopez over 5 years ago

Updated by Yuri Weinstein over 5 years ago

Updated by Yuri Weinstein over 5 years ago

Updated by Gaudenz Steinlin over 5 years ago

Updated by Nick Fisk about 5 years ago

Updated by Trent Lloyd over 1 year ago

Updated by Igor Fedotov over 1 year ago