Project

General

Profile

Actions

Bug #23871

closed

luminous->mimic: missing primary copy of xxx, wil try copies on 3, then full-object read crc mismatch

Added by Sage Weil almost 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2018-04-25T04:21:10.459 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.459 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.460 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.459 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.460 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.459 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 missing primary copy of 2:292cf221:::200.00000000:head, will try copies on 3
2018-04-25T04:21:10.463 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.463 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.464 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.463 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.464 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.463 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 missing primary copy of 2:292cf221:::200.00000000:head, will try copies on 3
2018-04-25T04:21:10.467 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.467 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.468 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.467 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.468 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.467 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 missing primary copy of 2:292cf221:::200.00000000:head, will try copies on 3
2018-04-25T04:21:10.471 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.471 7f167005d700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.472 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.471 7f167005d700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.473 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.471 7f167005d700 -1 log_channel(cluster) log [ERR] : 2.0 missing primary copy of 2:292cf221:::200.00000000:head, will try copies on 3
2018-04-25T04:21:10.475 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.475 7f167005d700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.476 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.475 7f167005d700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.477 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.475 7f167005d700 -1 log_channel(cluster) log [ERR] : 2.0 missing primary copy of 2:292cf221:::200.00000000:head, will try copies on 3
2018-04-25T04:21:10.479 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.479 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.487 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.479 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.487 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.479 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 missing primary copy of 2:292cf221:::200.00000000:head, will try copies on 3
2018-04-25T04:21:10.487 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.483 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head
2018-04-25T04:21:10.487 INFO:tasks.ceph.osd.0.smithi198.stderr:2018-04-25 04:21:10.483 7f166c055700 -1 log_channel(cluster) log [ERR] : 2.0 full-object read crc 0xfceaec2a != expected 0x9d8f0d8d on 2:292cf221:::200.00000000:head

/a/sage-2018-04-25_02:29:48-upgrade:luminous-x-wip-sage2-testing-2018-04-24-1612-distro-basic-smithi/2436600
/a/sage-2018-04-25_02:29:48-upgrade:luminous-x-wip-sage2-testing-2018-04-24-1612-distro-basic-smithi/2436594

Related issues 4 (1 open3 closed)

Related to RADOS - Bug #23788: luminous->mimic: EIO (crc mismatch) on copy-get from ec poolDuplicate04/18/2018

Actions
Related to RADOS - Feature #24917: Gracefully deal with upgrades when bluestore skipping of data_digest becomes activeNewJosh Durgin07/13/2018

Actions
Has duplicate RADOS - Bug #23204: missing primary copy of object in mixed luminous<->master cluster with bluestoreDuplicate03/02/2018

Actions
Copied to RADOS - Backport #24908: luminous: luminous->mimic: missing primary copy of xxx, wil try copies on 3, then full-object read crc mismatchResolvedSage WeilActions
Actions #1

Updated by Sage Weil almost 6 years ago

  • Status changed from 12 to In Progress
  • Assignee set to Sage Weil
Actions #2

Updated by Sage Weil almost 6 years ago

  • Has duplicate Bug #23204: missing primary copy of object in mixed luminous<->master cluster with bluestore added
Actions #3

Updated by Sage Weil almost 6 years ago

  • Status changed from In Progress to Resolved
Actions #4

Updated by Sage Weil almost 6 years ago

  • Related to Bug #23788: luminous->mimic: EIO (crc mismatch) on copy-get from ec pool added
Actions #5

Updated by Sage Weil almost 6 years ago

  • Status changed from Resolved to Pending Backport
  • Backport set to luminous
Actions #6

Updated by Sage Weil almost 6 years ago

original fix is fe5038c7f9577327f82913b4565712c53903ee48

luminosu backport https://github.com/ceph/ceph/pull/23028

Actions #7

Updated by Nathan Cutler almost 6 years ago

  • Copied to Backport #24908: luminous: luminous->mimic: missing primary copy of xxx, wil try copies on 3, then full-object read crc mismatch added
Actions #8

Updated by David Zafman almost 6 years ago

  • Related to Feature #24917: Gracefully deal with upgrades when bluestore skipping of data_digest becomes active added
Actions #9

Updated by Sage Weil almost 6 years ago

For the luminous regression, this will reproduce the issue:

export h=mira018

cd src/ceph-deploy

# clean up node
ssh $h sudo systemctl stop ceph.target
ssh $h sudo "yes | sudo vgremove `vgs | grep ceph | awk '{print $1}'`" 
ssh $h sudo ceph-volume lvm zap /dev/sdb
ssh $h sudo ceph-volume lvm zap /dev/sdc
./ceph-deploy purge $h
./ceph-deploy purgedata $h
ssh $h sudo yum remove -y librados2

# start over
rm ceph.*
./ceph-deploy install --dev wip-12.2.5-luminous $h
./ceph-deploy new $h
./ceph-deploy mon create-initial
./ceph-deploy mgr create $h
./ceph-deploy admin $h

./ceph-deploy osd create $h --data /dev/sdb --bluestore
./ceph-deploy osd create $h --data /dev/sdc --bluestore

ssh $h sudo ceph osd pool create foo 2
ssh $h sudo ceph osd pool set foo size 2
ssh $h sudo ceph osd crush rule create-replicated foo default osd
ssh $h sudo ceph osd pool set foo crush_rule foo

ssh $h sudo rados -p foo put foo /bin/bash
ssh $h sudo rados -p foo put bar /bin/bash

# 12.2.6 has the bug
./ceph-deploy install --dev luminous --dev-commit 488df8a1076c4f5fc5b8d18a90463262c438740f $h
ssh $h sudo systemctl restart ceph-osd.target
sleep 10

ssh $h sudo rados -p foo put foo /bin/ls
ssh $h sudo rados -p foo get foo /tmp/a    # this will produce EIO

Actions #10

Updated by Nathan Cutler over 5 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF