Project

General

Profile

Bug #36455

BlueStore: ENODATA not fully handled

Added by Lars Marowsky-Brée over 1 year ago. Updated 29 days ago.

Status:
Resolved
Priority:
Normal
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
luminous,mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

We have a drive model experiencing weak writes, which manifest themselves as failed reads later; the drive notices the low-level checksum doesn't validate and returns ENODATA.

This is 12.2.7; this was supposedly(?) fixed in http://tracker.ceph.com/issues/23333 and backported for 12.2.6: http://tracker.ceph.com/issues/23672 - is_expected_ioerr() is supposed to map this to EIO.

However, we still observe an assert:

2018-10-01T12:14:22.503855-05:00 cosd9 ceph-osd[10672]: /home/abuild/rpmbuild/BUILD/ceph-12.2.7-420-gc0ef85b854/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::_do_read(BlueStore::Collection*, BlueStore::OnodeRef, uint64_t, size_t, ceph::bufferlist&, uint32_t)' thread 7fed7de36700 time 2018-10-01 12:14:22.503757
2018-10-01T12:14:22.504167-05:00 cosd9 ceph-osd[10672]: /home/abuild/rpmbuild/BUILD/ceph-12.2.7-420-gc0ef85b854/src/os/bluestore/BlueStore.cc: 6673: FAILED assert(r == 0)
2018-10-01T12:14:22.504305-05:00 cosd9 ceph-osd[10672]: 2018-10-01 12:14:22.503728 7fed7de36700 -1 bluestore(/var/lib/ceph/osd/ceph-117) _do_read bdev-read failed: (61) No data available
2018-10-01T12:14:22.506015-05:00 cosd9 kernel: [950406.153546] sd 0:0:3:0: [sdd] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
2018-10-01T12:14:22.506024-05:00 cosd9 kernel: [950406.153552] sd 0:0:3:0: [sdd] tag#2 Sense Key : Medium Error [current] [descriptor] 
2018-10-01T12:14:22.506025-05:00 cosd9 kernel: [950406.153555] sd 0:0:3:0: [sdd] tag#2 Add. Sense: Unrecovered read error
2018-10-01T12:14:22.506028-05:00 cosd9 kernel: [950406.153560] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
2018-10-01T12:14:22.506029-05:00 cosd9 kernel: [950406.153562] sd 0:0:3:0: [sdd] tag#2 CDB: Read(16) 88 00 00 00 00 02 de 46 b7 00 00 00 04 00 00 00
2018-10-01T12:14:22.506030-05:00 cosd9 kernel: [950406.153565] blk_update_request: critical medium error, dev sdd, sector 12319111672
2018-10-01T12:14:22.506031-05:00 cosd9 kernel: [950406.153582] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
2018-10-01T12:14:22.506032-05:00 cosd9 kernel: [950406.153588] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
2018-10-01T12:14:22.506033-05:00 cosd9 kernel: [950406.153593] mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
2018-10-01T12:14:22.508227-05:00 cosd9 ceph-osd[10672]:  ceph version 12.2.7-420-gc0ef85b854 (c0ef85b8544d85079fa7b1643ff944121eb51513) luminous (stable)
2018-10-01T12:14:22.508410-05:00 cosd9 ceph-osd[10672]:  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x10e) [0x557e216fbabe]
2018-10-01T12:14:22.508541-05:00 cosd9 ceph-osd[10672]:  2: (BlueStore::_do_read(BlueStore::Collection*, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, ceph::buffer::list&, unsigned int)+0
x1023) [0x557e215cde23]
2018-10-01T12:14:22.508655-05:00 cosd9 ceph-osd[10672]:  3: (BlueStore::read(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&, unsigned long, unsigned long, ceph::buffer::list&, unsigned int
)+0x47a) [0x557e215cfdfa]
2018-10-01T12:14:22.508766-05:00 cosd9 ceph-osd[10672]:  4: (ECBackend::be_deep_scrub(hobject_t const&, ScrubMap&, ScrubMapBuilder&, ScrubMap::object&)+0x248) [0x557e2147d978]
2018-10-01T12:14:22.508893-05:00 cosd9 ceph-osd[10672]:  5: (PGBackend::be_scan_list(ScrubMap&, ScrubMapBuilder&)+0x87b) [0x557e2138baab]
2018-10-01T12:14:22.508997-05:00 cosd9 ceph-osd[10672]:  6: (PG::build_scrub_map_chunk(ScrubMap&, ScrubMapBuilder&, hobject_t, hobject_t, bool, ThreadPool::TPHandle&)+0x88) [0x557e2123a518]
2018-10-01T12:14:22.509101-05:00 cosd9 ceph-osd[10672]:  7: (PG::chunky_scrub(ThreadPool::TPHandle&)+0x5d3) [0x557e21266b53]
2018-10-01T12:14:22.509210-05:00 cosd9 ceph-osd[10672]:  8: (PG::scrub(unsigned int, ThreadPool::TPHandle&)+0x2e4) [0x557e21269034]
2018-10-01T12:14:22.509315-05:00 cosd9 ceph-osd[10672]:  9: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfd7) [0x557e211b2f17]
2018-10-01T12:14:22.509420-05:00 cosd9 ceph-osd[10672]:  10: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x829) [0x557e21701269]
2018-10-01T12:14:22.509524-05:00 cosd9 ceph-osd[10672]:  11: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x557e217031b0]
2018-10-01T12:14:22.509628-05:00 cosd9 ceph-osd[10672]:  12: (()+0x8724) [0x7fed9c98b724]
2018-10-01T12:14:22.509733-05:00 cosd9 ceph-osd[10672]:  13: (clone()+0x6d) [0x7fed9ba13e8d]

(During deep scrub.)


Related issues

Copied to bluestore - Backport #37824: mimic: BlueStore: ENODATA not fully handled Resolved
Copied to bluestore - Backport #37825: luminous: BlueStore: ENODATA not fully handled Resolved

History

#1 Updated by Joao Eduardo Luis over 1 year ago

  • Description updated (diff)

#2 Updated by Joao Eduardo Luis over 1 year ago

  • Affected Versions v12.2.7 added

#3 Updated by Lars Marowsky-Brée over 1 year ago

The code appears identical in master.

For this particular case, especially during scrub, we know our local copy is lost. It'd make sense to flag the PG for rebuild - a hard drive will remember this fault (tracked as "pending sector" in SMART), and relocate the affected sector on the next write.

It should be logged properly, so appropriate action can be taken if these occur with too high a frequency (besides firmware bugs, they really can be caused by cosmic radiation and bad luck), but a large install will likely occasionally experience these and we should recover and restore redundancy.

(Re-writing the affected data is a common mitigation for traditional RAID.)

Is that feasible?

#4 Updated by Sage Weil over 1 year ago

  • Status changed from New to 12
  • Assignee set to Radoslaw Zarzynski

#6 Updated by Radoslaw Zarzynski over 1 year ago

  • Status changed from 12 to Fix Under Review
  • Backport set to luminous

#7 Updated by Sage Weil about 1 year ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport changed from luminous to luminous,mimic

#8 Updated by Nathan Cutler about 1 year ago

  • Copied to Backport #37824: mimic: BlueStore: ENODATA not fully handled added

#9 Updated by Nathan Cutler about 1 year ago

  • Copied to Backport #37825: luminous: BlueStore: ENODATA not fully handled added

#10 Updated by Nathan Cutler about 1 year ago

  • Status changed from Pending Backport to Resolved

#11 Updated by Paul Emmerich 29 days ago

I'm seeing this on 14.2.7 during deep-scrubbing. I'll investigate and maybe open a new issue.

2020-02-27 23:18:30 XXXXXXXX osd.258 03ea89e700 -1 bluestore(/var/lib/ceph/osd/ceph-258) _do_read bdev-read failed: (61) No data available
2020-02-27 23:18:30 XXXXXXXX osd.258 /tmp/release/Debian/WORKDIR/ceph-14.2.7/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::_do_read(BlueStore::Collection*, BlueStore::OnodeRef, uint64_t, size_t, ceph::bufferlist&, uint32_t, uint64_t)' thread 7f03ea89e700 time 2020-02-27 23:18:30.647763
2020-02-27 23:18:30 XXXXXXXX osd.258 /tmp/release/Debian/WORKDIR/ceph-14.2.7/src/os/bluestore/BlueStore.cc: 9152: FAILED ceph_assert(r == 0)
2020-02-27 23:18:30 XXXXXXXX osd.258 03ea89e700 -1 bdev(0x55d366e0a000 /var/lib/ceph/osd/ceph-258/block) read stalled read 0x474fddb0000~80000 (direct) since 300036s, timeout is 5s
2020-02-27 23:18:30 XXXXXXXX osd.258 03ea89e700 -1 bluestore(/var/lib/ceph/osd/ceph-258) _do_read bdev-read failed: (61) No data available
2020-02-27 23:18:30 XXXXXXXX osd.258 ceph version 14.2.7 (fb8e34a687d76cd3bd45c2a0fb445432ab69b4ff) nautilus (stable)
2020-02-27 23:18:30 XXXXXXXX osd.258 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x55d35c06ba86]
2020-02-27 23:18:30 XXXXXXXX osd.258 2: (()+0x518c5e) [0x55d35c06bc5e]
2020-02-27 23:18:30 XXXXXXXX osd.258 3: (BlueStore::_do_read(BlueStore::Collection*, boost::intrusive_ptr, unsigned long, unsigned long, ceph::buffer::v14_2_0::list&, unsigned int, unsigned long)+0x39e7) [0x55d35c5d67f7]
2020-02-27 23:18:30 XXXXXXXX osd.258 4: (BlueStore::read(boost::intrusive_ptr&, ghobject_t const&, unsigned long, unsigned long, ceph::buffer::v14_2_0::list&, unsigned int)+0x1b5) [0x55d35c5da595]
2020-02-27 23:18:30 XXXXXXXX osd.258 5: (ReplicatedBackend::be_deep_scrub(hobject_t const&, ScrubMap&, ScrubMapBuilder&, ScrubMap::object&)+0x2cb) [0x55d35c44e89b]
2020-02-27 23:18:30 XXXXXXXX osd.258 6: (PGBackend::be_scan_list(ScrubMap&, ScrubMapBuilder&)+0x6db) [0x55d35c369e3b]
2020-02-27 23:18:30 XXXXXXXX osd.258 7: (PG::build_scrub_map_chunk(ScrubMap&, ScrubMapBuilder&, hobject_t, hobject_t, bool, ThreadPool::TPHandle&)+0x83) [0x55d35c208c53]
2020-02-27 23:18:30 XXXXXXXX osd.258 8: (PG::chunky_scrub(ThreadPool::TPHandle&)+0x17d3) [0x55d35c234ed3]
2020-02-27 23:18:30 XXXXXXXX osd.258 9: (PG::scrub(unsigned int, ThreadPool::TPHandle&)+0x4bb) [0x55d35c23610b]
2020-02-27 23:18:30 XXXXXXXX osd.258 10: (PGScrub::run(OSD*, OSDShard*, boost::intrusive_ptr&, ThreadPool::TPHandle&)+0x1a) [0x55d35c3ee6ea]
2020-02-27 23:18:30 XXXXXXXX osd.258 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x7d7) [0x55d35c167d17]
2020-02-27 23:18:30 XXXXXXXX osd.258 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b4) [0x55d35c726864]
2020-02-27 23:18:30 XXXXXXXX osd.258 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55d35c729270]
2020-02-27 23:18:30 XXXXXXXX osd.258 14: (()+0x7fa3) [0x7f04075e4fa3]
2020-02-27 23:18:30 XXXXXXXX osd.258 15: (clone()+0x3f) [0x7f04071944cf]
2020-02-27 23:18:30 XXXXXXXX osd.258 *** Caught signal (Aborted) **
2020-02-27 23:18:30 XXXXXXXX osd.258 in thread 7f03ea89e700 thread_name:tp_osd_tp

Also available in: Atom PDF