Project

General

Profile

Actions

Bug #22796

closed

bluestore gets to ENOSPC with small devices

Added by David Turner about 6 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
bluestore
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I have a 3 node cluster with mon, mds, mgr, and osds all running on each. The steps I've recently performed on my cluster have all gone well until all 3 of my Bluestore SSD OSDs started crashing with the titled segfault.

I upgraded to 12.2.2 from 10.2.10.
Migrated my 9 HDD OSDs to bluestore (without flash media for rocksdb or WAL).
Configured my crush rules to specifically use class HDD.
Failed to be able to remove the previously required cache tier on top of an EC cephfs data pool due to this issue. http://tracker.ceph.com/issues/22754
Created 3 new SSD OSDs with accompanying crush rules to use class SSD.
Updated the pools cephfs_metadata and cephfs_cache to use the replicated-ssd crush rule.

2 days after making this change, the 3 SSD OSDs all segfaulted at the same time and refused to come back up. I generated a `debug bluestore 20` log for each of these OSDs, but don't know how you would like me to provide them since they're 80MB/each.


Files

ceph-osd.9.log.debug5 (547 KB) ceph-osd.9.log.debug5 David Turner, 01/25/2018 11:09 AM

Related issues 1 (0 open1 closed)

Related to bluestore - Bug #23040: bluestore: statfs available can go negativeResolved02/19/2018

Actions
Actions #1

Updated by Igor Fedotov about 6 years ago

Can you attach logs with lower debug level? E.g. debug bluestore = 5

Actions #2

Updated by David Turner about 6 years ago

Here's a log with `debug bluestore 5`.

Actions #3

Updated by David Turner about 6 years ago

David Turner wrote:

Here's a log with `debug bluestore = 5`.

Actions #4

Updated by Greg Farnum about 6 years ago

  • Project changed from Ceph to bluestore
  • Category deleted (OSD)
  • Priority changed from Normal to High

Please use ceph-post-file to upload the full logs.

Actions #5

Updated by David Turner about 6 years ago

debug bluestore = 20 log for the same OSD as before.
ceph-post-file: 06b467b7-4a91-4263-85e0-c89268b694e3

Actions #6

Updated by David Turner about 6 years ago

This might be a red herring. I think Nick Fisk on the ML found the problem. Originally the output of `ceph osd df` showed the OSDs as 45% full, now it's showing as completely full.

Actions #7

Updated by David Turner about 6 years ago

I was able to resolve this issue by using the ceph-objectstore-tool to remove copies of PGs so the osds could start. The crash in this place would be helpful to specify full osds instead of unknown error.

Actions #8

Updated by Brad Hubbard about 6 years ago

David Turner wrote:

I was able to resolve this issue by using the ceph-objectstore-tool to remove copies of PGs so the osds could start. The crash in this place would be helpful to specify full osds instead of unknown error.

It does, "2018-01-25 06:05:56.325462 7f3803f9c700 -1 bluestore(/var/lib/ceph/osd/ceph-9) _txc_add_transaction error (28) No space left on device not handled on operation 10 (op 0, counting from 0)"

Actions #9

Updated by Sage Weil about 6 years ago

  • Subject changed from BlueStore.cc: 9363: FAILED assert(0 == "unexpected error") to BlueStore.cc: 9363: FAILED assert(0 == "unexpected error") (ENOSPC)
  • Status changed from New to Need More Info
  -147> 2018-01-25 05:36:54.471301 7fd8eb27e700  5 osd.9 14828 heartbeat: osd_stat(22560 MB used, 16383 PB avail, 22312 MB total, peers [] op hist [])

ok, clearnly 16PB free isn't right. is 22GB total for the OSD correct, though?
Actions #10

Updated by Sage Weil about 6 years ago

  • Related to Bug #23040: bluestore: statfs available can go negative added
Actions #11

Updated by David Turner about 6 years ago

Yes The 22GB is correct, the 16PB is not. I created a quick set of SSD OSDs to test new crush rules from what the OSDs had been using as journals on filestore. The cephfs_cache pool hadn't used even 5GB at a time in the previous few months, but overnight it filled up completely when I put it on the small SSD crush rule.

I'm a little confused that the OSDs were able to fill up to 100%. I'm using default ratio settings.

Actions #12

Updated by Sage Weil about 6 years ago

  • Subject changed from BlueStore.cc: 9363: FAILED assert(0 == "unexpected error") (ENOSPC) to bluestore gets to ENOSPC with small devices
  • Status changed from Need More Info to 12

The full checks rely on a (slow) feedback loop. For small devices, it's easy to go faster than the "set the full flag" operation. This could be improved!

Actions #13

Updated by Sage Weil about 6 years ago

  • Priority changed from High to Normal
Actions #14

Updated by Sage Weil about 5 years ago

  • Status changed from 12 to Resolved

with Igor's recent changes, we no longer rely on the slow feedback, so i think we can close this now.

Actions

Also available in: Atom PDF