Project

General

Profile

Actions

Bug #53899

open

bluefs _allocate allocation failed - BlueFS.cc: 2768: ceph_abort_msg("bluefs enospc")

Added by Pivert Dubuisson over 2 years ago. Updated about 1 month ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

All OSDs failing to start after OSD near full. Cluster down.
3 nodes cluster (pve1, pve2, pve3) - Bluestore on a single LV (lvm) on NVME.

Sequence of events:
- 00:30 The ceph RBD was about 90% on a 3 dones cluster with about 140GB on each
- 00:47 pve2 Issued a restore VM/CT (in a Proxmox VE) to overwrite a CT of 20GB. From the pve logs:
- Logical volume "vm-105-disk-0" successfully removed
- restoring 'OneDriveSecure1:backup/vzdump-lxc-105-2022_01_15-23_14_19.tar.zst' now..
- 00:48:26 pve3 ceph-osd1681: 2022-01-16T00:48:26.215+0100 7f7bfa5a5700 1 bluefs _allocate allocation failed, needed 0x1687 -> ceph_abort_msg("bluefs enospc")
00:48:33 pve1 ceph-osd2157: 2022-01-16T00:48:33.904+0100 7f3b78139700 1 bluefs _allocate allocation failed, needed 0x6ff1 -> ceph_abort_msg("bluefs enospc")
01:20 pve2 The restore on pve2 never finished. Server hanging (iowaits). Reboot did not complete.
- 01:38 pve2 I had to power off and restart the server.
- 01:39:12 pve2 ceph-crash687: INFO:ceph-crash:monitoring path /var/lib/ceph/crash, delay 600s
01:40:00 pve2 ceph-osd1681: 2022-01-16T01:40:00.023+0100 7f628f9cbf00 -1 bluefs _allocate allocation failed, needed 0x74240 -> ceph_abort_msg("bluefs enospc")

At this stage, all OSDs are down in a failing to restart endless loop.

All standard logs from the 3 pve hosts are attached.
(grep -a ceph /var/log/syslog | xz -z > /root/cephcrash_$HOSTNAME.xz)


Files

cephcrash_pve1.xz (944 KB) cephcrash_pve1.xz Pivert Dubuisson, 01/16/2022 05:24 PM
cephcrash_pve2.xz (59.2 KB) cephcrash_pve2.xz Pivert Dubuisson, 01/16/2022 05:24 PM
cephcrash_pve3.xz (750 KB) cephcrash_pve3.xz Pivert Dubuisson, 01/16/2022 05:24 PM
ceph_crash_logs_pve1.xz (840 KB) ceph_crash_logs_pve1.xz Pivert Dubuisson, 01/16/2022 11:14 PM
ceph_crash_logs_pve2.xz (87 KB) ceph_crash_logs_pve2.xz Pivert Dubuisson, 01/16/2022 11:14 PM
ceph_crash_logs_pve3.xz (758 KB) ceph_crash_logs_pve3.xz Pivert Dubuisson, 01/16/2022 11:14 PM
ceph-osd_pve1.1.log.xz (482 KB) ceph-osd_pve1.1.log.xz Pivert Dubuisson, 01/16/2022 11:36 PM
ceph-osd_pve2.0.log.xz (117 KB) ceph-osd_pve2.0.log.xz Pivert Dubuisson, 01/16/2022 11:36 PM
ceph-osd_pve3.2.log.xz (506 KB) ceph-osd_pve3.2.log.xz Pivert Dubuisson, 01/16/2022 11:36 PM
issue-53899-journalctl-log.txt (53.1 KB) issue-53899-journalctl-log.txt Niklas Hambuechen, 03/29/2024 12:12 AM
issue-53899-osd-log.txt.gz (103 KB) issue-53899-osd-log.txt.gz Niklas Hambuechen, 03/29/2024 12:12 AM

Related issues 2 (0 open2 closed)

Related to bluestore - Bug #53466: OSD is unable to allocate free space for BlueFSResolvedIgor Fedotov

Actions
Has duplicate bluestore - Bug #53590: ceph abort at bluefs enospcDuplicate

Actions
Actions

Also available in: Atom PDF