Bug #16028: File >100GB crash OSDs(?) - Ceph - Ceph

Actions

Copy link

Bug #16028

closed

File >100GB crash OSDs(?)

Added by Georg Stergiou almost 8 years ago. Updated over 7 years ago.

Status:

Won't Fix

Priority:

High

Assignee:

Category:

Target version:

v0.94.8

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

v0.94.7

ceph-qa-suite:

rados

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi all,

I encountered a strange behaviour in the morning which I think is a bug. I was not able to find something similar in here, so I post my experience.

I use Ceph to store my VM backups but also mail server filesystems and so on. All servers are debian (7.10), ceph is v0.94.7. The VMs is store (rados -p VMs put ...) are usually about 40-60GB each. The VM backup pool is configured with size 2 (min_size 1). Ceph is running on 4 hosts with (3 hosts have 4 OSDs each, 1 host has 8 OSDs). My general RBD pool is configured with size 3, min_size = 2.

Ceph is installed regularly using ceph-deploy, and the configuration is quite vanilla.

Yesterday night I uploaded an image with about 104GB (that was the first file over 100GB). In the morning I noticed, that ceph is driving crazy. One OSD was marked down but all my RBDs were blocking, which I thought could not happen with pool size is 3 (min_size 2). I thought one disc failed, so I waited for the rebalance.

While balancing (after may be 10-15 minutes), I noticed that a second OSD was suddenly flagged as down, but the first OSD was available again. Logs showed, that the OSDs reconnected by themselves because disk was perfectly working ("log_channel(cluster) log [WRN] : map e8682 wrongly marked me down").

Now, those OSDs got continuously marked "down" (one of them) while the second came back and that in an infinite loop every 10-15 minutes. Between those cycles ceph cluster was working correctly about a minute before the OSDs got dropped again.

I set both OSDs to "down" (as I still thought broken disks are the reason) and everything synced fine. But there is one stuck pg which contains that >100GB file, which is located exactly on those two OSDs which caused the trouble. So that's no coincidence in my eyes.

All OSDs are mounted like follows:

Host ceph01
/dev/xvdf1 on /var/lib/ceph/osd/ceph-13 type xfs (rw,noatime,attr2,delaylog,inode64,noquota)

Host ceph02
/dev/xvdc1 on /var/lib/ceph/osd/ceph-21 type xfs (rw,noatime,attr2,delaylog,inode64,noquota)

I attached the pg query, may be this can help.

Could it be that a xfs filesize limitation / xattr limitation causes the OSDs to be dropped mistakenly?

Thanks for feedback
Georg

Files

pg_query.txt (32.6 KB) pg_query.txt

ceph pg query 10.31 (stuck active+remapped)

Georg Stergiou, 05/25/2016 04:57 PM

Actions

Copy link

Updated by Abhishek Lekshmanan almost 8 years ago

Project changed from rgw to Ceph

Actions

Copy link

Updated by Samuel Just over 7 years ago

Status changed from New to Won't Fix

Yeah, that's not gonna work. rados level objects are supposed to be bounded size (think 4 MB). You want to be using RGW's s3 interface to upload the objects (it breaks large images down into smaller pieces, 4MB by default, behind the scenes).

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #16028

File >100GB crash OSDs(?)

Updated by Abhishek Lekshmanan almost 8 years ago

Updated by Samuel Just over 7 years ago