Project

General

Profile

Bug #12012

Updated by Loïc Dachary over 7 years ago

issue-12012-ceph-report-20150614.gz can be found in sftp cephdrop@ceph.com

Hi,

I have recently upgraded a giant (0.87.1) cluster to hammer 0.94.2. We got no issue with the upgrade itself.
The main usage of the cluster kvm virtual machine backed on rbd image (format 2).

We have some 'replicated' pools and some 'erasure coding' pools.

On Virtual Machines using 'erasure coding' pools, we got many 'IO Error' on their filesystems (ext4) since the upgrade.

Running a 'fsck' on the FS, doesn't solve this problem (fsck repairs the FS, but it continues to be re-corrupted when we reuse it)

I can't reproduce the issue on freshly created image on the 'erasure coding' pool (I still reading/writing files to ensure).

Ceph doesn't report any issue in logs.

Cheers,

Additional info:

$ ceph osd erasure-code-profile get ec4p1profile
directory=/usr/lib/ceph/erasure-code
k=4
m=1
plugin=jerasure
ruleset-failure-domain=host
technique=reed_sol_van

Output on 'dmesg' inside a VM:

[8536286.192503] end_request: I/O error, dev vdf, sector 4637171712
[8536286.530987] end_request: I/O error, dev vdf, sector 4637179904
[8536287.225984] end_request: I/O error, dev vdf, sector 4637188096
[8536287.556361] end_request: I/O error, dev vdf, sector 4637196288
[8536288.758833] end_request: I/O error, dev vdf, sector 4637204480
[8536289.305179] end_request: I/O error, dev vdf, sector 4637212672
[8536289.745542] end_request: I/O error, dev vdf, sector 4637220864
[8536290.192655] end_request: I/O error, dev vdf, sector 4637229056
[8536290.924421] end_request: I/O error, dev vdf, sector 4637237248
[8536291.635483] end_request: I/O error, dev vdf, sector 6468124152
[8557542.208535] end_request: I/O error, dev vdc, sector 7616889296
[8557542.944758] end_request: I/O error, dev vdc, sector 41232
[8557542.945423] Buffer I/O error on device vdc, logical block 5154
[8557542.946000] Buffer I/O error on device vdc, logical block 5155
[8557542.946572] Buffer I/O error on device vdc, logical block 5156
[8557542.947156] Buffer I/O error on device vdc, logical block 5157
[8557542.947728] Buffer I/O error on device vdc, logical block 5158
[8557542.948305] Buffer I/O error on device vdc, logical block 5159
[8557542.948741] Buffer I/O error on device vdc, logical block 5160
[8557542.948741] Buffer I/O error on device vdc, logical block 5161
[8557542.948741] Buffer I/O error on device vdc, logical block 5162
[8557543.804862] EXT4-fs error (device vdc): ext4_readdir:172: inode #11: comm find: path /data/tdcpb/archive/4/lost+found: directory contains a hole at offset 0
[8557543.806331] Aborting journal on device vdc-8.
[8557543.806787] EXT4-fs (vdc): Remounting filesystem read-only
[8557543.813079] end_request: I/O error, dev vdb, sector 7138738712
[8557544.515328] end_request: I/O error, dev vdb, sector 41232
[8557545.979173] EXT4-fs error (device vdb): ext4_readdir:172: inode #11: comm find: path /data/tdcpb/archive/6/lost+found: directory contains a hole at offset 0
[8557546.340171] end_request: I/O error, dev vdf, sector 41232
[8557546.913031] EXT4-fs error (device vdf): ext4_readdir:172: inode #11: comm find: path /data/tdcpb/archive/8/lost+found: directory contains a hole at offset 0
[8557546.914464] Aborting journal on device vdf-8.
[8557548.249942] EXT4-fs (vdf): Remounting filesystem read-only
[8557566.732395] end_request: I/O error, dev vdd, sector 41224
[8557844.829111] end_request: I/O error, dev vdd, sector 4915757312
[8583680.992045] EXT4-fs (vdb): error count since last fsck: 2
[8583680.992057] EXT4-fs (vdb): initial error at time 1434120904: ext4_find_entry:932: inode 111542328
[8583680.992060] EXT4-fs (vdb): last error at time 1434181428: ext4_readdir:172: inode 11
[8586076.784775] UDP: bad checksum. From 46.183.220.250:3099 to 89.234.156.236:5060 ulen 237
[8644236.256068] EXT4-fs (vdc): error count since last fsck: 1
[8644236.256070] EXT4-fs (vdc): initial error at time 1434181426: ext4_readdir:172: inode 11
[8644236.256072] EXT4-fs (vdc): last error at time 1434181426: ext4_readdir:172: inode 11
[8644236.256075] EXT4-fs (vdf): error count since last fsck: 1
[8644236.256076] EXT4-fs (vdf): initial error at time 1434181429: ext4_readdir:172: inode 11
[8644236.256078] EXT4-fs (vdf): last error at time 1434181429: ext4_readdir:172: inode 11
[8652845.649692] end_request: I/O error, dev vdd, sector 8328
[8652845.651157] EXT4-fs error (device vdd): ext4_read_inode_bitmap:161: comm touch: Cannot read inode bitmap - block_group = 0, inode_bitmap = 1041
[8652846.052461] Aborting journal on device vdd-8.
[8652846.408963] EXT4-fs (vdd): Remounting filesystem read-only
[8652846.409564] EXT4-fs error (device vdd) in ext4_new_inode:945: IO failure
[8652846.446874] EXT4-fs error (device vdd) in ext4_create:1776: IO failure

Back