Project

General

Profile

Actions

Bug #41601

closed

oi(object_info_t).size does not match on disk size

Added by 侯 斌 over 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
% Done:

50%

Source:
Community (dev)
Tags:
Backport:
mimic,luminous,nautilus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
rest
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

In our test environment(ceph version 14.2.1(nautilus) + replicated pool), we found scrub error like bug23701. We use rados_write interface in librados.h to write (offset = 1024, length = 0), and after scrub the corresponding pg, we found a scrub error. The log in osd belows:

2019-09-01 20:41:21.393 7f767dc21700 0 log_channel(cluster) log [DBG] : 1.63 scrub starts
2019-09-01 20:41:21.402 7f767dc21700 -1 log_channel(cluster) log [ERR] : 1.63 shard 1 soid 1:c68704ec:::test_0001:head : candidate size 1024 info size 0 mismatch
2019-09-01 20:41:21.402 7f767dc21700 -1 log_channel(cluster) log [ERR] : 1.63 shard 0 soid 1:c68704ec:::test_0001:head : candidate size 1024 info size 0 mismatch
2019-09-01 20:41:21.402 7f767dc21700 -1 log_channel(cluster) log [ERR] : 1.63 shard 3 soid 1:c68704ec:::test_0001:head : candidate size 1024 info size 0 mismatch
2019-09-01 20:41:21.402 7f767dc21700 -1 log_channel(cluster) log [ERR] : 1.63 soid 1:c68704ec:::test_0001:head : failed to pick suitable object info
2019-09-01 20:41:21.402 7f767dc21700 -1 log_channel(cluster) log [ERR] : scrub 1.63 1:c68704ec:::test_0001:head : on disk size (1024) does not match object info size (0) adjusted for ondisk to (0)
2019-09-01 20:41:21.434 7f767dc21700 -1 log_channel(cluster) log [ERR] : 1.63 scrub 4 errors

After analysis the code, we found the operation CEPH_OSD_OP_WRITE will truncate object size to offset when length is 0.

    if (op.extent.length == 0) {
      if (op.extent.offset > oi.size) {
        t->truncate(
          soid, op.extent.offset);
      } else {
        t->nop(soid);
      }
    } else {
      t->write(
        soid, op.extent.offset, op.extent.length, osd_op.indata, op.flags);
    }

but it doesn't update the oi.size later because the length is 0

  if (write_full ||
      (offset + length > oi.size && length)) {
    uint64_t new_size = offset + length;
    delta_stats.num_bytes -= oi.size;
    delta_stats.num_bytes += new_size;
    oi.size = new_size;
  }

Moreover, we think it has the same bug when old write(e.g offset=1024, length=4096) arrived after trimtrunc(e.g truncate_seq = 2, truncate_size=0) in cephfs.


Related issues 3 (0 open3 closed)

Copied to RADOS - Backport #41702: luminous: oi(object_info_t).size does not match on disk size RejectedActions
Copied to RADOS - Backport #41703: nautilus: oi(object_info_t).size does not match on disk size ResolvedPrashant DActions
Copied to RADOS - Backport #41704: mimic: oi(object_info_t).size does not match on disk size ResolvedPrashant DActions
Actions

Also available in: Atom PDF