Project

General

Profile

Actions

Bug #65857

open

osd: user_version is inconsistent between object_info and log entries

Added by Samuel Just 12 days ago. Updated 10 days ago.

Status:
In Progress
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Steps to reproduce:
- create vstart cluster

    MDS=0 MGR=1 OSD=3 MON=1 ../src/vstart.sh --without-dashboard -X --redirect-output --debug -n --no-restart --crimson
    ./bin/ceph osd pool create rbd 8 8 replicated replicated_rule 2 2 2

- start ceph_test_rados
    ./bin/ceph_test_rados --max-ops 10000000000 --pool rbd --objects 20 --max-in-flight 10 --size 1000000 --no-omap --op write 1

- mark an osd down
    ./bin/ceph osd down 0

Result:

251:  finishing write tid 1 to 6f040cb71532303301-10
251:  finishing write tid 2 to 6f040cb71532303301-10
241:  finishing write tid 1 to 6f040cb71532303301-18
241:  finishing write tid 2 to 6f040cb71532303301-18
241:  finishing write tid 3 to 6f040cb71532303301-18
241:  finishing write tid 4 to 6f040cb71532303301-18
241:  oid 18 updating version 0 to 212
241:  oid 18 updating version 212 to 214
241:  oid 18 updating version 214 to 216
update_object_version oid 18 v 216 (ObjNum 240 snap 0 seq_num 240) dirty exists
Error: racing read on 18 returned version 215 rather than version 216
/home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/src/test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f4e01b99640 time 2024-05-07T02:50:12.974788+0000
/home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/src/test/osd/RadosModel.h: 1076: ceph_abort_msg("racing read got wrong version")
 ceph version 19.0.0-3547-gf3b5e31713f (f3b5e31713f21ed8eb561155964aa72f331b572b) squid (dev)
 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xc8) [0x7f4e0ae87d1c]
 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x5ca) [0x44c7a4]
 3: (TestOp::finish(TestOp::CallbackInfo*)+0xa) [0x45d6e6]
 4: (write_callback(void*, void*)+0x18) [0x45d719]
 5: /home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/build/lib/librados.so.2(+0x9031c) [0x7f4e0b94331c]
 6: /home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/build/lib/librados.so.2(+0x99a8a) [0x7f4e0b94ca8a]
 7: /home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/build/lib/librados.so.2(+0x8f46b) [0x7f4e0b94246b]
 8: /home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/build/lib/librados.so.2(+0x9b168) [0x7f4e0b94e168]
 9: /home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/build/lib/librados.so.2(+0x9ba18) [0x7f4e0b94ea18]
 10: /home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/build/lib/librados.so.2(+0xa7325) [0x7f4e0b95a325]
 11: /home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/build/lib/librados.so.2(+0xa7aeb) [0x7f4e0b95aaeb]
 12: /home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/build/lib/librados.so.2(+0xa9fcd) [0x7f4e0b95cfcd]
 13: /home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/build/lib/librados.so.2(+0xaa042) [0x7f4e0b95d042]
 14: /home/sam/git-checkouts/ceph-workspace/integration/crimson-obc-testing/build/lib/librados.so.2(+0xaa05b) [0x7f4e0b95d05b]
 15: /lib64/libstdc++.so.6(+0xdbad4) [0x7f4e09cdbad4]
 16: /lib64/libc.so.6(+0x89c02) [0x7f4e09889c02]
 17: /lib64/libc.so.6(+0x10ec40) [0x7f4e0990ec40]

osd_op_params->at_version is used to set the log entry versions incorrectly


Related issues 1 (1 open0 closed)

Related to crimson - Bug #65491: recover_missing: racing read got wrong versionIn ProgressMatan Breizman

Actions
Actions #1

Updated by Samuel Just 12 days ago

  • Description updated (diff)
Actions #2

Updated by Matan Breizman 11 days ago

  • Related to Bug #65491: recover_missing: racing read got wrong version added
Actions #3

Updated by Matan Breizman 11 days ago · Edited

I suspcet this is related to how we handle acting set changes which trigger `ClientRequest::Orderer::requeue`.
See: https://tracker.ceph.com/issues/65491#note-11

Attaching recent relevant at_version changes (issue persists with the merged PRs):
https://github.com/ceph/ceph/pull/57102
https://github.com/ceph/ceph/pull/57056

Actions #4

Updated by Samuel Just 11 days ago · Edited

I think it's a bug in how osd_op_params_t::user_at_version is used.

Actions #5

Updated by Samuel Just 11 days ago

  • Status changed from New to In Progress
  • Priority changed from Normal to Urgent
Actions #6

Updated by Samuel Just 11 days ago

First,

241:  oid 18 updating version 0 to 212
241:  oid 18 updating version 212 to 214
241:  oid 18 updating version 214 to 216

is weird. We're only seeing even versions here because we're double incrementing osd_op_params_t::at_version for each operation. This turns out to be unrelated to the failure, but it's worth fixing.

Second, we populate osd_op_params_t::user_at_version in flush_changes_n_do_osd_ops_effects from osd_op_params_t::at_version prior to calling prepare_transaction, which increments at_version.version. That incremented version eventually ends up as oi.version in PGBackend::mutate_object. user_at_version ends up as oi.user_version, which therefore does not match oi.version. That non-matching version ends up being used on the MOSDOpReply the client sees. This behavior so far is definitely wrong as user_version won't match version, but it won't actually trigger the above failure as long as it's consistently one below the correct version.

Where the above crash occurs is if the client resends the ops. OpsExecutor::prepare_transaction uses at_version rather than user_at_version to populate the user_version field of the log entry (note, this is right value). Upon replay of the op, the OSD finds the log entry and responds with the correct user_version. The above failure occurs when we eventually get to the end of the log, actually process a write or a read the OSD hasn't already seen, and return the wrong user_version. Since ceph_test_rados already saw the right version, it actually notices and crashes.

Actions #7

Updated by Samuel Just 11 days ago

I'll get a PR out tomorrow.

Actions #8

Updated by Samuel Just 11 days ago

Also, PGBackend::mutate_object probably shouldn't have the object_info updates -- it's really far from where we construct the log entries in OpsExecutor::prepare_transaction and seems like it should be the responsibility of OpsExecutor (which does construct maintain clone object_info and all of the other size fields) rather than PGBackend.

Actions #9

Updated by Samuel Just 10 days ago

  • Backport set to squid
Actions

Also available in: Atom PDF