Project

General

Profile

Actions

Bug #65672

open

Rados write requests user_version set to 0 when pg interval changes lead to duplicated client requests.

Added by Xuehan Xu 10 days ago. Updated 10 days ago.

Status:
New
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Client logs:

2024-04-25T12:15:59.395 INFO:tasks.rados.rados.0.scephqa02.stdout:714:  finishing write tid 4 to scephqa02.cpp.bjat.qianxin-inc.cn784452-714
2024-04-25T12:15:59.396 INFO:tasks.rados.rados.0.scephqa02.stdout:714:  finishing write tid 5 to scephqa02.cpp.bjat.qianxin-inc.cn784452-714
2024-04-25T12:15:59.396 INFO:tasks.rados.rados.0.scephqa02.stdout:714:  oid 714 version 0 is already newer than 0
2024-04-25T12:15:59.396 INFO:tasks.rados.rados.0.scephqa02.stdout:714:  oid 714 updating version 0 to 152
2024-04-25T12:15:59.397 INFO:tasks.rados.rados.0.scephqa02.stdout:714:  oid 714 updating version 152 to 153
2024-04-25T12:15:59.397 INFO:tasks.rados.rados.0.scephqa02.stdout:714:  oid 714 version 153 is already newer than 151
2024-04-25T12:15:59.397 INFO:tasks.rados.rados.0.scephqa02.stdout:update_object_version oid 714 v 153 (ObjNum 713 snap 0 seq_num 713) dirty exists
2024-04-25T12:15:59.399 DEBUG:teuthology.orchestra.run:got remote process result: None
2024-04-25T12:15:59.400 INFO:tasks.rados.rados.0.scephqa02.stderr:Error: racing read on 714 returned version 154 rather than version 153
2024-04-25T12:15:59.400 INFO:tasks.rados.rados.0.scephqa02.stderr:/da1/xxh/rpmbuild/BUILD/ceph-19.0.0-3235-g9dfc47aa2ed/src/test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 2af8ed001700 time 2024-04-25T12:15:57.888240+0000
2024-04-25T12:15:59.400 INFO:tasks.rados.rados.0.scephqa02.stderr:/da1/xxh/rpmbuild/BUILD/ceph-19.0.0-3235-g9dfc47aa2ed/src/test/osd/RadosModel.h: 1077: ceph_abort_msg("racing read got wrong version")
2024-04-25T12:15:59.401 INFO:tasks.rados.rados.0.scephqa02.stderr: ceph version 19.0.0-3235-g9dfc47aa2ed (9dfc47aa2ed674c314a654a6bb67267a31592af1) squid (dev)
2024-04-25T12:15:59.401 INFO:tasks.rados.rados.0.scephqa02.stderr: 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd8) [0x2af8d8d7b63d]
2024-04-25T12:15:59.401 INFO:tasks.rados.rados.0.scephqa02.stderr: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x8ae) [0x43cace]
2024-04-25T12:15:59.402 INFO:tasks.rados.rados.0.scephqa02.stderr: 3: (write_callback(void*, void*)+0x1d) [0x45135d]
2024-04-25T12:15:59.402 INFO:tasks.rados.rados.0.scephqa02.stderr: 4: /lib64/librados.so.2(+0xcf8c9) [0x2af8d81388c9]
2024-04-25T12:15:59.402 INFO:tasks.rados.rados.0.scephqa02.stderr: 5: /lib64/librados.so.2(+0xd0a85) [0x2af8d8139a85]
2024-04-25T12:15:59.403 INFO:tasks.rados.rados.0.scephqa02.stderr: 6: /lib64/librados.so.2(+0xd1633) [0x2af8d813a633]
2024-04-25T12:15:59.403 INFO:tasks.rados.rados.0.scephqa02.stderr: 7: /lib64/librados.so.2(+0xe5c12) [0x2af8d814ec12]
2024-04-25T12:15:59.403 INFO:tasks.rados.rados.0.scephqa02.stderr: 8: /lib64/librados.so.2(+0xd49f1) [0x2af8d813d9f1]
2024-04-25T12:15:59.403 INFO:tasks.rados.rados.0.scephqa02.stderr: 9: /lib64/librados.so.2(+0xe7f20) [0x2af8d8150f20]
2024-04-25T12:15:59.404 INFO:tasks.rados.rados.0.scephqa02.stderr: 10: /lib64/libstdc++.so.6(+0xc2ba3) [0x2af8d7d96ba3]
2024-04-25T12:15:59.404 INFO:tasks.rados.rados.0.scephqa02.stderr: 11: /lib64/libpthread.so.0(+0x81cf) [0x2af8d9e641cf]
2024-04-25T12:15:59.404 INFO:tasks.rados.rados.0.scephqa02.stderr: 12: clone()
2024-04-25T12:15:59.411 ERROR:teuthology:Uncaught exception (Hub)

OSD logs:

DEBUG 2024-04-25 12:15:57,871 [shard 0:main] osd -  pg_epoch 40 pg[3.b( v 32'154 (0'0,32'154] local-lis/les=39/40 n=43 ec=15/15 lis/c=39/31 les/c/f=40/32/0 sis=39) [3,0,2] r=0 lpr=39 pi=[31,39)/1 crt=32'154 lcod 0'0 mlcod 0'0 active  ClientRequest::process_op: client_request(id=97, detail=m=[osd_op(client.4232.0:3233 3.b 3:db5312dd:::scephqa02.cpp.bjat.qianxin-inc.cn784452-714:head {setxattr _header (82) in=89b, truncate 3983391} snapc 0={} RETRY=3 ondisk+retry+write+known_if_redirected+supports_pool_eio e39)]).0: checking already_complete
DEBUG 2024-04-25 12:15:57,871 [shard 0:main] osd -  pg_epoch 40 pg[3.b( v 32'154 (0'0,32'154] local-lis/les=39/40 n=43 ec=15/15 lis/c=39/31 les/c/f=40/32/0 sis=39) [3,0,2] r=0 lpr=39 pi=[31,39)/1 crt=32'154 lcod 0'0 mlcod 0'0 active  ClientRequest::process_op: client_request(id=97, detail=m=[osd_op(client.4232.0:3233 3.b 3:db5312dd:::scephqa02.cpp.bjat.qianxin-inc.cn784452-714:head {setxattr _header (82) in=89b, truncate 3983391} snapc 0={} RETRY=3 ondisk+retry+write+known_if_redirected+supports_pool_eio e39)]).0: already completed, sending reply

Actions #1

Updated by Xuehan Xu 10 days ago

  • Pull request ID set to 57102
Actions

Also available in: Atom PDF