Project

General

Profile

Bug #23827

osd sends op_reply out of order

Added by Patrick Donnelly almost 5 years ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
luminous, mimic
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2018-04-18T08:29:43.648 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr:/build/ceph-12.2.4-431-g64ffa81/src/osdc/ObjectCacher.cc: In function 'void ObjectCacher::bh_write_commit(int64_t, sobject_t, std::vector<std::pair<long int, long unsigned int> >&, ceph_tid_t, int)' thread 7fb3a0eaf700 time 2018-04-18 08:29:43.648412
2018-04-18T08:29:43.649 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr:/build/ceph-12.2.4-431-g64ffa81/src/osdc/ObjectCacher.cc: 1199: FAILED assert(ob->last_commit_tid < tid)
2018-04-18T08:29:43.650 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr: ceph version 12.2.4-431-g64ffa81 (64ffa817000d59d91379f7335439845930f58530) luminous (stable)
2018-04-18T08:29:43.650 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x55db3c3664f2]
2018-04-18T08:29:43.650 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr: 2: (ObjectCacher::bh_write_commit(long, sobject_t, std::vector<std::pair<long, unsigned long>, std::allocator<std::pair<long, unsigned long> > >&, unsigned long, int)+0xfa4) [0x55db3c747414]
2018-04-18T08:29:43.650 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr: 3: (ObjectCacher::C_WriteCommit::finish(int)+0x66) [0x55db3c7557f6]
2018-04-18T08:29:43.650 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr: 4: (Context::complete(int)+0x9) [0x55db3c2fa429]
2018-04-18T08:29:43.650 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr: 5: (C_Lock::finish(int)+0x29) [0x55db3c2fa819]
2018-04-18T08:29:43.650 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr: 6: (Context::complete(int)+0x9) [0x55db3c2fa429]
2018-04-18T08:29:43.650 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr: 7: (Finisher::finisher_thread_entry()+0x460) [0x55db3c365630]
2018-04-18T08:29:43.651 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr: 8: (()+0x76ba) [0x7fb3a73686ba]
2018-04-18T08:29:43.651 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi068.stderr: 9: (clone()+0x6d) [0x7fb3a61d041d]

From: /a/yuriw-2018-04-17_21:26:13-powercycle-luminous-distro-basic-smithi/2409865


Related issues

Related to CephFS - Bug #22741: osdc: "FAILED assert(bh->last_write_tid > tid)" in powercycle-wip-yuri-master-1.19.18-distro-basic-smithi Resolved 01/20/2018
Related to Ceph - Bug #20742: "ObjectCacher.cc: 1196: FAILED assert(ob->last_commit_tid < tid)" in powercycle Can't reproduce
Copied to RADOS - Backport #24068: luminous: osd sends op_reply out of order Resolved
Copied to RADOS - Backport #25010: mimic: osd sends op_reply out of order Resolved

History

#1 Updated by Patrick Donnelly almost 5 years ago

  • Related to Bug #22741: osdc: "FAILED assert(bh->last_write_tid > tid)" in powercycle-wip-yuri-master-1.19.18-distro-basic-smithi added

#2 Updated by Zheng Yan almost 5 years ago

  • Project changed from CephFS to Ceph
  • Subject changed from osdc/ObjectCacher.cc: 1199: FAILED assert(ob->last_commit_tid < tid) to osd sends op_reply out of order
  • Category deleted (Correctness/Safety)
  • Assignee deleted (Zheng Yan)
   -41> 2018-04-18 08:29:43.643129 7fb3a26b2700  1 -- 172.21.15.68:0/2637453904 <== osd.1 172.21.15.100:6800/11997 49426 ==== osd_op_reply(102425 10000000004.00000004 [write 45056~4096,write 147456~8192] v45'51341 uv51341 ondisk = 0) v8 ==== 206+0+0 (1394293435 0 0) 0x55db6efe7040 con 0x55db46cc0000
   -40> 2018-04-18 08:29:43.647883 7fb3a26b2700  5 -- 172.21.15.68:0/2637453904 >> 172.21.15.100:6800/11997 conn(0x55db46cc0000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=12 cs=1 l=1). rx osd.1 seq 49427 0x55db6efe7040 osd_op_reply(93534 10000000004.0000000a [write 2777088~4096,write 2793472~4096,write 2813952~4096] v52'54288 uv54288 ondisk = 0) v8
   -39> 2018-04-18 08:29:43.647898 7fb3a26b2700  1 -- 172.21.15.68:0/2637453904 <== osd.1 172.21.15.100:6800/11997 49427 ==== osd_op_reply(93534 10000000004.0000000a [write 2777088~4096,write 2793472~4096,write 2813952~4096] v52'54288 uv54288 ondisk = 0) v8 ==== 248+0+0 (4014598912 0 0) 0x55db6efe7040 con 0x55db46cc0000
   -38> 2018-04-18 08:29:43.648056 7fb3a26b2700  5 -- 172.21.15.68:0/2637453904 >> 172.21.15.100:6800/11997 conn(0x55db46cc0000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=12 cs=1 l=1). rx osd.1 seq 49428 0x55db6ef48080 osd_op_reply(93741 10000000004.0000000a [write 909312~4096,write 1007616~4096,write 1019904~4096,write 1253376~4096] v52'54289 uv54289 ondisk = 0) v8
   -37> 2018-04-18 08:29:43.648069 7fb3a26b2700  1 -- 172.21.15.68:0/2637453904 <== osd.1 172.21.15.100:6800/11997 49428 ==== osd_op_reply(93741 10000000004.0000000a [write 909312~4096,write 1007616~4096,write 1019904~4096,write 1253376~4096] v52'54289 uv54289 ondisk = 0) v8 ==== 290+0+0 (826312375 0 0) 0x55db6ef48080 con 0x55db46cc0000
   -36> 2018-04-18 08:29:43.648194 7fb3a26b2700  5 -- 172.21.15.68:0/2637453904 >> 172.21.15.100:6800/11997 conn(0x55db46cc0000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=12 cs=1 l=1). rx osd.1 seq 49429 0x55db6ef48080 osd_op_reply(93771 10000000004.0000000a [write 1323008~4096,write 1355776~4096,write 1646592~4096] v52'54290 uv54290 ondisk = 0) v8
   -35> 2018-04-18 08:29:43.648205 7fb3a26b2700  1 -- 172.21.15.68:0/2637453904 <== osd.1 172.21.15.100:6800/11997 49429 ==== osd_op_reply(93771 10000000004.0000000a [write 1323008~4096,write 1355776~4096,write 1646592~4096] v52'54290 uv54290 ondisk = 0) v8 ==== 248+0+0 (3235242034 0 0) 0x55db6ef48080 con 0x55db46cc0000
   -34> 2018-04-18 08:29:43.648358 7fb3a26b2700  5 -- 172.21.15.68:0/2637453904 >> 172.21.15.100:6800/11997 conn(0x55db46cc0000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=12 cs=1 l=1). rx osd.1 seq 49430 0x55db6efe7040 osd_op_reply(93353 10000000004.00000004 [write 3481600~4096,write 3551232~4096,write 3698688~4096] v52'54291 uv54291 ondisk = 0) v8
   -33> 2018-04-18 08:29:43.648371 7fb3a26b2700  1 -- 172.21.15.68:0/2637453904 <== osd.1 172.21.15.100:6800/11997 49430 ==== osd_op_reply(93353 10000000004.00000004 [write 3481600~4096,write 3551232~4096,write 3698688~4096] v52'54291 uv54291 ondisk = 0) v8 ==== 248+0+0 (3731186171 0 0) 0x55db6efe7040 con 0x55db46cc0000

both osd_op 93534 and 102425 modified object 10000000004.00000004. but client got reply for osd_op 102425 first.

osd.1.log

2018-04-18 08:29:25.771761 7fecc7cda700  1 -- 172.21.15.100:6800/11997 <== client.74195 172.21.15.68:0/2637453904 46923 ==== osd_op(client.74195.0:93534 3.1 3.5d9f5e61 (undecoded) ondisk+write+known_if_redirected e40) v8 ==== 305+0+12288 (4263481419 0 0) 0x55f2226a9400 con 0x55f21f569800
2018-04-18 08:29:25.771769 7fecc7cda700 15 osd.1 39 enqueue_op 0x55f2221968c0 prio 63 cost 12288 latency 0.000025 epoch 40 osd_op(client.74195.0:93534 3.1 3.5d9f5e61 (undecoded) ondisk+write+known_if_redirected e40) v8

2018-04-18 08:29:31.173418 7fecc7cda700  1 -- 172.21.15.100:6800/11997 <== client.74195 172.21.15.68:0/2637453904 51342 ==== osd_op(client.74195.0:102425 3.1 3.f3986555 (undecoded) ondisk+write+known_if_redirected e45) v8 ==== 267+0+12288 (4024547642 0 0) 0x55f22fecc080 con 0x55f21f569800
2018-04-18 08:29:31.173428 7fecc7cda700 15 osd.1 43 enqueue_op 0x55f2209ac8c0 prio 63 cost 12288 latency 0.000064 epoch 45 osd_op(client.74195.0:102425 3.1 3.f3986555 (undecoded) ondisk+write+known_if_redirected e45) v8

osd.1 did receive osd_op 93534 first

#3 Updated by Patrick Donnelly almost 5 years ago

  • Project changed from Ceph to RADOS
  • Priority changed from High to Urgent
  • Severity changed from 3 - minor to 2 - major
  • Component(RADOS) OSD added

#4 Updated by Sage Weil almost 5 years ago

  • Status changed from New to In Progress
  • Assignee set to Sage Weil

#5 Updated by jianpeng ma almost 5 years ago

For this case: if slot->to_process is null. And Op1 enqueue_front. At the same time Op2 dispatch. Because two threads do those ops. So we don't make sure who first. I think for slot->to_process is null, we should move enquene_front to to_process to make sure it firstly called.

#6 Updated by jianpeng ma almost 5 years ago

Ignore my statement. Dispatch do put_back.So no race .

#7 Updated by jianpeng ma almost 5 years ago

same bug #20742

#8 Updated by Sage Weil almost 5 years ago

  • Related to Bug #20742: "ObjectCacher.cc: 1196: FAILED assert(ob->last_commit_tid < tid)" in powercycle added

#9 Updated by Sage Weil almost 5 years ago

  • Status changed from In Progress to 12
  • Assignee deleted (Sage Weil)

#10 Updated by Josh Durgin almost 5 years ago

  • Assignee set to Neha Ojha

#11 Updated by Neha Ojha almost 5 years ago

For object 10000000004.00000004 osd_op_reply for 102425 is received before 93353.

-43> 2018-04-18 08:29:43.643046 7fb3a26b2700  1 -- 172.21.15.68:0/2637453904 <== osd.1 172.21.15.100:6800/11997 49425 ==== osd_op_reply(102268 10000000004.00000004 [write 3944448~4096,write 4190208~4096] v45'51262 uv51262 ondisk = 0) v8 ==== 206+0+0 (1458295628 0 0) 0x55db6efe7040 con 0x55db46cc0000
   -42> 2018-04-18 08:29:43.643114 7fb3a26b2700  5 -- 172.21.15.68:0/2637453904 >> 172.21.15.100:6800/11997 conn(0x55db46cc0000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=12 cs=1 l=1). rx osd.1 seq 49426 0x55db6efe7040 osd_op_reply(102425 10000000004.00000004 [write 45056~4096,write 147456~8192] v45'51341 uv51341 ondisk = 0) v8
   -41> 2018-04-18 08:29:43.643129 7fb3a26b2700  1 -- 172.21.15.68:0/2637453904 <== osd.1 172.21.15.100:6800/11997 49426 ==== osd_op_reply(102425 10000000004.00000004 [write 45056~4096,write 147456~8192] v45'51341 uv51341 ondisk = 0) v8 ==== 206+0+0 (1394293435 0 0) 0x55db6efe7040 con 0x55db46cc0000
   -34> 2018-04-18 08:29:43.648358 7fb3a26b2700  5 -- 172.21.15.68:0/2637453904 >> 172.21.15.100:6800/11997 conn(0x55db46cc0000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=12 cs=1 l=1). rx osd.1 seq 49430 0x55db6efe7040 osd_op_reply(93353 10000000004.00000004 [write 3481600~4096,write 3551232~4096,write 3698688~4096] v52'54291 uv54291 ondisk = 0) v8
   -33> 2018-04-18 08:29:43.648371 7fb3a26b2700  1 -- 172.21.15.68:0/2637453904 <== osd.1 172.21.15.100:6800/11997 49430 ==== osd_op_reply(93353 10000000004.00000004 [write 3481600~4096,write 3551232~4096,write 3698688~4096] v52'54291 uv54291 ondisk = 0) v8 ==== 248+0+0 (3731186171 0 0) 0x55db6efe7040 con 0x55db46cc0000
   -32> 2018-04-18 08:29:43.648499 7fb3a26b2700  5 -- 172.21.15.68:0/2637453904 >> 172.21.15.100:6800/11997 conn(0x55db46cc0000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=12 cs=1 l=1). rx osd.1 seq 49431 0x55db6ef48080 osd_op_reply(93355 10000000004.00000004 [write 16384~4096,write 36864~4096] v52'54292 uv54292 ondisk = 0) v8
   -31> 2018-04-18 08:29:43.648511 7fb3a26b2700  1 -- 172.21.15.68:0/2637453904 <== osd.1 172.21.15.100:6800/11997 49431 ==== osd_op_reply(93355 10000000004.00000004 [write 16384~4096,write 36864~4096] v52'54292 uv54292 ondisk = 0) v8 ==== 206+0+0 (3933831713 0 0) 0x55db6ef48080 con 0x55db46cc0000

#12 Updated by Neha Ojha almost 5 years ago

  • Status changed from 12 to Fix Under Review

The cause for this issue is that we are not tracking enough dup ops for this test, which does multiple writes to the same object. This can be fixed by increasing osd_pg_log_dups_tracked.

https://github.com/ceph/ceph/pull/21909

#13 Updated by Josh Durgin almost 5 years ago

  • Status changed from Fix Under Review to Pending Backport

#14 Updated by Nathan Cutler almost 5 years ago

  • Copied to Backport #24068: luminous: osd sends op_reply out of order added

#16 Updated by Nathan Cutler over 4 years ago

  • Backport changed from luminous to luminous, mimic

#17 Updated by Nathan Cutler over 4 years ago

#18 Updated by Nathan Cutler over 4 years ago

  • Status changed from Pending Backport to Resolved
  • ceph-qa-suite ceph-deploy added

#19 Updated by Nathan Cutler over 4 years ago

  • ceph-qa-suite rados added
  • ceph-qa-suite deleted (ceph-deploy)

#20 Updated by Ivan Guan about 1 year ago

This bug occurred in my online environment(Nautilus 14.2.5) some days ago and my application exited because client’s assert fail. After analysising osd log and code i found there may be a risk in the logic processing the dup requests.

Now,the osd will trigger canceling and requeuing ops when detected "on change”(osd be marked failed or down) of pg_map. When the pg become active, osd will retry those ops has been inserted to op_wq. The dups map can record how many versions back to track in order to detect duplicate ops which can avoid processing those non-idempotent operation repeatedly.

In my case, there are many write ops for one object from same client. One osd of this pg is down due to other reason and the pg requeue the write ops to op_wd when it detected pg on_change. After several seconds, the pg become active again and op_1 has been trimmed from pg_log and dups map and op_n is still in the dups map. When pg start retrying the ops, the op_1 didn’t found, so redo the op_1. The op_n was processed before all the EC shards of op_1 reply to primary osd. Thus, op_n was founded in dups map and be considered to be a completed request, so returned to client directly. This sequence will lead my write op out of order. When the op_1 received all shards and do a osd reply, the client will assert fail because it received op_n first.

What’s worse, if the op_n be processed after op_1 was completed, the client will not be able to catch this error, but the data of object was old data which has been overwritten by op_1.

The default value of option osd_pg_log_dups_tracked is 3000. In my opinion, if we increase it to 10000 or bigger value only can reduce the chance of this bug because we don’t know how many ops will be requeued when one osd was down.

I don’t know if what understand is correct, can anyone give me some ideas? Thank you very much!

#21 Updated by Ivan Guan about 1 year ago

Here is the log information on my environment.

op1 is coming and send shards to peers osd

2021-12-01 18:27:25.012 7f2da0001700 1 -- [v2:10.61.152.2:6888/1514953,v1:10.61.152.2:6889/1514953] <== client.64988315 10.61.152.103:0/2619559371 1947463 ==== osd_op(client.64988315.0:86211702 26.3as0 26.5227e3ba (undecoded) ondisk+write+known_if_redirected e32669) v8 ==== 229+0+180 (crc 0 0 0) 0x7f2dbc0ed8c0 con 0x7f2db6b74400
...
2021-12-01 18:27:25.012 7f2d7e7be700 10 osd.58 pg_epoch: 32669 pg[26.3as0( v 32669'1603772 (32669'1601762,32669'1603772] local-lis/les=32650/32651 n=29426 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32640/32650/32357) [58,63,85,83,15,23]p58(0) r=0 lpr=32650 luod=32669'1603420 crt=32669'1603420 lcod 32669'1603419 mlcod 32669'1603419 active+clean] new_repop rep_tid 5268699 on osd_op(client.64988315.0:86211702 26.3as0 26:5dc7e44a:::1000006fbd0.0000017d:head [write 305430~180] snapc 1=[] ondisk+write+known_if_redirected e32669) v8
...
2021-12-01 18:27:25.013 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.1:6840/617749,v1:172.27.152.1:6841/617749] -- MOSDECSubOpWrite(26.3as4 32669/32650 ECSubWrite(tid=5268699, reqid=client.64988315.0:86211702, at_version=32669'1603773, trim_to=32669'1601762, roll_forward_to=32669'1603420)) v2 -- 0x7f2dc0b37800 con 0x7f2db6b74c00
2021-12-01 18:27:25.013 7f2d7e7be700 20 osd.58 32669 share_map_peer 0x7f2dc61cb000 already has epoch 32669
2021-12-01 18:27:25.013 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.1:6852/1624095,v1:172.27.152.1:6854/1624095] -- MOSDECSubOpWrite(26.3as5 32669/32650 ECSubWrite(tid=5268699, reqid=client.64988315.0:86211702, at_version=32669'1603773, trim_to=32669'1601762, roll_forward_to=32669'1603420)) v2 -- 0x7f2dc2580800 con 0x7f2dc61cb000
2021-12-01 18:27:25.013 7f2d7f7c0700 20 osd.58 32669 share_map_peer 0x7f2dbcd6b000 already has epoch 32669
2021-12-01 18:27:25.013 7f2d7e7be700 20 osd.58 32669 share_map_peer 0x7f2dbcd6b000 already has epoch 32669
2021-12-01 18:27:25.013 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.2:6800/1524185,v1:172.27.152.2:6801/1524185] -- MOSDECSubOpWrite(26.3as1 32669/32650 ECSubWrite(tid=5268699, reqid=client.64988315.0:86211702, at_version=32669'1603773, trim_to=32669'1601762, roll_forward_to=32669'1603420)) v2 -- 0x7f2dba0de800 con 0x7f2dbcd6b000
2021-12-01 18:27:25.013 7f2d7e7be700 20 osd.58 32669 share_map_peer 0x7f2dbe120000 already has epoch 32669
2021-12-01 18:27:25.013 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.3:6884/1510716,v1:172.27.152.3:6885/1510716] -- MOSDECSubOpWrite(26.3as3 32669/32650 ECSubWrite(tid=5268699, reqid=client.64988315.0:86211702, at_version=32669'1603773, trim_to=32669'1601762, roll_forward_to=32669'1603420)) v2 -- 0x7f2dc7b1d800 con 0x7f2dbe120000
2021-12-01 18:27:25.013 7f2d7e7be700 20 osd.58 32669 share_map_peer 0x7f2dbba43c00 already has epoch 32669
2021-12-01 18:27:25.013 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.3:6904/1502674,v1:172.27.152.3:6905/1502674] -- MOSDECSubOpWrite(26.3as2 32669/32650 ECSubWrite(tid=5268699, reqid=client.64988315.0:86211702, at_version=32669'1603773, trim_to=32669'1601762, roll_forward_to=32669'1603420)) v2 -- 0x7f2dbc120000 con 0x7f2dbba43c00

op1 write pg log

2021-12-01 18:27:25.013 7f2d7e7be700 10 osd.58 pg_epoch: 32669 pg[26.3as0( v 32669'1603772 (32669'1601762,32669'1603772] local-lis/les=32650/32651 n=29426 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32640/32650/32357) [58,63,85,83,15,23]p58(0) r=0 lpr=32650 luod=32669'1603420 crt=32669'1603420 lcod 32669'1603419 mlcod 32669'1603419 active+clean]* append_log log((32669'1601762,32669'1603772], crt=32669'1603420)* [32669'1603773 (32669'1603772) modify 26:5dc7e44a:::1000006fbd0.0000017d:head by client.64988315.0:86211702 2021-12-01 18:26:37.076668 0]
2021-12-01 18:27:25.013 7f2d7e7be700 10 osd.58 pg_epoch: 32669 pg[26.3as0( v 32669'1603773 (32669'1601762,32669'1603773] local-lis/les=32650/32651 n=29426 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32640/32650/32357) [58,63,85,83,15,23]p58(0) r=0 lpr=32650 luod=32669'1603420 lua=32669'1603772 crt=32669'1603420 lcod 32669'1603419 mlcod 32669'1603419 active+clean] add_log_entry 32669'1603773 (32669'1603772) modify 26:5dc7e44a:::1000006fbd0.0000017d:head by client.64988315.0:86211702 2021-12-01 18:26:37.076668 0
2021-12-01 18:27:25.013 7f2d7e7be700 20 earliest_dup_version = 1600774
2021-12-01 18:27:25.013 7f2d7e7be700 20 trim 32669'1603773 (32669'1603772) modify 26:5dc7e44a:::1000006fbd0.0000017d:head by client.64988315.0:86211702 2021-12-01 18:26:37.076668 0
2021-12-01 18:27:25.013 7f2d7e7be700 20 trim dup log_dup(reqid=client.64988315.0:86059851 v=32669'1600773 uv=1600773 rc=0)
2021-12-01 18:27:25.013 7f2d7e7be700 10 osd.58 pg_epoch: 32669 pg[26.3as0( v 32669'1603773 (32669'1601762,32669'1603773] local-lis/les=32650/32651 n=29426 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32640/32650/32357) [58,63,85,83,15,23]p58(0) r=0 lpr=32650 luod=32669'1603420 lua=32669'1603772 crt=32669'1603420 lcod 32669'1603419 mlcod 32669'1603419 active+clean] append_log approx pg log length = 2011
2021-12-01 18:27:25.013 7f2d7e7be700 10 osd.58 pg_epoch: 32669 pg[26.3as0( v 32669'1603773 (32669'1601762,32669'1603773] local-lis/les=32650/32651 n=29426 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32640/32650/32357) [58,63,85,83,15,23]p58(0) r=0 lpr=32650 luod=32669'1603420 lua=32669'1603772 crt=32669'1603420 lcod 32669'1603419 mlcod 32669'1603419 active+clean] append_log transaction_applied = 1
2021-12-01 18:27:25.013 7f2d7e7be700 10 trim proposed trim_to = 32669’1601762

ECWrite reply callback except for osd.23

2021-12-01 18:27:25.014 7f2da0001700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.83 v2:172.27.152.3:6884/1510716 1727791 ==== MOSDECSubOpWriteReply(26.3as0 32669/32650 ECSubWriteReply(tid=5268699, last_complete=32669'1603773, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2dc76e7a00 con 0x7f2dbe120000
2021-12-01 18:27:25.014 7f2d9e7fe700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.15 v2:172.27.152.1:6840/617749 1648674 ==== MOSDECSubOpWriteReply(26.3as0 32669/32650 ECSubWriteReply(tid=5268699, last_complete=32669'1603773, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2db9d30c00 con 0x7f2db6b74c00
2021-12-01 18:27:25.014 7f2da0802700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.63 v2:172.27.152.2:6800/1524185 3613549 ==== MOSDECSubOpWriteReply(26.3as0 32669/32650 ECSubWriteReply(tid=5268699, last_complete=32669'1603773, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2dc8201c00 con 0x7f2dbcd6b000
2021-12-01 18:27:25.014 7f2da1003700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.85 v2:172.27.152.3:6904/1502674 1727805 ==== MOSDECSubOpWriteReply(26.3as0 32669/32650 ECSubWriteReply(tid=5268699, last_complete=32669'1603773, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2dc8196e00 con 0x7f2dbba43c00

op2 is coming

2021-12-01 18:27:26.770 7f2da0001700 1 -- [v2:10.61.152.2:6888/1514953,v1:10.61.152.2:6889/1514953] <== client.64988315 10.61.152.103:0/2619559371 1947855 ==== osd_op(client.64988315.0:86212603 26.3as0 26.5227e3ba (undecoded) ondisk+write+known_if_redirected e32669) v8 ==== 229+0+109 (crc 0 0 0) 0x7f2dbbfa0580 con 0x7f2db6b74400
2021-12-01 18:27:26.771 7f2d7e7be700 10 osd.58 pg_epoch: 32669 pg[26.3as0( v 32669'1604165 (32669'1602162,32669'1604165] local-lis/les=32650/32651 n=29426 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32640/32650/32357) [58,63,85,83,15,23]p58(0) r=0 lpr=32650 luod=32669'1603420 crt=32669'1603420 lcod 32669'1603419 mlcod 32669'1603419 active+clean] new_repop rep_tid 5269125 on osd_op(client.64988315.0:86212603 26.3as0 26:5dc7e44a:::1000006fbd0.0000017d:head [write 347139~109] snapc 1=[] ondisk+write+known_if_redirected e32669) v8

op2 come and send shards to peers osd

2021-12-01 18:27:26.771 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.1:6840/617749,v1:172.27.152.1:6841/617749] -- MOSDECSubOpWrite(26.3as4 32669/32650 ECSubWrite(tid=5269125, reqid=client.64988315.0:86212603, at_version=32669'1604166, trim_to=32669'1602162, roll_forward_to=32669'1603420)) v2 -- 0x7f2db6f03000 con 0x7f2db6b74c00
2021-12-01 18:27:26.771 7f2d7e7be700 20 osd.58 32669 share_map_peer 0x7f2dc61cb000 already has epoch 32669
2021-12-01 18:27:26.771 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.1:6852/1624095,v1:172.27.152.1:6854/1624095] -- MOSDECSubOpWrite(26.3as5 32669/32650 ECSubWrite(tid=5269125, reqid=client.64988315.0:86212603, at_version=32669'1604166, trim_to=32669'1602162, roll_forward_to=32669'1603420)) v2 -- 0x7f2db6f03800 con 0x7f2dc61cb000
2021-12-01 18:27:26.771 7f2d7e7be700 20 osd.58 32669 share_map_peer 0x7f2dbcd6b000 already has epoch 32669
2021-12-01 18:27:26.771 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.2:6800/1524185,v1:172.27.152.2:6801/1524185] -- MOSDECSubOpWrite(26.3as1 32669/32650 ECSubWrite(tid=5269125, reqid=client.64988315.0:86212603, at_version=32669'1604166, trim_to=32669'1602162, roll_forward_to=32669'1603420)) v2 -- 0x7f2db6f04000 con 0x7f2dbcd6b000
2021-12-01 18:27:26.772 7f2d7e7be700 20 osd.58 32669 share_map_peer 0x7f2dbe120000 already has epoch 32669
2021-12-01 18:27:26.772 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.3:6884/1510716,v1:172.27.152.3:6885/1510716] -- MOSDECSubOpWrite(26.3as3 32669/32650 ECSubWrite(tid=5269125, reqid=client.64988315.0:86212603, at_version=32669'1604166, trim_to=32669'1602162, roll_forward_to=32669'1603420)) v2 -- 0x7f2db6f04800 con 0x7f2dbe120000
2021-12-01 18:27:26.772 7f2d7e7be700 20 osd.58 32669 share_map_peer 0x7f2dbba43c00 already has epoch 32669
2021-12-01 18:27:26.772 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.3:6904/1502674,v1:172.27.152.3:6905/1502674] -- MOSDECSubOpWrite(26.3as2 32669/32650 ECSubWrite(tid=5269125, reqid=client.64988315.0:86212603, at_version=32669'1604166, trim_to=32669'1602162, roll_forward_to=32669'1603420)) v2 -- 0x7f2db6f05000 con 0x7f2dbba43c00

write pg log

2021-12-01 18:27:26.772 7f2d7e7be700 10 osd.58 pg_epoch: 32669 pg[26.3as0( v 32669'1604165 (32669'1602162,32669'1604165] local-lis/les=32650/32651 n=29426 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32640/32650/32357) [58,63,85,83,15,23]p58(0) r=0 lpr=32650 luod=32669'1603420 crt=32669'1603420 lcod 32669'1603419 mlcod 32669'1603419 active+clean] append_log log((32669'1602162,32669'1604165], crt=32669'1603420) [32669'1604166 (32669'1604165) modify 26:5dc7e44a:::1000006fbd0.0000017d:head by client.64988315.0:86212603 2021-12-01 18:26:38.849471 0]
2021-12-01 18:27:26.772 7f2d7e7be700 10 osd.58 pg_epoch: 32669 pg[26.3as0( v 32669'1604166 (32669'1602162,32669'1604166] local-lis/les=32650/32651 n=29426 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32640/32650/32357) [58,63,85,83,15,23]p58(0) r=0 lpr=32650 luod=32669'1603420 lua=32669'1604165 crt=32669'1603420 lcod 32669'1603419 mlcod 32669'1603419 active+clean] add_log_entry 32669'1604166 (32669'1604165) modify 26:5dc7e44a:::1000006fbd0.0000017d:head by client.64988315.0:86212603 2021-12-01 18:26:38.849471 0
2021-12-01 18:27:26.772 7f2d7e7be700 20 earliest_dup_version = 1601167
2021-12-01 18:27:26.772 7f2d7e7be700 20 trim 32669'1604166 (32669'1604165) modify 26:5dc7e44a:::1000006fbd0.0000017d:head by client.64988315.0:86212603 2021-12-01 18:26:38.849471 0

ECWrite reply callback except for osd.23

2021-12-01 18:27:26.772 7f2da0802700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.63 v2:172.27.152.2:6800/1524185 3614264 ==== MOSDECSubOpWriteReply(26.3as0 32669/32650 ECSubWriteReply(tid=5269125, last_complete=32669'1604166, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2dc5b7bc00 con 0x7f2dbcd6b000
2021-12-01 18:27:26.775 7f2d9e7fe700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.15 v2:172.27.152.1:6840/617749 1649067 ==== MOSDECSubOpWriteReply(26.3as0 32669/32650 ECSubWriteReply(tid=5269125, last_complete=32669'1604166, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2de34aa000 con 0x7f2db6b74c00
2021-12-01 18:27:26.776 7f2da0001700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.83 v2:172.27.152.3:6884/1510716 1728184 ==== MOSDECSubOpWriteReply(26.3as0 32669/32650 ECSubWriteReply(tid=5269125, last_complete=32669'1604166, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2dd100f000 con 0x7f2dbe120000
2021-12-01 18:27:26.786 7f2da1003700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.85 v2:172.27.152.3:6904/1502674 1728198 ==== MOSDECSubOpWriteReply(26.3as0 32669/32650 ECSubWriteReply(tid=5269125, last_complete=32669'1604166, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2dc5c20c00 con 0x7f2dbba43c00

osd found pg 26.3a on_change

2021-12-01 18:29:19.111 7f2d7e7be700 10 osd.58 pg_epoch: 32670 pg[26.3as0( v 32669'1606846 (32669'1603420,32669'1606846] local-lis/les=32650/32651 n=29427 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32670/32670/32357) [58,63,85,83,15,2147483647]p58(0) r=0 lpr=32670 pi=[32650,32670)/1 crt=32669'1603420 lcod 32669'1603419 mlcod 0'0 unknown mbc={}] on_change

2021-12-01 18:29:19.120 7f2d7e7be700 10 osd.58 pg_epoch: 32670 pg[26.3as0( v 32669'1606846 (32669'1603420,32669'1606846] local-lis/les=32650/32651 n=29427 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32670/32670/32357) [58,63,85,83,15,2147483647]p58(0) r=0 lpr=32670 pi=[32650,32670)/1 crt=32669'1603420 lcod 32669'1603419 mlcod 0'0 unknown mbc={}] canceling repop tid 5268699
2021-12-01 18:29:19.120 7f2d7e7be700 10 osd.58 pg_epoch: 32670 pg[26.3as0( v 32669'1606846 (32669'1603420,32669'1606846] local-lis/les=32650/32651 n=29427 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32670/32670/32357) [58,63,85,83,15,2147483647]p58(0) r=0 lpr=32670 pi=[32650,32670)/1 crt=32669'1603420 lcod 32669'1603419 mlcod 0'0 unknown mbc={}] requeuing osd_op(client.64988315.0:86211702 26.3as0 26:5dc7e44a:::1000006fbd0.0000017d:head [write 305430~180] snapc 1=[] ondisk+write+known_if_redirected e32669) v8

2021-12-01 18:29:19.128 7f2d7e7be700 10 osd.58 pg_epoch: 32670 pg[26.3as0( v 32669'1606846 (32669'1603420,32669'1606846] local-lis/les=32650/32651 n=29427 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32670/32670/32357) [58,63,85,83,15,2147483647]p58(0) r=0 lpr=32670 pi=[32650,32670)/1 crt=32669'1603420 lcod 32669'1603419 mlcod 0'0 unknown mbc={}] canceling repop tid 5269125
2021-12-01 18:29:19.128 7f2d7e7be700 10 osd.58 pg_epoch: 32670 pg[26.3as0( v 32669'1606846 (32669'1603420,32669'1606846] local-lis/les=32650/32651 n=29427 ec=30881/30881 lis/c 32650/32650 les/c/f 32651/32651/0 32670/32670/32357) [58,63,85,83,15,2147483647]p58(0) r=0 lpr=32670 pi=[32650,32670)/1 crt=32669'1603420 lcod 32669'1603419 mlcod 0'0 unknown mbc={}] requeuing osd_op(client.64988315.0:86212603 26.3as0 26:5dc7e44a:::1000006fbd0.0000017d:head [write 347139~109] snapc 1=[] ondisk+write+known_if_redirected e32669) v8

pg 26.3a start trimming pg log

2021-12-01 18:29:22.157 7f2d7e7be700 10 trim proposed trim_to = 32669'1604846
2021-12-01 18:29:22.157 7f2d7e7be700 10 trim missing = 0
2021-12-01 18:29:22.157 7f2d7e7be700 10 trim log((32669'1603420,32673'1606847], crt=32669'1606846) to 32669'1604846
2021-12-01 18:29:22.157 7f2d7e7be700 20 earliest_dup_version = 1603848

2021-12-01 18:29:22.160 7f2d7e7be700 20 trim 32669'1603773 (32669'1603772) modify 26:5dc7e44a:::1000006fbd0.0000017d:head by client.64988315.0:86211702 2021-12-01 18:26:37.076668 0
2021-12-01 18:29:22.160 7f2d7e7be700 20 trim 32669'1603848 (32669'1603847) modify 26:5dc7e44a:::1000006fbd0.0000017d:head by client.64988315.0:86211890 2021-12-01 18:26:37.518857 0
2021-12-01 18:29:22.160 7f2d7e7be700 20 updating write_from_dups from 4294967295'18446744073709551615 to 32669'1603848
2021-12-01 18:29:22.162 7f2d7e7be700 20 trim 32669'1604166 (32669'1604165) modify 26:5dc7e44a:::1000006fbd0.0000017d:head by client.64988315.0:86212603 2021-12-01 18:26:38.849471 0
...
2021-12-01 18:29:22.169 7f2d7e7be700 20 trim dup log_dup(reqid=client.65009583.0:85732558 v=32669'1602464 uv=1602464 rc=0)
2021-12-01 18:29:22.169 7f2d7e7be700 20 trim dup log_dup(reqid=client.64988315.0:86137811 v=32669'1602465 uv=1602465 rc=0)
2021-12-01 18:29:22.169 7f2d7e7be700 20 trim dup log_dup(reqid=client.65009583.0:85732593 v=32669'1602466 uv=1602466 rc=0)

OSD start retrying requeued op

2021-12-01 18:29:22.720 7f2d7e7be700 20 osd.58 op_wq(3) _process OpQueueItem(26.3as0 PGOpItem(op=osd_op(client.64988315.0:86211702 26.3as0 26:5dc7e44a:::1000006fbd0.0000017d:head [write 305430~180] snapc 1=[] ondisk+write+known_if_redirected e32669) v8) prio 63 cost 180 e32673) queued
2021-12-01 18:29:22.721 7f2d7e7be700 10 osd.58 pg_epoch: 32673 pg[26.3as0( v 32673'1606887 (32669'1604846,32673'1606887] local-lis/les=32670/32673 n=29427 ec=30881/30881 lis/c 32670/32650 les/c/f 32673/32651/0 32670/32670/32357) [58,63,85,83,15,2147483647]p58(0) r=0 lpr=32670 pi=[32650,32670)/1 luod=32673'1606847 crt=32669'1606846 lcod 32669'1606846 mlcod 32669'1606846 active+undersized+degraded mbc={}] new_repop rep_tid 5272338 on osd_op(client.64988315.0:86211702 26.3as0 26:5dc7e44a:::1000006fbd0.0000017d:head [write 305430~180] snapc 1=[] ondisk+write+known_if_redirected e32669) v8
2021-12-01 18:29:23.534 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.1:6840/617749,v1:172.27.152.1:6841/617749] — MOSDECSubOpWrite(26.3as4 32673/32670 ECSubWrite(tid=5272338, reqid=client.64988315.0:86211702, at_version=32673'1607122, trim_to=32669'1604846, roll_forward_to=32673'1606847)) v2 -- 0x7f2deb878800 con 0x7f2db6b74c00
2021-12-01 18:29:23.534 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.2:6800/1524185,v1:172.27.152.2:6801/1524185] -- MOSDECSubOpWrite(26.3as1 32673/32670 ECSubWrite(tid=5272338, reqid=client.64988315.0:86211702, at_version=32673'1607122, trim_to=32669'1604846, roll_forward_to=32673'1606847)) v2 -- 0x7f2dc5ea1000 con 0x7f2dbcd6b000
2021-12-01 18:29:23.534 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.3:6884/1510716,v1:172.27.152.3:6885/1510716] -- MOSDECSubOpWrite(26.3as3 32673/32670 ECSubWrite(tid=5272338, reqid=client.64988315.0:86211702, at_version=32673'1607122, trim_to=32669'1604846, roll_forward_to=32673'1606847)) v2 -- 0x7f2dc1531000 con 0x7f2dbe120000
2021-12-01 18:29:23.534 7f2d7e7be700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] --> [v2:172.27.152.3:6904/1502674,v1:172.27.152.3:6905/1502674] -- MOSDECSubOpWrite(26.3as2 32673/32670 ECSubWrite(tid=5272338, reqid=client.64988315.0:86211702, at_version=32673'1607122, trim_to=32669'1604846, roll_forward_to=32673'1606847)) v2 -- 0x7f2dbc723000 con 0x7f2dbba43c00

return three ECWrite reply

2021-12-01 18:29:23.542 7f2d9e7fe700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.15 v2:172.27.152.1:6840/617749 1652028 ==== MOSDECSubOpWriteReply(26.3as0 32673/32670 ECSubWriteReply(tid=5272338, last_complete=32673'1607122, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2dc883c400 con 0x7f2db6b74c00
2021-12-01 18:29:23.552 7f2da1003700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.85 v2:172.27.152.3:6904/1502674 1731194 ==== MOSDECSubOpWriteReply(26.3as0 32673/32670 ECSubWriteReply(tid=5272338, last_complete=32673'1607122, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2dc1c61400 con 0x7f2dbba43c00
2021-12-01 18:29:24.269 7f2da0001700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.83 v2:172.27.152.3:6884/1510716 1731180 ==== MOSDECSubOpWriteReply(26.3as0 32673/32670 ECSubWriteReply(tid=5272338, last_complete=32673'1607122, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2db698b800 con 0x7f2dbe120000

star retrying op2

2021-12-01 18:29:24.768 7f2d7e7be700 20 osd.58 op_wq(3) _process OpQueueItem(26.3as0 PGOpItem(op=osd_op(client.64988315.0:86212603 26.3as0 26:5dc7e44a:::1000006fbd0.0000017d:head [write 347139~109] snapc 1=[] ondisk+write+known_if_redirected e32669) v8) prio 63 cost 109 e32673) queued

2021-12-01 18:29:24.768 7f2d7e7be700 3 osd.58 pg_epoch: 32673 pg[26.3as0( v 32673'1607171 (32669'1605164,32673'1607171] local-lis/les=32670/32673 n=29427 ec=30881/30881 lis/c 32670/32650 les/c/f 32673/32651/0 32670/32670/32357) [58,63,85,83,15,2147483647]p58(0) r=0 lpr=32670 pi=[32650,32670)/1 luod=32673'1607121 crt=32673'1607089 lcod 32673'1606887 mlcod 32673'1606887 active+undersized+degraded mbc={}] do_op dup client.64988315.0:86212603 version 32669’1604166
2021-12-01 18:29:24.768 7f2d7e7be700 20 osd.58 pg_epoch: 32673 pg[26.3as0( v 32673'1607171 (32669'1605164,32673'1607171] local-lis/les=32670/32673 n=29427 ec=30881/30881 lis/c 32670/32650 les/c/f 32673/32651/0 32670/32670/32357) [58,63,85,83,15,2147483647]p58(0) r=0 lpr=32670 pi=[32650,32670)/1 luod=32673'1607121 crt=32673'1607089 lcod 32673'1606887 mlcod 32673'1606887 active+undersized+degraded mbc={}] already_complete: repgather(0x7f2db85fca20 32673'1607122 rep_tid=5272338 committed?=0 r=0)
2021-12-01 18:29:24.768 7f2d7e7be700 20 osd.58 pg_epoch: 32673 pg[26.3as0( v 32673'1607171 (32669'1605164,32673'1607171] local-lis/les=32670/32673 n=29427 ec=30881/30881 lis/c 32670/32650 les/c/f 32673/32651/0 32670/32670/32357) [58,63,85,83,15,2147483647]p58(0) r=0 lpr=32670 pi=[32650,32670)/1 luod=32673'1607121 crt=32673'1607089 lcod 32673'1606887 mlcod 32673'1606887 active+undersized+degraded mbc={}] already_complete: repgather(0x7f2db85fca20 32673'1607122 rep_tid=5272338 committed?=0 r=0) (i)->v past v
2021-12-01 18:29:24.768 7f2d7e7be700 20 osd.58 pg_epoch: 32673 pg[26.3as0( v 32673'1607171 (32669'1605164,32673'1607171] local-lis/les=32670/32673 n=29427 ec=30881/30881 lis/c 32670/32650 les/c/f 32673/32651/0 32670/32670/32357) [58,63,85,83,15,2147483647]p58(0) r=0 lpr=32670 pi=[32650,32670)/1 luod=32673'1607121 crt=32673'1607089 lcod 32673'1606887 mlcod 32673'1606887 active+undersized+degraded mbc={}] *already_complete: returning true

2021-12-01 18:29:24.768 7f2d7e7be700 1 -- [v2:10.61.152.2:6888/1514953,v1:10.61.152.2:6889/1514953] --> 10.61.152.103:0/2619559371 -- osd_op_reply(86212603 1000006fbd0.0000017d [write 347139~109] v32669'1604166 uv1604166 ondisk = 0) v8 -- 0x7f2dba165400 con 0x7f2db6b74400

the last reply come back op1 reply to client

2021-12-01 18:29:24.877 7f2da0802700 1 -- [v2:172.27.152.2:6888/1514953,v1:172.27.152.2:6889/1514953] <== osd.63 v2:172.27.152.2:6800/1524185 3620315 ==== MOSDECSubOpWriteReply(26.3as0 32673/32670 ECSubWriteReply(tid=5272338, last_complete=32673'1607122, committed=1, applied=1)) v2 ==== 95+0+0 (crc 0 0 0) 0x7f2dbddb5c00 con 0x7f2dbcd6b000
2021-12-01 18:29:24.878 7f2d7e7be700 1 -- [v2:10.61.152.2:6888/1514953,v1:10.61.152.2:6889/1514953] --> 10.61.152.103:0/2619559371 -- osd_op_reply(86211702 1000006fbd0.0000017d [write 305430~180] v32673'1607122 uv1607122 ondisk = 0) v8 -- 0x7f2de1b1e280 con 0x7f2db6b74400

Client received op2(tid=86212603) reply earlier than then op1(tid=86211702), so assert fail. After studying the code, i found that this problem was not only limited to
write the same object, but also with writes between multiple objects. So i think this risk is a very serious bug and we should pay more attention to it.
What do you think about it?

Also available in: Atom PDF