Project

General

Profile

Actions

Bug #47031

closed

out of order i/o

Added by Kefu Chai over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2020-08-19T03:54:58.106 INFO:tasks.rados.rados.0.smithi179.stdout:299:  writing smithi17920641-299 from 2950640 to 3008486 tid 3
2020-08-19T03:54:58.107 INFO:tasks.rados.rados.0.smithi179.stdout: waiting on 64
2020-08-19T03:54:58.107 INFO:tasks.rados.rados.0.smithi179.stdout:219:  finishing write tid 2 to smithi17920641-219
2020-08-19T03:54:58.107 INFO:tasks.rados.rados.0.smithi179.stderr:Error: finished tid 2 when last_acked_tid was 6
2020-08-19T03:54:58.107 INFO:tasks.rados.rados.0.smithi179.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/
gigantic/release/16.0.0-4462-gf2683dd/rpm/el8/BUILD/ceph-16.0.0-4462-gf2683dd/src/test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f6c98bdc700 time 2020-08
-19T03:54:58.105614+0000
2020-08-19T03:54:58.108 INFO:tasks.rados.rados.0.smithi179.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/
gigantic/release/16.0.0-4462-gf2683dd/rpm/el8/BUILD/ceph-16.0.0-4462-gf2683dd/src/test/osd/RadosModel.h: 907: ceph_abort_msg("abort() called")
2020-08-19T03:54:58.108 INFO:tasks.rados.rados.0.smithi179.stderr: ceph version 16.0.0-4462-gf2683dd (f2683ddba8c322fab6b600e61d2da10a3e9d7eb9) pacific (dev)
2020-08-19T03:54:58.108 INFO:tasks.rados.rados.0.smithi179.stderr: 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > con
st&)+0xe0) [0x7f6ca0534ce0]
2020-08-19T03:54:58.109 INFO:tasks.rados.rados.0.smithi179.stderr: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x67d) [0x563f8f322aed]
2020-08-19T03:54:58.109 INFO:tasks.rados.rados.0.smithi179.stderr: 3: (write_callback(void*, void*)+0x1e) [0x563f8f344bae]
2020-08-19T03:54:58.109 INFO:tasks.rados.rados.0.smithi179.stderr: 4: (()+0xb0d67) [0x7f6ca9b3ad67]
2020-08-19T03:54:58.110 INFO:tasks.rados.rados.0.smithi179.stderr: 5: (()+0xcf6b5) [0x7f6ca9b596b5]
2020-08-19T03:54:58.110 INFO:tasks.rados.rados.0.smithi179.stderr: 6: (()+0xcf022) [0x7f6ca9b59022]
2020-08-19T03:54:58.110 INFO:tasks.rados.rados.0.smithi179.stderr: 7: (()+0xd520a) [0x7f6ca9b5f20a]
2020-08-19T03:54:58.111 INFO:tasks.rados.rados.0.smithi179.stderr: 8: (()+0xc2b23) [0x7f6c9ee43b23]
2020-08-19T03:54:58.111 INFO:tasks.rados.rados.0.smithi179.stderr: 9: (()+0x82de) [0x7f6c9f98d2de]
2020-08-19T03:54:58.111 INFO:tasks.rados.rados.0.smithi179.stderr: 10: (clone()+0x43) [0x7f6c9e520133]
2020-08-19T03:54:58.283 DEBUG:teuthology.orchestra.run:got remote process result: None

it seems we also have this issue in classic OSD. for instance:

/a/kchai-2020-08-19_03:43:16-crimson-rados-wip-kefu2-testing-2020-08-19-1026-distro-basic-smithi/5358625/
/a/kchai-2020-08-18_08:24:46-crimson-rados-master-distro-basic-smithi/5356123/
/a/kchai-2020-08-16_14:37:34-crimson-rados-master-distro-basic-smithi/5349929

Actions #1

Updated by Kefu Chai over 3 years ago

/a/kchai-2020-08-22_03:33:19-crimson-rados:thrash-wip-crimson-more-ops-kefu-distro-basic-smithi/5365998

2020-08-22T03:49:33.852 INFO:tasks.rados.rados.0.smithi183.stdout:118:  writing smithi18327214-118 from 766419 to 1315847 tid 1
2020-08-22T03:49:33.858 INFO:tasks.rados.rados.0.smithi183.stdout:118:  writing smithi18327214-118 from 2013452 to 2588147 tid 2
2020-08-22T03:49:33.865 INFO:tasks.rados.rados.0.smithi183.stdout:118:  writing smithi18327214-118 from 3346843 to 3774492 tid 3

//// header = 5

//// read = 6

2020-08-22T03:49:34.613 INFO:tasks.rados.rados.0.smithi183.stdout:118:  finishing write tid 1 to smithi18327214-118
2020-08-22T03:49:34.619 INFO:tasks.rados.rados.0.smithi183.stdout:118:  finishing write tid 2 to smithi18327214-118
2020-08-22T03:49:34.623 INFO:tasks.rados.rados.0.smithi183.stdout:118:  finishing write tid 3 to smithi18327214-118
2020-08-22T03:49:34.624 INFO:tasks.rados.rados.0.smithi183.stdout:118:  finishing write tid 6 to smithi18327214-118
2020-08-22T03:49:35.820 INFO:tasks.rados.rados.0.smithi183.stdout:118:  finishing write tid 5 to smithi18327214-118
$ zgrep 'smithi18327214-118:head sending reply' remote/*/log/ceph-osd.*.log.gz

remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:33,953 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:528 3.9 3:9240fc16:::smithi18327214-118:head {write 766419~549428 in=549428b, stat out=16b} snapc 0={} ondisk+write+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head sending reply
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:33,971 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:529 3.9 3:9240fc16:::smithi18327214-118:head {write 2013452~574695 in=574695b} snapc 0={} ondisk+write+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head sending reply
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:34,014 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:530 3.9 3:9240fc16:::smithi18327214-118:head {write 3346843~427649 in=427649b} snapc 0={} ondisk+write+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head sending reply
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:34,023 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:532 3.9 3:9240fc16:::smithi18327214-118:head {read 0~1 out=1b} snapc 0={} ondisk+read+rwordered+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head sending reply
remote/smithi183/log/ceph-osd.1.log.gz:DEBUG 2020-08-22 03:49:35,064 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:531 3.9 3:9240fc16:::smithi18327214-118:head {setxattr (57), truncate 3966419} snapc 0={} RETRY=1 ondisk+retry+write+known_if_redirected e18) v8 - object 3:9240fc16:::smithi18327214-118:head sending reply
$ zgrep 'do_osd_ops' remote/smithi183/log/ceph-osd.*.log.gz | grep smithi18327214-118

remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:33,972 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:530 3.9 3:9240fc16:::smithi18327214-118:head {write 3346843~427649 in=427649b} snapc 0={} ondisk+write+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head all operations successful
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:33,972 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:530 3.9 3:9240fc16:::smithi18327214-118:head {write 3346843~427649 in=427649b} snapc 0={} ondisk+write+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head submitting txn
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:34,014 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:530 3.9 3:9240fc16:::smithi18327214-118:head {write 3346843~427649 in=427649b} snapc 0={} ondisk+write+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head sending reply
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:34,014 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:531 3.9 3:9240fc16:::smithi18327214-118:head {setxattr _header (57) in=64b, truncate 3966419} snapc 0={} ondisk+write+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head - handling op setxattr
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:34,014 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:531 3.9 3:9240fc16:::smithi18327214-118:head {setxattr (57), truncate 3966419} snapc 0={} ondisk+write+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head - handling op truncate
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:34,014 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:531 3.9 3:9240fc16:::smithi18327214-118:head {setxattr (57), truncate 3966419} snapc 0={} ondisk+write+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head all operations successful
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:34,014 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:531 3.9 3:9240fc16:::smithi18327214-118:head {setxattr (57), truncate 3966419} snapc 0={} ondisk+write+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head submitting txn
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:34,023 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:532 3.9 3:9240fc16:::smithi18327214-118:head {read 0~1} snapc 0={} ondisk+read+rwordered+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head - handling op read
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:34,023 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:532 3.9 3:9240fc16:::smithi18327214-118:head {read 0~1 out=1b} snapc 0={} ondisk+read+rwordered+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head all operations successful
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:34,023 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:532 3.9 3:9240fc16:::smithi18327214-118:head {read 0~1 out=1b} snapc 0={} ondisk+read+rwordered+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head txn is empty, bypassing mutate
remote/smithi183/log/ceph-osd.0.log.gz:DEBUG 2020-08-22 03:49:34,023 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:532 3.9 3:9240fc16:::smithi18327214-118:head {read 0~1 out=1b} snapc 0={} ondisk+read+rwordered+known_if_redirected e16) v8 - object 3:9240fc16:::smithi18327214-118:head sending reply
remote/smithi183/log/ceph-osd.1.log.gz:DEBUG 2020-08-22 03:49:35,015 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:531 3.9 3:9240fc16:::smithi18327214-118:head {setxattr _header (57) in=64b, truncate 3966419} snapc 0={} RETRY=1 ondisk+retry+write+known_if_redirected e18) v8 - object 3:9240fc16:::smithi18327214-118:head - handling op setxattr
remote/smithi183/log/ceph-osd.1.log.gz:DEBUG 2020-08-22 03:49:35,015 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:531 3.9 3:9240fc16:::smithi18327214-118:head {setxattr (57), truncate 3966419} snapc 0={} RETRY=1 ondisk+retry+write+known_if_redirected e18) v8 - object 3:9240fc16:::smithi18327214-118:head - handling op truncate
remote/smithi183/log/ceph-osd.1.log.gz:DEBUG 2020-08-22 03:49:35,015 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:531 3.9 3:9240fc16:::smithi18327214-118:head {setxattr (57), truncate 3966419} snapc 0={} RETRY=1 ondisk+retry+write+known_if_redirected e18) v8 - object 3:9240fc16:::smithi18327214-118:head all operations successful
remote/smithi183/log/ceph-osd.1.log.gz:DEBUG 2020-08-22 03:49:35,015 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:531 3.9 3:9240fc16:::smithi18327214-118:head {setxattr (57), truncate 3966419} snapc 0={} RETRY=1 ondisk+retry+write+known_if_redirected e18) v8 - object 3:9240fc16:::smithi18327214-118:head submitting txn
remote/smithi183/log/ceph-osd.1.log.gz:DEBUG 2020-08-22 03:49:35,064 [shard 0] osd - do_osd_ops: osd_op(client.4228.0:531 3.9 3:9240fc16:::smithi18327214-118:head {setxattr (57), truncate 3966419} snapc 0={} RETRY=1 ondisk+retry+write+known_if_redirected e18) v8 - object 3:9240fc16:::smithi18327214-118:head sending reply
Actions #2

Updated by Kefu Chai over 3 years ago

  • Status changed from New to Fix Under Review
  • Assignee set to Kefu Chai
  • Pull request ID set to 36787
Actions #3

Updated by Kefu Chai over 3 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF