Project

General

Profile

Actions

Bug #50208

open

[ FAILED ] CephSQLiteTest.InsertBulk4096 [with slow ops during this test with addtional dispatch delays]

Added by Deepika Upadhyay about 3 years ago. Updated about 3 years ago.

Status:
New
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-04-06T15:02:12.602 INFO:tasks.ceph.osd.0.smithi016.stderr:2021-04-06T15:02:12.600+0000 7f6791369700 -1 osd.0 27 get_health_metrics reporting 29 slow ops, oldest is osd_op(client.4419.0:2172 3.17 3.aefb40d7 (undecoded) ondisk+retry+write+known_if_redirected e24)
2021-04-06T15:02:13.610 INFO:tasks.ceph.osd.0.smithi016.stderr:2021-04-06T15:02:13.608+0000 7f6791369700 -1 osd.0 27 get_health_metrics reporting 28 slow ops, oldest is osd_op(client.4419.0:1212 3.1f 3.c717541f (undecoded) ondisk+retry+write+known_if_redirected e27)
2021-04-06T15:02:14.191 INFO:teuthology.orchestra.run.smithi016.stdout:[/build/ceph-17.0.0-2790-g80aaef5b/src/test/libcephsqlite/main.cc:188] sqlite3 error: 10 `disk I/O error': disk I/O error
2021-04-06T15:02:14.192 INFO:teuthology.orchestra.run.smithi016.stdout:/build/ceph-17.0.0-2790-g80aaef5b/src/test/libcephsqlite/main.cc:199: Failure
2021-04-06T15:02:14.192 INFO:teuthology.orchestra.run.smithi016.stdout:Expected equality of these values:
2021-04-06T15:02:14.192 INFO:teuthology.orchestra.run.smithi016.stdout:  0
2021-04-06T15:02:14.193 INFO:teuthology.orchestra.run.smithi016.stdout:  rc
2021-04-06T15:02:14.193 INFO:teuthology.orchestra.run.smithi016.stdout:    Which is: 10
2021-04-06T15:02:14.195 INFO:teuthology.orchestra.run.smithi016.stdout:[  FAILED  ] CephSQLiteTest.InsertBulk4096 (315826 ms)
2021-04-06T15:02:14.195 INFO:teuthology.orchestra.run.smithi016.stdout:[ RUN      ] CephSQLiteTest.InsertBulk

/ceph/teuthology-archive/ideepika-2021-04-06_14:26:42-rados-wip-deepika-testing-2021-04-05-0643-distro-basic-smithi/6024579/teuthology.log
Actions #1

Updated by Neha Ojha about 3 years ago

  • Assignee set to Patrick Donnelly

Patrick, this test was using extra delays (https://github.com/ceph/ceph/pull/40593/files#diff-81a3b5f9694ef27ed306f04dcef9e8fc5fd0493ec139b8c23677d0c4ecbc135eR9-R10), should we expect an I/O error though?

Actions #2

Updated by Patrick Donnelly about 3 years ago

2021-04-06T14:57:02.588+0000 7f1da0ff9700 20 client.4419: SimpleRADOSStriper: lock_keeper_main: 74c80a7a-d7d1-45ec-adbe-12ec44c1a806.db: tick
2021-04-06T14:57:02.588+0000 7f1da0ff9700 10 client.4419: SimpleRADOSStriper: lock_keeper_main: 74c80a7a-d7d1-45ec-adbe-12ec44c1a806.db: renewing lock
...
2021-04-06T14:58:31.020+0000 7f1da0ff9700  1 -- 172.21.15.16:0/3563114407 --> v1:172.21.15.122:6807/17226 -- osd_op(unknown.0.0:1088 3.14 3:2e8bbbe6:::74c80a7a-d7d1-45ec-adbe-12ec44c1a806.db.0000000000000000:head [call lock.lock in=106b] snapc 0=[] ondisk+write+known_if_redirected e24) v8 -- 0x7f1d84004800 con 0x5565961e96d0
...
2021-04-06T14:59:14.056+0000 7f1dc1186700  1 -- 172.21.15.16:0/3563114407 <== osd.4 v1:172.21.15.122:6807/17226 1025 ==== osd_op_reply(1088 74c80a7a-d7d1-45ec-adbe-12ec44c1a806.db.0000000000000000 [call] v24'1035 uv1034 ondisk = -2 ((2) No such file or directory)) v8 ==== 200+0+0 (unknown 1333430253 0 0) 0x7f1dac06bc40 con 0x5565961e96d0
2021-04-06T14:59:14.056+0000 7f1da0ff9700 -1 client.4419: SimpleRADOSStriper: lock_keeper_main: 74c80a7a-d7d1-45ec-adbe-12ec44c1a806.db: lock renewal failed: (2) No such file or directory
2021-04-06T14:59:14.056+0000 7f1dc873a440  1 -- 172.21.15.16:0/3563114407 --> v1:172.21.15.16:6808/17196 -- osd_op(unknown.0.0:2190 3.17 3:eb02df75:::74c80a7a-d7d1-45ec-adbe-12ec44c1a806.db.0000000000000002:head [write 319488~4096 in=4096b] snapc 0=[] ondisk+write+known_if_redirected e24) v8 -- 0x556597814770 con 0x5565961dad60
2021-04-06T14:59:14.056+0000 7f1dc873a440 10 client.4419: SimpleRADOSStriper: write: 74c80a7a-d7d1-45ec-adbe-12ec44c1a806.db:  dirty size: 8712192
2021-04-06T14:59:14.056+0000 7f1dc873a440  5 cephsqlite: Write: (client.4419) [cephsqlite:/74c80a7a-d7d1-45ec-adbe-12ec44c1a806.db] = 4096
2021-04-06T14:59:14.056+0000 7f1dc873a440  5 cephsqlite: Write: (client.4419) [cephsqlite:/74c80a7a-d7d1-45ec-adbe-12ec44c1a806.db] 8712192~4096
2021-04-06T14:59:14.056+0000 7f1dc873a440  5 client.4419: SimpleRADOSStriper: write: 74c80a7a-d7d1-45ec-adbe-12ec44c1a806.db: 8712192~4096
2021-04-06T14:59:14.056+0000 7f1dc873a440  5 cephsqlite: Write: (client.4419) [cephsqlite:/74c80a7a-d7d1-45ec-adbe-12ec44c1a806.db] write failed: (108) Cannot send after transport endpoint shutdown

Probably cephsqlite_lock_renewal_timeout will need increased to ~300 seconds for this configuration.

Actions #3

Updated by Patrick Donnelly about 3 years ago

  • Assignee changed from Patrick Donnelly to Deepika Upadhyay
Actions

Also available in: Atom PDF