Project

General

Profile

Bug #49607

qa: slow metadata ops during scrubbing

Added by Patrick Donnelly about 2 months ago. Updated 19 days ago.

Status:
Resolved
Priority:
High
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
qa-suite
Labels (FS):
qa, qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-03-04T15:35:33.462 INFO:teuthology.orchestra.run.smithi023.stdout:{
2021-03-04T15:35:33.462 INFO:teuthology.orchestra.run.smithi023.stdout:    "status": "scrub active (91 inodes in the stack)",
2021-03-04T15:35:33.462 INFO:teuthology.orchestra.run.smithi023.stdout:    "scrubs": {
2021-03-04T15:35:33.463 INFO:teuthology.orchestra.run.smithi023.stdout:        "23cc1ac4-cbd1-462e-8c86-edb3d0e8ceb6": {
2021-03-04T15:35:33.463 INFO:teuthology.orchestra.run.smithi023.stdout:            "path": "/",
2021-03-04T15:35:33.463 INFO:teuthology.orchestra.run.smithi023.stdout:            "tag": "23cc1ac4-cbd1-462e-8c86-edb3d0e8ceb6",
2021-03-04T15:35:33.463 INFO:teuthology.orchestra.run.smithi023.stdout:            "options": "recursive,force" 
2021-03-04T15:35:33.464 INFO:teuthology.orchestra.run.smithi023.stdout:        }
2021-03-04T15:35:33.464 INFO:teuthology.orchestra.run.smithi023.stdout:    }
2021-03-04T15:35:33.464 INFO:teuthology.orchestra.run.smithi023.stdout:}
2021-03-04T15:35:33.465 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 -- 172.21.15.23:0/193355435 >> [v2:172.21.15.47:6800/37064,v1:172.21.15.47:6801/37064] conn(0x7fec440851c0 msgr2=0x7fec44087680 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2021-03-04T15:35:33.465 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 --2- 172.21.15.23:0/193355435 >> [v2:172.21.15.47:6800/37064,v1:172.21.15.47:6801/37064] conn(0x7fec440851c0 0x7fec44087680 secure :-1 s=READY pgs=167 cs=0 l=1 rev1=1 rx=0x7fec58000bc0 tx=0x7fec58007110).stop
2021-03-04T15:35:33.465 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 -- 172.21.15.23:0/193355435 >> [v2:172.21.15.71:3300/0,v1:172.21.15.71:6789/0] conn(0x7fec50004b40 msgr2=0x7fec64172d00 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2021-03-04T15:35:33.466 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 --2- 172.21.15.23:0/193355435 >> [v2:172.21.15.71:3300/0,v1:172.21.15.71:6789/0] conn(0x7fec50004b40 0x7fec64172d00 secure :-1 s=READY pgs=244 cs=0 l=1 rev1=1 rx=0x7fec5400c980 tx=0x7fec54015000).stop
2021-03-04T15:35:33.466 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 -- 172.21.15.23:0/193355435 shutdown_connections
2021-03-04T15:35:33.466 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 --2- 172.21.15.23:0/193355435 >> [v2:172.21.15.47:6800/37064,v1:172.21.15.47:6801/37064] conn(0x7fec440851c0 0x7fec44087680 unknown :-1 s=CLOSED pgs=167 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2021-03-04T15:35:33.466 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 --2- 172.21.15.23:0/193355435 >> [v2:172.21.15.47:3300/0,v1:172.21.15.47:6789/0] conn(0x7fec640811a0 0x7fec64081620 unknown :-1 s=CLOSED pgs=0 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2021-03-04T15:35:33.467 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 --2- 172.21.15.23:0/193355435 >> [v2:172.21.15.71:3300/0,v1:172.21.15.71:6789/0] conn(0x7fec50004b40 0x7fec64172d00 unknown :-1 s=CLOSED pgs=244 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2021-03-04T15:35:33.467 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 --2- 172.21.15.23:0/193355435 >> [v2:172.21.15.23:3300/0,v1:172.21.15.23:6789/0] conn(0x7fec640aac70 0x7fec640a8880 unknown :-1 s=CLOSED pgs=0 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2021-03-04T15:35:33.467 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 -- 172.21.15.23:0/193355435 >> 172.21.15.23:0/193355435 conn(0x7fec640cc2f0 msgr2=0x7fec640cc6d0 unknown :-1 s=STATE_NONE l=0).mark_down
2021-03-04T15:35:33.467 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 -- 172.21.15.23:0/193355435 shutdown_connections
2021-03-04T15:35:33.468 INFO:teuthology.orchestra.run.smithi023.stderr:2021-03-04T15:35:33.445+0000 7fec693ca700  1 -- 172.21.15.23:0/193355435 wait complete.
2021-03-04T15:35:33.475 INFO:tasks.cephfs.filesystem:scrub status for tag:23cc1ac4-cbd1-462e-8c86-edb3d0e8ceb6 - {'path': '/', 'tag': '23cc1ac4-cbd1-462e-8c86-edb3d0e8ceb6', 'options': 'recursive,force'}
2021-03-04T15:35:33.759 INFO:tasks.ceph.osd.2.smithi071.stderr:2021-03-04T15:35:33.731+0000 7f89a43e8700 -1 osd.2 31 get_health_metrics reporting 83 slow ops, oldest is osd_op(client.54731.0:32130 5.1s0 5:88882c78:::100000003b4.00000004:head [write 540672~4096 in=4096b,write 548864~4096 in=4096b,write 565248~4096 in=4096b,write 577536~20480 in=20480b,write 618496~8192 in=8192b,write 634880~4096 in=4096b,write 643072~4096 in=4096b,write 651264~16384 in=16384b,write 684032~8192 in=8192b,write 696320~4096 in=4096b,write 704512~8192 in=8192b,write 716800~4096 in=4096b,write 745472~4096 in=4096b,write 770048~4096 in=4096b,write 778240~4096 in=4096b,write 798720~4096 in=4096b] snapc 1=[] ondisk+write+known_if_redirected e31)
2021-03-04T15:35:33.805 INFO:tasks.ceph.osd.7.smithi047.stderr:2021-03-04T15:35:33.747+0000 7f88d3848700 -1 osd.7 31 get_health_metrics reporting 45 slow ops, oldest is osd_op(client.54731.0:30506 5.5s0 5:bd7af79a:::1000000039b.00000014:head [write 1908736~4096 in=4096b,write 1925120~4096 in=4096b,write 1937408~4096 in=4096b,write 1945600~8192 in=8192b,write 1961984~4096 in=4096b,write 2011136~20480 in=20480b,write 2080768~4096 in=4096b,write 2093056~4096 in=4096b,write 2109440~8192 in=8192b,write 2138112~4096 in=4096b,write 2154496~4096 in=4096b,write 2166784~4096 in=4096b,write 2203648~4096 in=4096b,write 2224128~4096 in=4096b,write 2232320~8192 in=8192b,write 2260992~4096 in=4096b] snapc 1=[] ondisk+write+known_if_redirected e31)
2021-03-04T15:35:34.204 INFO:tasks.ceph.osd.5.smithi071.stderr:2021-03-04T15:35:34.175+0000 7f3df2564700 -1 osd.5 31 get_health_metrics reporting 70 slow ops, oldest is osd_op(client.54731.0:29637 5.7s0 5:f086d436:::100000003bc.0000000d:head [write 2326528~4096 in=4096b,write 2334720~8192 in=8192b,write 2359296~4096 in=4096b,write 2367488~4096 in=4096b,write 2396160~4096 in=4096b,write 2420736~4096 in=4096b,write 2449408~4096 in=4096b,write 2469888~4096 in=4096b,write 2494464~4096 in=4096b,write 2514944~12288 in=12288b,write 2539520~4096 in=4096b,write 2551808~8192 in=8192b,write 2572288~4096 in=4096b,write 2580480~4096 in=4096b,write 2592768~4096 in=4096b,write 2617344~4096 in=4096b] snapc 1=[] ondisk+write+known_if_redirected e31)
2021-03-04T15:35:34.428 INFO:tasks.ceph.osd.8.smithi071.stderr:2021-03-04T15:35:34.387+0000 7f5c00196700 -1 osd.8 31 get_health_metrics reporting 33 slow ops, oldest is osd_op(client.54731.0:42972 5.2s0 5:4fa52d1d:::10000000366.00000003:head [write 3985408~4096 in=4096b,write 4005888~4096 in=4096b,write 4018176~4096 in=4096b,write 4026368~4096 in=4096b,write 4042752~4096 in=4096b,write 4050944~4096 in=4096b,write 4059136~4096 in=4096b,write 4071424~4096 in=4096b,write 4079616~4096 in=4096b,write 4104192~16384 in=16384b,write 4128768~4096 in=4096b,write 4136960~8192 in=8192b,write 4157440~4096 in=4096b,write 4165632~4096 in=4096b,write 4173824~4096 in=4096b,write 4186112~4096 in=4096b] snapc 1=[] ondisk+write+known_if_redirected e31)
2021-03-04T15:35:34.753 INFO:tasks.ceph.osd.2.smithi071.stderr:2021-03-04T15:35:34.691+0000 7f89a43e8700 -1 osd.2 31 get_health_metrics reporting 82 slow ops, oldest is osd_op(client.54731.0:32171 5.1s0 5:8057a0f2:::100000003b4.00000006:head [write 2510848~4096 in=4096b,write 2527232~8192 in=8192b,write 2539520~8192 in=8192b,write 2551808~8192 in=8192b,write 2564096~12288 in=12288b,write 2580480~28672 in=28672b,write 2613248~4096 in=4096b,write 2621440~12288 in=12288b,write 2637824~8192 in=8192b,write 2666496~4096 in=4096b,write 2678784~12288 in=12288b,write 2703360~4096 in=4096b,write 2719744~4096 in=4096b,write 2740224~24576 in=24576b,write 2772992~4096 in=4096b,write 2785280~28672 in=28672b] snapc 1=[] ondisk+write+known_if_redirected e31)
...
2021-03-04T15:40:19.132 INFO:teuthology.orchestra.run.smithi023.stdout:2021-03-04T15:36:00.796173+0000 mon.a (mon.0) 255 : cluster [WRN] Health check failed: 1 MDSs report slow metadata IOs (MDS_SLOW_METADATA_IO)

From: /ceph/teuthology-archive/pdonnell-2021-03-04_03:51:01-fs-wip-pdonnell-testing-20210303.195715-distro-basic-smithi/5932303/teuthology.log


Related issues

Copied to CephFS - Backport #49630: pacific: qa: slow metadata ops during scrubbing Resolved

History

#1 Updated by Patrick Donnelly about 2 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 39841

#2 Updated by Patrick Donnelly about 2 months ago

  • Status changed from Fix Under Review to Pending Backport

#3 Updated by Backport Bot about 2 months ago

  • Copied to Backport #49630: pacific: qa: slow metadata ops during scrubbing added

#4 Updated by Loïc Dachary 19 days ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF