Project

General

Profile

Actions

Bug #36182

closed

osd: hung op "osd.3 22 get_health_metrics reporting 2 slow ops, oldest is osd_op(mds.0.6:55075 3.7s0 3:edaf1c25:::1000000129e.00000010:head [trimtrunc 850@0] snapc 1=[] ondisk+write+known_if_redirected+full_force e22)"

Added by Patrick Donnelly over 5 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
EC Pools
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(RADOS):
EC plugins, OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

From: http://pulpito.ceph.com/pdonnell-2018-09-25_01:23:37-fs-wip-pdonnell-testing-20180924.230702-distro-basic-smithi/3066511/

MDS stuck waiting for OSD reply on $subject op.

Neha pulled osd logs from the cluster.

Actions #1

Updated by Patrick Donnelly over 5 years ago

  • Description updated (diff)
Actions #2

Updated by Neha Ojha over 5 years ago

Logs in /a/pdonnell-2018-09-25_01:23:37-fs-wip-pdonnell-testing-20180924.230702-distro-basic-smithi/3066511/remote/log

Actions #3

Updated by Patrick Donnelly over 5 years ago

Another with full logs (no cores):

/ceph/teuthology-archive/pdonnell-2018-10-01_03:14:44-fs-wip-pdonnell-testing-20181001.011252-distro-basic-smithi/3089807/teuthology.log

Actions #4

Updated by Patrick Donnelly over 5 years ago

Another set:

Timeout 3h running clone.client.0/qa/workunits/suites/fsx.sh
3 jobs: ['3105708', '3105843', '3106036']
suites intersection: ['clusters/fixed-2-ucephfs.yaml', 'conf/{client.yaml', 'fs/basic_workload/{begin.yaml', 'mds.yaml', 'mon.yaml', 'mount/fuse.yaml', 'osd.yaml}', 'overrides/{frag_enable.yaml', 'tasks/cfuse_workunit_suites_fsx.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']
suites union: ['clusters/fixed-2-ucephfs.yaml', 'conf/{client.yaml', 'fs/basic_workload/{begin.yaml', 'inline/no.yaml', 'inline/yes.yaml', 'mds.yaml', 'mon.yaml', 'mount/fuse.yaml', 'objectstore-ec/bluestore-comp-ec-root.yaml', 'objectstore-ec/bluestore-ec-root.yaml', 'omap_limit/10.yaml', 'omap_limit/10000.yaml', 'osd.yaml}', 'overrides/{frag_enable.yaml', 'supported-random-distros$/{rhel_latest.yaml}', 'supported-random-distros$/{ubuntu_16.04.yaml}', 'supported-random-distros$/{ubuntu_latest.yaml}', 'tasks/cfuse_workunit_suites_fsx.yaml}', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}']

From: pdonnell-2018-10-06_01:02:41-fs-wip-pdonnell-testing-20181005.225845-distro-basic-smithi

Actions #5

Updated by Neha Ojha over 5 years ago

  • Status changed from New to 12

This can be reproduced with the fs:basic_workload suite, using --filter 'cfuse_workunit_suites_fsx.yaml'.
Particularly seen with bluestore-ec-root.yaml and bluestore-comp-ec-root.yaml

Actions #6

Updated by Neha Ojha over 5 years ago

  • Assignee set to Neha Ojha
Actions #7

Updated by Neha Ojha over 5 years ago

  • Backport deleted (mimic,luminous)

Haven't been able to reproduce this on luminous and mimic, so clearing the Backport fields for now.

Actions #8

Updated by Neha Ojha over 5 years ago

  • Status changed from 12 to 7
Actions #9

Updated by Neha Ojha over 5 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF