Project

General

Profile

Bug #55750

mon: slow request of very long time

Added by yite gu almost 2 years ago. Updated over 1 year ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ceph -c /etc/ceph/nc4srv-hdd.conf health detail
HEALTH_WARN 4 slow ops, oldest one blocked for 160935 sec, mon.nc4-hadoop-ceph-st0001 has slow ops
SLOW_OPS 4 slow ops, oldest one blocked for 160935 sec, mon.nc4-hadoop-ceph-st0001 has slow ops

Related issues

Related to RADOS - Bug #47380: mon: slow ops due to osd_failure Resolved

History

#1 Updated by yite gu almost 2 years ago

$ ceph -c /etc/ceph/nc4srv-hdd.conf daemon mon.nc4-hadoop-ceph-st0001 ops
{
    "ops": [
        {
            "description": "osd_failure(failed timeout osd.300 [v2:10.195.9.6:6800/64111,v1:10.195.9.6:6801/64111] for 25sec e465704 v465704)",
            "initiated_at": "2022-05-21 11:31:56.079427",
            "age": 168994.095883528,
            "duration": 168994.09590782601,
            "type_data": {
                "events": [
                    {
                        "time": "2022-05-21 11:31:56.079427",
                        "event": "initiated" 
                    },
                    {
                        "time": "2022-05-21 11:31:56.079427",
                        "event": "header_read" 
                    },
                    {
                        "time": "0.000000",
                        "event": "throttled" 
                    },
                    {
                        "time": "0.000000",
                        "event": "all_read" 
                    },
                    {
                        "time": "0.000000",
                        "event": "dispatched" 
                    },
                    {
                        "time": "2022-05-21 11:31:56.082413",
                        "event": "mon:_ms_dispatch" 
                    },
                    {
                        "time": "2022-05-21 11:31:56.082413",
                        "event": "mon:dispatch_op" 
                    },
                    {
                        "time": "2022-05-21 11:31:56.082413",
                        "event": "psvc:dispatch" 
                    },
                    {
                        "time": "2022-05-21 11:31:56.082419",
                        "event": "osdmap:preprocess_query" 
                    },
                    {
                        "time": "2022-05-21 11:31:56.082420",
                        "event": "osdmap:preprocess_failure" 
                    },
                    {
                        "time": "2022-05-21 11:31:56.082427",
                        "event": "osdmap:prepare_update" 
                    },
                    {
                        "time": "2022-05-21 11:31:56.082428",
                        "event": "osdmap:prepare_failure" 
                    },
                    {
                        "time": "2022-05-21 11:31:56.082437",
                        "event": "no_reply: send routed request" 
                    }
                ],
                "info": {
                    "seq": 7084262,
                    "src_is_mon": false,
                    "source": "osd.6 v2:10.195.32.19:6834/717429",
                    "forwarded_to_leader": false
                }
            }
        },

#2 Updated by yite gu almost 2 years ago

It appears that this mon request has been completed,but it have no erase from ops_in_flight_sharded?

#3 Updated by Radoslaw Zarzynski almost 2 years ago

  • Status changed from New to Need More Info

Could you please provide an info on which version of Ceph this issue happened?

#4 Updated by Radoslaw Zarzynski almost 2 years ago

  • Related to Bug #47380: mon: slow ops due to osd_failure added

#5 Updated by yite gu almost 2 years ago

Radoslaw Zarzynski wrote:

Could you please provide an info on which version of Ceph this issue happened?

  1. ceph -v
    ceph version 14.2.21 (5ef401921d7a88aea18ec7558f7f9374ebd8f5a6) nautilus (stable)

#6 Updated by yite gu almost 2 years ago

https://github.com/ceph/ceph/pull/41516
https://github.com/ceph/ceph/commit/a124ee85b03e15f4ea371358008ecac65f9f4e50
There are two PR to resolve this problem, But I'm not sure it can really solve problem.
I think mon report slow request from vector Monitor::get_health_metrics. If quetion op have no erase from op_tracker.visit_ops_in_flight, mon will continue report slow request?

#7 Updated by Neha Ojha over 1 year ago

yite gu wrote:

It appears that this mon request has been completed,but it have no erase from ops_in_flight_sharded?

How did deduce that the mon request had completed?

#8 Updated by yite gu over 1 year ago

Neha Ojha wrote:

yite gu wrote:

It appears that this mon request has been completed,but it have no erase from ops_in_flight_sharded?

How did deduce that the mon request had completed?

I'm not sure, because I see mon can handle other request, so I guess slow request had completed.

#9 Updated by yite gu over 1 year ago

{
"description": "osd_failure(failed timeout osd.6 [v2:10.172.98.151:6800/39,v1:10.172.98.151:6802/39] for 20sec e141 v141)",
"initiated_at": "2022-11-14T10:43:48.698115+0000",
"age": 3083.261244717,
"duration": 3083.2613051510002,
"type_data": {
"events": [ {
"time": "2022-11-14T10:43:48.698115+0000",
"event": "initiated"
}, {
"time": "0.000000",
"event": "throttled"
}, {
"time": "2022-11-14T10:43:48.698115+0000",
"event": "header_read"
}, {
"time": "0.000000",
"event": "all_read"
}, {
"time": "0.000000",
"event": "dispatched"
}, {
"time": "2022-11-14T10:43:48.698238+0000",
"event": "mon:_ms_dispatch"
}, {
"time": "2022-11-14T10:43:48.698238+0000",
"event": "mon:dispatch_op"
}, {
"time": "2022-11-14T10:43:48.698239+0000",
"event": "psvc:dispatch"
}, {
"time": "2022-11-14T10:43:48.698244+0000",
"event": "osdmap:preprocess_query"
}, {
"time": "2022-11-14T10:43:48.698244+0000",
"event": "osdmap:preprocess_failure"
}, {
"time": "2022-11-14T10:43:48.698251+0000",
"event": "osdmap:prepare_update"
}, {
"time": "2022-11-14T10:43:48.698251+0000",
"event": "osdmap:prepare_failure"
}, {
"time": "2022-11-14T10:43:48.698259+0000",
"event": "no_reply: send routed request"
}
],
"info": {
"seq": 8066192,
"src_is_mon": false,
"source": "osd.7 v2:10.172.98.86:6803/39",
"forwarded_to_leader": false
}
}
},

Also available in: Atom PDF