Bug #55750
openmon: slow request of very long time
0%
Description
ceph -c /etc/ceph/nc4srv-hdd.conf health detail HEALTH_WARN 4 slow ops, oldest one blocked for 160935 sec, mon.nc4-hadoop-ceph-st0001 has slow ops SLOW_OPS 4 slow ops, oldest one blocked for 160935 sec, mon.nc4-hadoop-ceph-st0001 has slow ops
Updated by yite gu almost 2 years ago
$ ceph -c /etc/ceph/nc4srv-hdd.conf daemon mon.nc4-hadoop-ceph-st0001 ops { "ops": [ { "description": "osd_failure(failed timeout osd.300 [v2:10.195.9.6:6800/64111,v1:10.195.9.6:6801/64111] for 25sec e465704 v465704)", "initiated_at": "2022-05-21 11:31:56.079427", "age": 168994.095883528, "duration": 168994.09590782601, "type_data": { "events": [ { "time": "2022-05-21 11:31:56.079427", "event": "initiated" }, { "time": "2022-05-21 11:31:56.079427", "event": "header_read" }, { "time": "0.000000", "event": "throttled" }, { "time": "0.000000", "event": "all_read" }, { "time": "0.000000", "event": "dispatched" }, { "time": "2022-05-21 11:31:56.082413", "event": "mon:_ms_dispatch" }, { "time": "2022-05-21 11:31:56.082413", "event": "mon:dispatch_op" }, { "time": "2022-05-21 11:31:56.082413", "event": "psvc:dispatch" }, { "time": "2022-05-21 11:31:56.082419", "event": "osdmap:preprocess_query" }, { "time": "2022-05-21 11:31:56.082420", "event": "osdmap:preprocess_failure" }, { "time": "2022-05-21 11:31:56.082427", "event": "osdmap:prepare_update" }, { "time": "2022-05-21 11:31:56.082428", "event": "osdmap:prepare_failure" }, { "time": "2022-05-21 11:31:56.082437", "event": "no_reply: send routed request" } ], "info": { "seq": 7084262, "src_is_mon": false, "source": "osd.6 v2:10.195.32.19:6834/717429", "forwarded_to_leader": false } } },
Updated by yite gu almost 2 years ago
It appears that this mon request has been completed,but it have no erase from ops_in_flight_sharded?
Updated by Radoslaw Zarzynski almost 2 years ago
- Status changed from New to Need More Info
Could you please provide an info on which version of Ceph this issue happened?
Updated by Radoslaw Zarzynski almost 2 years ago
- Related to Bug #47380: mon: slow ops due to osd_failure added
Updated by yite gu almost 2 years ago
Radoslaw Zarzynski wrote:
Could you please provide an info on which version of Ceph this issue happened?
- ceph -v
ceph version 14.2.21 (5ef401921d7a88aea18ec7558f7f9374ebd8f5a6) nautilus (stable)
Updated by yite gu almost 2 years ago
https://github.com/ceph/ceph/pull/41516
https://github.com/ceph/ceph/commit/a124ee85b03e15f4ea371358008ecac65f9f4e50
There are two PR to resolve this problem, But I'm not sure it can really solve problem.
I think mon report slow request from vector Monitor::get_health_metrics. If quetion op have no erase from op_tracker.visit_ops_in_flight, mon will continue report slow request?
Updated by Neha Ojha almost 2 years ago
yite gu wrote:
It appears that this mon request has been completed,but it have no erase from ops_in_flight_sharded?
How did deduce that the mon request had completed?
Updated by yite gu almost 2 years ago
Neha Ojha wrote:
yite gu wrote:
It appears that this mon request has been completed,but it have no erase from ops_in_flight_sharded?
How did deduce that the mon request had completed?
I'm not sure, because I see mon can handle other request, so I guess slow request had completed.
Updated by yite gu over 1 year ago
{
"description": "osd_failure(failed timeout osd.6 [v2:10.172.98.151:6800/39,v1:10.172.98.151:6802/39] for 20sec e141 v141)",
"initiated_at": "2022-11-14T10:43:48.698115+0000",
"age": 3083.261244717,
"duration": 3083.2613051510002,
"type_data": {
"events": [
{
"time": "2022-11-14T10:43:48.698115+0000",
"event": "initiated"
},
{
"time": "0.000000",
"event": "throttled"
},
{
"time": "2022-11-14T10:43:48.698115+0000",
"event": "header_read"
},
{
"time": "0.000000",
"event": "all_read"
},
{
"time": "0.000000",
"event": "dispatched"
},
{
"time": "2022-11-14T10:43:48.698238+0000",
"event": "mon:_ms_dispatch"
},
{
"time": "2022-11-14T10:43:48.698238+0000",
"event": "mon:dispatch_op"
},
{
"time": "2022-11-14T10:43:48.698239+0000",
"event": "psvc:dispatch"
},
{
"time": "2022-11-14T10:43:48.698244+0000",
"event": "osdmap:preprocess_query"
},
{
"time": "2022-11-14T10:43:48.698244+0000",
"event": "osdmap:preprocess_failure"
},
{
"time": "2022-11-14T10:43:48.698251+0000",
"event": "osdmap:prepare_update"
},
{
"time": "2022-11-14T10:43:48.698251+0000",
"event": "osdmap:prepare_failure"
},
{
"time": "2022-11-14T10:43:48.698259+0000",
"event": "no_reply: send routed request"
}
],
"info": {
"seq": 8066192,
"src_is_mon": false,
"source": "osd.7 v2:10.172.98.86:6803/39",
"forwarded_to_leader": false
}
}
},