Actions
Bug #37568
closedCephFS remove snapshot result in slow ops
Description
Hello,
I have a ceph mimic cluster with cephfs.
I create few snapshots (mkdir .snap/test etc...) in different directories. So far so good.
But when I delete the snapshots (rmdir .snap/test etc...) the cluster get in a warn state with :
I have a ceph mimic cluster with cephfs.
I create few snapshots (mkdir .snap/test etc...) in different directories. So far so good.
But when I delete the snapshots (rmdir .snap/test etc...) the cluster get in a warn state with :
- ceph -w
cluster:
id: 2fbbf089-a846-4c09-90bc-1dd9bd7af30f
health: HEALTH_WARN
3 slow ops, oldest one blocked for 11415 sec, mon.lpnceph01 has slow ops
...
2018-12-06 16:54:56.356518 mon.lpnceph-mon01 [WRN] Health check update: 3 slow ops, oldest one blocked for 11410 sec, mon.lpnceph01 has slow ops (SLOW_OPS)
2018-12-06 16:55:05.856294 mon.lpnceph-mon01 [WRN] Health check update: 3 slow ops, oldest one blocked for 11415 sec, mon.lpnceph01 has slow ops (SLOW_OPS)
2018-12-06 16:55:10.856657 mon.lpnceph-mon01 [WRN] Health check update: 3 slow ops, oldest one blocked for 11425 sec, mon.lpnceph01 has slow ops (SLOW_OPS)
- ceph daemon mon.lpnceph01 ops
{
"ops": [ {
"description": "remove_snaps({28=[3,4]} v0)",
"initiated_at": "2018-12-06 13:44:41.396039",
"age": 14549.148016,
"duration": 14549.148028,
"type_data": {
"events": [ {
"time": "2018-12-06 13:44:41.396039",
"event": "initiated"
}, {
"time": "2018-12-06 13:44:41.396039",
"event": "header_read"
}, {
"time": "2018-12-06 13:44:41.396042",
"event": "throttled"
}, {
"time": "2018-12-06 13:44:41.396089",
"event": "all_read"
}, {
"time": "2018-12-06 13:44:41.396186",
"event": "dispatched"
}, {
"time": "2018-12-06 13:44:41.396190",
"event": "mon:_ms_dispatch"
}, {
"time": "2018-12-06 13:44:41.396191",
"event": "mon:dispatch_op"
}, {
"time": "2018-12-06 13:44:41.396192",
"event": "psvc:dispatch"
}, {
"time": "2018-12-06 13:44:41.396205",
"event": "osdmap:preprocess_query"
}, {
"time": "2018-12-06 13:44:41.396214",
"event": "osdmap:preprocess_remove_snaps"
}, {
"time": "2018-12-06 13:44:41.396220",
"event": "forward_request_leader"
}, {
"time": "2018-12-06 13:44:41.396258",
"event": "forwarded"
}
],
"info": {
"seq": 250448,
"src_is_mon": false,
"source": "mds.0 xxx.xxx.xxx.xxx:6800/2790459226",
"forwarded_to_leader": true
}
}
},
...
I tryed to add in ceph.conf the lines :
[osd]
osd snap trim sleep = 0.6
as suggested in http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-November/031227.html
but it doesn't solve the problem.
I had to restart the service :
systemctl restart ceph-mon@lpnceph01.service
to get the cluster back to healty status.
Actions