Bug #40361: getattr on snap inode stuck - CephFS - Ceph

Actions

Copy link

Bug #40361

closed

getattr on snap inode stuck

Added by Zheng Yan almost 5 years ago. Updated over 4 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Zheng Yan

Category:

Target version:

Ceph - v15.0.0

% Done:

Source:

Community (user)

Tags:

Backport:

nautilus,mimic

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

MDS, cephfs.pyx

Labels (FS):

snapshots

Pull request ID:

28551

Crash signature (v1):

Crash signature (v2):

Description

from maillling list

On Wed, Jun 12, 2019 at 3:26 PM Hector Martin <hector@marcansoft.com> wrote:

Hi list,

I have a setup where two clients mount the same filesystem and
read/write from mostly non-overlapping subsets of files (Dovecot mail
storage/indices). There is a third client that takes backups by
snapshotting the top-level directory, then rsyncing the snapshot over to
another location.

Ever since I switched the backup process to using snapshots, the rsync
process has stalled at a certain point during the backup with a stuck
MDS op:

root@mon02:~# ceph daemon mds.mon02 dump_ops_in_flight {
"ops": [ {
"description": "client_request(client.146682828:199050
getattr pAsLsXsFs #0x10000000007//bak-20190612094501/<snip
path>/dovecot.index.log 2019-06-12 12:20:56.992049 caller_uid=5000,
caller_gid=5000{})",
"initiated_at": "2019-06-12 12:20:57.001534",
"age": 9563.847754,
"duration": 9563.847780,
"type_data": {
"flag_point": "failed to rdlock, waiting",
"reqid": "client.146682828:199050",
"op_type": "client_request",
"client_info": {
"client": "client.146682828",
"tid": 199050
},
"events": [ {
"time": "2019-06-12 12:20:57.001534",
"event": "initiated"
}, {
"time": "2019-06-12 12:20:57.001534",
"event": "header_read"
}, {
"time": "2019-06-12 12:20:57.001538",
"event": "throttled"
}, {
"time": "2019-06-12 12:20:57.001550",
"event": "all_read"
}, {
"time": "2019-06-12 12:20:57.001713",
"event": "dispatched"
}, {
"time": "2019-06-12 12:20:57.001997",
"event": "failed to rdlock, waiting"
}
]
}
}
],
"num_ops": 1
}

AIUI, when a snapshot is taken, all clients with dirty data are supposed
to get a message to flush it to the cluster in order to produce a
consistent snapshot. My guess is this isn't happening properly, so reads
of that file in the snapshot are blocked. Doing a 'echo 3 >
/proc/sys/vm/drop_caches' on both of the writing clients seems to clear
the stuck op, but doing it once isn't enough; usually I get the stuck up
and have to clear caches twice after making any given snapshot.

Everything is on Ubuntu. The cluster is running 13.2.4 (mimic), and the
clients are using the kernel client version 4.18.0-20-generic (writers)
and 4.18.0-21-generic (backup host).

I managed to reproduce it like this:

host1$ mkdir _test
host1$ cd _test/.snap

host2$ cd test
host2$ for i in $(seq 1 10000); do (sleep 0.1; echo $i; sleep 1) > b$i
& sleep 0.05; done

(while that is running)

host1$ mkdir s11
host1$ cd s11

(wait a few seconds)

host2$ ^C

host1$ ls -al
(hangs)

This yielded this stuck request:

{
"ops": [ {
"description": "client_request(client.146687505:13785
getattr pAsLsXsFs #0x1000017f41c//s11/b_42 2019-06-12 16:15:59.095025
caller_uid=0, caller_gid=0{})",
"initiated_at": "2019-06-12 16:15:59.095559",
"age": 30.846294,
"duration": 30.846318,
"type_data": {
"flag_point": "failed to rdlock, waiting",
"reqid": "client.146687505:13785",
"op_type": "client_request",
"client_info": {
"client": "client.146687505",
"tid": 13785
},
"events": [ {
"time": "2019-06-12 16:15:59.095559",
"event": "initiated"
}, {
"time": "2019-06-12 16:15:59.095559",
"event": "header_read"
}, {
"time": "2019-06-12 16:15:59.095562",
"event": "throttled"
}, {
"time": "2019-06-12 16:15:59.095573",
"event": "all_read"
}, {
"time": "2019-06-12 16:15:59.096201",
"event": "dispatched"
}, {
"time": "2019-06-12 16:15:59.096318",
"event": "failed to rdlock, waiting"
}, {
"time": "2019-06-12 16:15:59.268368",
"event": "failed to rdlock, waiting"
}
]
}
}
],
"num_ops": 1
}

My guess is somewhere along the line of this process there's a race
condition and the dirty client isn't properly flushing its data.

A 'sync' on host2 does not clear the stuck op. 'echo 1 >
/proc/sys/vm/drop_caches' does not either, while 'echo 2 >
/proc/sys/vm/drop_caches' does fix it. So I guess the problem is a
dentry/inode that is stuck dirty in the cache of host2?

--
Hector Martin (hector@marcansoft.com)
Public Key: https://mrcn.st/pub

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by Zheng Yan almost 5 years ago

Status changed from New to Fix Under Review
Backport set to nautilus,mimic
Pull request ID set to 28551

Actions

Copy link

Updated by Patrick Donnelly almost 5 years ago

Assignee set to Zheng Yan
Target version set to v15.0.0
Start date deleted (~~06/14/2019~~)
Source set to Community (user)
Component(FS) MDS added
Labels (FS) snapshots added

Actions

Copy link

Updated by Patrick Donnelly almost 5 years ago

Status changed from Fix Under Review to Pending Backport
Component(FS) cephfs.pyx added

Actions

Copy link

Updated by Nathan Cutler almost 5 years ago

Copied to Backport #40437: mimic: getattr on snap inode stuck added

Actions

Copy link

Updated by Nathan Cutler almost 5 years ago

Copied to Backport #40438: nautilus: getattr on snap inode stuck added

Actions

Copy link

Updated by Nathan Cutler over 4 years ago

Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #40361

getattr on snap inode stuck

Updated by Zheng Yan almost 5 years ago

Updated by Patrick Donnelly almost 5 years ago

Updated by Patrick Donnelly almost 5 years ago

Updated by Nathan Cutler almost 5 years ago

Updated by Nathan Cutler almost 5 years ago

Updated by Nathan Cutler over 4 years ago