Project

General

Profile

Bug #20467

Ceph FS kernel client not consistency

Added by Yunzhi Cheng over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I use 'ls' to list files in one directory, two client has different result


ruitian@rndcl9:/data/hft/data_v1/hftdata/WIND/market/stock/2016/10$ ls 21 | wc -l
5670
ruitian@rndcl3:/data/hft/data_v1/hftdata/WIND/market/stock/2016/10$ ls 21 | wc -l
3409

But when I can access the missing file directly, I can read the file on both of client.

ruitian@rndcl3:/data/hft/data_v1/hftdata/WIND/market/stock/2016/10$ ls 21 | grep 603999-SH-stock.book.csv.gz
"No Such File" 
ruitian@rndcl3:/data/hft/data_v1/hftdata/WIND/market/stock/2016/10$ ls ./21/603999-SH-stock.book.csv.gz
./21/603999-SH-stock.book.csv.gz

It seems something wrong about directory in ceph kernel client.

current_mds_config.txt View (43.5 KB) Yunzhi Cheng, 07/05/2017 08:25 AM

History

#1 Updated by Zheng Yan over 6 years ago

kernel version ?

#2 Updated by Yunzhi Cheng over 6 years ago

kernel version is 4.4.0-46

#3 Updated by Yunzhi Cheng over 6 years ago


ruitian@rndcl3:~$ uname -a
Linux rndcl3 4.4.0-46-generic #67~14.04.1-Ubuntu SMP Fri Oct 21 16:04:40 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

ruitian@rndcl9:~$ uname -a
Linux rndcl9 4.4.0-46-generic #67~14.04.1-Ubuntu SMP Fri Oct 21 16:04:40 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

#4 Updated by Zheng Yan over 6 years ago

I checked ubuntu kenrel. It does not contain following commit. I suspect it's the cause. Could you please try newest upstream kernel.

commit af5e5eb574776cdf1b756a27cc437bff257e22fe
Author: Yan, Zheng <zyan@redhat.com>
Date:   Fri Feb 26 16:27:13 2016 +0800

    ceph: fix race during filling readdir cache

    Readdir cache uses page cache to save dentry pointers. When adding
    dentry pointers to middle of a page, we need to make sure the page
    already exists. Otherwise the beginning part of the page will be
    invalid pointers.

    Signed-off-by: Yan, Zheng <zyan@redhat.com>

#5 Updated by Yunzhi Cheng over 6 years ago

Zheng Yan wrote:

I checked ubuntu kenrel. It does not contain following commit. I suspect it's the cause. Could you please try newest upstream kernel.

[...]

I use the mainline kernel now, and still has this issue.


ruitian@rndcl3:~$ uname -r
4.11.8-041108-generic


ruitian@rndcl3:/data/hft/data_v1/hftdata/WIND/market/stock/2016/07$ ls 06 | grep 002319
"No Such File" 
ruitian@rndcl9:/data/hft/data_v1/hftdata/WIND/market/stock/2016/07$ ls ./06/002319-SZ-stock.book.csv.gz
./06/002319-SZ-stock.book.csv.gz

And I find that if I create a tmp file in the error directory, it will be ok


ruitian@rndcl3:/data/hft/data_v1/hftdata/WIND/market/stock/2016/07$ ls 06 | grep 002319
"No Such File" 
ruitian@rndcl3:/data/hft/data_v1/hftdata/WIND/market/stock/2016/07$ touch ./06/a.tmp

ruitian@rndcl3:/data/hft/data_v1/hftdata/WIND/market/stock/2016/07$ ls 06 | grep 002319
./06/002319-SZ-stock.book.csv.gz

Maybe it's a big problem that I cannot trust any result of 'ls'.

#6 Updated by Zheng Yan over 6 years ago

ceph-mds version and configure, dirfrag or multimds enabled?

please describe workload you put on cephfs, how many clients? does they modify the same directory at the same time?

how difficult to reproduce this, if it's easy. please set debug_mds=10 and enable kernel debugging:

on machine that run ceph-mds, run "ceph daemon mds.x config set debug_mds 10"
on cephfs client machine, run "echo module ceph +p > /sys/kernel/debug/ceph/*/mdsc"

reproduce this bug.

disable debugging and upload the log.
"ceph daemon mds.x config set debug_mds 0"
"echo module ceph -p > /sys/kernel/debug/ceph/*/mdsc"

thanks

#7 Updated by Zheng Yan over 6 years ago

  • Assignee set to Zheng Yan

#8 Updated by Yunzhi Cheng over 6 years ago

Zheng Yan wrote:

ceph-mds version and configure, dirfrag or multimds enabled?

please describe workload you put on cephfs, how many clients? does they modify the same directory at the same time?

how difficult to reproduce this, if it's easy. please set debug_mds=10 and enable kernel debugging:

on machine that run ceph-mds, run "ceph daemon mds.x config set debug_mds 10"
on cephfs client machine, run "echo module ceph +p > /sys/kernel/debug/ceph/*/mdsc"

reproduce this bug.

disable debugging and upload the log.
"ceph daemon mds.x config set debug_mds 0"
"echo module ceph -p > /sys/kernel/debug/ceph/*/mdsc"

thanks

ceph-mds version is 10.2.7 And I only change mds_cache_size to 5000000, the other configs are same as default.

I haven't enable dirfrag or multimds

I have 6 clients per host , and there are 70 hosts

We have 3 ceph cluster, and on every host 3 clients point to A, 2 clients point to B, the last one point to C

I change 2 node to kernel 4.11.8-041108-generic and the others are 4.4.0-46-generic

As you see above, rndcl3 which I reproduce this issue is kernel 4.11.8

Here's the mount option


10.0.0.2,10.0.0.12,10.0.0.28:/hft on /data/hft type ceph (name=admin,noshare,key=client.admin)
10.0.0.2,10.0.0.12,10.0.0.28:/rtt on /data/rtt type ceph (name=admin,noshare,key=client.admin)
10.0.0.26,10.0.0.38,10.0.0.62:/rnd on /data/rnd2 type ceph (name=admin,noshare,key=client.admin)
10.0.0.27,10.0.0.39,10.0.0.63:/rnd on /data/rnd type ceph (name=admin,noshare,key=client.admin)
10.0.0.2,10.0.0.12,10.0.0.28:/share on /data/share type ceph (name=admin,noshare,key=client.admin)
10.0.0.26,10.0.0.38,10.0.0.62:/ on /old_data type ceph (name=admin,noshare,key=client.admin)

The client witch reproduce the issue is '10.0.0.2,10.0.0.12,10.0.0.28:/hft on /data/hft type ceph (name=admin,noshare,key=client.admin)'

We only read data from directroy and donot modify it. At most time, there are 10 processes read the same directory concurrency.

I will upload the debug_log if I reproduce it.


root@rndcl3:/sys/kernel/debug/ceph/8126fa27-e3b1-49e6-8a59-9ad337074533.client3143441# echo 'module ceph +p' > mdsc
-bash: echo: write error: Invalid argument

Seems it's error, should I do echo module ceph +p > /sys/kernel/debug/dynamic_debug ?

By the way, if I add mount option noasyncreaddir, will the bug be avoid?

#9 Updated by Zheng Yan over 6 years ago

Yunzhi Cheng wrote:

Seems it's error, should I do echo module ceph +p > /sys/kernel/debug/dynamic_debug ?

yes, it should be "module ceph +p > /sys/kernel/debug/dynamic_debug/control"

By the way, if I add mount option noasyncreaddir, will the bug be avoid?

yes, noasyncreaddir should avoid this problem

#11 Updated by Yunzhi Cheng over 6 years ago

Zheng Yan wrote:

Please try

https://github.com/ceph/ceph-client/commit/6c51866e67dda544b9be524185a526cf4ace865a

Sorry I can't compile the custom kernel to our cluster recently, I can only try it when it been backport to the mainline kernel :(
There will be a long time before I can test a custom kernel in a test enviroment.

I upload the log when I reproduce the bug, it happened at Jul 6 16:58:00 to 17:00:30.
https://www.dropbox.com/s/brtd46l7o9kklgo/ceph-data-mds.rndcl9.log.gz?dl=0
https://www.dropbox.com/s/vfbgxfmgvn7qo8s/kern.log.gz?dl=0
And I reproduce it by


ls /data/hft/data_v1/hftdata/WIND/market/stock/2016/01/18 | grep 601808-SH-stock.trade.csv.gz

Maybe you can find something in the log, but it's really a mass.

I am testing noasyncreaddir mount option now.

#12 Updated by Zheng Yan over 6 years ago

Yunzhi Cheng wrote:

Zheng Yan wrote:

Please try

https://github.com/ceph/ceph-client/commit/6c51866e67dda544b9be524185a526cf4ace865a

Sorry I can't compile the custom kernel to our cluster recently, I can only try it when it been backport to the mainline kernel :(
There will be a long time before I can test a custom kernel in a test enviroment.

I upload the log when I reproduce the bug, it happened at Jul 6 16:58:00 to 17:00:30.
https://www.dropbox.com/s/brtd46l7o9kklgo/ceph-data-mds.rndcl9.log.gz?dl=0
https://www.dropbox.com/s/vfbgxfmgvn7qo8s/kern.log.gz?dl=0
And I reproduce it by
[...]
Maybe you can find something in the log, but it's really a mass.

I am testing noasyncreaddir mount option now.

I have reproduced this issue locally. The patch works.

Thanks

#13 Updated by Zheng Yan over 6 years ago

  • Status changed from 7 to Resolved

Also available in: Atom PDF