Bug #47854
some clients may return failure in the scenario where multiple clients create directories at the same time
% Done:
0%
Source:
Community (dev)
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
The issue can be reproduced by the following steps:
(1)ceph version: 14.2.10, multimds , multiple clients mount the same directory through ceph-fuse
(2) Use vdbench tool test performace, the test script is:
> hd=default,vdbench=/root/vdbench50406,user=root,shell=ssh > hd=hd1,system=192.168.8.101,clients=5 > hd=hd2,system=192.168.8.104,clients=5 > hd=hd3,system=192.168.8.103,clients=5 > fsd=fsd1,anchor=/mnt/dir2,depth=2,width=100,files=10,sizes=4k,shared=yes > fwd=format,threads=32,xfersize=4k > fwd=default,rdpct=60,xfersize=4k,fileio=sequential,fileselect=sequential,threads=32 > fwd=fwd1,fsd=fsd*, host=hd1 > fwd=fwd2,fsd=fsd*, host=hd2 > fwd=fwd3,fsd=fsd*, host=hd3 > rd=rd1,fwd=fwd*,fwdrate=max,format=restart,elapsed=300,interval=1
(3) vdbench tool abort because mkdir failure
>08:53:27.586 All slaves are now connected >08:53:29.001 Starting RD=format_for_rd1 >Oct 14, 2020 .Interval. .ReqstdOps... ...cpu%... read ....read..... ....write.... ..mb/sec... mb/sec .xfer.. ...mkdir.... >...rmdir.... ...create... ....open.... ...close.... ...delete... > rate resp total sys pct rate resp rate resp read write total size rate resp rate > > resp rate resp rate resp rate resp rate resp > 08:53:30.123 1 0.0 0.000 17.3 4.58 0.0 0.0 0.000 0.0 0.000 0.00 0.00 0.00 0 473.0 33.676 0.0 > > 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 > 08:53:31.045 2 0.0 0.000 42.1 25.5 0.0 0.0 0.000 0.0 0.000 0.00 0.00 0.00 0 700.0 17.441 0.0 > > 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 > 08:53:32.062 3 0.0 0.000 38.8 25.7 0.0 0.0 0.000 0.0 0.000 0.00 0.00 0.00 0 750.0 16.095 0.0 > > 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 > 08:53:33.057 4 0.0 0.000 37.6 24.2 0.0 0.0 0.000 0.0 0.000 0.00 0.00 0.00 0 691.0 17.338 0.0 > > 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 > 08:53:34.056 5 0.0 0.000 30.3 20.8 0.0 0.0 0.000 0.0 0.000 0.00 0.00 0.00 0 616.0 17.556 0.0 > > 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 > 08:53:35.056 6 0.0 0.000 29.3 19.0 0.0 0.0 0.000 0.0 0.000 0.00 0.00 0.00 0 681.0 20.819 0.0 > > 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 > 08:53:36.054 7 0.0 0.000 34.7 24.9 0.0 0.0 0.000 0.0 0.000 0.00 0.00 0.00 0 1232 11.082 0.0 > > 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 > 08:53:37.057 8 0.0 0.000 36.2 25.9 0.0 0.0 0.000 0.0 0.000 0.00 0.00 0.00 0 1326 8.678 0.0 > > 0.000 0.0 0.000 0.0 0.000 0.0 0.000 0.0 0.000 > 08:53:37.956 * > 08:53:37.956 ******************************************************************************************************** > 08:53:37.956 * Slave hd1_cl0-0 aborting: Unable to create directory: /mnt/dir2/vdb.1_1.dir/vdb.2_17.dir/ * > 08:53:37.956 ********************************************************************************************************pari
The following is a partial log of the client hd1. See the attachment for the complete log.
2020-10-13 14:34:52.429 7fc8d9ffb700 3 client.621319 ll_lookup 0x1000002130f.head vdb.2_17.dir -> -2 (0) 2020-10-13 14:34:52.429 7fc97b7fe700 3 client.621319 ll_mkdir 0x1000002130f.head vdb.2_17.dir 2020-10-13 14:34:52.429 7fc97b7fe700 8 client.621319 _mkdir(0x1000002130f vdb.2_17.dir, 0777, uid 0, gid 0) 2020-10-13 14:34:52.429 7fc97b7fe700 20 client.621319 get_or_create 0x1000002130f.head(faked_ino=0 ref=4 ll_ref=247 cap_refs={} open={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=2020-10-13 14:37:31.430429 ctime=2020-10-13 14:37:31.430429 caps=pAsLsXs(0=pAsLsXs) parents=0x1000000d8b3.head["vdb.1_1.dir"] 0x7fc9b801b1a0) name vdb.2_17.dir 2020-10-13 14:34:52.429 7fc97b7fe700 10 client.621319 send_request client_request(unknown.0:2663 mkdir #0x1000002130f/vdb.2_17.dir 2020-10-13 14:34:52.430764 caller_uid=0, caller_gid=0{0,}) v4 to mds.0 2020-10-13 14:34:52.599 7fc97b7fe700 10 client.621319 send_request client_request(unknown.0:2663 mkdir #0x1000002130f/vdb.2_17.dir 2020-10-13 14:34:52.430764 caller_uid=0, caller_gid=0{0,}) v4 to mds.1 2020-10-13 14:34:52.599 7fc97b7fe700 8 client.621319 _mkdir(#0x1000002130f/vdb.2_17.dir, 040755) = -17 2020-10-13 14:34:52.599 7fc97b7fe700 3 client.621319 ll_mkdir 0x1000002130f.head vdb.2_17.dir = -17 (0) 2020-10-13 14:34:52.600 7fc95a7fc700 3 client.621319 ll_lookup 0x1000002130f.head vdb.2_17.dir 2020-10-13 14:34:52.600 7fc95a7fc700 10 client.621319 _lookup 0x1000002130f.head(faked_ino=0 ref=3 ll_ref=248 cap_refs={} open={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=2020-10-13 14:37:36.563897 ctime=2020-10-13 14:37:36.563897 caps=pAsLsXs(0=pAsLsXs) parents=0x1000000d8b3.head["vdb.1_1.dir"] 0x7fc9b801b1a0) vdb.2_17.dir = -2 2020-10-13 14:34:52.600 7fc95a7fc700 3 client.621319 ll_lookup 0x1000002130f.head vdb.2_17.dir -> -2 (0) 2020-10-13 14:34:52.600 7fc89a7fc700 3 client.621319 ll_mkdir 0x1000002130f.head vdb.2_17.dir 2020-10-13 14:34:52.600 7fc89a7fc700 8 client.621319 _mkdir(0x1000002130f vdb.2_17.dir, 0777, uid 0, gid 0) 2020-10-13 14:34:52.600 7fc89a7fc700 20 client.621319 get_or_create 0x1000002130f.head(faked_ino=0 ref=4 ll_ref=248 cap_refs={} open={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=2020-10-13 14:37:36.563897 ctime=2020-10-13 14:37:36.563897 caps=pAsLsXs(0=pAsLsXs) parents=0x1000000d8b3.head["vdb.1_1.dir"] 0x7fc9b801b1a0) name vdb.2_17.dir 2020-10-13 14:34:52.600 7fc89a7fc700 10 client.621319 send_request client_request(unknown.0:2718 mkdir #0x1000002130f/vdb.2_17.dir 2020-10-13 14:34:52.601221 caller_uid=0, caller_gid=0{0,}) v4 to mds.0 2020-10-13 14:34:52.784 7fc89a7fc700 10 client.621319 send_request client_request(unknown.0:2718 mkdir #0x1000002130f/vdb.2_17.dir 2020-10-13 14:34:52.601221 caller_uid=0, caller_gid=0{0,}) v4 to mds.1 2020-10-13 14:34:52.784 7fc89a7fc700 8 client.621319 _mkdir(#0x1000002130f/vdb.2_17.dir, 040755) = -17 2020-10-13 14:34:52.784 7fc89a7fc700 3 client.621319 ll_mkdir 0x1000002130f.head vdb.2_17.dir = -17 (0) 2020-10-13 14:34:52.784 7fc8da7fc700 3 client.621319 ll_lookup 0x1000002130f.head vdb.2_17.dir 2020-10-13 14:34:52.784 7fc8da7fc700 10 client.621319 _lookup 0x1000002130f.head(faked_ino=0 ref=3 ll_ref=249 cap_refs={} open={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=2020-10-13 14:37:36.563897 ctime=2020-10-13 14:37:36.563897 caps=pAsLsXs(0=pAsLsXs) parents=0x1000000d8b3.head["vdb.1_1.dir"] 0x7fc9b801b1a0) vdb.2_17.dir = -2 2020-10-13 14:34:52.784 7fc8da7fc700 3 client.621319 ll_lookup 0x1000002130f.head vdb.2_17.dir -> -2 (0) 2020-10-13 14:34:52.784 7fc8b97fa700 3 client.621319 ll_mkdir 0x1000002130f.head vdb.2_17.dir 2020-10-13 14:34:52.784 7fc8b97fa700 8 client.621319 _mkdir(0x1000002130f vdb.2_17.dir, 0777, uid 0, gid 0) 2020-10-13 14:34:52.784 7fc8b97fa700 20 client.621319 get_or_create 0x1000002130f.head(faked_ino=0 ref=4 ll_ref=249 cap_refs={} open={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=2020-10-13 14:37:36.563897 ctime=2020-10-13 14:37:36.563897 caps=pAsLsXs(0=pAsLsXs) parents=0x1000000d8b3.head["vdb.1_1.dir"] 0x7fc9b801b1a0) name vdb.2_17.dir 2020-10-13 14:34:52.784 7fc8b97fa700 10 client.621319 send_request client_request(unknown.0:2780 mkdir #0x1000002130f/vdb.2_17.dir 2020-10-13 14:34:52.785694 caller_uid=0, caller_gid=0{0,}) v4 to mds.0 2020-10-13 14:34:52.966 7fc8b97fa700 10 client.621319 send_request client_request(unknown.0:2780 mkdir #0x1000002130f/vdb.2_17.dir 2020-10-13 14:34:52.785694 caller_uid=0, caller_gid=0{0,}) v4 to mds.1 2020-10-13 14:34:52.966 7fc8b97fa700 8 client.621319 _mkdir(#0x1000002130f/vdb.2_17.dir, 040755) = -17 2020-10-13 14:34:52.966 7fc8b97fa700 3 client.621319 ll_mkdir 0x1000002130f.head vdb.2_17.dir = -17 (0) 2020-10-13 14:34:52.967 7fc8ba7fc700 3 client.621319 ll_lookup 0x1000002130f.head vdb.2_17.dir 2020-10-13 14:34:52.967 7fc8ba7fc700 10 client.621319 _lookup 0x1000002130f.head(faked_ino=0 ref=3 ll_ref=250 cap_refs={} open={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=2020-10-13 14:37:37.323976 ctime=2020-10-13 14:37:37.323976 caps=pAsLsXs(0=pAsLsXs) parents=0x1000000d8b3.head["vdb.1_1.dir"] 0x7fc9b801b1a0) vdb.2_17.dir = -2 2020-10-13 14:34:52.967 7fc8ba7fc700 3 client.621319 ll_lookup 0x1000002130f.head vdb.2_17.dir -> -2 (0) 2020-10-13 14:34:52.967 7fc8b9ffb700 3 client.621319 ll_lookup 0x1000002130f.head vdb.2_17.dir 2020-10-13 14:34:52.967 7fc8b9ffb700 10 client.621319 _lookup 0x1000002130f.head(faked_ino=0 ref=3 ll_ref=250 cap_refs={} open={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=2020-10-13 14:37:37.323976 ctime=2020-10-13 14:37:37.323976 caps=pAsLsXs(0=pAsLsXs) parents=0x1000000d8b3.head["vdb.1_1.dir"] 0x7fc9b801b1a0) vdb.2_17.dir = -2 2020-10-13 14:34:52.967 7fc8b9ffb700 3 client.621319 ll_lookup 0x1000002130f.head vdb.2_17.dir -> -2 (0) 2020-10-13 14:34:52.967 7fc95bfff700 3 client.621319 ll_lookup 0x1000002130f.head vdb.2_17.dir
Related issues
History
#1 Updated by Patrick Donnelly almost 3 years ago
- Description updated (diff)
- ceph-qa-suite deleted (
fs)
#2 Updated by Patrick Donnelly almost 3 years ago
- Status changed from New to Fix Under Review
- Assignee set to wei qiaomiao
- Target version set to v16.0.0
- Source set to Community (dev)
- Backport set to octopus,nautilus
- Pull request ID set to 37664
#3 Updated by Patrick Donnelly almost 3 years ago
- Status changed from Fix Under Review to Pending Backport
#4 Updated by Nathan Cutler almost 3 years ago
- Copied to Backport #48129: octopus: some clients may return failure in the scenario where multiple clients create directories at the same time added
#5 Updated by Nathan Cutler almost 3 years ago
- Copied to Backport #48130: nautilus: some clients may return failure in the scenario where multiple clients create directories at the same time added
#6 Updated by Nathan Cutler over 2 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".