Bug #50216: qa: "ls: cannot access 'lost+found': No such file or directory" - CephFS - Ceph

Actions

Copy link

Bug #50216

closed

qa: "ls: cannot access 'lost+found': No such file or directory"

Added by Patrick Donnelly about 3 years ago. Updated almost 3 years ago.

Status:

Resolved

Priority:

High

Assignee:

Xiubo Li

Category:

Target version:

Ceph - v17.0.0

% Done:

Source:

Q/A

Tags:

Backport:

pacific,octopus,nautilus

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

MDS

Labels (FS):

qa-failure

Pull request ID:

40903

Crash signature (v1):

Crash signature (v2):

Description

2021-04-07T03:51:31.570 DEBUG:teuthology.orchestra.run.smithi044:> sudo nsenter --net=/var/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage /bin/mount -t ceph :/ /home/ubuntu/cephtest/mnt.0 -v -o norequire_active_mds,name=0,conf=/etc/ceph/ceph.conf,norbytes
2021-04-07T03:51:31.716 INFO:teuthology.orchestra.run.smithi044.stdout:parsing options: rw,norequire_active_mds,name=0,conf=/etc/ceph/ceph.conf,norbytes
2021-04-07T03:51:31.717 INFO:teuthology.orchestra.run.smithi044.stdout:mount.ceph: options "norequire_active_mds,name=0,norbytes" will pass to kernel.
2021-04-07T03:51:31.759 INFO:teuthology.orchestra.run.smithi044.stdout:parsing options: rw,norequire_active_mds,name=0,conf=/etc/ceph/ceph.conf,norbytes
2021-04-07T03:51:31.759 INFO:teuthology.orchestra.run.smithi044.stdout:mount.ceph: options "norequire_active_mds,name=0,norbytes" will pass to kernel.
2021-04-07T03:51:31.771 INFO:tasks.cephfs.kernel_mount:mount command passed
2021-04-07T03:51:31.772 INFO:teuthology.orchestra.run:Running command with timeout 300
2021-04-07T03:51:31.773 DEBUG:teuthology.orchestra.run.smithi044:> sudo chmod 1777 /home/ubuntu/cephtest/mnt.0
2021-04-07T03:51:31.876 INFO:teuthology.orchestra.run:Running command with timeout 900
2021-04-07T03:51:31.877 DEBUG:teuthology.orchestra.run.smithi044:> (cd /home/ubuntu/cephtest/mnt.0 && exec sudo ls)
2021-04-07T03:51:32.277 INFO:teuthology.orchestra.run.smithi044.stdout:lost+found
2021-04-07T03:51:32.282 INFO:teuthology.orchestra.run:Running command with timeout 900
2021-04-07T03:51:32.282 DEBUG:teuthology.orchestra.run.smithi044:> (cd /home/ubuntu/cephtest/mnt.0 && exec sudo ls lost+found)
2021-04-07T03:51:32.954 INFO:teuthology.orchestra.run.smithi044.stderr:ls: cannot access 'lost+found': No such file or directory

From: /ceph/teuthology-archive/pdonnell-2021-04-07_02:12:41-fs-wip-pdonnell-testing-20210406.213012-distro-basic-smithi/6025589/teuthology.log

Rather bizarre considering the client just returned that lost+found exists via ls.

Related issues 3 (0 open — 3 closed)

Actions

Copy link

Updated by Xiubo Li about 3 years ago

Status changed from New to In Progress

Actions

Copy link

Updated by Xiubo Li about 3 years ago

After https://github.com/ceph/ceph/pull/40868, I can reproduce it locally:

2021-04-15 17:07:33,805.805 INFO:__main__:----------------------------------------------------------------------
2021-04-15 17:07:33,805.805 INFO:__main__:Ran 1 test in 57.004s
2021-04-15 17:07:33,805.805 INFO:__main__:
2021-04-15 17:07:33,805.805 INFO:__main__:FAILED (errors=1)
2021-04-15 17:07:33,806.806 INFO:__main__:
2021-04-15 17:07:33,806.806 INFO:__main__:
2021-04-15 17:07:33,806.806 INFO:__main__:======================================================================
2021-04-15 17:07:33,806.806 INFO:__main__:ERROR: test_rebuild_backtraceless (tasks.cephfs.test_data_scan.TestDataScan)
2021-04-15 17:07:33,806.806 INFO:__main__:----------------------------------------------------------------------
2021-04-15 17:07:33,806.806 INFO:__main__:Traceback (most recent call last):
2021-04-15 17:07:33,806.806 INFO:__main__:  File "/data/ceph/qa/tasks/cephfs/test_data_scan.py", line 401, in test_rebuild_backtraceless
2021-04-15 17:07:33,806.806 INFO:__main__:    self._rebuild_metadata(BacktracelessFile(self.fs, self.mount_a))
2021-04-15 17:07:33,806.806 INFO:__main__:  File "/data/ceph/qa/tasks/cephfs/test_data_scan.py", line 384, in _rebuild_metadata
2021-04-15 17:07:33,806.806 INFO:__main__:    errors = workload.validate()
2021-04-15 17:07:33,806.806 INFO:__main__:  File "/data/ceph/qa/tasks/cephfs/test_data_scan.py", line 128, in validate
2021-04-15 17:07:33,806.806 INFO:__main__:    self.assert_equal(self._mount.ls("lost+found"), [ino_name])
2021-04-15 17:07:33,806.806 INFO:__main__:  File "/data/ceph/qa/tasks/cephfs/mount.py", line 1251, in ls
2021-04-15 17:07:33,807.807 INFO:__main__:    ls_text = self.run_shell(cmd).stdout.getvalue().strip()
2021-04-15 17:07:33,807.807 INFO:__main__:  File "/data/ceph/qa/tasks/cephfs/mount.py", line 704, in run_shell
2021-04-15 17:07:33,807.807 INFO:__main__:    return self.client_remote.run(args=args, cwd=cwd, timeout=timeout, stdout=stdout, stderr=stderr, **kwargs)
2021-04-15 17:07:33,807.807 INFO:__main__:  File "../qa/tasks/vstart_runner.py", line 413, in run
2021-04-15 17:07:33,807.807 INFO:__main__:  File "../qa/tasks/vstart_runner.py", line 479, in _do_run
2021-04-15 17:07:33,807.807 INFO:__main__:  File "../qa/tasks/vstart_runner.py", line 216, in wait
2021-04-15 17:07:33,807.807 INFO:__main__:teuthology.exceptions.CommandFailedError: Command failed with status 2: ['ls', 'lost+found']
2021-04-15 17:07:33,807.807 INFO:__main__:
2021-04-15 17:07:33,832.832 INFO:__main__:
2021-04-15 17:07:33,833.833 INFO:__main__:

Actions

Copy link

Updated by Xiubo Li about 3 years ago

For our privious Private-Inodes fixes, we have missed the "lost+found" dir, whose ino is 0x40.head.

 392 2021-04-19T09:53:31.601+0800 147cbd747700 15 client.5478 0: 'lost+found'
 393 2021-04-19T09:53:31.601+0800 147cbd747700 10 client.5478 update_inode_file_time 0x4.head(faked_ino=0 ref=0 ll_ref=0 cap_refs={} open=     {} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=- 0x147c94009f40) - ctime 0.000000 mtime 0.000000   
 394 2021-04-19T09:53:31.601+0800 147cbd747700 20 client.5478  dir hash is 2
 395 2021-04-19T09:53:31.601+0800 147cbd747700 12 client.5478 add_update_inode adding 0x4.head(faked_ino=0 ref=0 ll_ref=0 cap_refs={} open     ={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=- 0x147c94009f40) caps pAsLsXsFs
 396 2021-04-19T09:53:31.601+0800 147cbd747700 20 client.5478 get_snap_realm 0x1 0x147c94007480 2 -> 3
 397 2021-04-19T09:53:31.601+0800 147cbd747700 15 client.5478 add_update_cap first one, opened snaprealm 0x147c94007480
 398 2021-04-19T09:53:31.601+0800 147cbd747700 10 client.5478 add_update_cap issued - -> pAsLsXsFs from mds.0 on 0x4.head(faked_ino=0 ref=     0 ll_ref=0 cap_refs={} open={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=pAsLsXsFs(0=pAsLsXsFs) 0     x147c94009f40)
 399 2021-04-19T09:53:31.601+0800 147cbd747700 15 client.5478 link dir 0x147c94009770 'lost+found' to inode 0x147c94009f40 dn 0x147c9400a4     d0 (new dn)
 400 2021-04-19T09:53:31.601+0800 147cbd747700 15 inode.get on 0x147c94009f40 0x4.head now 1
 401 2021-04-19T09:53:31.601+0800 147cbd747700 15 inode.get on 0x147c94009f40 0x4.head now 2
 402 2021-04-19T09:53:31.601+0800 147cbd747700 20 client.5478 put_inode on 0x4.head(faked_ino=0 ref=2 ll_ref=0 cap_refs={} open={} mode=40     755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=pAsLsXsFs(0=pAsLsXsFs) parents=0x1.head["lost+found"] 0x147c94     009f40) n = 1

Actions

Copy link

Updated by Xiubo Li about 3 years ago

Pull request ID set to 40903

Actions

Copy link

Updated by Xiubo Li about 3 years ago

Status changed from In Progress to Fix Under Review

Actions

Copy link

Updated by Xiubo Li about 3 years ago

Fixed it in the kernel client too: https://patchwork.kernel.org/project/ceph-devel/list/?series=469331

Maybe Jeff could fold it into https://patchwork.kernel.org/project/ceph-devel/list/?series=460827.

Actions

Copy link

Updated by Jeff Layton about 3 years ago

How do you reproduce this on the kclient? The MDS would have to give this inode number as the result of a lookup, I'd think. What dentry maps to the lost+found inode, given that I don't see such a directory on cephfs?

Actions

Copy link

Updated by Xiubo Li almost 3 years ago

Jeff Layton wrote:

How do you reproduce this on the kclient? The MDS would have to give this inode number as the result of a lookup, I'd think. What dentry maps to the lost+found inode, given that I don't see such a directory on cephfs?

Please see "qa/tasks/cephfs/test_data_scan.py", it will do:

self.assert_equal(self._mount.ls(), ["lost+found"])
self.assert_equal(self._mount.ls("lost+found"), [ino_name])

It should be reproducible by running:

# python3 ../qa/tasks/vstart_runner.py tasks.cephfs.test_data_scan.TestDataScan.test_rebuild_backtraceless --kclient

Actions

Copy link

Updated by Patrick Donnelly almost 3 years ago

Status changed from Fix Under Review to Pending Backport

Actions

Copy link

#10

Updated by Backport Bot almost 3 years ago

Copied to Backport #50623: octopus: qa: "ls: cannot access 'lost+found': No such file or directory" added

Actions

Copy link

#11

Updated by Backport Bot almost 3 years ago

Copied to Backport #50624: pacific: qa: "ls: cannot access 'lost+found': No such file or directory" added

Actions

Copy link

#12

Updated by Backport Bot almost 3 years ago

Copied to Backport #50625: nautilus: qa: "ls: cannot access 'lost+found': No such file or directory" added

Actions

Copy link

#13

Updated by Loïc Dachary almost 3 years ago

Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #50216

qa: "ls: cannot access 'lost+found': No such file or directory"

Updated by Xiubo Li about 3 years ago

Updated by Xiubo Li about 3 years ago

Updated by Xiubo Li about 3 years ago

Updated by Xiubo Li about 3 years ago

Updated by Xiubo Li about 3 years ago

Updated by Xiubo Li about 3 years ago

Updated by Jeff Layton about 3 years ago

Updated by Xiubo Li almost 3 years ago

Updated by Patrick Donnelly almost 3 years ago

Updated by Backport Bot almost 3 years ago

Updated by Backport Bot almost 3 years ago

Updated by Backport Bot almost 3 years ago

Updated by Loïc Dachary almost 3 years ago