Bug #50216
closedqa: "ls: cannot access 'lost+found': No such file or directory"
0%
Description
2021-04-07T03:51:31.570 DEBUG:teuthology.orchestra.run.smithi044:> sudo nsenter --net=/var/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage /bin/mount -t ceph :/ /home/ubuntu/cephtest/mnt.0 -v -o norequire_active_mds,name=0,conf=/etc/ceph/ceph.conf,norbytes 2021-04-07T03:51:31.716 INFO:teuthology.orchestra.run.smithi044.stdout:parsing options: rw,norequire_active_mds,name=0,conf=/etc/ceph/ceph.conf,norbytes 2021-04-07T03:51:31.717 INFO:teuthology.orchestra.run.smithi044.stdout:mount.ceph: options "norequire_active_mds,name=0,norbytes" will pass to kernel. 2021-04-07T03:51:31.759 INFO:teuthology.orchestra.run.smithi044.stdout:parsing options: rw,norequire_active_mds,name=0,conf=/etc/ceph/ceph.conf,norbytes 2021-04-07T03:51:31.759 INFO:teuthology.orchestra.run.smithi044.stdout:mount.ceph: options "norequire_active_mds,name=0,norbytes" will pass to kernel. 2021-04-07T03:51:31.771 INFO:tasks.cephfs.kernel_mount:mount command passed 2021-04-07T03:51:31.772 INFO:teuthology.orchestra.run:Running command with timeout 300 2021-04-07T03:51:31.773 DEBUG:teuthology.orchestra.run.smithi044:> sudo chmod 1777 /home/ubuntu/cephtest/mnt.0 2021-04-07T03:51:31.876 INFO:teuthology.orchestra.run:Running command with timeout 900 2021-04-07T03:51:31.877 DEBUG:teuthology.orchestra.run.smithi044:> (cd /home/ubuntu/cephtest/mnt.0 && exec sudo ls) 2021-04-07T03:51:32.277 INFO:teuthology.orchestra.run.smithi044.stdout:lost+found 2021-04-07T03:51:32.282 INFO:teuthology.orchestra.run:Running command with timeout 900 2021-04-07T03:51:32.282 DEBUG:teuthology.orchestra.run.smithi044:> (cd /home/ubuntu/cephtest/mnt.0 && exec sudo ls lost+found) 2021-04-07T03:51:32.954 INFO:teuthology.orchestra.run.smithi044.stderr:ls: cannot access 'lost+found': No such file or directory
From: /ceph/teuthology-archive/pdonnell-2021-04-07_02:12:41-fs-wip-pdonnell-testing-20210406.213012-distro-basic-smithi/6025589/teuthology.log
Rather bizarre considering the client just returned that lost+found exists via ls.
Updated by Xiubo Li about 3 years ago
After https://github.com/ceph/ceph/pull/40868, I can reproduce it locally:
2021-04-15 17:07:33,805.805 INFO:__main__:---------------------------------------------------------------------- 2021-04-15 17:07:33,805.805 INFO:__main__:Ran 1 test in 57.004s 2021-04-15 17:07:33,805.805 INFO:__main__: 2021-04-15 17:07:33,805.805 INFO:__main__:FAILED (errors=1) 2021-04-15 17:07:33,806.806 INFO:__main__: 2021-04-15 17:07:33,806.806 INFO:__main__: 2021-04-15 17:07:33,806.806 INFO:__main__:====================================================================== 2021-04-15 17:07:33,806.806 INFO:__main__:ERROR: test_rebuild_backtraceless (tasks.cephfs.test_data_scan.TestDataScan) 2021-04-15 17:07:33,806.806 INFO:__main__:---------------------------------------------------------------------- 2021-04-15 17:07:33,806.806 INFO:__main__:Traceback (most recent call last): 2021-04-15 17:07:33,806.806 INFO:__main__: File "/data/ceph/qa/tasks/cephfs/test_data_scan.py", line 401, in test_rebuild_backtraceless 2021-04-15 17:07:33,806.806 INFO:__main__: self._rebuild_metadata(BacktracelessFile(self.fs, self.mount_a)) 2021-04-15 17:07:33,806.806 INFO:__main__: File "/data/ceph/qa/tasks/cephfs/test_data_scan.py", line 384, in _rebuild_metadata 2021-04-15 17:07:33,806.806 INFO:__main__: errors = workload.validate() 2021-04-15 17:07:33,806.806 INFO:__main__: File "/data/ceph/qa/tasks/cephfs/test_data_scan.py", line 128, in validate 2021-04-15 17:07:33,806.806 INFO:__main__: self.assert_equal(self._mount.ls("lost+found"), [ino_name]) 2021-04-15 17:07:33,806.806 INFO:__main__: File "/data/ceph/qa/tasks/cephfs/mount.py", line 1251, in ls 2021-04-15 17:07:33,807.807 INFO:__main__: ls_text = self.run_shell(cmd).stdout.getvalue().strip() 2021-04-15 17:07:33,807.807 INFO:__main__: File "/data/ceph/qa/tasks/cephfs/mount.py", line 704, in run_shell 2021-04-15 17:07:33,807.807 INFO:__main__: return self.client_remote.run(args=args, cwd=cwd, timeout=timeout, stdout=stdout, stderr=stderr, **kwargs) 2021-04-15 17:07:33,807.807 INFO:__main__: File "../qa/tasks/vstart_runner.py", line 413, in run 2021-04-15 17:07:33,807.807 INFO:__main__: File "../qa/tasks/vstart_runner.py", line 479, in _do_run 2021-04-15 17:07:33,807.807 INFO:__main__: File "../qa/tasks/vstart_runner.py", line 216, in wait 2021-04-15 17:07:33,807.807 INFO:__main__:teuthology.exceptions.CommandFailedError: Command failed with status 2: ['ls', 'lost+found'] 2021-04-15 17:07:33,807.807 INFO:__main__: 2021-04-15 17:07:33,832.832 INFO:__main__: 2021-04-15 17:07:33,833.833 INFO:__main__:
Updated by Xiubo Li about 3 years ago
For our privious Private-Inodes fixes, we have missed the "lost+found" dir, whose ino is 0x40.head.
392 2021-04-19T09:53:31.601+0800 147cbd747700 15 client.5478 0: 'lost+found' 393 2021-04-19T09:53:31.601+0800 147cbd747700 10 client.5478 update_inode_file_time 0x4.head(faked_ino=0 ref=0 ll_ref=0 cap_refs={} open= {} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=- 0x147c94009f40) - ctime 0.000000 mtime 0.000000 394 2021-04-19T09:53:31.601+0800 147cbd747700 20 client.5478 dir hash is 2 395 2021-04-19T09:53:31.601+0800 147cbd747700 12 client.5478 add_update_inode adding 0x4.head(faked_ino=0 ref=0 ll_ref=0 cap_refs={} open ={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=- 0x147c94009f40) caps pAsLsXsFs 396 2021-04-19T09:53:31.601+0800 147cbd747700 20 client.5478 get_snap_realm 0x1 0x147c94007480 2 -> 3 397 2021-04-19T09:53:31.601+0800 147cbd747700 15 client.5478 add_update_cap first one, opened snaprealm 0x147c94007480 398 2021-04-19T09:53:31.601+0800 147cbd747700 10 client.5478 add_update_cap issued - -> pAsLsXsFs from mds.0 on 0x4.head(faked_ino=0 ref= 0 ll_ref=0 cap_refs={} open={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=pAsLsXsFs(0=pAsLsXsFs) 0 x147c94009f40) 399 2021-04-19T09:53:31.601+0800 147cbd747700 15 client.5478 link dir 0x147c94009770 'lost+found' to inode 0x147c94009f40 dn 0x147c9400a4 d0 (new dn) 400 2021-04-19T09:53:31.601+0800 147cbd747700 15 inode.get on 0x147c94009f40 0x4.head now 1 401 2021-04-19T09:53:31.601+0800 147cbd747700 15 inode.get on 0x147c94009f40 0x4.head now 2 402 2021-04-19T09:53:31.601+0800 147cbd747700 20 client.5478 put_inode on 0x4.head(faked_ino=0 ref=2 ll_ref=0 cap_refs={} open={} mode=40 755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=pAsLsXsFs(0=pAsLsXsFs) parents=0x1.head["lost+found"] 0x147c94 009f40) n = 1
Updated by Xiubo Li about 3 years ago
- Status changed from In Progress to Fix Under Review
Updated by Xiubo Li about 3 years ago
Fixed it in the kernel client too: https://patchwork.kernel.org/project/ceph-devel/list/?series=469331
Maybe Jeff could fold it into https://patchwork.kernel.org/project/ceph-devel/list/?series=460827.
Updated by Jeff Layton about 3 years ago
How do you reproduce this on the kclient? The MDS would have to give this inode number as the result of a lookup, I'd think. What dentry maps to the lost+found inode, given that I don't see such a directory on cephfs?
Updated by Xiubo Li about 3 years ago
Jeff Layton wrote:
How do you reproduce this on the kclient? The MDS would have to give this inode number as the result of a lookup, I'd think. What dentry maps to the lost+found inode, given that I don't see such a directory on cephfs?
Please see "qa/tasks/cephfs/test_data_scan.py", it will do:
self.assert_equal(self._mount.ls(), ["lost+found"]) self.assert_equal(self._mount.ls("lost+found"), [ino_name])
It should be reproducible by running:
# python3 ../qa/tasks/vstart_runner.py tasks.cephfs.test_data_scan.TestDataScan.test_rebuild_backtraceless --kclient
Updated by Patrick Donnelly almost 3 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Backport Bot almost 3 years ago
- Copied to Backport #50623: octopus: qa: "ls: cannot access 'lost+found': No such file or directory" added
Updated by Backport Bot almost 3 years ago
- Copied to Backport #50624: pacific: qa: "ls: cannot access 'lost+found': No such file or directory" added
Updated by Backport Bot almost 3 years ago
- Copied to Backport #50625: nautilus: qa: "ls: cannot access 'lost+found': No such file or directory" added
Updated by Loïc Dachary almost 3 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".