Project

General

Profile

Bug #50216

qa: "ls: cannot access 'lost+found': No such file or directory"

Added by Patrick Donnelly 14 days ago. Updated 1 day ago.

Status:
Fix Under Review
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific,octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-04-07T03:51:31.570 DEBUG:teuthology.orchestra.run.smithi044:> sudo nsenter --net=/var/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage /bin/mount -t ceph :/ /home/ubuntu/cephtest/mnt.0 -v -o norequire_active_mds,name=0,conf=/etc/ceph/ceph.conf,norbytes
2021-04-07T03:51:31.716 INFO:teuthology.orchestra.run.smithi044.stdout:parsing options: rw,norequire_active_mds,name=0,conf=/etc/ceph/ceph.conf,norbytes
2021-04-07T03:51:31.717 INFO:teuthology.orchestra.run.smithi044.stdout:mount.ceph: options "norequire_active_mds,name=0,norbytes" will pass to kernel.
2021-04-07T03:51:31.759 INFO:teuthology.orchestra.run.smithi044.stdout:parsing options: rw,norequire_active_mds,name=0,conf=/etc/ceph/ceph.conf,norbytes
2021-04-07T03:51:31.759 INFO:teuthology.orchestra.run.smithi044.stdout:mount.ceph: options "norequire_active_mds,name=0,norbytes" will pass to kernel.
2021-04-07T03:51:31.771 INFO:tasks.cephfs.kernel_mount:mount command passed
2021-04-07T03:51:31.772 INFO:teuthology.orchestra.run:Running command with timeout 300
2021-04-07T03:51:31.773 DEBUG:teuthology.orchestra.run.smithi044:> sudo chmod 1777 /home/ubuntu/cephtest/mnt.0
2021-04-07T03:51:31.876 INFO:teuthology.orchestra.run:Running command with timeout 900
2021-04-07T03:51:31.877 DEBUG:teuthology.orchestra.run.smithi044:> (cd /home/ubuntu/cephtest/mnt.0 && exec sudo ls)
2021-04-07T03:51:32.277 INFO:teuthology.orchestra.run.smithi044.stdout:lost+found
2021-04-07T03:51:32.282 INFO:teuthology.orchestra.run:Running command with timeout 900
2021-04-07T03:51:32.282 DEBUG:teuthology.orchestra.run.smithi044:> (cd /home/ubuntu/cephtest/mnt.0 && exec sudo ls lost+found)
2021-04-07T03:51:32.954 INFO:teuthology.orchestra.run.smithi044.stderr:ls: cannot access 'lost+found': No such file or directory

From: /ceph/teuthology-archive/pdonnell-2021-04-07_02:12:41-fs-wip-pdonnell-testing-20210406.213012-distro-basic-smithi/6025589/teuthology.log

Rather bizarre considering the client just returned that lost+found exists via ls.

History

#1 Updated by Xiubo Li 8 days ago

  • Status changed from New to In Progress

#2 Updated by Xiubo Li 6 days ago

After https://github.com/ceph/ceph/pull/40868, I can reproduce it locally:

2021-04-15 17:07:33,805.805 INFO:__main__:----------------------------------------------------------------------
2021-04-15 17:07:33,805.805 INFO:__main__:Ran 1 test in 57.004s
2021-04-15 17:07:33,805.805 INFO:__main__:
2021-04-15 17:07:33,805.805 INFO:__main__:FAILED (errors=1)
2021-04-15 17:07:33,806.806 INFO:__main__:
2021-04-15 17:07:33,806.806 INFO:__main__:
2021-04-15 17:07:33,806.806 INFO:__main__:======================================================================
2021-04-15 17:07:33,806.806 INFO:__main__:ERROR: test_rebuild_backtraceless (tasks.cephfs.test_data_scan.TestDataScan)
2021-04-15 17:07:33,806.806 INFO:__main__:----------------------------------------------------------------------
2021-04-15 17:07:33,806.806 INFO:__main__:Traceback (most recent call last):
2021-04-15 17:07:33,806.806 INFO:__main__:  File "/data/ceph/qa/tasks/cephfs/test_data_scan.py", line 401, in test_rebuild_backtraceless
2021-04-15 17:07:33,806.806 INFO:__main__:    self._rebuild_metadata(BacktracelessFile(self.fs, self.mount_a))
2021-04-15 17:07:33,806.806 INFO:__main__:  File "/data/ceph/qa/tasks/cephfs/test_data_scan.py", line 384, in _rebuild_metadata
2021-04-15 17:07:33,806.806 INFO:__main__:    errors = workload.validate()
2021-04-15 17:07:33,806.806 INFO:__main__:  File "/data/ceph/qa/tasks/cephfs/test_data_scan.py", line 128, in validate
2021-04-15 17:07:33,806.806 INFO:__main__:    self.assert_equal(self._mount.ls("lost+found"), [ino_name])
2021-04-15 17:07:33,806.806 INFO:__main__:  File "/data/ceph/qa/tasks/cephfs/mount.py", line 1251, in ls
2021-04-15 17:07:33,807.807 INFO:__main__:    ls_text = self.run_shell(cmd).stdout.getvalue().strip()
2021-04-15 17:07:33,807.807 INFO:__main__:  File "/data/ceph/qa/tasks/cephfs/mount.py", line 704, in run_shell
2021-04-15 17:07:33,807.807 INFO:__main__:    return self.client_remote.run(args=args, cwd=cwd, timeout=timeout, stdout=stdout, stderr=stderr, **kwargs)
2021-04-15 17:07:33,807.807 INFO:__main__:  File "../qa/tasks/vstart_runner.py", line 413, in run
2021-04-15 17:07:33,807.807 INFO:__main__:  File "../qa/tasks/vstart_runner.py", line 479, in _do_run
2021-04-15 17:07:33,807.807 INFO:__main__:  File "../qa/tasks/vstart_runner.py", line 216, in wait
2021-04-15 17:07:33,807.807 INFO:__main__:teuthology.exceptions.CommandFailedError: Command failed with status 2: ['ls', 'lost+found']
2021-04-15 17:07:33,807.807 INFO:__main__:
2021-04-15 17:07:33,832.832 INFO:__main__:
2021-04-15 17:07:33,833.833 INFO:__main__:

#3 Updated by Xiubo Li 3 days ago

For our privious Private-Inodes fixes, we have missed the "lost+found" dir, whose ino is 0x40.head.

 392 2021-04-19T09:53:31.601+0800 147cbd747700 15 client.5478 0: 'lost+found'
 393 2021-04-19T09:53:31.601+0800 147cbd747700 10 client.5478 update_inode_file_time 0x4.head(faked_ino=0 ref=0 ll_ref=0 cap_refs={} open=     {} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=- 0x147c94009f40) - ctime 0.000000 mtime 0.000000   
 394 2021-04-19T09:53:31.601+0800 147cbd747700 20 client.5478  dir hash is 2
 395 2021-04-19T09:53:31.601+0800 147cbd747700 12 client.5478 add_update_inode adding 0x4.head(faked_ino=0 ref=0 ll_ref=0 cap_refs={} open     ={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=- 0x147c94009f40) caps pAsLsXsFs
 396 2021-04-19T09:53:31.601+0800 147cbd747700 20 client.5478 get_snap_realm 0x1 0x147c94007480 2 -> 3
 397 2021-04-19T09:53:31.601+0800 147cbd747700 15 client.5478 add_update_cap first one, opened snaprealm 0x147c94007480
 398 2021-04-19T09:53:31.601+0800 147cbd747700 10 client.5478 add_update_cap issued - -> pAsLsXsFs from mds.0 on 0x4.head(faked_ino=0 ref=     0 ll_ref=0 cap_refs={} open={} mode=40755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=pAsLsXsFs(0=pAsLsXsFs) 0     x147c94009f40)
 399 2021-04-19T09:53:31.601+0800 147cbd747700 15 client.5478 link dir 0x147c94009770 'lost+found' to inode 0x147c94009f40 dn 0x147c9400a4     d0 (new dn)
 400 2021-04-19T09:53:31.601+0800 147cbd747700 15 inode.get on 0x147c94009f40 0x4.head now 1
 401 2021-04-19T09:53:31.601+0800 147cbd747700 15 inode.get on 0x147c94009f40 0x4.head now 2
 402 2021-04-19T09:53:31.601+0800 147cbd747700 20 client.5478 put_inode on 0x4.head(faked_ino=0 ref=2 ll_ref=0 cap_refs={} open={} mode=40     755 size=0/0 nlink=1 btime=0.000000 mtime=0.000000 ctime=0.000000 caps=pAsLsXsFs(0=pAsLsXsFs) parents=0x1.head["lost+found"] 0x147c94     009f40) n = 1

#4 Updated by Xiubo Li 3 days ago

  • Pull request ID set to 40903

#5 Updated by Xiubo Li 3 days ago

  • Status changed from In Progress to Fix Under Review

#7 Updated by Jeff Layton 2 days ago

How do you reproduce this on the kclient? The MDS would have to give this inode number as the result of a lookup, I'd think. What dentry maps to the lost+found inode, given that I don't see such a directory on cephfs?

#8 Updated by Xiubo Li 1 day ago

Jeff Layton wrote:

How do you reproduce this on the kclient? The MDS would have to give this inode number as the result of a lookup, I'd think. What dentry maps to the lost+found inode, given that I don't see such a directory on cephfs?

Please see "qa/tasks/cephfs/test_data_scan.py", it will do:

self.assert_equal(self._mount.ls(), ["lost+found"])
self.assert_equal(self._mount.ls("lost+found"), [ino_name])

It should be reproducible by running:

# python3 ../qa/tasks/vstart_runner.py tasks.cephfs.test_data_scan.TestDataScan.test_rebuild_backtraceless --kclient

Also available in: Atom PDF