Project

General

Profile

Bug #55824

Updated by Xiubo Li almost 2 years ago

https://pulpito.ceph.com/vshankar-2022-05-31_02:47:51-fs-wip-vshankar-fscrypt-20220530-091336-testing-default-smithi/6853737/ 


 When mounting a *_ceph-fuse_* it failed: 

 <pre> 
 2022-05-31T03:13:43.772 DEBUG:teuthology.orchestra.run:got remote process result: 32 
 2022-05-31T03:13:43.773 INFO:teuthology.orchestra.run:Running command with timeout 300 
 2022-05-31T03:13:43.773 DEBUG:teuthology.orchestra.run.smithi045:> ls /sys/fs/fuse/connections 
 2022-05-31T03:13:43.806 INFO:tasks.cephfs.fuse_mount.ceph-fuse.admin.smithi045.stderr:2022-05-31T03:13:43.806+0000 7f7ece3c53c0 -1 init, newargv = 0x55c8f14132f0 newargc=15 
 2022-05-31T03:13:43.806 INFO:tasks.cephfs.fuse_mount.ceph-fuse.admin.smithi045.stderr:ceph-fuse[88614]: starting ceph client 
 2022-05-31T03:13:43.811 INFO:tasks.cephfs.fuse_mount.ceph-fuse.admin.smithi045.stderr:ceph-fuse[88614]: probably no MDS server is up? 
 2022-05-31T03:13:43.812 INFO:tasks.cephfs.fuse_mount.ceph-fuse.admin.smithi045.stderr:ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536 
 2022-05-31T03:13:43.975 INFO:tasks.cephfs.fuse_mount.ceph-fuse.admin.smithi045.stderr:daemon-helper: command failed with exit status 1 
 2022-05-31T03:13:44.931 DEBUG:teuthology.orchestra.run.smithi045:> sudo modprobe fuse 
 2022-05-31T03:13:45.023 INFO:teuthology.orchestra.run:Running command with timeout 30 
 2022-05-31T03:13:45.023 DEBUG:teuthology.orchestra.run.smithi045:> sudo mount -t fusectl /sys/fs/fuse/connections /sys/fs/fuse/connections 
 2022-05-31T03:13:45.106 INFO:teuthology.orchestra.run.smithi045.stderr:mount: /sys/fs/fuse/connections: /sys/fs/fuse/connections already mounted or mount point busy. 
 2022-05-31T03:13:45.110 DEBUG:teuthology.orchestra.run:got remote process result: 32 
 2022-05-31T03:13:45.111 INFO:teuthology.orchestra.run:Running command with timeout 300 
 2022-05-31T03:13:45.111 DEBUG:teuthology.orchestra.run.smithi045:> ls /sys/fs/fuse/connections 
 2022-05-31T03:13:45.165 DEBUG:teuthology.orchestra.run:got remote process result: 1 
 2022-05-31T03:13:45.165 INFO:tasks.cephfs.fuse_mount:mount command failed. 
 2022-05-31T03:13:45.165 ERROR:teuthology.contextutil:Saw exception from nested tasks 
 Traceback (most recent call last): 
   File "/home/teuthworker/src/git.ceph.com_git_teuthology_1b30281f276b97a8186594b7f92fe1f728418ada/teuthology/contextutil.py", line 31, in nested 
     vars.append(enter()) 
   File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__ 
     return next(self.gen) 
   File "/home/teuthworker/src/git.ceph.com_ceph-c_9e1d1677a609de7d04588e347b7aaa17ce9b38a2/qa/tasks/ceph.py", line 447, in cephfs_setup 
     fs = Filesystem(ctx, fs_config=temp, name=name, create=True) 
   File "/home/teuthworker/src/git.ceph.com_ceph-c_9e1d1677a609de7d04588e347b7aaa17ce9b38a2/qa/tasks/cephfs/filesystem.py", line 512, in __init__ 
     self.create() 
   File "/home/teuthworker/src/git.ceph.com_ceph-c_9e1d1677a609de7d04588e347b7aaa17ce9b38a2/qa/tasks/cephfs/filesystem.py", line 680, in create 
     self.run_client_payload(f"setfattr -n ceph.dir.layout.pool -v {ec_data_pool_name} . && getfattr -n ceph.dir.layout .") 
   File "/home/teuthworker/src/git.ceph.com_ceph-c_9e1d1677a609de7d04588e347b7aaa17ce9b38a2/qa/tasks/cephfs/filesystem.py", line 718, in run_client_payload 
     m.mount_wait() 
   File "/home/teuthworker/src/git.ceph.com_ceph-c_9e1d1677a609de7d04588e347b7aaa17ce9b38a2/qa/tasks/cephfs/mount.py", line 453, in mount_wait 
     self.mount(**kwargs) 
   File "/home/teuthworker/src/git.ceph.com_ceph-c_9e1d1677a609de7d04588e347b7aaa17ce9b38a2/qa/tasks/cephfs/fuse_mount.py", line 52, in mount 
     return self._mount(mntopts, check_status) 
   File "/home/teuthworker/src/git.ceph.com_ceph-c_9e1d1677a609de7d04588e347b7aaa17ce9b38a2/qa/tasks/cephfs/fuse_mount.py", line 68, in _mount 
     retval = self._run_mount_cmd(mntopts, check_status) 
   File "/home/teuthworker/src/git.ceph.com_ceph-c_9e1d1677a609de7d04588e347b7aaa17ce9b38a2/qa/tasks/cephfs/fuse_mount.py", line 96, in _run_mount_cmd 
     check_status, pre_mount_conns, mountcmd_stdout, mountcmd_stderr) 
   File "/home/teuthworker/src/git.ceph.com_ceph-c_9e1d1677a609de7d04588e347b7aaa17ce9b38a2/qa/tasks/cephfs/fuse_mount.py", line 171, in _wait_and_record_our_fuse_conn 
     self.fuse_daemon.wait() 
   File "/home/teuthworker/src/git.ceph.com_git_teuthology_1b30281f276b97a8186594b7f92fe1f728418ada/teuthology/orchestra/run.py", line 161, in wait 
     self._raise_for_status() 
   File "/home/teuthworker/src/git.ceph.com_git_teuthology_1b30281f276b97a8186594b7f92fe1f728418ada/teuthology/orchestra/run.py", line 183, in _raise_for_status 
     node=self.hostname, label=self.label 
 teuthology.exceptions.CommandFailedError: Command failed on smithi045 with status 1: "(cd /home/ubuntu/cephtest && exec sudo nsenter --net=/var/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.admin sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-fuse -f --admin-socket '/var/run/ceph/$cluster-$name.$pid.asok' /home/ubuntu/cephtest/mnt.admin --id admin --client_fs=cephfs)" 
 2022-05-31T03:13:45.168 INFO:teuthology.misc:Shutting down mds daemons... 
 2022-05-31T03:13:45.168 DEBUG:tasks.ceph.mds.a:waiting for process to exit 
 2022-05-31T03:13:45.168 INFO:teuthology.orchestra.run:waiting for 300 
 </pre> 

 From the *_mds.0_* logs: 

 <pre> 
 2022-05-31T03:13:43.482+0000 7f21b5089700 10 mds.0.cache cache not ready for trimming 
 2022-05-31T03:13:43.482+0000 7f21b5089700 20 mds.0.cache upkeep thread waiting interval 1.000000000s 
 2022-05-31T03:13:43.493+0000 7f21b708d700    5 mds.beacon.d Sending beacon up:active seq 5 
 2022-05-31T03:13:43.493+0000 7f21b708d700    1 -- [v2:172.21.15.136:6835/1045808825,v1:172.21.15.136:6837/1045808825] --> [v2:172.21.15.136:3300/0,v1:172.21.15.136:6789/0] -- mdsbeacon(4270/d up:active seq=5 v5) v8 -- 0x5639ce331340 con 0x5639ce159400 
 2022-05-31T03:13:43.493+0000 7f21b708d700 20 mds.beacon.d sender thread waiting interval 4s 
 2022-05-31T03:13:43.957+0000 7f21b9091700    1 -- [v2:172.21.15.136:6835/1045808825,v1:172.21.15.136:6837/1045808825] <== mon.1 v2:172.21.15.136:3300/0 49 ==== mdsmap(e 6) v2 ==== 1094+0+0 (secure 0 0 0) 0x5639ce1bb380 con 0x5639ce159400 
 2022-05-31T03:13:43.957+0000 7f21b9091700    1 mds.d Updating MDS map to version 6 from mon.1 
 2022-05-31T03:13:43.957+0000 7f21b9091700 10 mds.d        my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} 
 2022-05-31T03:13:43.957+0000 7f21b9091700 10 mds.d    mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} 
 2022-05-31T03:13:43.957+0000 7f21b9091700 10 mds.d my gid is 4270 
 2022-05-31T03:13:43.957+0000 7f21b9091700 10 mds.d map says I am mds.0.4 state up:active 
 2022-05-31T03:13:43.957+0000 7f21b9091700 10 mds.d msgr says I am [v2:172.21.15.136:6835/1045808825,v1:172.21.15.136:6837/1045808825] 
 2022-05-31T03:13:43.957+0000 7f21b9091700 10 mds.d handle_mds_map: handling map as rank 0 
 2022-05-31T03:13:43.959+0000 7f21bc097700    1 -- [v2:172.21.15.136:6835/1045808825,v1:172.21.15.136:6837/1045808825] <== mon.1 v2:172.21.15.136:3300/0 50 ==== mdsbeacon(4270/d up:active seq=5 v6) v8 ==== 130+0+0 (secure 0 0 0) 0x5639ce331080 con 0x5639ce159400 
 2022-05-31T03:13:43.959+0000 7f21bc097700    5 mds.beacon.d received beacon reply up:active seq 5 rtt 0.46599 
 2022-05-31T03:13:44.051+0000 7f21b688c700    1 -- [v2:172.21.15.136:6835/1045808825,v1:172.21.15.136:6837/1045808825] --> [v2:172.21.15.136:6832/87765,v1:172.21.15.136:6833/87765] -- mgrreport(unknown.d +0-0 packed 1534 task_status=0) v9 -- 0x5639cd42d180 con 0x5639ce159800 
 </pre> 


 The client logs: 

 <pre> 
 2022-05-31T03:13:43.811+0000 7f7eaa7fc700 20 client.4441 tick 
 2022-05-31T03:13:43.811+0000 7f7eaa7fc700 20 client.4441 collect_and_send_metrics 
 2022-05-31T03:13:43.811+0000 7f7eaa7fc700 20 client.4441 collect_and_send_global_metrics 
 2022-05-31T03:13:43.811+0000 7f7eaa7fc700    5 client.4441 collect_and_send_global_metrics: no session with rank=0 -- not sending metric 
 2022-05-31T03:13:43.811+0000 7f7eaa7fc700 20 client.4441 trim_cache size 0 max 16384 
 2022-05-31T03:13:43.811+0000 7f7eaa7fc700 20 client.4441 upkeep thread waiting interval 1.000000000s 
 2022-05-31T03:13:43.812+0000 7f7eabfff700    1 client.4441 handle_mds_map epoch 5 
 2022-05-31T03:13:43.812+0000 7f7ece3c53c0 10 client.4441 mds cluster unavailable: epoch=5 
 2022-05-31T03:13:43.812+0000 7f7ece3c53c0    1 client.4441 shutdown 
 2022-05-31T03:13:43.812+0000 7f7ece3c53c0 10 client.4441 shutdown stopping cache invalidator finisher 
 2022-05-31T03:13:43.812+0000 7f7ece3c53c0 10 client.4441 shutdown stopping dentry invalidator finisher 
 2022-05-31T03:13:43.812+0000 7f7ece3c53c0 10 client.4441 shutdown stopping interrupt finisher 
 2022-05-31T03:13:43.812+0000 7f7ece3c53c0 10 client.4441 shutdown stopping remount finisher 
 </pre> 


 Still not sure why the *_ceph-fuse_* didn't detect any active MDS.

Back