Project

General

Profile

Bug #50019

qa: mount failure with cephadm "probably no MDS server is up?"

Added by Patrick Donnelly 26 days ago.

Status:
Need More Info
Priority:
Normal
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
qa, qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-03-25T06:06:23.260 DEBUG:teuthology.orchestra.run.smithi144:> (cd /home/ubuntu/cephtest && exec sudo nsenter --net=/var/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.1 sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-fuse -f --admin-socket '/var/run/ceph/$cluster-$name.$pid.asok' /home/ubuntu/cephtest/mnt.1 --id 1 --client_fs=bar)
2021-03-25T06:06:23.304 DEBUG:teuthology.orchestra.run.smithi144:> sudo modprobe fuse
2021-03-25T06:06:23.328 INFO:teuthology.orchestra.run:Running command with timeout 30
2021-03-25T06:06:23.329 DEBUG:teuthology.orchestra.run.smithi144:> sudo mount -t fusectl /sys/fs/fuse/connections /sys/fs/fuse/connections
2021-03-25T06:06:23.393 INFO:teuthology.orchestra.run.smithi144.stderr:mount: /sys/fs/fuse/connections: /sys/fs/fuse/connections already mounted or mount point busy.
2021-03-25T06:06:23.395 DEBUG:teuthology.orchestra.run:got remote process result: 32
2021-03-25T06:06:23.395 INFO:teuthology.orchestra.run:Running command with timeout 900
2021-03-25T06:06:23.396 DEBUG:teuthology.orchestra.run.smithi144:> ls /sys/fs/fuse/connections
2021-03-25T06:06:23.424 INFO:tasks.cephfs.fuse_mount.ceph-fuse.1.smithi144.stderr:2021-03-25T06:06:23.421+0000 7f10f0bf1200 -1 init, newargv = 0x558fbd648540 newargc=15
2021-03-25T06:06:23.424 INFO:tasks.cephfs.fuse_mount.ceph-fuse.1.smithi144.stderr:ceph-fuse[49463]: starting ceph client
2021-03-25T06:06:23.426 INFO:tasks.cephfs.fuse_mount.ceph-fuse.1.smithi144.stderr:ceph-fuse[49463]: probably no MDS server is up?
2021-03-25T06:06:23.426 INFO:tasks.cephfs.fuse_mount.ceph-fuse.1.smithi144.stderr:ceph-fuse[49463]: ceph mount failed with (65536) Unknown error 65536
2021-03-25T06:06:23.598 INFO:tasks.cephfs.fuse_mount.ceph-fuse.1.smithi144.stderr:daemon-helper: command failed with exit status 1
2021-03-25T06:06:23.973 INFO:journalctl@ceph.mon.smithi019.smithi019.stdout:Mar 25 06:06:23 smithi019 conmon[26072]: cluster 2021-03-25T06:06:22.655578+0000 mon.smithi019 (
2021-03-25T06:06:23.973 INFO:journalctl@ceph.mon.smithi019.smithi019.stdout:Mar 25 06:06:23 smithi019 conmon[26072]: mon.0) 642 : cluster [DBG] mds.? [v2:172.21.15.144:6834/3277615949,v1:172.21.15.144:6835/3277615949] up:rejoin
2021-03-25T06:06:23.974 INFO:journalctl@ceph.mon.smithi019.smithi019.stdout:Mar 25 06:06:23 smithi019 conmon[26072]: cluster 2021-03-25T06:06:22.655660+0000 mon.smithi019 (mon.0) 643 : cluster [DBG] fsmap foo:1 bar:1/1 {bar:0=bar.smithi144.fpuqaw=up:rejoin,foo:0=foo.smithi019.drrlmw=up:active} 1 up:standby
2021-03-25T06:06:23.974 INFO:journalctl@ceph.mon.smithi019.smithi019.stdout:Mar 25 06:06:23 smithi019 conmon[26072]: cluster 2021-03-25T06:06:22.656029+0000 mon.smithi019 (mon.0) 644 : cluster [INF] daemon mds.bar.smithi144.fpuqaw is now active in filesystem bar as rank 0
2021-03-25T06:06:23.974 INFO:journalctl@ceph.mon.smithi019.smithi019.stdout:Mar 25 06:06:23 smithi019 conmon[26072]: cluster 2021-03-25T06:06:22.692448+0000 mon.smithi019 (mon.0) 645 : cluster [DBG] osdmap e64: 8 total, 8 up, 8 in
2021-03-25T06:06:23.974 INFO:journalctl@ceph.mon.smithi019.smithi019.stdout:Mar 25 06:06:23 smithi019 conmon[26072]: cluster 2021-03-25T06:06:23.429580+0000 mgr.smithi019.pfbjuu (mgr.14168) 260 : cluster [DBG] pgmap v234: 129 pgs: 129 active+clean; 4.9 KiB data, 46 MiB used, 715 GiB / 715 GiB avail; 853 B/s rd, 170 B/s wr, 1 op/s
2021-03-25T06:06:23.975 INFO:journalctl@ceph.mon.smithi144.smithi144.stdout:Mar 25 06:06:23 smithi144 conmon[27520]: cluster 2021-03-25T06:06:22.655578+0000 mon.smithi019 (
2021-03-25T06:06:23.975 INFO:journalctl@ceph.mon.smithi144.smithi144.stdout:Mar 25 06:06:23 smithi144 conmon[27520]: mon.0) 642 : cluster [DBG] mds.? [v2:172.21.15.144:6834/3277615949,v1:172.21.15.144:6835/3277615949] up:rejoin
2021-03-25T06:06:23.975 INFO:journalctl@ceph.mon.smithi144.smithi144.stdout:Mar 25 06:06:23 smithi144 conmon[27520]: cluster 2021-03-25T06:06:22.655660+0000 mon.smithi019 (mon.0) 643 : cluster [DBG] fsmap foo:1 bar:1/1 {bar:0=bar.smithi144.fpuqaw=up:rejoin,foo:0=foo.smithi019.drrlmw=up:active} 1 up:standby
2021-03-25T06:06:23.975 INFO:journalctl@ceph.mon.smithi144.smithi144.stdout:Mar 25 06:06:23 smithi144 conmon[27520]: cluster 2021-03-25T06:06:22.656029+0000 mon.smithi019 (mon.0) 644 : cluster [INF] daemon mds.bar.smithi144.fpuqaw is now active in filesystem bar as rank 0
2021-03-25T06:06:23.975 INFO:journalctl@ceph.mon.smithi144.smithi144.stdout:Mar 25 06:06:23 smithi144 conmon[27520]: cluster 2021-03-25T06:06:22.692448+0000 mon.smithi019 (mon.0) 645 : cluster [DBG] osdmap e64: 8 total, 8 up, 8 in
2021-03-25T06:06:23.976 INFO:journalctl@ceph.mon.smithi144.smithi144.stdout:Mar 25 06:06:23 smithi144 conmon[27520]: cluster 2021-03-25T06:06:23.429580+0000 mgr.smithi019.pfbjuu (mgr.14168) 260 : cluster [DBG] pgmap v234: 129 pgs: 129 active+clean; 4.9 KiB data, 46 MiB used, 715 GiB / 715 GiB avail; 853 B/s rd, 170 B/s wr, 1 op/s
2021-03-25T06:06:24.539 DEBUG:teuthology.orchestra.run.smithi144:> sudo modprobe fuse
2021-03-25T06:06:24.566 INFO:teuthology.orchestra.run:Running command with timeout 30
2021-03-25T06:06:24.567 DEBUG:teuthology.orchestra.run.smithi144:> sudo mount -t fusectl /sys/fs/fuse/connections /sys/fs/fuse/connections
2021-03-25T06:06:24.631 INFO:teuthology.orchestra.run.smithi144.stderr:mount: /sys/fs/fuse/connections: /sys/fs/fuse/connections already mounted or mount point busy.
2021-03-25T06:06:24.632 DEBUG:teuthology.orchestra.run:got remote process result: 32
2021-03-25T06:06:24.633 INFO:teuthology.orchestra.run:Running command with timeout 900
2021-03-25T06:06:24.633 DEBUG:teuthology.orchestra.run.smithi144:> ls /sys/fs/fuse/connections
2021-03-25T06:06:24.688 DEBUG:teuthology.orchestra.run:got remote process result: 1
2021-03-25T06:06:24.688 INFO:tasks.cephfs.fuse_mount:mount command failed.
2021-03-25T06:06:24.688 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_6b3150e9e0aa7ca432e26f31d87920ebd77f3708/teuthology/run_tasks.py", line 94, in run_tasks
    manager.__enter__()
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_0691d6bed3e3aaf89688b125297e25f6f6c3fae2/qa/tasks/ceph_fuse.py", line 161, in task
    mount_x.mount()
  File "/home/teuthworker/src/git.ceph.com_ceph-c_0691d6bed3e3aaf89688b125297e25f6f6c3fae2/qa/tasks/cephfs/fuse_mount.py", line 51, in mount
    return self._mount(mntopts, check_status)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_0691d6bed3e3aaf89688b125297e25f6f6c3fae2/qa/tasks/cephfs/fuse_mount.py", line 67, in _mount
    retval = self._run_mount_cmd(mntopts, check_status)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_0691d6bed3e3aaf89688b125297e25f6f6c3fae2/qa/tasks/cephfs/fuse_mount.py", line 95, in _run_mount_cmd
    check_status, pre_mount_conns, mountcmd_stdout, mountcmd_stderr)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_0691d6bed3e3aaf89688b125297e25f6f6c3fae2/qa/tasks/cephfs/fuse_mount.py", line 170, in _wait_and_record_our_fuse_conn
    self.fuse_daemon.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_6b3150e9e0aa7ca432e26f31d87920ebd77f3708/teuthology/orchestra/run.py", line 161, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_6b3150e9e0aa7ca432e26f31d87920ebd77f3708/teuthology/orchestra/run.py", line 183, in _raise_for_status
    node=self.hostname, label=self.label
teuthology.exceptions.CommandFailedError: Command failed on smithi144 with status 1: "(cd /home/ubuntu/cephtest && exec sudo nsenter --net=/var/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.1 sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-fuse -f --admin-socket '/var/run/ceph/$cluster-$name.$pid.asok' /home/ubuntu/cephtest/mnt.1 --id 1 --client_fs=bar)" 
2021-03-25T06:06:24.791 ERROR:teuthology.run_tasks: Sentry event: https://sentry.ceph.com/organizations/ceph/?query=e2c53b6ed56d464fb5d5bf944cd11501
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_6b3150e9e0aa7ca432e26f31d87920ebd77f3708/teuthology/run_tasks.py", line 94, in run_tasks
    manager.__enter__()
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_0691d6bed3e3aaf89688b125297e25f6f6c3fae2/qa/tasks/ceph_fuse.py", line 161, in task
    mount_x.mount()
  File "/home/teuthworker/src/git.ceph.com_ceph-c_0691d6bed3e3aaf89688b125297e25f6f6c3fae2/qa/tasks/cephfs/fuse_mount.py", line 51, in mount
    return self._mount(mntopts, check_status)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_0691d6bed3e3aaf89688b125297e25f6f6c3fae2/qa/tasks/cephfs/fuse_mount.py", line 67, in _mount
    retval = self._run_mount_cmd(mntopts, check_status)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_0691d6bed3e3aaf89688b125297e25f6f6c3fae2/qa/tasks/cephfs/fuse_mount.py", line 95, in _run_mount_cmd
    check_status, pre_mount_conns, mountcmd_stdout, mountcmd_stderr)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_0691d6bed3e3aaf89688b125297e25f6f6c3fae2/qa/tasks/cephfs/fuse_mount.py", line 170, in _wait_and_record_our_fuse_conn
    self.fuse_daemon.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_6b3150e9e0aa7ca432e26f31d87920ebd77f3708/teuthology/orchestra/run.py", line 161, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_6b3150e9e0aa7ca432e26f31d87920ebd77f3708/teuthology/orchestra/run.py", line 183, in _raise_for_status
    node=self.hostname, label=self.label
teuthology.exceptions.CommandFailedError: Command failed on smithi144 with status 1: "(cd /home/ubuntu/cephtest && exec sudo nsenter --net=/var/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.1 sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-fuse -f --admin-socket '/var/run/ceph/$cluster-$name.$pid.asok' /home/ubuntu/cephtest/mnt.1 --id 1 --client_fs=bar)" 

From: /ceph/teuthology-archive/pdonnell-2021-03-24_23:26:35-fs-wip-pdonnell-testing-20210324.190252-distro-basic-smithi/5995909/teuthology.log

fs.ready passed correctly. Can't see the cause of this error because there was no debugging on the client.

Also available in: Atom PDF