Project

General

Profile

Bug #46883

kclient: ghost kernel mount

Added by Patrick Donnelly 3 months ago. Updated 10 days ago.

Status:
Fix Under Review
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
kceph, qa-suite
Labels (FS):
qa, qa-failure
Pull request ID:
Crash signature:

Description

Relevant snippets of teuthology log:

2020-08-08T16:09:17.162 INFO:teuthology.orchestra.run.smithi114:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 900 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.a.asok --format=json session evict 6737
...
2020-08-08T16:09:18.433 DEBUG:tasks.cephfs.kernel_mount:Unmounting client client.0...
2020-08-08T16:09:18.433 INFO:teuthology.orchestra.run:Running command with timeout 900
2020-08-08T16:09:18.434 INFO:teuthology.orchestra.run.smithi072:> sudo umount /home/ubuntu/cephtest/mnt.0 -f
2020-08-08T16:09:18.479 INFO:teuthology.orchestra.run.smithi072.stderr:umount: /home/ubuntu/cephtest/mnt.0: target is busy.
...
2020-08-08T16:09:20.022 INFO:teuthology.orchestra.run.smithi072:> sudo umount -f -l /home/ubuntu/cephtest/mnt.0
2020-08-08T16:09:20.270 INFO:teuthology.orchestra.run:Running command with timeout 300
2020-08-08T16:09:20.271 INFO:teuthology.orchestra.run.smithi072:> (cd /home/ubuntu/cephtest && exec rmdir -- /home/ubuntu/cephtest/mnt.0)
...
020-08-08T16:09:22.489 INFO:tasks.cephfs.mount:Ready to start KernelMount...
2020-08-08T16:09:22.489 INFO:tasks.cephfs.mount:Setting the 'ceph-ns--home-ubuntu-cephtest-mnt.0' netns for '/home/ubuntu/cephtest/mnt.0'
...
2020-08-08T16:09:22.841 INFO:teuthology.orchestra.run.smithi072:> mkdir -p /home/ubuntu/cephtest/mnt.0
2020-08-08T16:09:22.895 INFO:teuthology.orchestra.run:Running command with timeout 1800
2020-08-08T16:09:22.896 INFO:teuthology.orchestra.run.smithi072:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage nsenter --net=/var/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.0 /bin/mount -t ceph :/ /home/ubuntu/cephtest/mnt.0 -v -o name=0,norequire_active_mds,conf=/etc/ceph/ceph.conf
2020-08-08T16:09:22.992 INFO:teuthology.orchestra.run.smithi072.stdout:parsing options: rw,name=0,norequire_active_mds,conf=/etc/ceph/ceph.conf
2020-08-08T16:09:22.993 INFO:teuthology.orchestra.run.smithi072.stdout:mount.ceph: options "name=0,norequire_active_mds" will pass to kernel.
2020-08-08T16:09:22.995 INFO:teuthology.orchestra.run.smithi072.stdout:parsing options: rw,name=0,norequire_active_mds,conf=/etc/ceph/ceph.conf
2020-08-08T16:09:22.996 INFO:teuthology.orchestra.run.smithi072.stdout:mount.ceph: options "name=0,norequire_active_mds" will pass to kernel.
2020-08-08T16:09:22.998 INFO:teuthology.orchestra.run:Running command with timeout 300
2020-08-08T16:09:22.998 INFO:teuthology.orchestra.run.smithi072:> sudo chmod 1777 /home/ubuntu/cephtest/mnt.0
2020-08-08T16:09:25.966 INFO:teuthology.orchestra.run.smithi072.stderr:chmod: cannot access '/home/ubuntu/cephtest/mnt.0': Permission denied
2020-08-08T16:09:25.968 DEBUG:teuthology.orchestra.run:got remote process result: 1

From: /ceph/teuthology-archive/pdonnell-2020-08-08_02:19:19-kcephfs-wip-pdonnell-testing-20200808.001303-distro-basic-smithi/5319130/teuthology.log

Then from the kernel syslog:

2020-08-08T16:09:05.425102+00:00 smithi072 kernel: libceph: mon2 (1)172.21.15.197:6790 session established
2020-08-08T16:09:05.425176+00:00 smithi072 kernel: libceph: client6737 fsid 2a4fe1bc-6c9e-4855-8e2e-c3febc31e26b
2020-08-08T16:09:18.271102+00:00 smithi072 kernel: libceph: mds0 (1)172.21.15.114:6835 socket closed (con state OPEN)
2020-08-08T16:09:18.430096+00:00 smithi072 kernel: ceph-brx: port 1(brx.0) entered disabled state
2020-08-08T16:09:18.478710+00:00 smithi072 kernel: ceph:  dropping file locks for 000000007fa0e095 1099511627776
2020-08-08T16:09:18.478762+00:00 smithi072 kernel: ceph:  dropping file locks for 000000007dd06430 1099511627777
2020-08-08T16:09:22.843915+00:00 smithi072 kernel: IPv6: ADDRCONF(NETDEV_UP): brx.0: link is not ready
2020-08-08T16:09:22.843998+00:00 smithi072 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): brx.0: link becomes ready
2020-08-08T16:09:22.844030+00:00 smithi072 kernel: ceph-brx: port 1(brx.0) entered blocking state 
2020-08-08T16:09:22.855959+00:00 smithi072 kernel: ceph-brx: port 1(brx.0) entered forwarding state 
2020-08-08T16:09:25.578999+00:00 smithi072 kernel: libceph: mds0 (1)172.21.15.114:6835 connection reset
2020-08-08T16:09:25.579048+00:00 smithi072 kernel: libceph: reset on mds0
2020-08-08T16:09:25.579072+00:00 smithi072 kernel: ceph: mds0 closed our session
2020-08-08T16:09:25.583476+00:00 smithi072 kernel: ceph: mds0 reconnect start
2020-08-08T16:09:25.597111+00:00 smithi072 kernel: libceph: mds0 (1)172.21.15.114:6835 socket closed (con state NEGOTIATING)
2020-08-08T16:09:25.965071+00:00 smithi072 kernel: ceph: mds0 rejected session

From: /ceph/teuthology-archive/pdonnell-2020-08-08_02:19:19-kcephfs-wip-pdonnell-testing-20200808.001303-distro-basic-smithi/5319130/remote/smithi072/syslog/kern.log.gz

So the client started at 16:09:05 was killed by the test suite 16:09:17. The test kills the mount and cleans up:

https://github.com/ceph/ceph/blob/9b2d0a90fc890029ff559bcc5cc88b01bebb4dea/qa/tasks/cephfs/test_client_recovery.py#L433-L434

and then does a new mount but gets "Permission denied" when running chmod.

What's odd is I don't see the usual evidence of a new kernel mount, i.e. something like

libceph: mon2 (1)172.21.15.197:6790 session established

in the log at 16:09:22. It appears the old mount persists and that is what the MDS is showing in some of its traffic:

2020-08-08T16:09:25.586+0000 7f7509fad700  1 --1- [v2:172.21.15.114:6834/980265380,v1:172.21.15.114:6835/980265380] >>  conn(0x563d8dc88800 0x563d8dbf9000 :6835 s=ACCEPTING pgs=0 cs=0 l=0).send_server_banner sd=38 legacy v1:172.21.15.114:6835/980265380 socket_addr v1:172.21.15.114:6835/980265380 target_addr v1:172.21.15.72:47482/0
2020-08-08T16:09:25.586+0000 7f7509fad700 10 mds.a parse_caps: parsing auth_cap_str='allow'
2020-08-08T16:09:25.586+0000 7f75077a8700 10 mds.a  new session 0x563d8dbc4880 for client.6737 v1:172.21.15.72:0/1351493818 con 0x563d8dc88800
2020-08-08T16:09:25.586+0000 7f75077a8700 10 mds.a parse_caps: parsing auth_cap_str='allow'
2020-08-08T16:09:25.586+0000 7f75077a8700 10 mds.a ms_handle_accept v1:172.21.15.72:0/1351493818 con 0x563d8dc88800 session 0x563d8dbc4880
2020-08-08T16:09:25.586+0000 7f75077a8700  1 -- [v2:172.21.15.114:6834/980265380,v1:172.21.15.114:6835/980265380] <== client.6737 v1:172.21.15.72:0/1351493818 1 ==== client_reconnect(0 caps 0 realms ) v3 ==== 0+0+4 (unknown 0 0 0) 0x563d8db04ea0 con 0x563d8dc88800
2020-08-08T16:09:25.586+0000 7f75077a8700  7 mds.0.server handle_client_reconnect client.6737
2020-08-08T16:09:25.586+0000 7f75077a8700 20 mds.0.82 get_session have 0x563d8dbc4880 client.6737 v1:172.21.15.72:0/1351493818 state closed
2020-08-08T16:09:25.586+0000 7f75077a8700  0 mds.0.server  ignoring msg from not-open sessionclient_reconnect(0 caps 0 realms ) v3

From: /ceph/teuthology-archive/pdonnell-2020-08-08_02:19:19-kcephfs-wip-pdonnell-testing-20200808.001303-distro-basic-smithi/5319130/remote/smithi114/log/ceph-mds.a.log.gz

History

#1 Updated by Jeff Layton 3 months ago

I think the problem here is that teuthology is using umount -l. That just detaches the mount from the tree, but defers cleanup. The superblock can be found again, and if the options, etc. match, then it'll just reuse that instead of creating a new client, etc.

For this test, it may be simplest to create and mount a subdir and work under there, or maybe use a throwaway mds_namespace= mount option. That should ensure that the mount doesn't match the superblock with the blacklisted client.

#2 Updated by Patrick Donnelly 2 months ago

So there are two issues here:

  • umount should not use -l so we aren't papering over bugs. Use -f to umount. If -f fails/hangs, collect debug information and reboot the machine.
  • Use separate auth credentials for each mount so the superblocks are always different.

#3 Updated by Jeff Layton 2 months ago

Patrick Donnelly wrote:

So there are two issues here:

  • umount should not use -l so we aren't papering over bugs. Use -f to umount. If -f fails/hangs, collect debug information and reboot the machine.
  • Use separate auth credentials for each mount so the superblocks are always different.

Bear in mind too that with a forced umount, the cancellation may still not be immediate. So you will need to wait a bit -- possibly even a few minutes.

If we do need to reboot at that point, we should forcibly crash the box (echo c > /proc/sysrq-trigger) and collect a core dump via kdump. With that, we could analyze the core to determine the cause.

#4 Updated by Patrick Donnelly 15 days ago

  • Status changed from New to Triaged
  • Assignee set to Xiubo Li
  • Labels (FS) qa, qa-failure added

#5 Updated by Xiubo Li 13 days ago

Will work on it.

#6 Updated by Xiubo Li 13 days ago

Patrick Donnelly wrote:

So there are two issues here:

[...]

  • Use separate auth credentials for each mount so the superblocks are always different.

For making it not reuse the existing sb, how about add one `noshare` option instead ?

#7 Updated by Patrick Donnelly 11 days ago

  • Status changed from Triaged to Fix Under Review
  • Backport set to octopus,nautilus
  • Pull request ID set to 37652

#8 Updated by Jeff Layton 10 days ago

I'm not a fan of this noshare option. That seems like a hacky workaround for a problem that I'm not sure any of us fully understand. Also, why do we need special options to umount at all? That should just work.

I could understand if we were (e.g.) dealing with a test where the MDS has gone unresponsive. You might need a '-f' in that case, or something, but ordinarily it shouldn't be needed.

#9 Updated by Xiubo Li 10 days ago

From: /ceph/teuthology-archive/pdonnell-2020-08-08_02:19:19-kcephfs-wip-pdonnell-testing-20200808.001303-distro-basic-smithi/5319130/teuthology.log


2020-08-08T16:09:18.434 INFO:teuthology.orchestra.run.smithi072:> sudo umount /home/ubuntu/cephtest/mnt.0 -f
2020-08-08T16:09:18.479 INFO:teuthology.orchestra.run.smithi072.stderr:umount: /home/ubuntu/cephtest/mnt.0: target is busy.
2020-08-08T16:09:18.481 DEBUG:teuthology.orchestra.run:got remote process result: 32
2020-08-08T16:09:18.482 INFO:teuthology.orchestra.run:Running command with timeout 900
2020-08-08T16:09:18.483 INFO:teuthology.orchestra.run.smithi072:> sudo PATH=/usr/sbin:$PATH lsof ; ps auxf
2020-08-08T16:09:18.523 INFO:teuthology.orchestra.run.smithi072.stderr:lsof: WARNING: can't stat() ceph file system /home/ubuntu/cephtest/mnt.0
2020-08-08T16:09:18.524 INFO:teuthology.orchestra.run.smithi072.stderr:      Output information may be incomplete.
2020-08-08T16:09:18.848 INFO:teuthology.orchestra.run.smithi072.stdout:COMMAND     PID   TID TASKCMD             USER   FD      TYPE             DEVICE SIZE/OFF       NODE NAME
2020-08-08T16:09:18.849 INFO:teuthology.orchestra.run.smithi072.stdout:systemd       1                           root  cwd       DIR                8,1     4096          2 /

......

2020-08-08T16:09:19.944 INFO:teuthology.orchestra.run.smithi072.stdout:python3   33373                           root    3w  unknown        /home/ubuntu/cephtest/mnt.0/background_file-1 (stat: Input/output error)
2020-08-08T16:09:19.944 INFO:teuthology.orchestra.run.smithi072.stdout:python3   33373                           root    4w  unknown        /home/ubuntu/cephtest/mnt.0/background_file-2 (stat: Input/output error)

......

2020-08-08T16:09:20.016 INFO:teuthology.orchestra.run.smithi072.stdout:root       33355  0.0  0.0 144436  7676 ?        Ss   16:09   0:00          \_ sudo adjust-ulimits daemon-helper kill python3 -c  import time import fcntl import struct f1 = open("/home/ubuntu/cephtest/mnt.0/background_file-1", 'w') fcntl.flock(f1, fcntl.LOCK_EX | fcntl.LOCK_NB) f2 = open("/home/ubuntu/cephtest/mnt.0/background_file-2", 'w') lockdata = struct.pack('hhllhh', fcntl.F_WRLCK, 0, 0, 0, 0, 0) fcntl.fcntl(f2, fcntl.F_SETLK, lockdata) while True:     time.sleep(1)
2020-08-08T16:09:20.017 INFO:teuthology.orchestra.run.smithi072.stdout:root       33371  0.3  0.0  40060 10848 ?        S    16:09   0:00          |   \_ /usr/bin/python3 /bin/daemon-helper kill python3 -c  import time import fcntl import struct f1 = open("/home/ubuntu/cephtest/mnt.0/background_file-1", 'w') fcntl.flock(f1, fcntl.LOCK_EX | fcntl.LOCK_NB) f2 = open("/home/ubuntu/cephtest/mnt.0/background_file-2", 'w') lockdata = struct.pack('hhllhh', fcntl.F_WRLCK, 0, 0, 0, 0, 0) fcntl.fcntl(f2, fcntl.F_SETLK, lockdata) while True:     time.sleep(1)
2020-08-08T16:09:20.017 INFO:teuthology.orchestra.run.smithi072.stdout:root       33373  0.1  0.0  40308  8744 ?        Ss   16:09   0:00          |       \_ python3 -c  import time import fcntl import struct f1 = open("/home/ubuntu/cephtest/mnt.0/background_file-1", 'w') fcntl.flock(f1, fcntl.LOCK_EX | fcntl.LOCK_NB) f2 = open("/home/ubuntu/cephtest/mnt.0/background_file-2", 'w') lockdata = struct.pack('hhllhh', fcntl.F_WRLCK, 0, 0, 0, 0, 0) fcntl.fcntl(f2, fcntl.F_SETLK, lockdata) while True:     time.sleep(1)
2020-08-08T16:09:20.017 INFO:teuthology.orchestra.run.smithi072.stdout:ubuntu     33450  0.0  0.0  12696  3076 ?        Ss   16:09   0:00          \_ bash -c sudo PATH=/usr/sbin:$PATH lsof ; ps auxf

Checked the log again, there has serveral python process is still running and openning the mountpoint.

Also available in: Atom PDF