Bug #40102
closedqa: probable kernel deadlock/oops during umount on testing branch
0%
Description
2019-05-31T03:05:07.400 INFO:teuthology.orchestra.run.smithi109:Running: 2019-05-31T03:05:07.400 INFO:teuthology.orchestra.run.smithi109:> sudo adjust-ulimits daemon-helper kill python -c ' 2019-05-31T03:05:07.400 INFO:teuthology.orchestra.run.smithi109:> import os 2019-05-31T03:05:07.400 INFO:teuthology.orchestra.run.smithi109:> import stat 2019-05-31T03:05:07.400 INFO:teuthology.orchestra.run.smithi109:> import json 2019-05-31T03:05:07.400 INFO:teuthology.orchestra.run.smithi109:> import sys 2019-05-31T03:05:07.401 INFO:teuthology.orchestra.run.smithi109:> 2019-05-31T03:05:07.401 INFO:teuthology.orchestra.run.smithi109:> try: 2019-05-31T03:05:07.401 INFO:teuthology.orchestra.run.smithi109:> s = os.stat("/home/ubuntu/cephtest/mnt.0/datafile") 2019-05-31T03:05:07.401 INFO:teuthology.orchestra.run.smithi109:> except OSError as e: 2019-05-31T03:05:07.401 INFO:teuthology.orchestra.run.smithi109:> sys.exit(e.errno) 2019-05-31T03:05:07.401 INFO:teuthology.orchestra.run.smithi109:> 2019-05-31T03:05:07.401 INFO:teuthology.orchestra.run.smithi109:> attrs = ["st_mode", "st_ino", "st_dev", "st_nlink", "st_uid", "st_gid", "st_size", "st_atime", "st_mtime", "st_ctime"] 2019-05-31T03:05:07.401 INFO:teuthology.orchestra.run.smithi109:> print json.dumps( 2019-05-31T03:05:07.402 INFO:teuthology.orchestra.run.smithi109:> dict([(a, getattr(s, a)) for a in attrs]), 2019-05-31T03:05:07.402 INFO:teuthology.orchestra.run.smithi109:> indent=2) 2019-05-31T03:05:07.402 INFO:teuthology.orchestra.run.smithi109:> ' 2019-05-31T03:05:07.466 INFO:teuthology.orchestra.run.smithi109.stdout:{ 2019-05-31T03:05:07.466 INFO:teuthology.orchestra.run.smithi109.stdout: "st_ctime": 1559271907.390514, 2019-05-31T03:05:07.466 INFO:teuthology.orchestra.run.smithi109.stdout: "st_mtime": 1559271907.390514, 2019-05-31T03:05:07.466 INFO:teuthology.orchestra.run.smithi109.stdout: "st_nlink": 1, 2019-05-31T03:05:07.466 INFO:teuthology.orchestra.run.smithi109.stdout: "st_gid": 0, 2019-05-31T03:05:07.466 INFO:teuthology.orchestra.run.smithi109.stdout: "st_dev": 43, 2019-05-31T03:05:07.467 INFO:teuthology.orchestra.run.smithi109.stdout: "st_size": 33554432, 2019-05-31T03:05:07.467 INFO:teuthology.orchestra.run.smithi109.stdout: "st_ino": 1099511627776, 2019-05-31T03:05:07.467 INFO:teuthology.orchestra.run.smithi109.stdout: "st_uid": 0, 2019-05-31T03:05:07.467 INFO:teuthology.orchestra.run.smithi109.stdout: "st_mode": 33188, 2019-05-31T03:05:07.467 INFO:teuthology.orchestra.run.smithi109.stdout: "st_atime": 1559271906.966523 2019-05-31T03:05:07.467 INFO:teuthology.orchestra.run.smithi109.stdout:} 2019-05-31T03:05:07.661 DEBUG:tasks.cephfs.kernel_mount:Unmounting client client.0... 2019-05-31T03:05:07.661 INFO:teuthology.orchestra.run:Running command with timeout 900 2019-05-31T03:05:07.661 INFO:teuthology.orchestra.run.smithi109:Running: 2019-05-31T03:05:07.661 INFO:teuthology.orchestra.run.smithi109:> sudo umount /home/ubuntu/cephtest/mnt.0 2019-05-31T03:05:29.665 INFO:teuthology.orchestra.run.smithi002:Running: 2019-05-31T03:05:29.665 INFO:teuthology.orchestra.run.smithi002:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2019-05-31T03:05:29.669 INFO:teuthology.orchestra.run.smithi036:Running: 2019-05-31T03:05:29.669 INFO:teuthology.orchestra.run.smithi036:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2019-05-31T03:05:29.673 INFO:teuthology.orchestra.run.smithi079:Running: 2019-05-31T03:05:29.674 INFO:teuthology.orchestra.run.smithi079:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2019-05-31T03:05:29.677 INFO:teuthology.orchestra.run.smithi109:Running: 2019-05-31T03:05:29.678 INFO:teuthology.orchestra.run.smithi109:> sudo logrotate /etc/logrotate.d/ceph-test.conf 2019-05-31T03:20:07.668 ERROR:teuthology:Uncaught exception (Hub) Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/virtualenv/local/lib/python2.7/site-packages/gevent/greenlet.py", line 536, in run result = self._run(*self.args, **self.kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 307, in copy_file_to copy_to_log(src, logger, capture=stream) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 276, in copy_to_log for line in f: File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/file.py", line 102, in next line = self.readline() File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/file.py", line 277, in readline new_data = self._read(n) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/channel.py", line 1305, in _read return self.channel.recv_stderr(size) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/channel.py", line 715, in recv_stderr raise socket.timeout() timeout 2019-05-31T03:20:07.674 ERROR:teuthology:Uncaught exception (Hub) Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/virtualenv/local/lib/python2.7/site-packages/gevent/greenlet.py", line 536, in run result = self._run(*self.args, **self.kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 307, in copy_file_to copy_to_log(src, logger, capture=stream) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 276, in copy_to_log for line in f: File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/file.py", line 102, in next line = self.readline() File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/file.py", line 277, in readline new_data = self._read(n) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/channel.py", line 1293, in _read return self.channel.recv(size) File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/channel.py", line 667, in recv raise socket.timeout() timeout 2019-05-31T03:21:27.992 ERROR:paramiko.transport:Socket exception: No route to host (113) 2019-05-31T03:21:27.993 DEBUG:teuthology.orchestra.run:got remote process result: None 2019-05-31T03:21:27.993 INFO:teuthology.orchestra.remote:Trying to reconnect to host 2019-05-31T03:21:27.994 DEBUG:teuthology.orchestra.connection:{'username': 'ubuntu', 'hostname': 'smithi109.front.sepia.ceph.com', 'timeout': 60} 2019-05-31T03:21:28.001 DEBUG:tasks.ceph:Missed logrotate, host unreachable 2019-05-31T03:21:31.064 DEBUG:teuthology.orchestra.remote:[Errno None] Unable to connect to port 22 on 172.21.15.109 2019-05-31T03:21:31.065 INFO:tasks.cephfs_test_runner:test_rebuild_nondefault_layout (tasks.cephfs.test_data_scan.TestDataScan) ... ERROR
From: /ceph/teuthology-archive/yuriw-2019-05-30_20:50:30-kcephfs-mimic_v13.2.6_QE-testing-basic-smithi/3989164/teuthology.log
This was with mimic and the testing branch of the kclient. Probably has nothign to do with mimic.
Updated by Patrick Donnelly almost 5 years ago
Another: /ceph/teuthology-archive/yuriw-2019-05-30_20:50:30-kcephfs-mimic_v13.2.6_QE-testing-basic-smithi/3989039/teuthology.log
Updated by Patrick Donnelly almost 5 years ago
Another: /ceph/teuthology-archive/yuriw-2019-05-30_20:50:30-kcephfs-mimic_v13.2.6_QE-testing-basic-smithi/3989013/teuthology.log
Updated by Zheng Yan almost 5 years ago
it's kernel BUG at fs/ceph/mds_client.c:1500!
BUG_ON(session->s_nr_caps > 0);
No idea how can it happen
Updated by Zheng Yan almost 5 years ago
- Status changed from New to 7
it's a longstanding bug. fix by "ceph: use ceph_evict_inode to cleanup inode's resource" in https://github.com/ceph/ceph-client/tree/testing
Updated by Ilya Dryomov over 4 years ago
- Status changed from 7 to Resolved
Updated by Ilya Dryomov over 4 years ago
The backport to 4.19 was incorrect, 4.19.76 is busted. Fixed in 4.19.77.
Updated by Ilya Dryomov over 4 years ago
Ilya Dryomov wrote:
The backport to 4.19 was incorrect, 4.19.76 is busted. Fixed in 4.19.77.
This goes for Ubuntu Disco 5.0.0-32 kernel as well: https://marc.info/?l=ceph-users&m=157167769117987&w=2
Updated by Nathan Fish over 4 years ago
5.0.0-32 introduced the bad backport, -33 reverted it:
http://changelogs.ubuntu.com/changelogs/pool/main/l/linux/linux_5.0.0-33.35/changelog