Actions
Bug #62848
closedqa: fail_fs upgrade scenario hanging
Status:
Duplicate
Priority:
Urgent
Assignee:
Category:
Testing
Target version:
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
qa-suite
Labels (FS):
qa, qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2023-09-12T17:30:00.275 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-4c41840e-518f-11ee-9ab7-7b867c8bd7da-mon-smithi173[187305]: 2023-09-12T17:29:59.999+0000 7fec1cd83700 -1 log_channel(cluster) log [ERR] : Health detail: HEALTH_ERR 1 filesystem is degraded; 1 filesystem has a failed mds daemon; 1 filesystem is offline 2023-09-12T17:30:00.276 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-4c41840e-518f-11ee-9ab7-7b867c8bd7da-mon-smithi173[187305]: 2023-09-12T17:29:59.999+0000 7fec1cd83700 -1 log_channel(cluster) log [ERR] : [WRN] FS_DEGRADED: 1 filesystem is degraded 2023-09-12T17:30:00.276 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-4c41840e-518f-11ee-9ab7-7b867c8bd7da-mon-smithi173[187305]: 2023-09-12T17:29:59.999+0000 7fec1cd83700 -1 log_channel(cluster) log [ERR] : fs cephfs is degraded 2023-09-12T17:30:00.276 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-4c41840e-518f-11ee-9ab7-7b867c8bd7da-mon-smithi173[187305]: 2023-09-12T17:29:59.999+0000 7fec1cd83700 -1 log_channel(cluster) log [ERR] : [WRN] FS_WITH_FAILED_MDS: 1 filesystem has a failed mds daemon 2023-09-12T17:30:00.276 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-4c41840e-518f-11ee-9ab7-7b867c8bd7da-mon-smithi173[187305]: 2023-09-12T17:29:59.999+0000 7fec1cd83700 -1 log_channel(cluster) log [ERR] : fs cephfs has 2 failed mdss 2023-09-12T17:30:00.277 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-4c41840e-518f-11ee-9ab7-7b867c8bd7da-mon-smithi173[187305]: 2023-09-12T17:29:59.999+0000 7fec1cd83700 -1 log_channel(cluster) log [ERR] : [ERR] MDS_ALL_DOWN: 1 filesystem is offline 2023-09-12T17:30:00.277 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-4c41840e-518f-11ee-9ab7-7b867c8bd7da-mon-smithi173[187305]: 2023-09-12T17:29:59.999+0000 7fec1cd83700 -1 log_channel(cluster) log [ERR] : fs cephfs is offline because no MDS is active for it. 2023-09-12T17:30:00.775 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-mon[187328]: Health detail: HEALTH_ERR 1 filesystem is degraded; 1 filesystem has a failed mds daemon; 1 filesystem is offline 2023-09-12T17:30:00.776 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-mon[187328]: [WRN] FS_DEGRADED: 1 filesystem is degraded 2023-09-12T17:30:00.776 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-mon[187328]: fs cephfs is degraded 2023-09-12T17:30:00.776 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-mon[187328]: [WRN] FS_WITH_FAILED_MDS: 1 filesystem has a failed mds daemon 2023-09-12T17:30:00.776 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-mon[187328]: fs cephfs has 2 failed mdss 2023-09-12T17:30:00.777 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-mon[187328]: [ERR] MDS_ALL_DOWN: 1 filesystem is offline 2023-09-12T17:30:00.777 INFO:journalctl@ceph.mon.smithi173.smithi173.stdout:Sep 12 17:30:00 smithi173 ceph-mon[187328]: fs cephfs is offline because no MDS is active for it. 2023-09-12T17:30:00.883 INFO:journalctl@ceph.mon.smithi204.smithi204.stdout:Sep 12 17:30:00 smithi204 ceph-mon[159017]: Health detail: HEALTH_ERR 1 filesystem is degraded; 1 filesystem has a failed mds daemon; 1 filesystem is offline 2023-09-12T17:30:00.884 INFO:journalctl@ceph.mon.smithi204.smithi204.stdout:Sep 12 17:30:00 smithi204 ceph-mon[159017]: [WRN] FS_DEGRADED: 1 filesystem is degraded 2023-09-12T17:30:00.884 INFO:journalctl@ceph.mon.smithi204.smithi204.stdout:Sep 12 17:30:00 smithi204 ceph-mon[159017]: fs cephfs is degraded 2023-09-12T17:30:00.884 INFO:journalctl@ceph.mon.smithi204.smithi204.stdout:Sep 12 17:30:00 smithi204 ceph-mon[159017]: [WRN] FS_WITH_FAILED_MDS: 1 filesystem has a failed mds daemon 2023-09-12T17:30:00.884 INFO:journalctl@ceph.mon.smithi204.smithi204.stdout:Sep 12 17:30:00 smithi204 ceph-mon[159017]: fs cephfs has 2 failed mdss 2023-09-12T17:30:00.884 INFO:journalctl@ceph.mon.smithi204.smithi204.stdout:Sep 12 17:30:00 smithi204 ceph-mon[159017]: [ERR] MDS_ALL_DOWN: 1 filesystem is offline 2023-09-12T17:30:00.885 INFO:journalctl@ceph.mon.smithi204.smithi204.stdout:Sep 12 17:30:00 smithi204 ceph-mon[159017]: fs cephfs is offline because no MDS is active for it. ... 2023-09-12T17:32:29.424 ERROR:teuthology:Uncaught exception (Hub) Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/channel.py", line 747, in recv_stderr out = self.in_stderr_buffer.read(nbytes, self.timeout) File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/buffered_pipe.py", line 164, in read raise PipeTimeout() paramiko.buffered_pipe.PipeTimeout During handling of the above exception, another exception occurred: Traceback (most recent call last): File "src/gevent/greenlet.py", line 906, in gevent._gevent_cgreenlet.Greenlet.run File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/teuthology/orchestra/run.py", line 323, in copy_file_to copy_to_log(src, logger, capture=stream, quiet=quiet) File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/teuthology/orchestra/run.py", line 276, in copy_to_log for line in f: File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/file.py", line 125, in __next__ line = self.readline() File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/file.py", line 291, in readline new_data = self._read(n) File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/channel.py", line 1376, in _read return self.channel.recv_stderr(size) File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/channel.py", line 749, in recv_stderr raise socket.timeout() socket.timeout 2023-09-12T17:32:29.434 ERROR:teuthology:Uncaught exception (Hub) Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/channel.py", line 699, in recv out = self.in_buffer.read(nbytes, self.timeout) File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/buffered_pipe.py", line 164, in read raise PipeTimeout() paramiko.buffered_pipe.PipeTimeout During handling of the above exception, another exception occurred: Traceback (most recent call last): File "src/gevent/greenlet.py", line 906, in gevent._gevent_cgreenlet.Greenlet.run File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/teuthology/orchestra/run.py", line 323, in copy_file_to copy_to_log(src, logger, capture=stream, quiet=quiet) File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/teuthology/orchestra/run.py", line 276, in copy_to_log for line in f: File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/file.py", line 125, in __next__ line = self.readline() File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/file.py", line 291, in readline new_data = self._read(n) File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/channel.py", line 1361, in _read return self.channel.recv(size) File "/home/teuthworker/src/git.ceph.com_teuthology_54e62bcbac4e53d9685e08328b790d3b20d71cae/virtualenv/lib/python3.8/site-packages/paramiko/channel.py", line 701, in recv raise socket.timeout() socket.timeout
From: /teuthology/pdonnell-2023-09-12_14:07:50-fs-wip-batrick-testing-20230912.122437-distro-default-smithi/7395159/teuthology.log
and others. Probably fallout from: 2b839838f70e9bcd31568013106aa7b5d2313bbe
Updated by Venky Shankar 8 months ago
- Related to Bug #62682: mon: no mdsmap broadcast after "fs set joinable" is set to true added
Updated by Venky Shankar 8 months ago
- Status changed from New to Triaged
- Assignee set to Patrick Donnelly
Updated by Patrick Donnelly 7 months ago
- Related to deleted (Bug #62682: mon: no mdsmap broadcast after "fs set joinable" is set to true)
Updated by Patrick Donnelly 7 months ago
- Is duplicate of Bug #62682: mon: no mdsmap broadcast after "fs set joinable" is set to true added
Updated by Patrick Donnelly 7 months ago
- Status changed from Triaged to Duplicate
Actions