Project

General

Profile

Actions

Bug #52430

open

mds: fast async create client mount breaks racy test

Added by Patrick Donnelly over 2 years ago. Updated almost 2 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
qa, qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-08-27T06:15:27.831 INFO:tasks.cephfs_test_runner:Starting test: test_rapid_creation (tasks.cephfs.test_fragment.TestFragmentation)
...
2021-08-27T06:16:09.526 DEBUG:teuthology.orchestra.run.smithi094:> sudo umount /home/ubuntu/cephtest/mnt.0

From: /ceph/teuthology-archive/pdonnell-2021-08-27_05:49:20-fs-wip-pdonnell-testing-20210827.024746-distro-basic-smithi/6361619/teuthology.log

umount hangs because:

[  911.040357] ceph: ceph: async create failure path=(1)splitdir/file_152 result=-28!                                     
[  911.048080] ceph: ceph_async_create_cb: no req->r_target_inode for 0x10000000099

It doesn't expect ENOSPC (which shouldn't happen for this test).

MDS says the reason was:

2021-08-27T06:15:57.265+0000 7f0bb7bb3700 20 Session check_access path /splitdir
2021-08-27T06:15:57.265+0000 7f0bb7bb3700 10 mds.0.server fragment [dir 0x10000000000 /splitdir/ [2,head] auth pv=305 v=1 cv=1/0 ap=156+306 state=1610616881|complete|freezingdir|fragmenting|committing f()->f(v0 m2021-08-27T06:15:57.204719+0000 152=152+0) n()->n(v0 rc2021-08-27T06:15:57.204719+0000 152=152+0) hs=0+153,ss=0+0 | child=1 dirty=1 authpin=1 0x562d08515600] size exceeds 152 (CEPHFS_ENOSPC)
2021-08-27T06:15:57.265+0000 7f0bb7bb3700  7 mds.0.server reply_client_request -28 ((28) No space left on device) client_request(client.5986:157 create #0x10000000000/file_152 2021-08-27T06:15:57.205719+0000 ASYNC caller_uid=1000, caller_gid=1254{6,36,1000,1254,}) v4

I think the proper fix here is to throw the client_request on a wait queue until the cdir is fragmented again.


Related issues 1 (0 open1 closed)

Related to Linux kernel client - Bug #52431: cephfs client must clean up bogus caps after failed async createDuplicateJeff Layton

Actions
Actions #1

Updated by Patrick Donnelly over 2 years ago

  • Related to Bug #52431: cephfs client must clean up bogus caps after failed async create added
Actions #2

Updated by Patrick Donnelly over 2 years ago

This one seems pretty reproducible: /ceph/teuthology-archive/pdonnell-2021-08-27_16:46:16-fs-wip-pdonnell-testing-20210827.024746-distro-basic-smithi/6362978/teuthology.log

Actions #3

Updated by Patrick Donnelly almost 2 years ago

  • Target version deleted (v17.0.0)
Actions #4

Updated by Rishabh Dave almost 2 years ago

Copying tracebacks for convenience (recently saw same test fail for different reason) -

Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/channel.py", line 747, in recv_stderr
    out = self.in_stderr_buffer.read(nbytes, self.timeout)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/buffered_pipe.py", line 164, in read
    raise PipeTimeout()
paramiko.buffered_pipe.PipeTimeout

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "src/gevent/greenlet.py", line 906, in gevent._gevent_cgreenlet.Greenlet.run
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/teuthology/orchestra/run.py", line 323, in copy_file_to
    copy_to_log(src, logger, capture=stream, quiet=quiet)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/teuthology/orchestra/run.py", line 276, in copy_to_log
    for line in f:
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/file.py", line 125, in __next__
    line = self.readline()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/file.py", line 291, in readline
    new_data = self._read(n)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/channel.py", line 1376, in _read
    return self.channel.recv_stderr(size)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/channel.py", line 749, in recv_stderr
    raise socket.timeout()
socket.timeout
2021-08-27T17:28:17.069 ERROR:teuthology:Uncaught exception (Hub)
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/channel.py", line 699, in recv
    out = self.in_buffer.read(nbytes, self.timeout)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/buffered_pipe.py", line 164, in read
    raise PipeTimeout()
paramiko.buffered_pipe.PipeTimeout

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "src/gevent/greenlet.py", line 906, in gevent._gevent_cgreenlet.Greenlet.run
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/teuthology/orchestra/run.py", line 323, in copy_file_to
    copy_to_log(src, logger, capture=stream, quiet=quiet)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/teuthology/orchestra/run.py", line 276, in copy_to_log
    for line in f:
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/file.py", line 125, in __next__
    line = self.readline()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/file.py", line 291, in readline
    new_data = self._read(n)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/channel.py", line 1361, in _read
    return self.channel.recv(size)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_d0ccb5e7543966c9868cca0e1d0b1e1f5b5df280/virtualenv/lib/python3.6/site-packages/paramiko/channel.py", line 701, in recv
    raise socket.timeout()
socket.timeout

Actions

Also available in: Atom PDF