Project

General

Profile

Actions

Bug #22488

open

mds: unlink blocks on large file when metadata pool is full

Added by Patrick Donnelly over 6 years ago. Updated about 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Development
Tags:
qa,full
Backport:
luminous,mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

With both of these PRs on mimic-dev1:

https://github.com/ceph/ceph/pull/19588
https://github.com/ceph/ceph/pull/19602

and a workaround patch for #22487:

diff --git a/src/mds/Server.cc b/src/mds/Server.cc
index a0ef0ea..99e81cc 100644
--- a/src/mds/Server.cc
+++ b/src/mds/Server.cc
@@ -1765,6 +1765,7 @@ void Server::dispatch_client_request(MDRequestRef& mdr)
         req->get_op() == CEPH_MDS_OP_SETXATTR ||
         req->get_op() == CEPH_MDS_OP_CREATE ||
         req->get_op() == CEPH_MDS_OP_SYMLINK ||
+        req->get_op() == CEPH_MDS_OP_SETATTR ||
         req->get_op() == CEPH_MDS_OP_MKSNAP ||
        ((req->get_op() == CEPH_MDS_OP_LINK ||
          req->get_op() == CEPH_MDS_OP_RENAME) &&
$ python2 ../qa/tasks/vstart_runner.py --interactive tasks.cephfs.test_full.TestClusterFull.test_full_different_file
...
2017-12-19 16:00:16,935.935 INFO:__main__:run args=['rm', '-f', 'large_file_a']
2017-12-19 16:00:16,935.935 INFO:__main__:Running ['rm', '-f', 'large_file_a']
2017-12-19 16:00:16,953.953 INFO:__main__:run args=['rm', '-f', 'large_file_b']
2017-12-19 16:00:16,953.953 INFO:__main__:Running ['rm', '-f', 'large_file_b']
2017-12-19 16:00:16,970.970 INFO:__main__:run args=['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:16,970.970 INFO:__main__:Running ['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:22,312.312 INFO:__main__:run args=['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:22,313.313 INFO:__main__:Running ['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:27,664.664 INFO:__main__:run args=['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:27,664.664 INFO:__main__:Running ['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:33,011.011 INFO:__main__:run args=['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:33,012.012 INFO:__main__:Running ['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:38,356.356 INFO:__main__:run args=['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:38,356.356 INFO:__main__:Running ['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:43,701.701 INFO:__main__:run args=['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:43,702.702 INFO:__main__:Running ['./bin/ceph', 'osd', 'dump', '--format=json-pretty']
2017-12-19 16:00:44,044.044 INFO:__main__:test_full_different_file (tasks.cephfs.test_full.TestClusterFull) ... ERROR
2017-12-19 16:00:44,044.044 ERROR:__main__:Traceback (most recent call last):
  File "/home/pdonnell/ceph/qa/tasks/cephfs/test_full.py", line 202, in test_full_different_file
    self._test_full(True)
  File "/home/pdonnell/ceph/qa/tasks/cephfs/test_full.py", line 187, in _test_full
    timeout=osd_mon_report_interval_max * 5)
  File "/home/pdonnell/ceph/qa/tasks/ceph_test_case.py", line 144, in wait_until_true
    raise RuntimeError("Timed out after {0}s".format(elapsed))
RuntimeError: Timed out after 25s
pdonnell@senta02 ~/ceph/build$ bin/ceph --admin-daemon /tmp/ceph-asok.pu3EX7/mds.c.asok dump_blocked_ops
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
{
    "ops": [
        {
            "description": "client_request(client.4237:12 unlink #0x1/large_file_a 2017-12-19 16:00:16.945210 caller_uid=1163, caller_gid=1163{27,119,1163,})",
            "initiated_at": "2017-12-19 16:00:16.945507",
            "age": 2052.146000,
            "duration": 2052.146028,
            "type_data": {
                "flag_point": "submit entry: journal_and_reply",
                "reqid": "client.4237:12",
                "op_type": "client_request",
                "client_info": {
                    "client": "client.4237",
                    "tid": 12
                },  
                "events": [
                    {
                        "time": "2017-12-19 16:00:16.945507",
                        "event": "initiated" 
                    },  
                    {
                        "time": "2017-12-19 16:00:16.946887",
                        "event": "acquired locks" 
                    },  
                    {
                        "time": "2017-12-19 16:00:16.947719",
                        "event": "early_replied" 
                    },  
                    {
                        "time": "2017-12-19 16:00:16.947722",
                        "event": "submit entry: journal_and_reply" 
                    }   
                ]   
            }   
        },  
...

It also concerns me that the rm -rf finishes for each file before the unlink operation completes.


Related issues 1 (0 open1 closed)

Related to CephFS - Bug #22487: mds: setattr blocked when metadata pool is fullRejected12/19/2017

Actions
Actions #1

Updated by Patrick Donnelly over 6 years ago

  • Related to Bug #22487: mds: setattr blocked when metadata pool is full added
Actions #2

Updated by Patrick Donnelly about 6 years ago

  • Category set to Correctness/Safety
  • Priority changed from Urgent to Normal
  • Target version set to v14.0.0
  • Tags set to qa,full
  • Backport changed from luminous to luminous,mimic
Actions #3

Updated by Patrick Donnelly about 5 years ago

  • Target version changed from v14.0.0 to v15.0.0
Actions #4

Updated by Patrick Donnelly about 5 years ago

  • Target version deleted (v15.0.0)
Actions

Also available in: Atom PDF