Project

General

Profile

Bug #21252

mds: asok command error merged with partial Formatter output

Added by Patrick Donnelly over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Category:
Administration/Usability
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2017-08-29T21:06:14.093 INFO:teuthology.orchestra.run.smithi141:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 0 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.b.asok dump tree /ceph-qa-suite/suites'
2017-08-29T21:06:14.711 INFO:teuthology.orchestra.run.smithi141.stdout:Failed to dump tree: (1) Operation not permitted[
2017-08-29T21:06:14.746 INFO:teuthology.orchestra.run.smithi141.stdout:    {
2017-08-29T21:06:14.833 INFO:teuthology.orchestra.run.smithi141.stdout:        "ino": 1099511629559,
2017-08-29T21:06:14.846 INFO:teuthology.orchestra.run.smithi141.stdout:        "rdev": 0,
2017-08-29T21:06:15.250 INFO:teuthology.orchestra.run.smithi141.stdout:        "ctime": "2017-08-29 20:55:16.552499",
2017-08-29T21:06:15.364 INFO:teuthology.orchestra.run.smithi141.stdout:        "btime": "2017-08-29 20:55:16.552499",
2017-08-29T21:06:15.405 INFO:teuthology.orchestra.run.smithi141.stdout:        "mode": 33188,
2017-08-29T21:06:15.411 INFO:teuthology.orchestra.run.smithi141.stdout:        "uid": 0,
...

From: /ceph/teuthology-archive/pdonnell-2017-08-29_20:29:11-fs-wip-pdonnell-testing-20170829-distro-basic-smithi/1578245/teuthology.log


Related issues

Copied to CephFS - Backport #21321: luminous: mds: asok command error merged with partial Formatter output Resolved

History

#1 Updated by Patrick Donnelly over 6 years ago

  • Status changed from New to Fix Under Review

#2 Updated by Patrick Donnelly over 6 years ago

I should note: the error itself is very concerning because the only way for dump_cache to fail is if it's operating on a file, which it isn't in this case.

This could be memory corruption on the stack but I don't immediately see the reason why. I've seen this appear in two jobs for the last test run. I'm seeing if it's still reproducible in the next run.

#3 Updated by Zheng Yan over 6 years ago

Sorry, the bug was introduced by my commit:

commit 7b9eae62c8c654ff82684451c222257d2c93be64 (HEAD)
Author: Yan, Zheng <zyan@redhat.com>
Date:   Wed Aug 2 17:26:56 2017 +0800

    mds: track snap inodes through sorted map

    Current mds track both head inodes and snap inodes through unsorted
    map. The unsorted map makes finding snap inode that follows a given
    snapid difficult. Currnt MDCache::pick_inode_snap() use snap set to
    guess snap inode's last. The method isn't reliable because snap set
    may change after creating the snap inode. For example:

    MDS cows inode[2,head] with snap set[5,6], which results inode[2,6]
    and inode[7,head].

    Later mds wants to find snap inode that follows snapid 2. But the
    snap set become [5], mds can't find snap inode [2,5].

    Signed-off-by: "Yan, Zheng" <zyan@redhat.com>

below incremental patch can fix it

diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
index 2fbccccd6e..0bb1cda9b9 100644
--- a/src/mds/MDCache.cc
+++ b/src/mds/MDCache.cc
@@ -11979,6 +11979,7 @@ int MDCache::dump_cache(const char *fn, Formatter *f,
     if (r < 0)
       goto out;
   }
+  r = 0;

  out:
   if (f) {

https://github.com/ceph/ceph/pull/16778/commits/f519fca9dd958121a289676edf5175fb8be9894f

#4 Updated by Patrick Donnelly over 6 years ago

  • Status changed from Fix Under Review to Pending Backport

#5 Updated by Nathan Cutler over 6 years ago

  • Copied to Backport #21321: luminous: mds: asok command error merged with partial Formatter output added

#6 Updated by Nathan Cutler over 6 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF