Bug #24403: mon failed to return metadata for mds - CephFS - Ceph

Actions

Copy link

Bug #24403

closed

mon failed to return metadata for mds

Added by Thomas De Maet almost 6 years ago. Updated 6 months ago.

Status:

Resolved

Priority:

Normal

Assignee:

Patrick Donnelly

Category:

Administration/Usability

Target version:

Ceph - v19.0.0

% Done:

100%

Source:

Community (user)

Tags:

backport_processed

Backport:

reef,quincy,pacific

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v10.2.5

ceph-qa-suite:

ceph-ansible

Component(FS):

MDS, MDSMonitor

Labels (FS):

Pull request ID:

50862

Crash signature (v1):

Crash signature (v2):

Description

Hello,

Redigging an error found into the ceph-users mailing list: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-April/026241.html
From there, it seems to be related to a mds-mgr communication issue ?

I have the same issue with a small cluster: spamming log error messages like in the mailing list for the active mgr, and

telegeo02:~ # ceph --cluster geoceph mds metadata sen2agriprod
Error ENOENT: 
telegeo02:~ # ceph --cluster geoceph mds metadata 
[
    {
        "name": "sen2agriprod" 
    },
    {
        "name": "geo09" 
    },
    {
        "name": "telegeo02",
        "addr": "10.36.2.2:6800/737495544",
        "arch": "x86_64",
        "ceph_version": "ceph version 12.2.5-407-g5e7ea8cf03 (5e7ea8cf03603e1dc8937665b599f6a8fcb0213e) luminous (stable)",
        "cpu": "Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz",
        "distro": "opensuse",
        "distro_description": "openSUSE Leap 42.3",
        "distro_version": "42.3",
        "hostname": "telegeo02",
        "kernel_description": "#1 SMP Sat Apr 7 05:22:50 UTC 2018 (f24992c)",
        "kernel_version": "4.4.126-48-default",
        "mem_swap_kb": "2104316",
        "mem_total_kb": "131933332",
        "os": "Linux" 
    }
]

I'm using 2 mds servers and 1 backup:

  cluster:
    id:     c27607d1-9852-4aa2-b953-b5e3fa3845ea
    health: HEALTH_WARN
            12410/2950041 objects misplaced (0.421%)
            Degraded data redundancy: 4353/2950041 objects degraded (0.148%), 33 pgs degraded, 33 pgs undersized

  services:
    mon: 3 daemons, quorum telegeo02,geo09,sen2agriprod
    mgr: sen2agriprod(active), standbys: geo09, telegeo02
    mds: cephfs-2/2/2 up  {0=geo09=up:active,1=sen2agriprod=up:active}, 1 up:standby
    osd: 80 osds: 77 up, 77 in; 95 remapped pgs

  data:
    pools:   2 pools, 384 pgs
    objects: 134k objects, 1973 GB
    usage:   4733 GB used, 253 TB / 258 TB avail
    pgs:     4353/2950041 objects degraded (0.148%)
             12410/2950041 objects misplaced (0.421%)
             256 active+clean
             95  active+clean+remapped
             33  active+undersized+degraded

  io:
    client:   10836 kB/s wr, 0 op/s rd, 17 op/s wr

I've some issues with the kernel client (I/O error with no pattern, no log), and wonder if it could be related.

Thanks !

Files

log_mgr_telegeo02.txt (7.59 KB) log_mgr_telegeo02.txt

Thomas De Maet, 06/05/2018 10:41 AM

Related issues 5 (0 open — 5 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #24403

mon failed to return metadata for mds

Updated by Zheng Yan almost 6 years ago

Updated by Thomas De Maet almost 6 years ago

Updated by Thomas De Maet almost 6 years ago

Updated by Sebastian Wagner over 4 years ago

Updated by Min Shi over 4 years ago

Updated by Venky Shankar over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Patrick Donnelly over 1 year ago

Updated by Greg Farnum over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Patrick Donnelly about 1 year ago

Updated by Patrick Donnelly about 1 year ago

Updated by Patrick Donnelly about 1 year ago

Updated by Venky Shankar 11 months ago

Updated by Backport Bot 11 months ago

Updated by Backport Bot 11 months ago

Updated by Backport Bot 11 months ago

Updated by Backport Bot 11 months ago

Updated by Min Shi 7 months ago

Updated by Patrick Donnelly 7 months ago

Updated by Konstantin Shalygin 6 months ago