Project

General

Profile

Actions

Bug #53179

closed

Crash when unlink in corrupted cephfs

Added by Daniel Poelzleithner over 2 years ago. Updated over 1 year ago.

Status:
Duplicate
Priority:
Normal
Assignee:
Category:
fsck/damage handling
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We have a corrupted cephfs that breaks every time after the repair when files are removed.

ceph crash post                                                                                                                                                                                                                                                                                                         
malformed crash metadata: Expecting value: line 1 column 1 (char 0)

Therefore here the json info:

> # ceph crash info 2021-11-06T00:19:20.049679Z_634c82a6-b234-4420-84ec-687a63fd4b0a                                                                                                                                                                                                                                        
{
    "assert_condition": "in->first <= straydn->first",
    "assert_file": "/build/ceph-ShSgWi/ceph-15.2.14/src/mds/Server.cc",
    "assert_func": "void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)",
    "assert_line": 7271,
    "assert_msg": "/build/ceph-ShSgWi/ceph-15.2.14/src/mds/Server.cc: In function 'void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)' thread 7f419a12e700 time 2021-11-06T01:19:20.046817+0100\n/build/ceph-ShSgWi/ceph-15.2.14/src/mds/Server.cc: 7271: FAILED ceph_assert(in->first <= straydn->first)\n",
    "assert_thread_name": "mds_rank_progr",
    "backtrace": [
        "(()+0x12730) [0x7f41a3618730]",
        "(gsignal()+0x10b) [0x7f41a2edd7bb]",
        "(abort()+0x121) [0x7f41a2ec8535]",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a5) [0x7f41a3ff80f5]",
        "(()+0x28127c) [0x7f41a3ff827c]",
        "(Server::_unlink_local(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*)+0x1038) [0x55d27df5edb8]",
        "(Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0x381) [0x55d27df63691]",
        "(Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xcc3) [0x55d27df78833]",
        "(Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x30b) [0x55d27df78d8b]",
        "(Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x133) [0x55d27df85323]",
        "(MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0xb6c) [0x55d27def5c9c]",
        "(MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x7ab) [0x55d27def81cb]",
        "(MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> const&)+0x12) [0x55d27def87e2]",
        "(MDSContext::complete(int)+0x52) [0x55d27e1a3d72]",
        "(MDSRank::_advance_queues()+0x7c) [0x55d27def72dc]",
        "(MDSRank::ProgressThread::entry()+0xc5) [0x55d27def7975]",
        "(()+0x7fa3) [0x7f41a360dfa3]",
        "(clone()+0x3f) [0x7f41a2f9f4cf]" 
    ],
    "ceph_version": "15.2.14",
    "crash_id": "2021-11-06T00:19:20.049679Z_634c82a6-b234-4420-84ec-687a63fd4b0a",
    "entity_name": "mds.server5",
    "os_id": "10",
    "os_name": "Debian GNU/Linux 10 (buster)",
    "os_version": "10 (buster)",
    "os_version_id": "10",
    "process_name": "ceph-mds",
    "stack_sig": "909b713c842abd296d2bc812c6fb77c7a3eab63d1cb961676ae59a86eab1c30c",
    "timestamp": "2021-11-06T00:19:20.049679Z",
    "utsname_hostname": "server5",
    "utsname_machine": "x86_64",
    "utsname_release": "5.4.143-1-pve",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP PVE 5.4.143-1 (Tue, 28 Sep 2021 09:10:37 +0200)" 
}


Related issues 1 (1 open0 closed)

Is duplicate of CephFS - Bug #38452: mds: assert crash loop while unlinking fileNeed More Info

Actions
Actions #1

Updated by Venky Shankar over 2 years ago

  • Status changed from New to Triaged
  • Assignee set to Venky Shankar
  • Target version set to v15.2.15

Most likely the same as: https://tracker.ceph.com/issues/41147.

I guess this should have been fixed a while back. Will check and update.

Actions #2

Updated by Loïc Dachary over 2 years ago

  • Target version deleted (v15.2.15)
Actions #3

Updated by Patrick Donnelly over 1 year ago

  • Is duplicate of Bug #38452: mds: assert crash loop while unlinking file added
Actions #4

Updated by Patrick Donnelly over 1 year ago

  • Status changed from Triaged to Duplicate
Actions

Also available in: Atom PDF