Project

General

Profile

Actions

Bug #51178

open

MDS became read-only while using rsync to copy files

Added by Jérôme Poulin almost 3 years ago. Updated almost 3 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Crash info on rank.1 before the FS became read-only.

{
    "os_version_id": "18.04", 
    "utsname_release": "4.15.0-142-generic", 
    "os_name": "Ubuntu", 
    "entity_name": "mds.sg1vosrv46", 
    "timestamp": "2021-06-11 14:34:14.603569Z", 
    "process_name": "ceph-mds", 
    "utsname_machine": "x86_64", 
    "utsname_sysname": "Linux", 
    "os_version": "18.04.5 LTS (Bionic Beaver)", 
    "os_id": "ubuntu", 
    "utsname_version": "#146-Ubuntu SMP Tue Apr 13 01:11:19 UTC 2021", 
    "backtrace": [
        "(()+0x12980) [0x7fb52da1d980]", 
        "(gsignal()+0xc7) [0x7fb52cb15fb7]", 
        "(abort()+0x141) [0x7fb52cb17921]", 
        "(()+0x8c957) [0x7fb52d50a957]", 
        "(()+0x92ae6) [0x7fb52d510ae6]", 
        "(()+0x92b21) [0x7fb52d510b21]", 
        "(()+0x92d54) [0x7fb52d510d54]", 
        "(Capability::Import::decode(ceph::buffer::v14_2_0::list::iterator_impl<true>&)+0x2d5) [0x55708d27dc75]", 
        "(Server::_commit_slave_rename(boost::intrusive_ptr<MDRequestImpl>&, int, CDentry*, CDentry*, CDentry*)+0x7c3) [0x55708d013163]", 
        "(MDSContext::complete(int)+0x73) [0x55708d235993]", 
        "(MDCache::request_finish(boost::intrusive_ptr<MDRequestImpl>&)+0x1c3) [0x55708d085a13]", 
        "(Server::dispatch_slave_request(boost::intrusive_ptr<MDRequestImpl>&)+0xd5) [0x55708d022d85]", 
        "(Server::handle_slave_request(boost::intrusive_ptr<MMDSSlaveRequest const> const&)+0x9f0) [0x55708d025bb0]", 
        "(Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x82) [0x55708d026492]", 
        "(MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0x72c) [0x55708cf8c93c]", 
        "(MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x4b3) [0x55708cf8efe3]", 
        "(MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0xb0) [0x55708cf8f870]", 
        "(MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0xfc) [0x55708cf7bc9c]", 
        "(DispatchQueue::entry()+0x1219) [0x7fb52e32e1d9]", 
        "(DispatchQueue::DispatchThread::entry()+0xd) [0x7fb52e3df08d]", 
        "(()+0x76db) [0x7fb52da126db]", 
        "(clone()+0x3f) [0x7fb52cbf871f]" 
    ], 
    "utsname_hostname": "sg1vosrv46", 
    "crash_id": "2021-06-11_14:34:14.603569Z_9c0aad25-7231-43da-97f7-ab7f3aff54c6", 
    "ceph_version": "14.2.21" 
}

I had to run recover_dentries and reset the journal, sessions and FS before it came back online.

root@sg1vosrv43:~# cephfs-journal-tool --rank=cephfs:0 event recover_dentries summary
2021-06-11 11:54:03.240 7f9e10e07c40  1 recover_dentries: frag 1000001bcb4.00000000 is corrupt, overwriting
2021-06-11 11:55:45.188 7f9e10e07c40  1 recover_dentries: frag 1000001bcb6.00000000 is corrupt, overwriting
2021-06-11 12:01:34.220 7f9e10e07c40  1 recover_dentries: frag 1000001f566.00000000 is corrupt, overwriting
Events by type:
  COMMITTED: 412
  EXPORT: 29
  FRAGMENT: 3
  IMPORTFINISH: 35
  IMPORTSTART: 35
  OPEN: 33
  SESSION: 3
  SESSIONS: 2
  SLAVEUPDATE: 782
  SUBTREEMAP: 129
  UPDATE: 116151
Errors: 0

root@sg1vosrv43:~# cephfs-journal-tool --rank=cephfs:1 event recover_dentries summary
Events by type:
  COMMITTED: 393
  EXPORT: 66
  IMPORTFINISH: 70
  IMPORTSTART: 70
  OPEN: 15
  SESSION: 12
  SESSIONS: 3
  SLAVEUPDATE: 826
  SUBTREEMAP: 40
  UPDATE: 31545
Errors: 0

MDS log file with ino errors attached.


Files

ceph-mds.sg1vosrv43.log.zst (8.15 KB) ceph-mds.sg1vosrv43.log.zst Jérôme Poulin, 06/11/2021 04:35 PM
ceph-mds.sg1vosrv43.log (287 KB) ceph-mds.sg1vosrv43.log Jérôme Poulin, 06/14/2021 02:02 PM
ceph-mds.sg1vosrv43-second-crash.log.zip (467 KB) ceph-mds.sg1vosrv43-second-crash.log.zip Jérôme Poulin, 06/14/2021 03:34 PM
Actions #1

Updated by Jérôme Poulin almost 3 years ago

I have a feeling this is related to active-active MDS setup looking at the stack trace, would that be right? We've down-scaled the FS to a single MDS until then.

Actions #2

Updated by Patrick Donnelly almost 3 years ago

  • Status changed from New to Need More Info

Do you have any logs from the MDS?

Actions #3

Updated by Jérôme Poulin almost 3 years ago

Yes, the log file for the day is attached as a zstd-compressed file, I re-attached it as plain text since it is quite small.

Actions #4

Updated by Jérôme Poulin almost 3 years ago

It just re-mounted read-only right now. Here is the new log, there is no crash this time. This time it is a bigger log.

Actions #5

Updated by Jérôme Poulin almost 3 years ago

The second MDS tried to take the active role and crashed with:
2021-06-14 11:41:27.795 7f21d26d9700 1 mds.0.340 rejoin_done
2021-06-14 11:41:28.259 7f21d8ee6700 1 mds.sg1vosrv46 Updating MDS map to version 343 from mon.1
2021-06-14 11:41:28.259 7f21d8ee6700 1 mds.0.340 handle_mds_map i am now mds.0.340
2021-06-14 11:41:28.259 7f21d8ee6700 1 mds.0.340 handle_mds_map state change up:rejoin --> up:active
2021-06-14 11:41:28.259 7f21d8ee6700 1 mds.0.340 recovery_done -- successful recovery!
2021-06-14 11:41:28.299 7f21d8ee6700 1 mds.0.340 active_start
2021-06-14 11:41:28.307 7f21d8ee6700 1 mds.0.340 cluster recovered.
2021-06-14 11:41:28.307 7f21d26d9700 0 mds.0 RecoveryQueue::_recovered recovery error! -1
2021-06-14 11:41:28.307 7f21d26d9700 -1 log_channel(cluster) log [ERR] : OSD read error while recovering size for inode 0x20000000f52
2021-06-14 11:41:28.335 7f21d26d9700 1 mds.sg1vosrv46 respawn!

Actions #6

Updated by Jérôme Poulin almost 3 years ago

It seems to now be corrupted to the point where no MDS can start it, not even read-only.

-1017> 2021-06-14 11:50:27.142 7f89be8b4700  1 mds.0.368 active_start
-1016> 2021-06-14 11:50:27.146 7f89be8b4700 1 mds.0.368 cluster recovered.
-1015> 2021-06-14 11:50:27.146 7f89be8b4700 4 mds.0.368 set_osd_epoch_barrier: epoch=119227
-1014> 2021-06-14 11:50:27.146 7f89be8b4700 5 mds.sg1vosrv43 ms_handle_remote_reset on 10.10.4.230:0/1130228476
-1013> 2021-06-14 11:50:27.146 7f89be8b4700 3 mds.sg1vosrv43 ms_handle_remote_reset closing connection for session client.54605292 10.10.4.230:0/1130228476
-1012> 2021-06-14 11:50:27.146 7f89be8b4700 5 mds.sg1vosrv43 ms_handle_reset on 10.10.4.230:0/1130228476
-1011> 2021-06-14 11:50:27.146 7f89b80a7700 0 mds.0 RecoveryQueue::_recovered recovery error! -1
-1010> 2021-06-14 11:50:27.146 7f89b80a7700 -1 log_channel(cluster) log [ERR] : OSD read error while recovering size for inode 0x20000000f24
-1009> 2021-06-14 11:50:27.146 7f89b80a7700 5 mds.beacon.sg1vosrv43 set_want_state: up:active -> down:damaged
-1008> 2021-06-14 11:50:27.146 7f89b80a7700 10 log_client log_queue is 6814 last_log 6814 sent 0 num 6814 unsent 6814 sending 1000
-1007> 2021-06-14 11:50:27.146 7f89b80a7700 10 log_client will send 2021-06-14 11:50:26.097992 mds.sg1vosrv43 (mds.0) 1 : cluster [ERR] bad backtrace on directory inode 0x1000001e402
-1006> 2021-06-14 11:50:27.146 7f89b80a7700 10 log_client will send 2021-06-14 11:50:26.099927 mds.sg1vosrv43 (mds.0) 2 : cluster [ERR] bad backtrace on directory inode 0x1000001f54f
-1005> 2021-06-14 11:50:27.146 7f89b80a7700 10 log_client will send 2021-06-14 11:50:26.102859 mds.sg1vosrv43 (mds.0) 3 : cluster [ERR] bad backtrace on directory inode 0x1000001f552
-1004> 2021-06-14 11:50:27.146 7f89b80a7700 10 log_client will send 2021-06-14 11:50:26.145849 mds.sg1vosrv43 (mds.0) 4 : cluster [ERR] loaded dup inode 0x20000000de2 [2,head] v22185 at /pg_xlog_archives/10/massalert/.000000040000001200000000.zst.D2RndG, but inode 0x20000000de2.head v22208 already exists at /pg_xlog_archives/10/massalert/000000040000001200000000.zst
-1003> 2021-06-14 11:50:27.146 7f89b80a7700 10 log_client will send 2021-06-14 11:50:26.145873 mds.sg1vosrv43 (mds.0) 5 : cluster [ERR] loaded dup inode 0x20000000de1 [2,head] v22171 at /pg_xlog_archives/10/massalert/.0000000400000011000000FF.zst.15dX49, but inode 0x20000000de1.head v22206 already exists at /pg_xlog_archives/10/massalert/0000000400000011000000FF.zst
-1002> 2021-06-14 11:50:27.146 7f89b80a7700 10 log_client will send 2021-06-14 11:50:26.145892 mds.sg1vosrv43 (mds.0) 6 : cluster [ERR] loaded dup inode 0x20000000de0 [2,head] v22152 at /pg_xlog_archives/10/massalert/.0000000400000011000000FE.zst.ngCjqF, but inode 0x20000000de0.head v22204 already exists at /pg_xlog_archives/10/massalert/0000000400000011000000FE.zst
Actions #7

Updated by Jérôme Poulin almost 3 years ago

Here is the hex dumps of the "bad backtrace on directory" inodes.

│11:58:31│0│jerome@p4:~
├ rados --cluster prod2 -p cephfsmeta listomapvals 1000001f54f.00000000
main_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 50  |........I......P|
00000010  f5 01 00 00 01 00 00 00  00 00 00 04 5a c3 60 d4  |............Z.`.|
00000020  7a c4 31 ed 41 00 00 00  00 00 00 00 00 00 00 01  |z.1.A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 c6 9b 21 60 f2 0c  |............!`..|
00000080  bf 1c 04 5a c3 60 46 1c  a4 31 02 00 00 00 00 00  |...Z.`F..1......|
00000090  00 00 03 02 28 00 00 00  00 00 00 00 00 00 00 00  |....(...........|
000000a0  04 5a c3 60 23 d3 81 31  01 00 00 00 00 00 00 00  |.Z.`#..1........|
000000b0  00 00 00 00 00 00 00 00  03 00 00 00 00 00 00 00  |................|
000000c0  03 02 38 00 00 00 00 00  00 00 00 00 00 00 71 11  |..8...........q.|
000000d0  81 00 00 00 00 00 01 00  00 00 00 00 00 00 01 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 04 5a  c3 60 d4 7a c4 31 03 02  |.......Z.`.z.1..|
00000100  38 00 00 00 00 00 00 00  00 00 00 00 71 11 81 00  |8...........q...|
00000110  00 00 00 00 01 00 00 00  00 00 00 00 01 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 04 5a c3 60  d4 7a c4 31 15 00 00 00  |.....Z.`.z.1....|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 02 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 03 5a  |...............Z|
000001a0  c3 60 e5 f7 a0 22 04 00  00 00 00 00 00 00 ff ff  |.`..."..........|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

│11:58:51│0│jerome@p4:~
├ rados --cluster prod2 -p cephfsmeta listomapvals 1000001f552.00000000
camapgears_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 53  |........I......S|
00000010  f5 01 00 00 01 00 00 00  00 00 00 52 5a c3 60 e6  |...........RZ.`.|
00000020  7f 98 2c ed 41 00 00 00  00 00 00 00 00 00 00 01  |..,.A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 5b 52 c2 60 51 52  |..........[R.`QR|
00000080  bb 36 52 5a c3 60 49 79  43 2c 01 00 00 00 00 00  |.6RZ.`IyC,......|
00000090  00 00 03 02 28 00 00 00  00 00 00 00 00 00 00 00  |....(...........|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  03 02 38 00 00 00 00 00  00 00 00 00 00 00 00 00  |..8.............|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 01 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 52 5a  c3 60 e6 7f 98 2c 03 02  |......RZ.`...,..|
00000100  38 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |8...............|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 52 5a c3 60  e6 7f 98 2c c4 05 00 00  |....RZ.`...,....|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 02 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 52 5a  |..............RZ|
000001a0  c3 60 9d e3 e6 1e 02 00  00 00 00 00 00 00 ff ff  |.`..............|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

citammapgears_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 54  |........I......T|
00000010  f5 01 00 00 01 00 00 00  00 00 00 fe 5a c3 60 ed  |............Z.`.|
00000020  8f 0c 37 ed 41 00 00 00  00 00 00 00 00 00 00 01  |..7.A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 7c 7d c2 60 7b b8  |..........|}.`{.|
00000080  18 32 fe 5a c3 60 d5 08  c0 36 02 00 00 00 00 00  |.2.Z.`...6......|
00000090  00 00 03 02 28 00 00 00  7b 02 00 00 00 00 00 00  |....(...{.......|
000000a0  fe 5a c3 60 43 82 f3 35  4b 01 00 00 00 00 00 00  |.Z.`C..5K.......|
000000b0  00 00 00 00 00 00 00 00  e1 03 00 00 00 00 00 00  |................|
000000c0  03 02 38 00 00 00 12 00  00 00 00 00 00 00 ab b1  |..8.............|
000000d0  e8 7c 00 00 00 00 4b 01  00 00 00 00 00 00 01 00  |.|....K.........|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 fe 5a  c3 60 ed 8f 0c 37 03 02  |.......Z.`...7..|
00000100  38 00 00 00 12 00 00 00  00 00 00 00 ab b1 e8 7c  |8..............||
00000110  00 00 00 00 4b 01 00 00  00 00 00 00 01 00 00 00  |....K...........|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 fe 5a c3 60  ed 8f 0c 37 c0 05 00 00  |.....Z.`...7....|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 04 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 52 5a  |..............RZ|
000001a0  c3 60 11 e0 4f 1f e2 03  00 00 00 00 00 00 ff ff  |.`..O...........|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

│11:59:06│0│jerome@p4:~
├ rados --cluster prod2 -p cephfsmeta listomapvals 1000001e402.00000000
c911p_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 03  |........I.......|
00000010  e4 01 00 00 01 00 00 00  00 00 00 4c 85 c2 60 f9  |...........L..`.|
00000020  18 c1 12 ed 41 00 00 00  00 00 00 00 00 00 00 01  |....A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 2d 78 c2 60 6c 3a  |..........-x.`l:|
00000080  5f 1e 4c 85 c2 60 d2 e4  4d 12 02 00 00 00 00 00  |_.L..`..M.......|
00000090  00 00 03 02 28 00 00 00  00 00 00 00 00 00 00 00  |....(...........|
000000a0  4c 85 c2 60 5d 5e 9c 11  6a 10 00 00 00 00 00 00  |L..`]^..j.......|
000000b0  00 00 00 00 00 00 00 00  3e 31 00 00 00 00 00 00  |........>1......|
000000c0  03 02 38 00 00 00 00 00  00 00 00 00 00 00 a9 fd  |..8.............|
000000d0  5c e2 07 00 00 00 6a 10  00 00 00 00 00 00 01 00  |\.....j.........|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 4c 85  c2 60 f9 18 c1 12 03 02  |......L..`......|
00000100  38 00 00 00 00 00 00 00  00 00 00 00 a9 fd 5c e2  |8.............\.|
00000110  07 00 00 00 6a 10 00 00  00 00 00 00 01 00 00 00  |....j...........|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 4c 85 c2 60  f9 18 c1 12 d1 e8 00 00  |....L..`........|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 02 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 10 7a  |...............z|
000001a0  c2 60 33 71 49 28 3f 31  00 00 00 00 00 00 ff ff  |.`3qI(?1........|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

citam311_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 04  |........I.......|
00000010  e4 01 00 00 01 00 00 00  00 00 00 e3 85 c2 60 6a  |..............`j|
00000020  ab a3 30 ed 41 00 00 00  00 00 00 00 00 00 00 01  |..0.A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 47 6b c2 60 59 3d  |..........Gk.`Y=|
00000080  99 21 e3 85 c2 60 ff 30  8c 30 02 00 00 00 00 00  |.!...`.0.0......|
00000090  00 00 03 02 28 00 00 00  00 00 00 00 00 00 00 00  |....(...........|
000000a0  e3 85 c2 60 38 66 40 30  e0 00 00 00 00 00 00 00  |...`8f@0........|
000000b0  00 00 00 00 00 00 00 00  a0 02 00 00 00 00 00 00  |................|
000000c0  03 02 38 00 00 00 00 00  00 00 00 00 00 00 01 c6  |..8.............|
000000d0  e8 6b 00 00 00 00 e0 00  00 00 00 00 00 00 01 00  |.k..............|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 e3 85  c2 60 6a ab a3 30 03 02  |.........`j..0..|
00000100  38 00 00 00 00 00 00 00  00 00 00 00 01 c6 e8 6b  |8..............k|
00000110  00 00 00 00 e0 00 00 00  00 00 00 00 01 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 e3 85 c2 60  6a ab a3 30 0d f4 00 00  |.......`j..0....|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 04 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 10 7a  |...............z|
000001a0  c2 60 53 cb a0 28 a1 02  00 00 00 00 00 00 ff ff  |.`S..(..........|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce
Actions #8

Updated by Jérôme Poulin almost 3 years ago

We now have a pool that can't list any object, from the log, I understand the MDS is trying to get inode 20000000f24 but fails, however, I also fail myself.

-1010> 2021-06-14 11:50:27.146 7f89b80a7700 -1 log_channel(cluster) log [ERR] :  OSD read error while recovering size for inode 0x20000000f24
├ rados --cluster prod2 -p cephfsarchive get 20000000f24.00000000  asd
error getting cephfsarchive/20000000f24.00000000: (2) No such file or directory
├ rados --cluster prod2 -p cephfsdata get 20000000f24.00000000  asd
error getting cephfsdata/20000000f24.00000000: (2) No such file or directory
│12:40:27│1│jerome@p4:~
├ rados --cluster prod2 -p cephfsarchive ls
(no return)
│12:40:58│0│jerome@p4:~
├ rados --cluster prod2 -p cephfsarchive df
POOL_NAME         USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY  UNFOUND  DEGRADED  RD_OPS   RD  WR_OPS       WR  USED COMPR  UNDER COMPR
cephfsarchive  477 GiB    84067       0  168134                   0        0         0       0  0 B   84075  238 GiB         0 B          0 B
Actions #9

Updated by Jérôme Poulin almost 3 years ago

More information, our Ceph FUSE clients are running ceph 15 if that matters.

Actions #10

Updated by Jérôme Poulin almost 3 years ago

To restart the CephFS, I had to find another broken inode in the log.

So I found that this entry crashes the MDS.

2021-06-14 13:34:55.480 7f4eb1257700 -1 log_channel(cluster) log [ERR] :  OSD read error while recovering size for inode 0x20000000f36

I found that this entry is in the scarto directory.

2021-06-14 09:03:59.968 7fb048bbb700  0 mds.0.cache.dir(0x1000001f566.100*) _fetched  badness: got (but i already had) [inode 0x20000000f36 [2,head] /pg_xlog_archives/9.6/scarto/.00000017000001B7000000CE.zst.APjjxP auth v122428 s=4138809 n(v0 rc2021-06-11 10:12:52.881593 b4138809 1=1+0) (iversion lock) cr={54092403=0-8388608@1} 0x560aa54cf100] mode 33188 mtime 2021-04-22 00:22:06.000000
2021-06-14 09:03:59.968 7fb048bbb700 -1 log_channel(cluster) log [ERR] : loaded dup inode 0x20000000f36 [2,head] v122472 at /pg_xlog_archives/9.6/scarto/00000017000001B7000000CE.zst, but inode 0x20000000f36.head v122428 already exists at /pg_xlog_archives/9.6/scarto/.00000017000001B7000000CE.zst.APjjxP

I did a reverse search for "bad backtrace" in the log and found this one.

2021-06-14 09:03:43.592 7fb048bbb700 -1 log_channel(cluster) log [ERR] : bad backtrace on directory inode 0x1000001f563

I listed the omap keys from this entries and found scarto_head.

│13:52:08│0│jerome@p4:~
├ rados --cluster prod2 -p cephfsmeta listomapvals 1000001f563.00000000
c311p_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 64  |........I......d|
00000010  f5 01 00 00 01 00 00 00  00 00 00 40 5d c3 60 37  |...........@].`7|
00000020  31 de 00 ed 41 00 00 00  00 00 00 00 00 00 00 01  |1...A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 bc c1 5f 5f d7 22  |............__."|
00000080  18 29 40 5d c3 60 1d 30  76 00 01 00 00 00 00 00  |.)@].`.0v.......|
00000090  00 00 03 02 28 00 00 00  00 00 00 00 00 00 00 00  |....(...........|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  03 02 38 00 00 00 00 00  00 00 00 00 00 00 00 00  |..8.............|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 01 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 40 5d  c3 60 37 31 de 00 03 02  |......@].`71....|
00000100  38 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |8...............|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 40 5d c3 60  37 31 de 00 59 15 02 00  |....@].`71..Y...|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 02 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 3d 5d  |..............=]|
000001a0  c3 60 7b 6b 3d 3a 02 00  00 00 00 00 00 00 ff ff  |.`{k=:..........|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

citam311_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 65  |........I......e|
00000010  f5 01 00 00 01 00 00 00  00 00 00 40 5d c3 60 64  |...........@].`d|
00000020  89 00 01 ed 41 00 00 00  00 00 00 00 00 00 00 01  |....A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 44 b2 4f 60 47 9a  |..........D.O`G.|
00000080  f1 30 40 5d c3 60 79 39  b3 00 01 00 00 00 00 00  |.0@].`y9........|
00000090  00 00 03 02 28 00 00 00  00 00 00 00 00 00 00 00  |....(...........|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  03 02 38 00 00 00 00 00  00 00 00 00 00 00 00 00  |..8.............|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 01 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 40 5d  c3 60 64 89 00 01 03 02  |......@].`d.....|
00000100  38 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |8...............|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 40 5d c3 60  64 89 00 01 58 15 02 00  |....@].`d...X...|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 04 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 3d 5d  |..............=]|
000001a0  c3 60 b5 fe 92 3a 02 00  00 00 00 00 00 00 ff ff  |.`...:..........|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

scarto_head
value (470 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 66  |........I......f|
00000010  f5 01 00 00 01 00 00 00  00 00 00 65 74 c3 60 5d  |...........et.`]|
00000020  6f e2 02 ed 41 00 00 00  00 00 00 00 00 00 00 01  |o...A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 65 74 c3 60 5d 6f  |..........et.`]o|
00000080  e2 02 4e 5d c3 60 f7 cf  5c 12 01 00 00 00 00 00  |..N].`..\.......|
00000090  00 00 03 02 28 00 00 00  93 09 00 00 00 00 00 00  |....(...........|
000000a0  65 74 c3 60 5d 6f e2 02  e5 2d 00 00 00 00 00 00  |et.`]o...-......|
000000b0  00 00 00 00 00 00 00 00  5b 75 00 00 00 00 00 00  |........[u......|
000000c0  03 02 38 00 00 00 8d 00  00 00 00 00 00 00 05 68  |..8............h|
000000d0  4c f4 0c 00 00 00 e5 2d  00 00 00 00 00 00 01 00  |L......-........|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 65 74  c3 60 5d 6f e2 02 03 02  |......et.`]o....|
00000100  38 00 00 00 8d 00 00 00  00 00 00 00 05 68 4c f4  |8............hL.|
00000110  0c 00 00 00 e5 2d 00 00  00 00 00 00 01 00 00 00  |.....-..........|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 65 74 c3 60  5d 6f e2 02 66 15 02 00  |....et.`]o..f...|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 06 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 3d 5d  |..............=]|
000001a0  c3 60 6a 0a ae 3a 5b 75  00 00 00 00 00 00 ff ff  |.`j..:[u........|
000001b0  ff ff 01 00 00 00 00 00  00 00 03 00 00 00 00 00  |................|
000001c0  00 00 00 00 00 00 00 00  00 00 fe ff ff ff ff ff  |................|
000001d0  ff ff 00 00 00 00                                 |......|
000001d6

smobile_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 67  |........I......g|
00000010  f5 01 00 00 01 00 00 00  00 00 00 3d 5d c3 60 32  |...........=].`2|
00000020  5d c7 3a c0 41 00 00 00  00 00 00 00 00 00 00 01  |].:.A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 3d 5d c3 60 32 5d  |..........=].`2]|
00000080  c7 3a 3d 5d c3 60 32 5d  c7 3a 00 00 00 00 00 00  |.:=].`2].:......|
00000090  00 00 03 02 28 00 00 00  00 00 00 00 00 00 00 00  |....(...........|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  03 02 38 00 00 00 00 00  00 00 00 00 00 00 00 00  |..8.............|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 01 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 03 02  |................|
00000100  38 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |8...............|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 00 00 00 00  00 00 00 00 08 00 00 00  |................|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 08 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 3d 5d  |..............=]|
000001a0  c3 60 32 5d c7 3a 00 00  00 00 00 00 00 00 ff ff  |.`2].:..........|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

smonitor_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 68  |........I......h|
00000010  f5 01 00 00 01 00 00 00  00 00 00 3d 5d c3 60 0c  |...........=].`.|
00000020  44 df 3a c0 41 00 00 00  00 00 00 00 00 00 00 01  |D.:.A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 3d 5d c3 60 0c 44  |..........=].`.D|
00000080  df 3a 3d 5d c3 60 0c 44  df 3a 00 00 00 00 00 00  |.:=].`.D.:......|
00000090  00 00 03 02 28 00 00 00  00 00 00 00 00 00 00 00  |....(...........|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  03 02 38 00 00 00 00 00  00 00 00 00 00 00 00 00  |..8.............|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 01 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 03 02  |................|
00000100  38 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |8...............|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 00 00 00 00  00 00 00 00 0a 00 00 00  |................|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 0a 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 3d 5d  |..............=]|
000001a0  c3 60 0c 44 df 3a 00 00  00 00 00 00 00 00 ff ff  |.`.D.:..........|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

srao_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 69  |........I......i|
00000010  f5 01 00 00 01 00 00 00  00 00 00 3d 5d c3 60 7b  |...........=].`{|
00000020  d0 f6 3a c0 41 00 00 00  00 00 00 00 00 00 00 01  |..:.A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 3d 5d c3 60 7b d0  |..........=].`{.|
00000080  f6 3a 3d 5d c3 60 7b d0  f6 3a 00 00 00 00 00 00  |.:=].`{..:......|
00000090  00 00 03 02 28 00 00 00  00 00 00 00 00 00 00 00  |....(...........|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  03 02 38 00 00 00 00 00  00 00 00 00 00 00 00 00  |..8.............|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 01 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 03 02  |................|
00000100  38 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |8...............|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 00 00 00 00  00 00 00 00 0c 00 00 00  |................|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 0c 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 3d 5d  |..............=]|
000001a0  c3 60 7b d0 f6 3a 00 00  00 00 00 00 00 00 ff ff  |.`{..:..........|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

vnomad_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 6a  |........I......j|
00000010  f5 01 00 00 01 00 00 00  00 00 00 3d 5d c3 60 49  |...........=].`I|
00000020  1a 0e 3b c0 41 00 00 00  00 00 00 00 00 00 00 01  |..;.A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 3d 5d c3 60 49 1a  |..........=].`I.|
00000080  0e 3b 3d 5d c3 60 49 1a  0e 3b 00 00 00 00 00 00  |.;=].`I..;......|
00000090  00 00 03 02 28 00 00 00  00 00 00 00 00 00 00 00  |....(...........|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  03 02 38 00 00 00 00 00  00 00 00 00 00 00 00 00  |..8.............|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 01 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 03 02  |................|
00000100  38 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |8...............|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 00 00 00 00  00 00 00 00 0e 00 00 00  |................|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 0e 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  00 00 00 00 00 00 3d 5d  |..............=]|
000001a0  c3 60 49 1a 0e 3b 00 00  00 00 00 00 00 00 ff ff  |.`I..;..........|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

Then I rm'ed the problematic folder from the omap keys.

│13:52:59│0│jerome@p4:~
├ rados --cluster prod2 -p cephfsmeta rmomapkey 1000001f563.00000000 scarto_head 

Now the FS is mounted read-write again, we'll be waiting for further comments before touching this specific folder again.

Actions #11

Updated by Jérôme Poulin almost 3 years ago

The PG auto-scaler is disabled and we missed the fact that it has been created with only 8 PG, could it affect the ability to list the pool (too many objects per PG) ?

Actions #12

Updated by Jérôme Poulin almost 3 years ago

Sorry about that last comment, the pool wasn't empty, in fact, everything was in another namespace for security and in the panic, I forgot to add --all to rados ls. The objects are still present.

Actions #13

Updated by Jérôme Poulin almost 3 years ago

Well, the filesystem has recovered since, scrub made us found out that one of the MDS didn't have the permission to write to the cephfsarchive pool. This has been corrected, objects without xattrs on them have been deleted and everything is back to normal.

My main concern here is how unclear the error messages were at pointing that when there's 2 MDS active. We found it out the hard way first, with a broken filesystem and "backtrace" errors.

Actions

Also available in: Atom PDF