Project

General

Profile

Bug #62925

Updated by Prashant D 8 months ago

The cephfs-journal-tool should be used by expert who has the knowledge of CephFS internals. Though we have a clear warning message on https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects doc to not to use cephfs-journal-tool to reset journal without cephfs team's advice, advise, still some users venture out to try this tools without much thought which can result in MDS crash as observed in https://tracker.ceph.com/issues/58878.  

 <pre> 
 sh-4.4$ cephfs-journal-tool --rank ocs-storagecluster-cephfilesystem:0 event recover_dentries summary 
 Events by type: 
   RESETJOURNAL: 1 
 Errors: 0 
 sh-4.4$ cephfs-journal-tool --rank ocs-storagecluster-cephfilesystem:0 journal reset 
 old journal was 8388608~48 
 new journal start will be 12582912 (4194256 bytes past old end) 
 writing journal head 
 writing EResetJournal entry 
 done 
 </pre> 

 We should have a warning message with a prompt to continue or not when we run this tool to reset the journal. Also cephfs-journal-tool should not be run when cephfs is online or we should have a clear warning message when user attempts to run against live cephfs, mostly when "event recover_dentries summary" command to write any inodes/dentries recoverable from the journal to the RADOS store.

Back