Bug #8255
closed
mds: directory with missing object cannot be removed
Added by Dmitry Smirnov about 10 years ago.
Updated almost 8 years ago.
Category:
fsck/damage handling
Description
MDS write the following line to it log over 14000 times per minute:
2014-04-30 15:05:50.996261 7fe8b4237700 0 mds.0.cache open_remote_dentry_finish bad remote dentry [dentry #1/home/user/.config/epiphany/session_state.xml~ [2,head] auth REMOTE(reg) (dversion lock) pv=0 v=1036 inode=0 0x7fe8ec0c8190]
Also the following error was logged once:
2014-04-30 14:42:36.148296 mds.0 [ERR] unmatched rstat rbytes on single dirfrag 1000010bd69, inode has n(v19 rc2014-04-30 14:42:36.134246 b200307 35=29+6), dirfrag has n(v19 rc2014-04-30 14:42:36.134246 b197383 33=28+5)
I can't remove /home/user/.config/epiphany
:
# sudo rm -rv /mnt/ceph/home/user/.config/epiphany
rm: cannot remove `/mnt/ceph/home/user/.config/epiphany': Directory not empty
Please advise.
- Status changed from New to Need More Info
need more log to diagnose
truncate the mds log
execute "rm -rv /mnt/ceph/home/user/.config/epiphany"
update the mds log
besides, I'm curious when was the fs created (which version)
FS was created on 0.72.2 then upgraded to 0.78, 0.79 following by 0.80~rc1.
Somehow journal was corrupted during cluster recovery; MDS was crashing on journal replay;
Unfortunately I lost crash dump because of the mentioned log flood.
I had to resort to "--reset-journal" to get access to files.
Some files are corrupted (that's not a problem) but now I'm getting errors like
2014-05-01 03:37:22.059908 mds.0 [ERR] dir 10000421bec object missing on disk; some files may be lost
2014-05-01 03:53:58.440638 mds.0 [ERR] dir 10000421646 object missing on disk; some files may be lost
on MDS start.
I moved "epiphany" directory out of the way and rebooted client(s) that were accessing it.
Now I still can't remove it but there is nothing in MDS log whatsoever. Could it be that one directory entry stuck but MDS do not show it due to above error?
I wonder how can I wipe out affected directories (together with directory fragments)?
get inode number of 'epiphany' directory, then modify Server::_dir_is_nonempty_unlocked() and Server::_dir_is_nonempty() in src/mds/Server.cc, add line
"if (in->ino() == <inode number>) return false;" to the beginning of these two functions.
Thanks, I might try that or make a new file system from scratch.
There are more than one issue mentioned in this ticket but as for TODO I think we should rate-limit logging to prevent flood of similar messages. Perhaps logger can remember last message and just increment the counter if the same message is repeated. Then it can print something like "previous message repeated NNN times." every 10 or 30 seconds.
- Subject changed from 0.80~rc1: MDS log pollution, unable to remove directory (unmatched rstat rbytes on single dirfrag) to mds: directory with missing object cannot be removed
- Status changed from Need More Info to 12
- Priority changed from High to Normal
- Source changed from other to Community (user)
I think the remaining step is to eventually incorporate the ability to remove teh last trace of the damaged directory.
- Status changed from 12 to Fix Under Review
- Status changed from Fix Under Review to New
- Category changed from 47 to fsck/damage handling
- Component(FS) MDS added
John, much of this is handled now with the metadata damaged flags. What's left?
- Status changed from New to Resolved
This kind of issue should be handled cleanly (MDS will raise 'damaged' health alert, specifics in "damage ls") as of Jewel
Also available in: Atom
PDF