Bug #19291
closedmds: log rotation doesn't work if mds has respawned
0%
Description
If an MDS respawns then its "comm" name becomes "exe" which confuses logrotate since it relies on killlall. What ends up happening is that logrotate will rename the current log to e.g. "ceph-mds.li1015-93.log.1", send SIGHUP to all processes named "ceph-mds" (there are none!), gzip the ceph-mds.li1015-93.log.1 to ceph-mds.li1015-93.log.1.gz and then unlink the original log ceph-mds.li1015-93.log.1. Unfortunately because ceph-mds does not get the SIGHUP, it will keep writing to the unlinked log inode until the disk space is consumed or ceph-mds dies.
This problem was introduced in 66a122025f6cf023cf7b2f3d8fbe4964fb7568a7 and discussed on ceph-devel here: https://www.spinics.net/lists/ceph-devel/msg33780.html
I'm going to suggest the simple (but very slightly racy) solution of calling prctl on startup to unconditionally change the name of the ceph-mds daemon to "ceph-mds". Thoughts?
Updated by Patrick Donnelly about 7 years ago
- Assignee set to Patrick Donnelly
- Priority changed from Normal to Urgent
- Source set to Development
- Component(FS) MDS added
Updated by Patrick Donnelly about 7 years ago
Sorry hit submit on accident before finishing writing this up. Standby!
Updated by Patrick Donnelly about 7 years ago
- Status changed from New to Fix Under Review
Updated by John Spray about 7 years ago
- Status changed from Fix Under Review to Resolved
Updated by Patrick Donnelly about 7 years ago
- Status changed from Resolved to Pending Backport
The bug is also in jewel 10.2.6, due to 6efad699249ba7c6928193dba111dbb23b606beb.
Updated by Nathan Cutler about 7 years ago
- Copied to Backport #19466: jewel: mds: log rotation doesn't work if mds has respawned added
Updated by Nathan Cutler over 6 years ago
- Status changed from Pending Backport to Resolved