Project

General

Profile

Actions

Bug #3554

closed

Logs broken after a rotation

Added by David Zafman over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is a ceph-deploy installation on Ubuntu 12.04 of ceph version 0.54 (commit:60b84b095b1009a305d4d6a5b16f88571cbd3150)

ubuntu@ceph1:/var/log/ceph$ uptime
17:28:54 up 2 days, 49 min, 1 user, load average: 3.90, 3.31, 3.37
ubuntu@ceph1:/var/log/ceph$ ps -ef | grep ceph- | grep -v grep
root 4335 1 1 Nov26 ? 00:45:52 /usr/bin/ceph-osd --cluster=ceph -i 0 -f
root 4632 1 1 Nov26 ? 00:48:23 /usr/bin/ceph-osd --cluster=ceph -i 1 -f
root 15693 1 0 Nov27 ? 00:04:07 /usr/bin/ceph-mon --cluster=ceph -i ceph1 -f
ubuntu@ceph1:/var/log/ceph$ ls -l
total 7288
-rw------- 1 root root 6705359 Nov 28 17:26 ceph.log
-rw-r--r-- 1 root root 0 Nov 28 06:39 ceph-mon.ceph1.log
-rw-r--r-- 1 root root 0 Nov 28 06:39 ceph-osd.0.log
-rw-r--r-- 1 root root 0 Nov 28 06:39 ceph-osd.1.log
-rw-r--r-- 1 root root 647295 Nov 27 23:37 ceph-mon.ceph1.log.1.gz
-rw------- 1 root root 58496 Nov 27 23:16 ceph.log.1.gz
-rw-r--r-- 1 root root 14885 Nov 27 20:09 ceph-osd.1.log.1.gz
-rw-r--r-- 1 root root 17024 Nov 27 20:09 ceph-osd.0.log.1.gz
ubuntu@ceph1:/var/log/ceph$ sudo ls -l /proc/4335/fd | grep /var/log
l-wx------ 1 root root 64 Nov 28 17:15 6 -> /var/log/ceph/ceph-osd.0.log.1 (deleted)
ubuntu@ceph1:/var/log/ceph$ sudo ls -l /proc/4632/fd | grep /var/log
l-wx------ 1 root root 64 Nov 28 17:15 6 -> /var/log/ceph/ceph-osd.1.log.1 (deleted)
ubuntu@ceph1:/var/log/ceph$ sudo ls -l /proc/15693/fd | grep /var/log
l-wx------ 1 root root 64 Nov 28 17:16 5 -> /var/log/ceph/ceph-mon.ceph1.log.1 (deleted)

We can see that all daemon are still logging to the deleted file probably after a gzip during rotation.

Actions #1

Updated by Sage Weil over 11 years ago

  • Description updated (diff)

it sounds like the /etc/logrotate.d/ceph isn't telling upstart to reload the logs, or that bit of the upstart configs are broken.... probably the latter, i haven't tested this myself.

Actions #2

Updated by Dan Mick over 11 years ago

  • Description updated (diff)
Actions #3

Updated by Anonymous over 11 years ago

Hey Dan,

I just ran into this issue on my 3node VM on my desktop...

i had noted that mon.a was running, but mon.b and mon.c were not, when I tried to start them and they failed, that is when i checked disk space....

Any other data you want? I will leave my systems in this state (just did a service ceph stop so i would not damage further)....

ceph version 0.48.2argonaut (commit:3e02b2fad88c2a95d9c0c86878f10d1beb780bfe)

CEPH1 =====================================
root@ceph1:/var/log/ceph# df -k
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 2074896 2074896 0 100% /
udev 247712 12 247700 1% /dev
tmpfs 100696 236 100460 1% /run
none 5120 0 5120 0% /run/lock
none 251736 0 251736 0% /run/shm
/dev/sdc1 10484736 1131984 8280008 13% /var/lib/ceph/osd/ceph-0
/dev/sdd1 10484736 1287824 8128448 14% /var/lib/ceph/osd/ceph-1
/dev/sde 10485760 1029468 7334828 13% /var/lib/ceph/osd/ceph-6 =====================================
root@ceph1:/var/log/ceph# ls al
total 1082264
drwxr-xr-x 2 root root 4096 Nov 28 10:23 .
drwxr-xr-x 9 root root 4096 Nov 28 10:15 ..
-rw------
1 root root 286907 Nov 28 17:38 ceph.log
rw------ 1 root root 48160 Nov 14 22:21 ceph.log.1.gz
rw------ 1 root root 43084 Nov 13 22:35 ceph.log.2.gz
rw-r--r- 1 root root 3497984 Nov 28 21:36 ceph-mds.a.log
rw-r--r- 1 root root 6898 Nov 14 22:25 ceph-mds.a.log.1.gz
rw-r--r- 1 root root 209318 Nov 13 22:37 ceph-mds.a.log.2.gz
rw-r--r- 1 root root 49668096 Nov 28 21:36 ceph-mon.a.log
rw-r--r- 1 root root 15420 Nov 14 22:21 ceph-mon.a.log.1.gz
rw-r--r- 1 root root 420872 Nov 13 22:35 ceph-mon.a.log.2.gz
rw-r--r- 1 root root 349429760 Nov 28 21:36 ceph-osd.0.log
rw-r--r- 1 root root 5570 Nov 14 17:00 ceph-osd.0.log.1.gz
rw-r--r- 1 root root 545507 Nov 13 21:18 ceph-osd.0.log.2.gz
rw-r--r- 1 root root 370728960 Nov 28 21:36 ceph-osd.1.log
rw-r--r- 1 root root 7190 Nov 14 18:23 ceph-osd.1.log.1.gz
rw-r--r- 1 root root 554520 Nov 13 21:38 ceph-osd.1.log.2.gz
rw-r--r- 1 root root 332541952 Nov 28 21:36 ceph-osd.6.log
rw-r--r- 1 root root 145556 Nov 28 10:16 radosgw.log =====================================
root@ceph1:/var/log/ceph# du -sh *
288K ceph.log
48K ceph.log.1.gz
44K ceph.log.2.gz
3.4M ceph-mds.a.log
8.0K ceph-mds.a.log.1.gz
208K ceph-mds.a.log.2.gz
48M ceph-mon.a.log
16K ceph-mon.a.log.1.gz
412K ceph-mon.a.log.2.gz
334M ceph-osd.0.log
8.0K ceph-osd.0.log.1.gz
536K ceph-osd.0.log.2.gz
354M ceph-osd.1.log
8.0K ceph-osd.1.log.1.gz
544K ceph-osd.1.log.2.gz
318M ceph-osd.6.log
148K radosgw.log

=====================================

CEPH2 =====================================
dbarba@ceph2:/var/log/ceph$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 2.0G 2.0G 0 100% /
udev 242M 12K 242M 1% /dev
tmpfs 99M 228K 99M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 246M 0 246M 0% /run/shm
/dev/sdc1 10G 1.2G 7.8G 14% /var/lib/ceph/osd/ceph-2
/dev/sdd1 10G 1.3G 7.8G 15% /var/lib/ceph/osd/ceph-3
overflow 1.0M 0 1.0M 0% /tmp =====================================
dbarba@ceph2:~$ ls al /var/log/ceph/
total 776404
drwxr-xr-x 3 root root 4096 Nov 15 12:30 .
drwxr-xr-x 8 root root 4096 Nov 28 17:21 ..
-rw------
1 root root 326388 Nov 28 15:57 ceph.log
rw------ 1 root root 74896 Nov 14 22:51 ceph.log.1.gz
rw-r--r- 1 root root 51142656 Nov 28 21:23 ceph-mon.b.log
rw-r--r- 1 root root 1269961 Nov 14 22:52 ceph-mon.b.log.1.gz
rw-r--r- 1 root root 360701952 Nov 28 21:28 ceph-osd.2.log
rw-r--r- 1 root root 9528 Nov 14 17:00 ceph-osd.2.log.1.gz
rw-r--r- 1 root root 381444096 Nov 28 21:28 ceph-osd.3.log
rw-r--r- 1 root root 9822 Nov 14 21:18 ceph-osd.3.log.1.gz
drwxr-xr-x 2 root root 4096 Nov 13 11:26 old
rw-r--r- 1 root root 0 Nov 15 12:30 radosgw.log =====================================
dbarba@ceph2:/var/log/ceph$ du -sh *
324K ceph.log
76K ceph.log.1.gz
49M ceph-mon.b.log
1.3M ceph-mon.b.log.1.gz
345M ceph-osd.2.log
12K ceph-osd.2.log.1.gz
364M ceph-osd.3.log
12K ceph-osd.3.log.1.gz
3.5M old
0 radosgw.log

CEPH3 =====================================
root@ceph3:/var/log/ceph# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 2.0G 2.0G 0 100% /
udev 242M 12K 242M 1% /dev
tmpfs 99M 236K 99M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 246M 0 246M 0% /run/shm
/dev/sdc1 10G 1.3G 7.8G 14% /var/lib/ceph/osd/ceph-4
/dev/sdd1 10G 1.3G 7.8G 14% /var/lib/ceph/osd/ceph-5
overflow 1.0M 0 1.0M 0% /tmp =====================================
root@ceph3:/var/log/ceph# ls al
total 784296
drwxr-xr-x 2 root root 4096 Nov 28 10:38 .
drwxr-xr-x 8 root root 4096 Nov 28 17:28 ..
-rw------
1 root root 360381 Nov 28 17:32 ceph.log
rw------ 1 root root 74882 Nov 14 22:36 ceph.log.1.gz
rw-r--r- 1 root root 83869696 Nov 28 21:23 ceph-mon.c.log
rw-r--r- 1 root root 1253170 Nov 14 22:42 ceph-mon.c.log.1.gz
rw-r--r- 1 root root 2545 Nov 28 10:38 ceph-osd.2.log
rw-r--r- 1 root root 626696192 Nov 28 21:29 ceph-osd.4.log
rw-r--r- 1 root root 16300 Nov 14 22:36 ceph-osd.4.log.1.gz
rw-r--r- 1 root root 90787840 Nov 28 21:29 ceph-osd.5.log
rw-r--r- 1 root root 10706 Nov 14 17:00 ceph-osd.5.log.1.gz
rw-r--r- 1 root root 0 Nov 15 09:24 radosgw.log =====================================
  1. du -sh *
    356K ceph.log
    76K ceph.log.1.gz
    80M ceph-mon.c.log
    1.2M ceph-mon.c.log.1.gz
    4.0K ceph-osd.2.log
    598M ceph-osd.4.log
    16K ceph-osd.4.log.1.gz
    87M ceph-osd.5.log
    12K ceph-osd.5.log.1.gz
    0 radosgw.log =====================================
Actions #4

Updated by Sage Weil over 11 years ago

is this upstart (ceph-deploy) or mkcephfs (initrd)?

Actions #5

Updated by David Zafman over 11 years ago

/etc/logrotate.d/ceph is doing an invoke-rc.d ceph reload which executes /etc/init.d/ceph which does /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -l {mon,osd,mds} once for each possible service. On a ceph-deploy installed cluster the ceph.conf is very simple, so this script doesn't find that kind of information there. Either this can be fixed in the way that ceph-deploy sets up ceph.conf or /etc/logrotate.d/ceph could notice that initctl is present and use that mechanism instead.

Actions #6

Updated by David Zafman over 11 years ago

  • Status changed from New to Fix Under Review
Actions #7

Updated by Dan Mick over 11 years ago

  • Status changed from Fix Under Review to Resolved

commit:dc93132d8cc1edf83b7e0a603b366cde6750f4d6

Actions

Also available in: Atom PDF