Bug #4815
closed
mon: leveldb grows quickly and without bound
Added by Mike Dawson about 11 years ago.
Updated about 11 years ago.
Description
My mon.a process went away without a core dump or indication in the ceph-mon log or syslog of what happened. mon.b and mon.c quickly held an election and continued operating properly. mon.a and was down for several hours. Upon starting the mon.a process, it would not achieve quorum. This is ceph version 0.60-641-gc7a0477 (c7a0477bad6bfbec4ef325295ca0489ec1977926) from gitbuilder next.
Log with debug mon = 20, debug paxos = 20, and debug ms = 20 is attached.
Files
Actually, I meant dmesg instead of syslog above. Looking at the syslog, ceph-mon was killed by oom-killer:
Killed process 6956 (ceph-mon) total-vm:65374500kB, anon-rss:46681848kB, file-rss:0kB
Looks like imjustmatthew on irc is seeing high memory and cpu with ceph-mon processes from gitbuilder next as well. Could that be related to ceph-mon.a's inability to rejoin quorum? For reference top shows:
mon.a: 19286 root 20 0 10.2g 160m 10m S 0 0.3 0:01.09 ceph-mon
mon.b: 13506 root 20 0 36.6g 20g 17m S 12 42.7 26:27.57 ceph-mon
mon.c: 12255 root 20 0 17.2g 211m 53m S 0 0.4 32:34.56 ceph-mon
while mon.a is stuck out of quorum and mon.b + mon.c are in the quorum.
- Status changed from New to Need More Info
It's not entirely clear what's going on here with just the messenger logging. If you can get monitor logging (and from the other monitors) maybe we can tell.
What I can see is that for about 3 seconds after startup, the monitor is successfully syncing. But then the pipes it's receiving sync data on start faulting and it never makes any more progress; it just keeps receiving auth messages and then faulting on the pipe.
- Assignee set to Sage Weil
- Priority changed from High to Urgent
can you reproduce with the latest next, capture the mon.a log, and also attach to the process after it stops making progress (the lines with ==== stop appearing) and do 'thread apply all bt' so we can see what it is blocked doing?
thanks!
New logs have been uploaded to cephdrop as "mikedawson/ceph-mon.*.log". They show starting up the three monitors. mon.a didn't sync, mon.b was the leader, and mon.c was a peon. Then, I start OSDs. At that point, mob.b went to probing. This is completely reproducible with the current state of my cluster. Same situation for the last three or four "next" builds I have tried.
After those logs were uploaded, I stopped all daemons, then started the mons only. The backtrace from mon.a is attached.
the mon.a is getting stuck in leveldb:
Thread 7 (Thread 0x7f3a62b35700 (LWP 12640)):
#0 0x00007f3ac4555ca4 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007f3ac35c373d in leveldb::port::CondVar::Wait() () from /usr/lib/x86_64-linux-gnu/libleveldb.so.1
#2 0x00007f3ac359d77c in leveldb::DBImpl::MakeRoomForWrite(bool) () from /usr/lib/x86_64-linux-gnu/libleveldb.so.1
#3 0x00007f3ac35a0b65 in leveldb::DBImpl::Write(leveldb::WriteOptions const&, leveldb::WriteBatch*) () from /usr/lib/x86_64-linux-gnu/libleveldb.so.1
#4 0x0000000000588a29 in LevelDBStore::submit_transaction_sync(std::tr1::shared_ptr<KeyValueDB::TransactionImpl>) ()
#5 0x00000000004a679a in MonitorDBStore::apply_transaction(MonitorDBStore::Transaction&) ()
#6 0x00000000004b8909 in Monitor::handle_sync_chunk(MMonSync*) ()
#7 0x00000000004caebb in Monitor::handle_sync(MMonSync*) ()
#8 0x00000000004d4b6b in Monitor::_ms_dispatch(Message*) ()
#9 0x00000000004eec82 in Monitor::ms_dispatch(Message*) ()
---Type <return> to continue, or q <return> to quit---
#10 0x00000000006ab9c3 in DispatchQueue::entry() ()
#11 0x000000000063ff8d in DispatchQueue::DispatchThread::entry() ()
#12 0x00007f3ac4551f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#13 0x00007f3ac2a9be1d in clone () from /lib/x86_64-linux-gnu/libc.so.6
not using cpu, but it is huge.. 19G
)
(06:01:31 PM) mikedawson: 6017 root 20 0 9842m 129m 10m S 0 0.3 0:00.94 ceph-mon
(06:01:33 PM) mikedawson: 6054 root 20 0 36772 7188 2356 S 0 0.0 0:00.02 ceph-create-key
(06:03:24 PM) mikedawson: sagelap: not sure how big leveldb should get, but...
(06:03:26 PM) mikedawson: 19G /var/lib/ceph/mon/ceph-a/store.db
(06:03:51 PM) sagelap: oof
other mons are 36GB, so it's not done yet. but stuck.
- Subject changed from Monitor will Not Re-join Quorum to mon: leveldb grows quickly and without bound
- Status changed from Need More Info to Fix Under Review
- Assignee changed from Sage Weil to Samuel Just
wip-mon-compact looks good
- Status changed from Fix Under Review to Resolved
Also available in: Atom
PDF