Bug #1613: mon crash - Ceph - Ceph

Actions

Copy link

Bug #1613

closed

mon crash

Added by Hong Cho over 12 years ago. Updated over 12 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi,

I am currently running ceph 0.36 w/ kernel 3.0.0 on two machines each with 4 2tb HDDs. All of those HDDs are being used as OSDs and both machines are running MON and MDS. This evening one machine showed OOPS and died. After reboot I am no longer able to mount the file system. It appears that monitor has been crashed. So I am attaching log from the monitor.

Is there some thing I can try to get the filesystem back? Or does this indicate that I will need to start a new one?

2011-10-11 20:26:00.445263 7f37928a2720 ceph version 0.36 (877cacbbdbb78d120b431cb762f57332d62ef0a6), process ceph-mon, pid 4020
2011-10-11 20:26:00.445462 7f37928a2720 store(/mnt/mon0) mount
2011-10-11 20:26:00.446112 7f37928a2720 -- 192.168.1.10:6789/0 accepter.bind ms_addr is 192.168.1.10:6789/0 need_addr=0
2011-10-11 20:26:00.448053 7f37928a2720 -- 192.168.1.10:6789/0 messenger.start
2011-10-11 20:26:00.448118 7f37928a2720 -- 192.168.1.10:6789/0 accepter.start
2011-10-11 20:26:00.448336 7f37928a2720 mon.0@0(starting) e1 init fsid 509a0e09-3cbf-33f3-fb42-7bafa0bd4401
2011-10-11 20:26:00.449375 7f37928a2720 log [INF] : mon.0 calling new monitor election
2011-10-11 20:26:00.449487 7f37928a2720 -- 192.168.1.10:6789/0 --> mon0 192.168.1.10:6789/0 -- log(1 entries) v1 -- ?+0 0x1b0d000
2011-10-11 20:26:00.449642 7f378ebc7700 -- 192.168.1.10:6789/0 <== mon0 192.168.1.10:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x1b0d000 con 0x1b17000
2011-10-11 20:26:00.501154 7f37928a2720 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- election(propose 1) v1 -- ?+0 0x1b0d400
2011-10-11 20:26:00.501494 7f378ebc7700 -- 192.168.1.10:6789/0 <== osd4 192.168.1.11:6801/3535 1 ==== auth(proto 0 26 bytes) v1 ==== 52+0+0 (1205707267 0 0) 0x1b0d600 con 0x1b17c80
2011-10-11 20:26:00.502604 7f378ebc7700 -- 192.168.1.10:6789/0 <== mds? 192.168.1.11:6800/3495 1 ==== auth(proto 0 29 bytes) v1 ==== 55+0+0 (1850031845 0 0) 0x1b0de00 con 0x1b253c0

2011-10-11 20:26:00.609344 7f378ebc7700 -- 192.168.1.10:6789/0 <== osd1 192.168.1.10:6804/1748 1 ==== auth(proto 0 26 bytes) v1 ==== 52+0+0 (1831646840 0 0) 0x1b27000 con 0x1b25140
2011-10-11 20:26:00.783236 7f378ebc7700 -- 192.168.1.10:6789/0 <== osd6 192.168.1.11:6807/3638 1 ==== auth(proto 0 26 bytes) v1 ==== 52+0+0 (2857138274 0 0) 0x1b0d400 con 0x1b25a00
2011-10-11 20:26:00.837565 7f378ebc7700 -- 192.168.1.10:6789/0 <== osd0 192.168.1.10:6801/1654 1 ==== auth(proto 0 26 bytes) v1 ==== 52+0+0 (2568118320 0 0) 0x1b27800 con 0x1b25780
2011-10-11 20:26:01.229308 7f378ebc7700 -- 192.168.1.10:6789/0 <== osd5 192.168.1.11:6804/3587 1 ==== auth(proto 0 26 bytes) v1 ==== 52+0+0 (3018028107 0 0) 0x1b27a00 con 0x1b243c0
2011-10-11 20:26:01.334853 7f378ebc7700 -- 192.168.1.10:6789/0 <== osd3 192.168.1.10:6810/1991 1 ==== auth(proto 0 26 bytes) v1 ==== 52+0+0 (2159897625 0 0) 0x1b2b000 con 0x1b24140
2011-10-11 20:26:01.356515 7f378ebc7700 -- 192.168.1.10:6789/0 <== mon1 192.168.1.11:6789/0 1 ==== election(propose 23) v1 ==== 340+0+0 (3872700071 0 0) 0x1b2b200 con 0x1b17a00
2011-10-11 20:26:01.840179 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- election(propose 23) v1 -- ?+0 0x1b2ba00
2011-10-11 20:26:01.840293 7f378ebc7700 -- 192.168.1.10:6789/0 <== mds? 192.168.1.10:6800/1544 1 ==== auth(proto 0 29 bytes) v1 ==== 55+0+0 (2011837132 0 0) 0x1b2be00 con 0x1b24dc0
2011-10-11 20:26:01.840357 7f378ebc7700 -- 192.168.1.10:6789/0 <== osd2 192.168.1.10:6807/1890 1 ==== auth(proto 0 26 bytes) v1 ==== 52+0+0 (1954789969 0 0) 0x1b2d000 con 0x1b24b40
2011-10-11 20:26:01.841289 7f378ebc7700 -- 192.168.1.10:6789/0 <== mon1 192.168.1.11:6789/0 2 ==== election(ack 23) v1 ==== 340+0+0 (176343387 0 0) 0x1b2ba00 con 0x1b17a00
2011-10-11 20:26:01.857175 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- election(victory 24) v1 -- ?+0 0x1b2b600
2011-10-11 20:26:01.857260 7f378ebc7700 log [INF] : mon.0@0 won leader election with quorum 0,1
2011-10-11 20:26:01.857326 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon0 192.168.1.10:6789/0 -- log(1 entries) v1 -- ?+0 0x1b2b400
2011-10-11 20:26:01.870530 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- paxos(pgmap collect lc 14655 fc 14655 pn 23900 opn 0) v1 -- ?+0 0x1b28240
2011-10-11 20:26:01.885007 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- paxos(mdsmap collect lc 7082 fc 0 pn 23900 opn 0) v1 -- ?+0 0x1b286c0
2011-10-11 20:26:01.906375 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- paxos(osdmap collect lc 0 fc 0 pn 23900 opn 0) v1 -- ?+0 0x1b28480
2011-10-11 20:26:01.918880 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- paxos(logm collect lc 0 fc 14655 pn 23900 opn 0) v1 -- ?+0 0x1b28d80
2011-10-11 20:26:01.930989 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- paxos(monmap collect lc 1 fc 0 pn 23900 opn 0) v1 -- ?+0 0x1b28b40
2011-10-11 20:26:01.941242 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- paxos(auth collect lc 228 fc 0 pn 1600 opn 0) v1 -- ?+0 0x1b28900
2011-10-11 20:26:01.941382 7f378ebc7700 -- 192.168.1.10:6789/0 <== mon0 192.168.1.10:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x1b2b400 con 0x1b17000
2011-10-11 20:26:02.075082 7f378ebc7700 -- 192.168.1.10:6789/0 <== mon1 192.168.1.11:6789/0 3 ==== paxos(pgmap last lc 14655 fc 14655 pn 23900 opn 0) v1 ==== 84+0+0 (1586122263 0 0) 0x1b28900 con 0x1b17a00
2011-10-11 20:26:02.075145 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- paxos(pgmap lease lc 14655 fc 14655 pn 0 opn 0) v1 -- ?+0 0x1b28240
2011-10-11 20:26:02.103460 7f378ebc7700 -- 192.168.1.10:6789/0 <== mon1 192.168.1.11:6789/0 4 ==== paxos(mdsmap last lc 232 fc 0 pn 23900 opn 0) v1 ==== 84+0+0 (2474334558 0 0) 0x1b28240 con 0x1b17a002011-10-11 20:26:02.131162 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- paxos(mdsmap commit lc 7082 fc 0 pn 0 opn 0) v1 -- ?+0 0x1b28900
2011-10-11 20:26:02.131261 7f378ebc7700 -- 192.168.1.10:6789/0 --> mon1 192.168.1.11:6789/0 -- paxos(mdsmap lease lc 7082 fc 0 pn 0 opn 0) v1 -- ?+0 0x1b286c0

Caught signal (Aborted) *
in thread 0x7f378ebc7700
ceph version 0.36 (877cacbbdbb78d120b431cb762f57332d62ef0a6)
1: /usr/bin/ceph-mon() [0x5b0ca4]
2: (()+0xf020) [0x7f3792484020]
3: (gsignal()+0x35) [0x7f3790e163d5]
4: (abort()+0x180) [0x7f3790e19650]
5: (_gnu_cxx::_verbose_terminate_handler()+0x11d) [0x7f37916bde8d]
6: (()+0xbd036) [0x7f37916bc036]
7: (()+0xbd063) [0x7f37916bc063]
8: (()+0xbd15e) [0x7f37916bc15e]
9: (ceph::buffer::list::iterator::copy(unsigned int, char)+0x127) [0x585327]
10: (MDSMap::decode(ceph::buffer::list::iterator&)+0x2c) [0x4c624c]
11: (MDSMonitor::update_from_paxos()+0x230) [0x4bd430]
12: (PaxosService::_active()+0x39) [0x48df09]
13: (Context::complete(int)+0xa) [0x479a0a]
14: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0xca) [0x48a86a]
15: (Paxos::handle_last(MMonPaxos*)+0x59b) [0x4891fb]
16: (Paxos::dispatch(PaxosServiceMessage*)+0x24b) [0x4897fb]
17: (Monitor::_ms_dispatch(Message*)+0x82e) [0x4765ae]
18: (Monitor::ms_dispatch(Message*)+0x69) [0x480839]
19: (SimpleMessenger::dispatch_entry()+0x7bb) [0x57359b]
20: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x46700c]
21: (()+0x6b40) [0x7f379247bb40]
22: (clone()+0x6d) [0x7f3790eb928d]

Regards,
Hong

Actions

Copy link

Updated by Greg Farnum over 12 years ago

Did you record the OOPS somewhere? It looks as though the monitor is pulling bad data off disk. You should be able to recover by deleting that monitor's data directory and rsyncing the other monitor's over to it.

Actions

Copy link

Updated by Hong Cho over 12 years ago

Unfortunately I didn't get a chance to record the OOPS. I'll try to get them next time. In the syslog I found this around the time it hung. But this seems to happened about 7 minutes before actual OOPS. syslog doesn't seem to have it.

Where do I find monitor's data directory?

Oct 11 18:58:41 MDS1 kernel: [424907.967293] ceph: ceph_add_cap: couldn't find snap realm 100
Oct 11 18:58:41 MDS1 kernel: [424907.971118] ------------[ cut here ]------------
Oct 11 18:58:41 MDS1 kernel: [424907.971153] WARNING: at /build/buildd-linux-2.6_3.0.0-3-amd64-9ClimQ/linux-2.6-3.0.0/debian/build/source_amd64_none/fs/ceph/caps.c:591 ceph_add_cap+0x369/0x47d [ceph]()
Oct 11 18:58:41 MDS1 kernel: [424907.971165] Hardware name: D525TUD
Oct 11 18:58:41 MDS1 kernel: [424907.971170] Modules linked in: cryptd aes_x86_64 aes_generic cbc ceph libceph fuse bonding it87 hwmon_vid coretemp loop snd_hda_codec_realtek snd_hda_intel snd_hda_codec i915 snd_hwdep drm_kms_helper snd_pcm drm snd_timer snd pcspkr evdev i2c_i801 serio_raw soundcore processor parport_pc parport i2c_algo_bit i2c_core video button snd_page_alloc thermal_sys ext2 mbcache btrfs zlib_deflate crc32c libcrc32c sg sd_mod crc_t10dif usb_storage ata_generic uas pata_jmicron uhci_hcd ahci libahci libata ehci_hcd r8169 scsi_mod mii usbcore [last unloaded: scsi_wait_scan]
Oct 11 18:58:41 MDS1 kernel: [424907.971297] Pid: 8685, comm: kworker/1:1 Not tainted 3.0.0-1-amd64 #1
Oct 11 18:58:41 MDS1 kernel: [424907.971304] Call Trace:
Oct 11 18:58:41 MDS1 kernel: [424907.971321] [<ffffffff81046521>] ? warn_slowpath_common+0x78/0x8c
Oct 11 18:58:41 MDS1 kernel: [424907.971347] [<ffffffffa038de33>] ? ceph_add_cap+0x369/0x47d [ceph]
Oct 11 18:58:41 MDS1 kernel: [424907.971360] [<ffffffff8103840a>] ? should_resched+0x5/0x24
Oct 11 18:58:41 MDS1 kernel: [424907.971382] [<ffffffffa03862b5>] ? fill_inode.clone.12+0x4ff/0x5a5 [ceph]
Oct 11 18:58:41 MDS1 kernel: [424907.971396] [<ffffffff8110b417>] ? __d_instantiate_unique+0xa7/0xb9
Oct 11 18:58:41 MDS1 kernel: [424907.971419] [<ffffffffa038661b>] ? ceph_readdir_prepopulate+0x2c0/0x359 [ceph]
Oct 11 18:58:41 MDS1 kernel: [424907.971433] [<ffffffff8103840a>] ? should_resched+0x5/0x24
Oct 11 18:58:41 MDS1 kernel: [424907.971460] [<ffffffffa0396cbe>] ? dispatch+0x9b9/0xe43 [ceph]
Oct 11 18:58:41 MDS1 kernel: [424907.971507] [<ffffffffa03613cd>] ? ceph_tcp_recvmsg+0x43/0x4f [libceph]
Oct 11 18:58:41 MDS1 kernel: [424907.971530] [<ffffffffa0362993>] ? con_work+0xf2b/0x1d65 [libceph]
Oct 11 18:58:41 MDS1 kernel: [424907.971542] [<ffffffff81008032>] ? load_TLS+0x7/0xa
Oct 11 18:58:41 MDS1 kernel: [424907.971552] [<ffffffff81008626>] ? __switch_to+0xc9/0x1fb
Oct 11 18:58:41 MDS1 kernel: [424907.971565] [<ffffffff810383fc>] ? need_resched+0x1a/0x23
Oct 11 18:58:41 MDS1 kernel: [424907.971577] [<ffffffff813356f9>] ? schedule+0x5e5/0x5fc
Oct 11 18:58:41 MDS1 kernel: [424907.971598] [<ffffffffa0361a68>] ? read_partial_message_section.clone.9+0x74/0x74 [libceph]
Oct 11 18:58:41 MDS1 kernel: [424907.971613] [<ffffffff8105b943>] ? process_one_work+0x193/0x28f
Oct 11 18:58:41 MDS1 kernel: [424907.971624] [<ffffffff8105cacf>] ? worker_thread+0xef/0x172
Oct 11 18:58:41 MDS1 kernel: [424907.971635] [<ffffffff8105c9e0>] ? manage_workers.clone.17+0x15b/0x15b
Oct 11 18:58:41 MDS1 kernel: [424907.971646] [<ffffffff8105c9e0>] ? manage_workers.clone.17+0x15b/0x15b
Oct 11 18:58:41 MDS1 kernel: [424907.971658] [<ffffffff8105fc0b>] ? kthread+0x7a/0x82
Oct 11 18:58:41 MDS1 kernel: [424907.971671] [<ffffffff8133ce24>] ? kernel_thread_helper+0x4/0x10
Oct 11 18:58:41 MDS1 kernel: [424907.971684] [<ffffffff8105fb91>] ? kthread_worker_fn+0x149/0x149
Oct 11 18:58:41 MDS1 kernel: [424907.971695] [<ffffffff8133ce20>] ? gs_change+0x13/0x13

Actions

Copy link

Updated by Greg Farnum over 12 years ago

The mon data dir is specified in your ceph.conf.

This backtrace though makes it look like you're running the kernel client on your OSD hosts? You don't want to do that as you can run into serious memory issues -- if you need them to be on the same machine, use the fuse client. :)

Actions

Copy link

Updated by Hong Cho over 12 years ago

Excellent. I got it running again using second monitor's data on first monitor.

Yes I am running kernel client on OSD host. From what I read so far it seemed like the memory issue may happen when the OS runs out of memory to use. So far I thought I didn't use very much memory yet... but I could be wrong. =) I'll look in to using fuse client. Thanks Greg.

Actions

Copy link

Updated by Greg Farnum over 12 years ago

Status changed from New to Resolved

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #1613

mon crash

Updated by Greg Farnum over 12 years ago

Updated by Hong Cho over 12 years ago

Updated by Greg Farnum over 12 years ago

Updated by Hong Cho over 12 years ago

Updated by Greg Farnum over 12 years ago