Project

General

Profile

Bug #2462 ยป osd.9.log

Eric Dold, 05/23/2012 01:50 AM

 
2012-05-23 06:11:09.399242 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:09.399229) v2 because of no pipe on con 0xf452c0
2012-05-23 06:11:10.499489 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:10.499477) v2 because of no pipe on con 0xf452c0
2012-05-23 06:11:13.999732 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:13.999713) v2 because of no pipe on con 0xf452c0
2012-05-23 06:11:15.700010 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:15.699997) v2 because of no pipe on con 0xf452c0
2012-05-23 06:11:19.801189 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:19.801176) v2 because of no pipe on con 0xf452c0
2012-05-23 06:11:23.301438 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:23.301427) v2 because of no pipe on con 0xf452c0
2012-05-23 06:11:23.536077 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:03.536076)
2012-05-23 06:11:24.401876 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:24.401864) v2 because of no pipe on con 0xf452c0
2012-05-23 06:11:24.401929 7f18f0f07700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:04.401929)
2012-05-23 06:11:24.536220 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:04.536219)
2012-05-23 06:11:25.536348 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:05.536347)
2012-05-23 06:11:26.536468 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:06.536467)
2012-05-23 06:11:27.536603 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:07.536602)
2012-05-23 06:11:27.902171 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:27.902161) v2 because of no pipe on con 0xf452c0
2012-05-23 06:11:27.902220 7f18f0f07700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:07.902219)
2012-05-23 06:11:28.536746 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:08.536745)
2012-05-23 06:16:35.131920 7f18f6813700 0 log [INF] : 0.b restarting backfill on osd.0 from (0'0,0'0] MAX to 475'4216
2012-05-23 06:16:35.132785 7f18f6813700 0 log [INF] : 0.7 restarting backfill on osd.1 from (0'0,0'0] MAX to 474'4141
2012-05-23 06:16:37.080317 7f18f6012700 -1 osd/PG.cc: In function 'void PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&, int)' thread 7f18f6012700 time 2012-05-23 06:16:36.911866
osd/PG.cc: 402: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)

ceph version 0.47.1 (commit:f5a9404445e2ed5ec2ee828aa53d73d4a002f7a5)
1: (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&, int)+0x1987) [0x614437]
2: (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec const&)+0x14d) [0x61d9ad]
3: (boost::statechart::simple_state<PG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x11e) [0x64f91e]
4: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x64450b]
5: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x11) [0x644631]
6: (PG::RecoveryState::handle_log(int, MOSDPGLog*, PG::RecoveryCtx*)+0x19e) [0x61cdce]
7: (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x5e7) [0x5dafa7]
8: (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x220) [0x5ddda0]
9: (OSD::_dispatch(Message*)+0x191) [0x5e40b1]
10: (OSD::ms_dispatch(Message*)+0x153) [0x5e4a03]
11: (SimpleMessenger::dispatch_entry()+0x85b) [0x783c4b]
12: (SimpleMessenger::DispatchThread::entry()+0xd) [0x74b52d]
13: (()+0x8ec6) [0x7f190367fec6]
14: (clone()+0x6d) [0x7f190232c51d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
-109> 2012-05-22 23:20:44.824911 7f19043de780 0 ceph version 0.47.1 (commit:f5a9404445e2ed5ec2ee828aa53d73d4a002f7a5), process ceph-osd, pid 11133
-108> 2012-05-22 23:20:44.904544 7f19043de780 0 filestore(/data/osd.9) mount FIEMAP ioctl is supported and appears to work
-107> 2012-05-22 23:20:44.904700 7f19043de780 0 filestore(/data/osd.9) mount detected btrfs
-106> 2012-05-22 23:20:44.904714 7f19043de780 0 filestore(/data/osd.9) mount btrfs CLONE_RANGE ioctl is supported
-105> 2012-05-22 23:20:44.972278 7f19043de780 0 filestore(/data/osd.9) mount btrfs SNAP_CREATE is supported
-104> 2012-05-22 23:20:44.972634 7f19043de780 0 filestore(/data/osd.9) mount btrfs SNAP_DESTROY is supported
-103> 2012-05-22 23:20:44.972901 7f19043de780 0 filestore(/data/osd.9) mount btrfs START_SYNC is supported (transid 722)
-102> 2012-05-22 23:20:45.006537 7f19043de780 0 filestore(/data/osd.9) mount btrfs WAIT_SYNC is supported
-101> 2012-05-22 23:20:45.007184 7f19043de780 0 filestore(/data/osd.9) mount btrfs SNAP_CREATE_V2 is supported
-100> 2012-05-22 23:20:45.020084 7f19043de780 0 filestore(/data/osd.9) mount found snaps <134893,134924>
-99> 2012-05-22 23:20:45.481129 7f19043de780 0 filestore(/data/osd.9) mount: enabling PARALLEL journal mode: btrfs, SNAP_CREATE_V2 detected and 'filestore btrfs snap' mode is enabled
-98> 2012-05-22 23:20:45.517170 7f19043de780 0 journal kernel version is 3.4.0
-97> 2012-05-22 23:20:45.517244 7f19043de780 1 journal _open /dev/sda6 fd 23: 1019899904 bytes, block size 4096 bytes, directio = 1, aio = 0
-96> 2012-05-22 23:20:47.619642 7f19043de780 0 journal kernel version is 3.4.0
-95> 2012-05-22 23:20:47.619708 7f19043de780 1 journal _open /dev/sda6 fd 23: 1019899904 bytes, block size 4096 bytes, directio = 1, aio = 0
-94> 2012-05-22 23:20:47.620069 7f19043de780 1 journal close /dev/sda6
-93> 2012-05-22 23:20:47.697027 7f19043de780 0 filestore(/data/osd.9) mount FIEMAP ioctl is supported and appears to work
-92> 2012-05-22 23:20:47.697225 7f19043de780 0 filestore(/data/osd.9) mount detected btrfs
-91> 2012-05-22 23:20:47.697234 7f19043de780 0 filestore(/data/osd.9) mount btrfs CLONE_RANGE ioctl is supported
-90> 2012-05-22 23:20:47.764749 7f19043de780 0 filestore(/data/osd.9) mount btrfs SNAP_CREATE is supported
-89> 2012-05-22 23:20:47.771363 7f19043de780 0 filestore(/data/osd.9) mount btrfs SNAP_DESTROY is supported
-88> 2012-05-22 23:20:47.771605 7f19043de780 0 filestore(/data/osd.9) mount btrfs START_SYNC is supported (transid 726)
-87> 2012-05-22 23:20:47.808258 7f19043de780 0 filestore(/data/osd.9) mount btrfs WAIT_SYNC is supported
-86> 2012-05-22 23:20:47.809293 7f19043de780 0 filestore(/data/osd.9) mount btrfs SNAP_CREATE_V2 is supported
-85> 2012-05-22 23:20:47.812052 7f19043de780 0 filestore(/data/osd.9) mount found snaps <134893,134924>
-84> 2012-05-22 23:20:48.079189 7f19043de780 0 filestore(/data/osd.9) mount: enabling PARALLEL journal mode: btrfs, SNAP_CREATE_V2 detected and 'filestore btrfs snap' mode is enabled
-83> 2012-05-22 23:20:48.092758 7f19043de780 0 journal kernel version is 3.4.0
-82> 2012-05-22 23:20:48.092820 7f19043de780 1 journal _open /dev/sda6 fd 31: 1019899904 bytes, block size 4096 bytes, directio = 1, aio = 0
-81> 2012-05-22 23:20:48.499529 7f19043de780 0 journal kernel version is 3.4.0
-80> 2012-05-22 23:20:48.499583 7f19043de780 1 journal _open /dev/sda6 fd 31: 1019899904 bytes, block size 4096 bytes, directio = 1, aio = 0
-79> 2012-05-22 23:37:03.007574 7f18ef4f4700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6806/16586 pipe(0x4954500 sd=38 pgs=36 cs=1 l=0).fault with nothing to send, going to standby
-78> 2012-05-22 23:38:07.970995 7f18ef8f8700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6812/16757 pipe(0x4954000 sd=40 pgs=37 cs=1 l=0).fault with nothing to send, going to standby
-77> 2012-05-22 23:38:09.992611 7f18efafa700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6817/12460 pipe(0xf43900 sd=34 pgs=97 cs=1 l=0).fault with nothing to send, going to standby
-76> 2012-05-22 23:39:56.151665 7f18ef9f9700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6820/12577 pipe(0xf43680 sd=35 pgs=117 cs=1 l=0).fault with nothing to send, going to standby
-75> 2012-05-22 23:40:21.152906 7f18ef7f7700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6809/16671 pipe(0x4954280 sd=39 pgs=57 cs=1 l=0).fault with nothing to send, going to standby
-74> 2012-05-22 23:40:21.336010 7f18eefef700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6802/16500 pipe(0x4954780 sd=37 pgs=69 cs=1 l=0).fault with nothing to send, going to standby
-73> 2012-05-22 23:40:21.439092 7f18ef6f6700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6814/12375 pipe(0xf43400 sd=33 pgs=112 cs=1 l=0).fault with nothing to send, going to standby
-72> 2012-05-23 00:07:26.611769 7f18ed0d8700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6812/16757 pipe(0x4954000 sd=40 pgs=40 cs=3 l=0).fault with nothing to send, going to standby
-71> 2012-05-23 00:20:05.842245 7f18ef7f7700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6809/16671 pipe(0x4954280 sd=29 pgs=62 cs=3 l=0).fault with nothing to send, going to standby
-70> 2012-05-23 00:20:05.879432 7f18ef4f4700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6806/16586 pipe(0x4954500 sd=27 pgs=40 cs=3 l=0).fault with nothing to send, going to standby
-69> 2012-05-23 00:20:05.949550 7f18ef6f6700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6814/12375 pipe(0xf43400 sd=33 pgs=117 cs=3 l=0).fault with nothing to send, going to standby
-68> 2012-05-23 00:23:44.888152 7f18ecfd7700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6812/16757 pipe(0x48ab180 sd=39 pgs=0 cs=0 l=0).accept connect_seq 4 vs existing 3 state 3
-67> 2012-05-23 00:23:44.914753 7f18effff700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6817/12460 pipe(0x4bd5b80 sd=55 pgs=0 cs=0 l=0).accept connect_seq 2 vs existing 1 state 3
-66> 2012-05-23 00:23:44.914792 7f18eced6700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6814/12375 pipe(0x4954000 sd=57 pgs=0 cs=0 l=0).accept connect_seq 4 vs existing 3 state 3
-65> 2012-05-23 00:23:44.918360 7f18f0605700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6806/16586 pipe(0xf43900 sd=34 pgs=0 cs=0 l=0).accept connect_seq 4 vs existing 3 state 3
-64> 2012-05-23 00:23:44.918427 7f18efafa700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6809/16671 pipe(0x4d34f00 sd=33 pgs=0 cs=0 l=0).accept connect_seq 4 vs existing 3 state 3
-63> 2012-05-23 00:38:45.015247 7f18effff700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6817/12460 pipe(0x4bd5b80 sd=55 pgs=106 cs=3 l=0).fault with nothing to send, going to standby
-62> 2012-05-23 00:43:13.231555 7f18eefef700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6802/16500 pipe(0x4954780 sd=37 pgs=71 cs=3 l=0).fault with nothing to send, going to standby
-61> 2012-05-23 00:46:00.902502 7f18f6813700 0 log [INF] : 0.153 restarting backfill on osd.7 from (0'0,0'0] MAX to 49'4209
-60> 2012-05-23 00:50:27.487085 7f18efafa700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6809/16671 pipe(0x4d34f00 sd=33 pgs=66 cs=5 l=0).fault with nothing to send, going to standby
-59> 2012-05-23 00:58:41.632336 7f18ecdd5700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6809/16671 pipe(0xf43400 sd=62 pgs=0 cs=0 l=0).accept connect_seq 6 vs existing 5 state 3
-58> 2012-05-23 01:05:16.954192 7f18eefef700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6817/12460 pipe(0x4bd5b80 sd=55 pgs=111 cs=5 l=0).fault with nothing to send, going to standby
-57> 2012-05-23 01:12:24.652619 7f18ef7f7700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6817/12460 pipe(0xfbef00 sd=29 pgs=0 cs=0 l=0).accept connect_seq 6 vs existing 5 state 3
-56> 2012-05-23 01:25:19.098978 7f18eced6700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6814/12375 pipe(0x4954000 sd=57 pgs=118 cs=5 l=0).fault with nothing to send, going to standby
-55> 2012-05-23 01:27:24.753981 7f18ef7f7700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6817/12460 pipe(0xfbef00 sd=29 pgs=115 cs=7 l=0).fault with nothing to send, going to standby
-54> 2012-05-23 01:27:26.885989 7f18ecdd5700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6809/16671 pipe(0xf43400 sd=62 pgs=72 cs=7 l=0).fault with nothing to send, going to standby
-53> 2012-05-23 01:28:39.459861 7f18ecfd7700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6812/16757 pipe(0x48ab180 sd=39 pgs=45 cs=5 l=0).fault with nothing to send, going to standby
-52> 2012-05-23 01:28:39.579302 7f18ef5f5700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6823/12665 pipe(0xf43b80 sd=36 pgs=122 cs=1 l=0).fault with nothing to send, going to standby
-51> 2012-05-23 01:28:40.104245 7f18ef9f9700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6820/12577 pipe(0xf43680 sd=26 pgs=122 cs=3 l=0).fault with nothing to send, going to standby
-50> 2012-05-23 01:28:40.111639 7f18f0605700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6806/16586 pipe(0xf43900 sd=34 pgs=42 cs=5 l=0).fault with nothing to send, going to standby
-49> 2012-05-23 01:31:40.604446 7f18f0706700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6823/12665 pipe(0x48b8000 sd=40 pgs=0 cs=0 l=0).accept connect_seq 2 vs existing 1 state 3
-48> 2012-05-23 01:31:41.582070 7f18f0403700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6809/16671 pipe(0x4954280 sd=36 pgs=0 cs=0 l=0).accept connect_seq 8 vs existing 7 state 3
-47> 2012-05-23 01:32:22.603023 7f18efafa700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6806/16586 pipe(0x4e04a00 sd=55 pgs=0 cs=0 l=0).accept connect_seq 6 vs existing 5 state 3
-46> 2012-05-23 01:57:09.636996 7f18ecfd7700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6812/16757 pipe(0x48ab180 sd=34 pgs=55 cs=7 l=0).fault with nothing to send, going to standby
-45> 2012-05-23 01:58:08.612326 7f18ef9f9700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6820/12577 pipe(0xf43680 sd=26 pgs=140 cs=5 l=0).fault with nothing to send, going to standby
-44> 2012-05-23 01:58:08.622765 7f18ef7f7700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6817/12460 pipe(0xfbef00 sd=29 pgs=117 cs=9 l=0).fault with nothing to send, going to standby
-43> 2012-05-23 01:59:09.304944 7f18ef4f4700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6820/12577 pipe(0x7fd1400 sd=64 pgs=0 cs=0 l=0).accept connect_seq 6 vs existing 5 state 3
-42> 2012-05-23 01:59:09.328292 7f18f0504700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6812/16757 pipe(0x6972c80 sd=26 pgs=0 cs=0 l=0).accept connect_seq 8 vs existing 7 state 3
-41> 2012-05-23 02:46:52.631206 7f18eced6700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6814/12375 pipe(0x4954000 sd=57 pgs=130 cs=7 l=0).fault with nothing to send, going to standby
-40> 2012-05-23 02:46:52.631251 7f18effff700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6802/16500 pipe(0x4954780 sd=37 pgs=82 cs=5 l=0).fault with nothing to send, going to standby
-39> 2012-05-23 03:02:08.365313 7f18effff700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6802/16500 pipe(0x4954780 sd=37 pgs=103 cs=7 l=0).fault with nothing to send, going to standby
-38> 2012-05-23 03:05:18.395331 7f18ed0d8700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6802/16500 pipe(0x6972000 sd=30 pgs=0 cs=0 l=0).accept connect_seq 8 vs existing 7 state 3
-37> 2012-05-23 03:51:50.144320 7f18eced6700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6814/12375 pipe(0x4954000 sd=54 pgs=142 cs=9 l=0).fault with nothing to send, going to standby
-36> 2012-05-23 04:03:45.529568 7f18ed0d8700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6802/16500 pipe(0x6972000 sd=30 pgs=105 cs=9 l=0).fault with nothing to send, going to standby
-35> 2012-05-23 04:03:45.550572 7f18ef7f7700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6817/12460 pipe(0xfbef00 sd=29 pgs=130 cs=11 l=0).fault with nothing to send, going to standby
-34> 2012-05-23 04:12:53.950119 7f18f0302700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6802/16500 pipe(0x6973680 sd=37 pgs=0 cs=0 l=0).accept connect_seq 10 vs existing 9 state 3
-33> 2012-05-23 04:12:55.076692 7f18effff700 0 -- 192.168.1.103:6818/11133 >> 192.168.1.11:0/1680573113 pipe(0x4954780 sd=30 pgs=0 cs=0 l=0).accept peer addr is really 192.168.1.11:0/1680573113 (socket is 192.168.1.11:45615/0)
-32> 2012-05-23 04:13:26.932868 7f18fe022700 0 log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 30.926133 secs
-31> 2012-05-23 04:13:26.932879 7f18fe022700 0 log [WRN] : slow request 30.926133 seconds old, received at 2012-05-23 04:12:56.006661: osd_op(client.4332.1:1704797 10000003085.00000125 [write 0~4194304] 0.d2534753 RETRY snapc 1=[]) currently waiting for sub ops
-30> 2012-05-23 04:13:56.734478 7f18fe022700 0 log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 60.727762 secs
-29> 2012-05-23 04:13:56.734493 7f18fe022700 0 log [WRN] : slow request 60.727762 seconds old, received at 2012-05-23 04:12:56.006661: osd_op(client.4332.1:1704797 10000003085.00000125 [write 0~4194304] 0.d2534753 RETRY snapc 1=[]) currently waiting for sub ops
-28> 2012-05-23 04:13:57.828526 7f18effff700 0 -- 192.168.1.103:6818/11133 >> 192.168.1.11:0/1680573113 pipe(0x6474c80 sd=30 pgs=0 cs=0 l=0).accept peer addr is really 192.168.1.11:0/1680573113 (socket is 192.168.1.11:45636/0)
-27> 2012-05-23 04:14:56.610250 7f18fe022700 0 log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 120.603551 secs
-26> 2012-05-23 04:14:56.610263 7f18fe022700 0 log [WRN] : slow request 120.603551 seconds old, received at 2012-05-23 04:12:56.006661: osd_op(client.4332.1:1704797 10000003085.00000125 [write 0~4194304] 0.d2534753 RETRY snapc 1=[]) currently waiting for sub ops
-25> 2012-05-23 04:16:56.037370 7f18fe022700 0 log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 240.030673 secs
-24> 2012-05-23 04:16:56.037382 7f18fe022700 0 log [WRN] : slow request 240.030673 seconds old, received at 2012-05-23 04:12:56.006661: osd_op(client.4332.1:1704797 10000003085.00000125 [write 0~4194304] 0.d2534753 RETRY snapc 1=[]) currently waiting for sub ops
-23> 2012-05-23 04:18:08.869980 7f18f370c700 0 -- 192.168.1.103:6818/11133 send_message dropped message osd_op_reply(1704797 10000003085.00000125 [write 0~4194304] ondisk = 0) v4 because of no pipe on con 0xf45040
-22> 2012-05-23 04:28:11.636499 7f18f0302700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6802/16500 pipe(0x6973680 sd=37 pgs=113 cs=11 l=0).fault with nothing to send, going to standby
-21> 2012-05-23 04:29:09.507392 7f18efafa700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.102:6806/16586 pipe(0x4e04a00 sd=55 pgs=51 cs=7 l=0).fault with nothing to send, going to standby
-20> 2012-05-23 04:44:45.976483 7f18ef7f7700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6817/12460 pipe(0xfbef00 sd=29 pgs=141 cs=13 l=0).fault with nothing to send, going to standby
-19> 2012-05-23 06:11:08.308671 7f18ef4f4700 0 -- 192.168.1.103:6819/11133 >> 192.168.1.101:6820/12577 pipe(0x7fd1400 sd=64 pgs=146 cs=7 l=0).fault with nothing to send, going to standby
-18> 2012-05-23 06:11:09.399242 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:09.399229) v2 because of no pipe on con 0xf452c0
-17> 2012-05-23 06:11:10.499489 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:10.499477) v2 because of no pipe on con 0xf452c0
-16> 2012-05-23 06:11:13.999732 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:13.999713) v2 because of no pipe on con 0xf452c0
-15> 2012-05-23 06:11:15.700010 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:15.699997) v2 because of no pipe on con 0xf452c0
-14> 2012-05-23 06:11:19.801189 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:19.801176) v2 because of no pipe on con 0xf452c0
-13> 2012-05-23 06:11:23.301438 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:23.301427) v2 because of no pipe on con 0xf452c0
-12> 2012-05-23 06:11:23.536077 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:03.536076)
-11> 2012-05-23 06:11:24.401876 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:24.401864) v2 because of no pipe on con 0xf452c0
-10> 2012-05-23 06:11:24.401929 7f18f0f07700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:04.401929)
-9> 2012-05-23 06:11:24.536220 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:04.536219)
-8> 2012-05-23 06:11:25.536348 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:05.536347)
-7> 2012-05-23 06:11:26.536468 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:06.536467)
-6> 2012-05-23 06:11:27.536603 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:07.536602)
-5> 2012-05-23 06:11:27.902171 7f18f0f07700 0 -- 192.168.1.103:0/11133 send_message dropped message osd_ping(ping e551 stamp 2012-05-23 06:11:27.902161) v2 because of no pipe on con 0xf452c0
-4> 2012-05-23 06:11:27.902220 7f18f0f07700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:07.902219)
-3> 2012-05-23 06:11:28.536746 7f18fe022700 -1 osd.9 551 heartbeat_check: no reply from osd.2 since 2012-05-23 06:11:03.499021 (cutoff 2012-05-23 06:11:08.536745)
-2> 2012-05-23 06:16:35.131920 7f18f6813700 0 log [INF] : 0.b restarting backfill on osd.0 from (0'0,0'0] MAX to 475'4216
-1> 2012-05-23 06:16:35.132785 7f18f6813700 0 log [INF] : 0.7 restarting backfill on osd.1 from (0'0,0'0] MAX to 474'4141
0> 2012-05-23 06:16:37.080317 7f18f6012700 -1 osd/PG.cc: In function 'void PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&, int)' thread 7f18f6012700 time 2012-05-23 06:16:36.911866
osd/PG.cc: 402: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)

ceph version 0.47.1 (commit:f5a9404445e2ed5ec2ee828aa53d73d4a002f7a5)
1: (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&, int)+0x1987) [0x614437]
2: (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec const&)+0x14d) [0x61d9ad]
3: (boost::statechart::simple_state<PG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x11e) [0x64f91e]
4: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x64450b]
5: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x11) [0x644631]
6: (PG::RecoveryState::handle_log(int, MOSDPGLog*, PG::RecoveryCtx*)+0x19e) [0x61cdce]
7: (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x5e7) [0x5dafa7]
8: (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x220) [0x5ddda0]
9: (OSD::_dispatch(Message*)+0x191) [0x5e40b1]
10: (OSD::ms_dispatch(Message*)+0x153) [0x5e4a03]
11: (SimpleMessenger::dispatch_entry()+0x85b) [0x783c4b]
12: (SimpleMessenger::DispatchThread::entry()+0xd) [0x74b52d]
13: (()+0x8ec6) [0x7f190367fec6]
14: (clone()+0x6d) [0x7f190232c51d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- end dump of recent events ---
2012-05-23 06:16:37.433469 7f18f6012700 -1 *** Caught signal (Aborted) **
in thread 7f18f6012700

ceph version 0.47.1 (commit:f5a9404445e2ed5ec2ee828aa53d73d4a002f7a5)
1: /usr/bin/ceph-osd() [0x6f2847]
2: (()+0x10bf0) [0x7f1903687bf0]
3: (gsignal()+0x35) [0x7f190227ba95]
4: (abort()+0x17b) [0x7f190227cf0b]
5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f1902ba8bed]
6: (()+0xb9da6) [0x7f1902ba6da6]
7: (()+0xb9dd3) [0x7f1902ba6dd3]
8: (()+0xb9ece) [0x7f1902ba6ece]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x26f) [0x7a406f]
10: (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&, int)+0x1987) [0x614437]
11: (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec const&)+0x14d) [0x61d9ad]
12: (boost::statechart::simple_state<PG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x11e) [0x64f91e]
13: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x64450b]
14: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x11) [0x644631]
15: (PG::RecoveryState::handle_log(int, MOSDPGLog*, PG::RecoveryCtx*)+0x19e) [0x61cdce]
16: (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x5e7) [0x5dafa7]
17: (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x220) [0x5ddda0]
18: (OSD::_dispatch(Message*)+0x191) [0x5e40b1]
19: (OSD::ms_dispatch(Message*)+0x153) [0x5e4a03]
20: (SimpleMessenger::dispatch_entry()+0x85b) [0x783c4b]
21: (SimpleMessenger::DispatchThread::entry()+0xd) [0x74b52d]
22: (()+0x8ec6) [0x7f190367fec6]
23: (clone()+0x6d) [0x7f190232c51d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
0> 2012-05-23 06:16:37.433469 7f18f6012700 -1 *** Caught signal (Aborted) **
in thread 7f18f6012700

ceph version 0.47.1 (commit:f5a9404445e2ed5ec2ee828aa53d73d4a002f7a5)
1: /usr/bin/ceph-osd() [0x6f2847]
2: (()+0x10bf0) [0x7f1903687bf0]
3: (gsignal()+0x35) [0x7f190227ba95]
4: (abort()+0x17b) [0x7f190227cf0b]
5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f1902ba8bed]
6: (()+0xb9da6) [0x7f1902ba6da6]
7: (()+0xb9dd3) [0x7f1902ba6dd3]
8: (()+0xb9ece) [0x7f1902ba6ece]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x26f) [0x7a406f]
10: (PG::merge_log(ObjectStore::Transaction&, pg_info_t&, pg_log_t&, int)+0x1987) [0x614437]
11: (PG::RecoveryState::Stray::react(PG::RecoveryState::MLogRec const&)+0x14d) [0x61d9ad]
12: (boost::statechart::simple_state<PG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x11e) [0x64f91e]
13: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x64450b]
14: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x11) [0x644631]
15: (PG::RecoveryState::handle_log(int, MOSDPGLog*, PG::RecoveryCtx*)+0x19e) [0x61cdce]
16: (OSD::handle_pg_log(std::tr1::shared_ptr<OpRequest>)+0x5e7) [0x5dafa7]
17: (OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>)+0x220) [0x5ddda0]
18: (OSD::_dispatch(Message*)+0x191) [0x5e40b1]
19: (OSD::ms_dispatch(Message*)+0x153) [0x5e4a03]
20: (SimpleMessenger::dispatch_entry()+0x85b) [0x783c4b]
21: (SimpleMessenger::DispatchThread::entry()+0xd) [0x74b52d]
22: (()+0x8ec6) [0x7f190367fec6]
23: (clone()+0x6d) [0x7f190232c51d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- end dump of recent events ---
    (1-1/1)