Project

General

Profile

Actions

Bug #1000

closed

osd: PG::replay_queued_ops

Added by Wido den Hollander about 13 years ago. Updated about 13 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Spent time:
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On my cluster (40 OSD's) I'm seeing multiple OSD's going down with this backtrace:

(gdb) bt
#0  0x00007fd932c197bb in raise () from /lib/libpthread.so.0
#1  0x000000000061a2f3 in reraise_fatal (signum=15599) at common/signal.cc:63
#2  0x000000000061b01b in handle_fatal_signal (signum=6) at common/signal.cc:110
#3  <signal handler called>
#4  0x00007fd9317e9a75 in raise () from /lib/libc.so.6
#5  0x00007fd9317ed5c0 in abort () from /lib/libc.so.6
#6  0x00007fd93209f8e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6
#7  0x00007fd93209dd16 in ?? () from /usr/lib/libstdc++.so.6
#8  0x00007fd93209dd43 in std::terminate() () from /usr/lib/libstdc++.so.6
#9  0x00007fd93209de3e in __cxa_throw () from /usr/lib/libstdc++.so.6
#10 0x00000000006001aa in ceph::__ceph_assert_fail (assertion=<value optimized out>, file=<value optimized out>, line=<value optimized out>, 
    func=0x643970 "void PG::replay_queued_ops()") at common/assert.cc:86
#11 0x0000000000556cb6 in PG::replay_queued_ops (this=0x29f3000) at osd/PG.cc:1984
#12 0x00000000004e2a1f in OSD::activate_pg (this=0x2690000, pgid=DWARF-2 expression error: DW_OP_reg operations must be used either alone or in conjuction with DW_OP_piece.
) at osd/OSD.cc:4903
#13 0x00000000004e4648 in OSD::check_replay_queue (this=0x2690000) at osd/OSD.cc:4885
#14 0x000000000051b14e in OSD::tick (this=0x2690000) at osd/OSD.cc:1754
#15 0x00000000005fba2b in SafeTimer::timer_thread (this=0x2690048) at common/Timer.cc:102
#16 0x00000000005fe21d in SafeTimerThread::entry (this=<value optimized out>) at common/Timer.cc:38
#17 0x00007fd932c109ca in start_thread () from /lib/libpthread.so.0
#18 0x00007fd93189c70d in clone () from /lib/libc.so.6
#19 0x0000000000000000 in ?? ()
(gdb)

I restarted a OSD with debug osd = 20, this gave me:

Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[0.657( v 1311'609 (1205'604,1311'609] n=589 ec=2 les=4908 4857/4857/4857) [32,26,2] r=2 luod=0'0 lcod 0'0 active] sending log(1311'609,1311'609] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[0.712( v 1215'598 (1215'598,1215'598] n=557 ec=2 les=4908 4857/4857/4857) [32,2,13] r=1 luod=0'0 lcod 0'0 active] sending info+missing+log since 4832'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[0.712( v 1215'598 (1215'598,1215'598] n=557 ec=2 les=4908 4857/4857/4857) [32,2,13] r=1 luod=0'0 lcod 0'0 active] sending log(1215'598,1215'598] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[0.854( v 1215'300 (1215'300,1215'300] n=280 ec=2 les=4924 4857/4857/4857) [32,2,11] r=1 luod=0'0 lcod 0'0 active] sending info+missing+log since 4722'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[0.854( v 1215'300 (1215'300,1215'300] n=280 ec=2 les=4924 4857/4857/4857) [32,2,11] r=1 luod=0'0 lcod 0'0 active] sending log(1215'300,1215'300] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[1.26b( v 531'5 (26'3,531'5] n=4 ec=2 les=4908 4857/4857/4857) [32,13,2] r=2 luod=0'0 lcod 0'0 active] sending info+missing+log since 4851'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[1.26b( v 531'5 (26'3,531'5] n=4 ec=2 les=4908 4857/4857/4857) [32,13,2] r=2 luod=0'0 lcod 0'0 active] sending log(531'5,531'5] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[1.3cc( v 26'5 (26'3,26'5] n=5 ec=2 les=4907 4857/4857/4857) [32,2,21] r=1 luod=0'0 lcod 0'0 active] sending info+missing+log since 4826'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[1.3cc( v 26'5 (26'3,26'5] n=5 ec=2 les=4907 4857/4857/4857) [32,2,21] r=1 luod=0'0 lcod 0'0 active] sending log(26'5,26'5] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[1.656( v 471'5 (26'3,471'5] n=4 ec=2 les=4908 4857/4857/4857) [32,26,2] r=2 luod=0'0 lcod 0'0 active] sending info+missing+log since 4830'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[1.656( v 471'5 (26'3,471'5] n=4 ec=2 les=4908 4857/4857/4857) [32,26,2] r=2 luod=0'0 lcod 0'0 active] sending log(471'5,471'5] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[1.711( v 506'8 (471'6,506'8] n=5 ec=2 les=4908 4857/4857/4857) [32,2,13] r=1 luod=0'0 lcod 0'0 active] sending info+missing+log since 4832'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[1.711( v 506'8 (471'6,506'8] n=5 ec=2 les=4908 4857/4857/4857) [32,2,13] r=1 luod=0'0 lcod 0'0 active] sending log(506'8,506'8] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.26a( v 24'27 (24'25,24'27] n=27 ec=2 les=4908 4857/4857/4857) [32,13,2] r=2 luod=0'0 lcod 0'0 active] sending info+missing+log since 4852'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.26a( v 24'27 (24'25,24'27] n=27 ec=2 les=4908 4857/4857/4857) [32,13,2] r=2 luod=0'0 lcod 0'0 active] sending log(24'27,24'27] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.3cb( v 24'21 (24'19,24'21] n=21 ec=2 les=4908 4857/4857/4857) [32,2,21] r=1 luod=0'0 lcod 0'0 active] sending info+missing+log since 4826'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.3cb( v 24'21 (24'19,24'21] n=21 ec=2 les=4908 4857/4857/4857) [32,2,21] r=1 luod=0'0 lcod 0'0 active] sending log(24'21,24'21] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.655( v 24'22 (24'20,24'22] n=22 ec=2 les=4908 4857/4857/4857) [32,26,2] r=2 luod=0'0 lcod 0'0 active] sending info+missing+log since 4831'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.655( v 24'22 (24'20,24'22] n=22 ec=2 les=4908 4857/4857/4857) [32,26,2] r=2 luod=0'0 lcod 0'0 active] sending log(24'22,24'22] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.710( v 1586'30 (1586'30,1586'30] n=30 ec=2 les=4908 4857/4857/4857) [32,2,13] r=1 luod=0'0 lcod 0'0 active] sending info+missing+log since 4830'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.710( v 1586'30 (1586'30,1586'30] n=30 ec=2 les=4908 4857/4857/4857) [32,2,13] r=1 luod=0'0 lcod 0'0 active] sending log(1586'30,1586'30] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.852( v 24'9 (24'7,24'9] n=9 ec=2 les=4924 4857/4857/4857) [32,2,11] r=1 luod=0'0 lcod 0'0 active] sending info+missing+log since 4722'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.852( v 24'9 (24'7,24'9] n=9 ec=2 les=4924 4857/4857/4857) [32,2,11] r=1 luod=0'0 lcod 0'0 active] sending log(24'9,24'9] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[3.269( v 25'68 (25'66,25'68] n=14 ec=2 les=4908 4857/4857/4857) [32,13,2] r=2 luod=0'0 lcod 0'0 active] sending info+missing+log since 4851'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[3.269( v 25'68 (25'66,25'68] n=14 ec=2 les=4908 4857/4857/4857) [32,13,2] r=2 luod=0'0 lcod 0'0 active] sending log(25'68,25'68] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[3.3ca( v 25'72 (25'70,25'72] n=17 ec=2 les=4908 4857/4857/4857) [32,2,21] r=1 luod=0'0 lcod 0'0 active] sending info+missing+log since 4826'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[3.3ca( v 25'72 (25'70,25'72] n=17 ec=2 les=4908 4857/4857/4857) [32,2,21] r=1 luod=0'0 lcod 0'0 active] sending log(25'72,25'72] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[1.967( v 471'4 (25'2,471'4] n=3 ec=2 les=4857 4857/4857/4718) [2,39,32] r=0 lcod 0'0 mlcod 0'0 active+clean] my log = log(25'2,471'4]
Apr 12 14:50:06 atom0 osd.2[15599]: 5'3 (0'0) m 10000001086.00000000/head by mds0.1:98401 2011-03-31 20:23:39.061913 indexed
Apr 12 14:50:06 atom0 osd.2[15599]: 71'4 (25'1) m 1000000021b.00000000/head by mds0.1:190907 2011-04-04 01:26:48.246041 indexed
Apr 12 14:50:06 atom0 osd.2[15599]: 
Apr 12 14:50:06 atom0 osd.2[15599]: g[1.967( v 471'4 (25'2,471'4] n=3 ec=2 les=4857 4857/4857/4718) [2,39,32] r=0 lcod 0'0 mlcod 0'0 active+clean] osd32 log = log(471'4,471'4]
Apr 12 14:50:06 atom0 osd.2[15599]: 
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 _dispatch 0xfcba800 pg_log(2.966 e4944) v1
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 handle_pg_log pg_log(2.966 e4944) v1 from osd32
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 require_same_or_newer_map 4944 (i am 4944) 0xfcba800
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.966( v 24'7 (24'5,24'7] n=7 ec=2 les=4857 4857/4857/4718) [2,39,32] r=0 lcod 0'0 mlcod 0'0 active+clean]: _process_pg_info info: 2.966( v 24'7 (24'7,24'7]+backlog n=7 ec=2 les=4857 4857/4857/4718), log: log(24'7,24'7], missing: missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.966( v 24'7 (24'5,24'7] n=7 ec=2 les=4857 4857/4857/4718) [2,39,32] r=0 lcod 0'0 mlcod 0'0 active+clean] my log = log(24'5,24'7]
Apr 12 14:50:06 atom0 osd.2[15599]: 4'6 (0'0) m gateway_15373_object16613/head by client4205.0:16614 2011-03-31 12:24:42.168247 indexed
Apr 12 14:50:06 atom0 osd.2[15599]: 4'7 (0'0) m gateway_15373_object20264/head by client4205.0:20265 2011-03-31 12:27:31.310326 indexed
Apr 12 14:50:06 atom0 osd.2[15599]: 
Apr 12 14:50:06 atom0 osd.2[15599]: g[2.966( v 24'7 (24'5,24'7] n=7 ec=2 les=4857 4857/4857/4718) [2,39,32] r=0 lcod 0'0 mlcod 0'0 active+clean] osd32 log = log(24'7,24'7]
Apr 12 14:50:06 atom0 osd.2[15599]: 
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 _dispatch 0xfcba000 pg_log(3.965 e4944) v1
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 handle_pg_log pg_log(3.965 e4944) v1 from osd32
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 require_same_or_newer_map 4944 (i am 4944) 0xfcba000
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[3.965( v 25'81 (25'79,25'81] n=16 ec=2 les=4857 4857/4857/4718) [2,39,32] r=0 lcod 0'0 mlcod 0'0 active+clean]: _process_pg_info info: 3.965( v 25'81 (25'81,25'81]+backlog n=16 ec=2 les=4857 4857/4857/4718), log: log(25'81,25'81], missing: missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[3.965( v 25'81 (25'79,25'81] n=16 ec=2 les=4857 4857/4857/4718) [2,39,32] r=0 lcod 0'0 mlcod 0'0 active+clean] my log = log(25'79,25'81]
Apr 12 14:50:06 atom0 osd.2[15599]: 5'80 (25'79) m rb.0.1.000000004039/head by client4196.0:179670 2011-03-31 15:19:49.547091 indexed
Apr 12 14:50:06 atom0 osd.2[15599]: 5'81 (25'80) m rb.0.1.000000004039/head by client4196.0:179674 2011-03-31 15:19:50.187755 indexed
Apr 12 14:50:06 atom0 osd.2[15599]: 
Apr 12 14:50:06 atom0 osd.2[15599]: g[3.965( v 25'81 (25'79,25'81] n=16 ec=2 les=4857 4857/4857/4718) [2,39,32] r=0 lcod 0'0 mlcod 0'0 active+clean] osd32 log = log(25'81,25'81]
Apr 12 14:50:06 atom0 osd.2[15599]: 
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 heartbeat_dispatch 0x5ebd000
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 handle_osd_ping from osd4 got stat stat(2011-04-12 14:50:05.045349 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 _share_map_incoming osd4 [2a00:f10:113:1:225:90ff:fe32:cf64]:6802/9348 4944
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 take_peer_stat peer osd4 stat(2011-04-12 14:50:05.045349 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 heartbeat_dispatch 0x3d4e000
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 handle_osd_ping from osd10 got stat stat(2011-04-12 14:50:19.148974 oprate=0.191361 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 _share_map_incoming osd10 [2a00:f10:113:1:225:90ff:fe33:49f2]:6806/18749 4944
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 take_peer_stat peer osd10 stat(2011-04-12 14:50:19.148974 oprate=0.191361 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 heartbeat_dispatch 0x5b338c0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 handle_osd_ping from osd9 got stat stat(2011-04-12 14:50:19.154303 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 _share_map_incoming osd9 [2a00:f10:113:1:225:90ff:fe33:49f2]:6811/19540 4944
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 take_peer_stat peer osd9 stat(2011-04-12 14:50:19.154303 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 heartbeat_dispatch 0xfed2380
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 handle_osd_ping from osd21 got stat stat(2011-04-12 14:50:04.446883 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 _share_map_incoming osd21 [2a00:f10:113:1:225:90ff:fe33:497c]:6805/1199 4944
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 take_peer_stat peer osd21 stat(2011-04-12 14:50:04.446883 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 _dispatch 0x10815c40 PGq v1
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 handle_pg_query from osd33 epoch 4944
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 require_same_or_newer_map 4944 (i am 4944) 0x10815c40
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[0.3f5( v 1288'628 (1205'625,1288'628] n=581 ec=2 les=4881 4825/4825/4825) [33,28,2] r=2 luod=0'0 lcod 0'0 active] sending info+missing+log since 4723'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[0.3f5( v 1288'628 (1205'625,1288'628] n=581 ec=2 les=4881 4825/4825/4825) [33,28,2] r=2 luod=0'0 lcod 0'0 active] sending log(1288'628,1288'628] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[1.3f4( v 471'9 (26'8,471'9] n=8 ec=2 les=4881 4825/4825/4825) [33,28,2] r=2 luod=0'0 lcod 0'0 active] sending info+missing+log since 4723'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[1.3f4( v 471'9 (26'8,471'9] n=8 ec=2 les=4881 4825/4825/4825) [33,28,2] r=2 luod=0'0 lcod 0'0 active] sending log(471'9,471'9] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.3f3( v 24'19 (24'17,24'19]+backlog n=19 ec=2 les=4881 4825/4825/4825) [33,28,2] r=2 luod=0'0 lcod 0'0 active] sending info+missing+log since 4723'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[2.3f3( v 24'19 (24'17,24'19]+backlog n=19 ec=2 les=4881 4825/4825/4825) [33,28,2] r=2 luod=0'0 lcod 0'0 active] sending log(24'19,24'19] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[3.3f2( v 25'104 (25'102,25'104]+backlog n=19 ec=2 les=4881 4825/4825/4825) [33,28,2] r=2 luod=0'0 lcod 0'0 active] sending info+missing+log since 4723'0
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd927589700 osd2 4944 pg[3.3f2( v 25'104 (25'102,25'104]+backlog n=19 ec=2 les=4881 4825/4825/4825) [33,28,2] r=2 luod=0'0 lcod 0'0 active] sending log(25'104,25'104] missing(0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 heartbeat_dispatch 0xfe9ce00
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 handle_osd_ping from osd16 got stat stat(2011-04-12 14:50:11.558873 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 _share_map_incoming osd16 [2a00:f10:113:1:225:90ff:fe33:49cc]:6809/5541 4944
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 take_peer_stat peer osd16 stat(2011-04-12 14:50:11.558873 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 heartbeat_dispatch 0x5679700
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 handle_osd_ping from osd24 got stat stat(2011-04-12 14:50:12.397722 oprate=0.0382867 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 _share_map_incoming osd24 [2a00:f10:113:1:225:90ff:fe33:49ca]:6802/9083 4944
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 take_peer_stat peer osd24 stat(2011-04-12 14:50:12.397722 oprate=0.0382867 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd92e597700 osd2 4944 tick
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd92e597700 osd2 4944 scrub_should_schedule loadavg 7.42 >= max 0.5 = no, load too high
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd92e597700 osd2 4944 activate_pg
Apr 12 14:50:06 atom0 osd.2[15599]: osd/PG.cc: In function 'void PG::replay_queued_ops()', in thread '0x7fd92e597700'#012osd/PG.cc: 1984: FAILED assert(is_replay() && is_active() && !is_crashed())
Apr 12 14:50:06 atom0 osd.2[15599]:  ceph version 0.25-627-gc02b56e (commit:c02b56e62221a35a4edff7b0afedb5b24d0e0de3)#012 1: (PG::replay_queued_ops()+0x3b6) [0x556cb6]#012 2: (OSD::activate_pg(pg_t, utime_t)+0x1ef) [0x4e2a1f]#012 3: (OSD::check_replay_queue()+0x188) [0x4e4648]#012 4: (OSD::tick()+0x24e) [0x51b14e]#012 5: (SafeTimer::timer_thread()+0x36b) [0x5fba2b]#012 6: (SafeTimerThread::entry()+0xd) [0x5fe21d]#012 7: (()+0x69ca) [0x7fd932c109ca]#012 8: (clone()+0x6d) [0x7fd93189c70d]
Apr 12 14:50:06 atom0 osd.2[15599]:  ceph version 0.25-627-gc02b56e (commit:c02b56e62221a35a4edff7b0afedb5b24d0e0de3)#012 1: (PG::replay_queued_ops()+0x3b6) [0x556cb6]#012 2: (OSD::activate_pg(pg_t, utime_t)+0x1ef) [0x4e2a1f]#012 3: (OSD::check_replay_queue()+0x188) [0x4e4648]#012 4: (OSD::tick()+0x24e) [0x51b14e]#012 5: (SafeTimer::timer_thread()+0x36b) [0x5fba2b]#012 6: (SafeTimerThread::entry()+0xd) [0x5fe21d]#012 7: (()+0x69ca) [0x7fd932c109ca]#012 8: (clone()+0x6d) [0x7fd93189c70d]
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 heartbeat_dispatch 0x3d4e380
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 handle_osd_ping from osd8 got stat stat(2011-04-12 14:50:19.221868 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 _share_map_incoming osd8 [2a00:f10:113:1:225:90ff:fe33:49f2]:6808/6040 4944
Apr 12 14:50:06 atom0 osd.2[15599]: 7fd926d88700 osd2 4944 take_peer_stat peer osd8 stat(2011-04-12 14:50:19.221868 oprate=0 qlen=0 recent_qlen=0 rdlat=0 / 0 fshedin=0)
Apr 12 14:50:06 atom0 osd.2[15599]: *** Caught signal (Aborted) **#012 in thread 0x7fd92e597700
Apr 12 14:50:06 atom0 osd.2[15599]:  ceph version 0.25-627-gc02b56e (commit:c02b56e62221a35a4edff7b0afedb5b24d0e0de3)#012 1: /usr/bin/cosd() [0x61adfe]#012 2: (()+0xf8f0) [0x7fd932c198f0]#012 3: (gsignal()+0x35) [0x7fd9317e9a75]#012 4: (abort()+0x180) [0x7fd9317ed5c0]#012 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7fd93209f8e5]#012 6: (()+0xcad16) [0x7fd93209dd16]#012 7: (()+0xcad43) [0x7fd93209dd43]#012 8: (()+0xcae3e) [0x7fd93209de3e]#012 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x36a) [0x6001aa]#012 10: (PG::replay_queued_ops()+0x3b6) [0x556cb6]#012 11: (OSD::activate_pg(pg_t, utime_t)+0x1ef) [0x4e2a1f]#012 12: (OSD::check_replay_queue()+0x188) [0x4e4648]#012 13: (OSD::tick()+0x24e) [0x51b14e]#012 14: (SafeTimer::timer_thread()+0x36b) [0x5fba2b]#012 15: (SafeTimerThread::entry()+0xd) [0x5fe21d]#012 16: (()+0x69ca) [0x7fd932c109ca]#012 17: (clone()+0x6d) [0x7fd93189c70d]
Actions #1

Updated by Samuel Just about 13 years ago

Likely caused by handle_pg_notify calling do_peer on an active pg.

Actions #2

Updated by Wido den Hollander about 13 years ago

  • Status changed from New to Closed

Uh, my bad, this is a duplicate of #990.

Actions

Also available in: Atom PDF