Project

General

Profile

Actions

Bug #9082

closed

Ceph Firefly 0.80.5 : PG has invalid (post-split) stats; must scrub before tier agent can activate

Added by karan singh over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hello

Ceph version : 0.80.5
Centos 6.5
Features in use : erasure coding and cache tiering

Few hours back my cluster was working absolutely fine , all OSD's were up and running.

As soon as i started benchmarking my erasure-coded pool which is has a tired cache pool , my osds are started getting DOWN and OUT . Immediately 6 OSDS went down and all of them are throwing the same error. All of these osds which went down are from cache pool crush ruleset.

2014-08-12 17:43:21.586831 7f4c5e33a700 0 log [WRN] : pg 24.104 has invalid (post-split) stats; must scrub before tier agent can activate
2014-08-12 17:43:21.593174 7f4c5d939700 -1 ** Caught signal (Floating point exception) *

When i am manually trying to start these ODS , they are not coming up and throwing the same error message (mentioned above)

I have been reading Firefly release notes ( http://ceph.com/releases/v0-80-firefly-released/ ) , looks like it was fixed earlier
osd: warn if agent cannot enable due to invalid (post-split) stats (Sage Weil)

But in version 0.80.5 , i am still facing this. Is this something you already know ??

Below are the OSD detailed LOGS:

2014-08-12 17:43:21.584841 7f4c5d939700  0 log [WRN] : pg 24.30f has invalid (post-split) stats; must scrub before tier agent can activate
  -175> 2014-08-12 17:43:21.584995 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.6 192.168.1.107:6809/53141 13 ==== pg_info(1 pgs e194300:22.795) v4 ==== 714+0+0 (1092772418 0 0) 0x62ba300 con 0x40cc360
  -174> 2014-08-12 17:43:21.585030 7f4c65145700  5 -- op tracker -- , seq: 647, time: 2014-08-12 17:43:21.584855, event: header_read, request: pg_info(1 pgs e194300:22.795) v4
  -173> 2014-08-12 17:43:21.585044 7f4c65145700  5 -- op tracker -- , seq: 647, time: 2014-08-12 17:43:21.584859, event: throttled, request: pg_info(1 pgs e194300:22.795) v4
  -172> 2014-08-12 17:43:21.585055 7f4c65145700  5 -- op tracker -- , seq: 647, time: 2014-08-12 17:43:21.584962, event: all_read, request: pg_info(1 pgs e194300:22.795) v4
  -171> 2014-08-12 17:43:21.585064 7f4c65145700  5 -- op tracker -- , seq: 647, time: 2014-08-12 17:43:21.585024, event: dispatched, request: pg_info(1 pgs e194300:22.795) v4
  -170> 2014-08-12 17:43:21.585074 7f4c65145700  5 -- op tracker -- , seq: 647, time: 2014-08-12 17:43:21.585074, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:22.795) v4
  -169> 2014-08-12 17:43:21.585087 7f4c65145700  5 -- op tracker -- , seq: 647, time: 2014-08-12 17:43:21.585087, event: started, request: pg_info(1 pgs e194300:22.795) v4
  -168> 2014-08-12 17:43:21.585116 7f4c65145700  5 -- op tracker -- , seq: 647, time: 2014-08-12 17:43:21.585116, event: done, request: pg_info(1 pgs e194300:22.795) v4
  -167> 2014-08-12 17:43:21.585173 7f4c5e33a700  0 log [WRN] : pg 24.2a9 has invalid (post-split) stats; must scrub before tier agent can activate
  -166> 2014-08-12 17:43:21.585232 7f4c5d939700  0 log [WRN] : pg 24.254 has invalid (post-split) stats; must scrub before tier agent can activate
  -165> 2014-08-12 17:43:21.585226 7f4c5e33a700  5 osd.306 pg_epoch: 194300 pg[24.2a9( v 194216'12832 (192834'10938,194216'12832] local-les=194300 n=4 ec=192688 les/c 194300/194235 194298/194299/194299) [306,24,61] r=0 lpr=194299 pi=194232-194298/5 crt=192888'12436 lcod 0'0 mlcod 0'0 active] exit Started/Primary/Active/Activating 0.366973 12 0.000638
  -164> 2014-08-12 17:43:21.585259 7f4c5e33a700  5 osd.306 pg_epoch: 194300 pg[24.2a9( v 194216'12832 (192834'10938,194216'12832] local-les=194300 n=4 ec=192688 les/c 194300/194235 194298/194299/194299) [306,24,61] r=0 lpr=194299 pi=194232-194298/5 crt=192888'12436 lcod 0'0 mlcod 0'0 active] enter Started/Primary/Active/Recovered
  -163> 2014-08-12 17:43:21.585270 7f4c5d939700  5 osd.306 pg_epoch: 194300 pg[24.30f( v 194217'14634 (192718'11887,194217'14634] local-les=194300 n=5 ec=192688 les/c 194300/194235 194298/194299/194299) [306,308,62] r=0 lpr=194299 pi=194232-194298/5 crt=192888'14229 lcod 0'0 mlcod 0'0 active] exit Started/Primary/Active/Activating 0.368068 12 0.000658
  -162> 2014-08-12 17:43:21.585288 7f4c5e33a700  5 osd.306 pg_epoch: 194300 pg[24.2a9( v 194216'12832 (192834'10938,194216'12832] local-les=194300 n=4 ec=192688 les/c 194300/194235 194298/194299/194299) [306,24,61] r=0 lpr=194299 pi=194232-194298/5 crt=192888'12436 lcod 0'0 mlcod 0'0 active] exit Started/Primary/Active/Recovered 0.000028 0 0.000000
  -161> 2014-08-12 17:43:21.585309 7f4c5d939700  5 osd.306 pg_epoch: 194300 pg[24.30f( v 194217'14634 (192718'11887,194217'14634] local-les=194300 n=5 ec=192688 les/c 194300/194235 194298/194299/194299) [306,308,62] r=0 lpr=194299 pi=194232-194298/5 crt=192888'14229 lcod 0'0 mlcod 0'0 active] enter Started/Primary/Active/Recovered
  -160> 2014-08-12 17:43:21.585311 7f4c5e33a700  5 osd.306 pg_epoch: 194300 pg[24.2a9( v 194216'12832 (192834'10938,194216'12832] local-les=194300 n=4 ec=192688 les/c 194300/194235 194298/194299/194299) [306,24,61] r=0 lpr=194299 pi=194232-194298/5 crt=192888'12436 lcod 0'0 mlcod 0'0 active] enter Started/Primary/Active/Clean
  -159> 2014-08-12 17:43:21.585339 7f4c5d939700  5 osd.306 pg_epoch: 194300 pg[24.30f( v 194217'14634 (192718'11887,194217'14634] local-les=194300 n=5 ec=192688 les/c 194300/194235 194298/194299/194299) [306,308,62] r=0 lpr=194299 pi=194232-194298/5 crt=192888'14229 lcod 0'0 mlcod 0'0 active] exit Started/Primary/Active/Recovered 0.000031 0 0.000000
  -158> 2014-08-12 17:43:21.585363 7f4c5d939700  5 osd.306 pg_epoch: 194300 pg[24.30f( v 194217'14634 (192718'11887,194217'14634] local-les=194300 n=5 ec=192688 les/c 194300/194235 194298/194299/194299) [306,308,62] r=0 lpr=194299 pi=194232-194298/5 crt=192888'14229 lcod 0'0 mlcod 0'0 active] enter Started/Primary/Active/Clean
  -157> 2014-08-12 17:43:21.585368 7f4c5e33a700  1 -- 192.168.1.101:6800/51417 --> osd.24 192.168.1.107:6800/52669 -- pg_info(1 pgs e194300:24.2a9) v4 -- ?+0 0x6038c40
  -156> 2014-08-12 17:43:21.585370 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.6 192.168.1.107:6809/53141 14 ==== MRecoveryReserve REQUEST  pgid: 22.795, query_epoch: 194300 v2 ==== 26+0+0 (3761680802 0 0) 0x62b8540 con 0x40cc360
  -155> 2014-08-12 17:43:21.585406 7f4c65145700  5 -- op tracker -- , seq: 648, time: 2014-08-12 17:43:21.585265, event: header_read, request: MRecoveryReserve REQUEST  pgid: 22.795, query_epoch: 194300 v2
  -154> 2014-08-12 17:43:21.585419 7f4c65145700  5 -- op tracker -- , seq: 648, time: 2014-08-12 17:43:21.585268, event: throttled, request: MRecoveryReserve REQUEST  pgid: 22.795, query_epoch: 194300 v2
  -153> 2014-08-12 17:43:21.585430 7f4c65145700  5 -- op tracker -- , seq: 648, time: 2014-08-12 17:43:21.585345, event: all_read, request: MRecoveryReserve REQUEST  pgid: 22.795, query_epoch: 194300 v2
  -152> 2014-08-12 17:43:21.585421 7f4c5d939700  1 -- 192.168.1.101:6800/51417 --> osd.62 192.168.1.112:6824/5505 -- pg_info(1 pgs e194300:24.30f) v4 -- ?+0 0x62baf40
  -151> 2014-08-12 17:43:21.585439 7f4c65145700  5 -- op tracker -- , seq: 648, time: 2014-08-12 17:43:21.585402, event: dispatched, request: MRecoveryReserve REQUEST  pgid: 22.795, query_epoch: 194300 v2
  -150> 2014-08-12 17:43:21.585446 7f4c65145700  5 -- op tracker -- , seq: 648, time: 2014-08-12 17:43:21.585446, event: waiting_for_osdmap, request: MRecoveryReserve REQUEST  pgid: 22.795, query_epoch: 194300 v2
  -149> 2014-08-12 17:43:21.585467 7f4c65145700  5 -- op tracker -- , seq: 648, time: 2014-08-12 17:43:21.585467, event: done, request: MRecoveryReserve REQUEST  pgid: 22.795, query_epoch: 194300 v2
  -148> 2014-08-12 17:43:21.585465 7f4c5d939700  1 -- 192.168.1.101:6800/51417 --> osd.308 192.168.1.113:6802/14403 -- pg_info(1 pgs e194300:24.30f) v4 -- ?+0 0x62babc0
  -147> 2014-08-12 17:43:21.585650 7f4c5e33a700  1 -- 192.168.1.101:6800/51417 --> osd.61 192.168.1.111:6840/16566 -- pg_info(1 pgs e194300:24.2a9) v4 -- ?+0 0x603b2c0
  -146> 2014-08-12 17:43:21.585680 7f4c5d939700  5 osd.306 pg_epoch: 194300 pg[24.254( v 194215'17471 (192803'14472,194215'17471] local-les=194300 n=-9 ec=192688 les/c 194300/194235 194298/194299/194299) [306,289,6] r=0 lpr=194299 pi=193810-194298/7 crt=192888'17092 lcod 0'0 mlcod 0'0 active] exit Started/Primary/Active/Activating 0.368052 12 0.000598
  -145> 2014-08-12 17:43:21.585737 7f4c5d939700  5 osd.306 pg_epoch: 194300 pg[24.254( v 194215'17471 (192803'14472,194215'17471] local-les=194300 n=-9 ec=192688 les/c 194300/194235 194298/194299/194299) [306,289,6] r=0 lpr=194299 pi=193810-194298/7 crt=192888'17092 lcod 0'0 mlcod 0'0 active] enter Started/Primary/Active/Recovered
  -144> 2014-08-12 17:43:21.585766 7f4c5d939700  5 osd.306 pg_epoch: 194300 pg[24.254( v 194215'17471 (192803'14472,194215'17471] local-les=194300 n=-9 ec=192688 les/c 194300/194235 194298/194299/194299) [306,289,6] r=0 lpr=194299 pi=193810-194298/7 crt=192888'17092 lcod 0'0 mlcod 0'0 active] exit Started/Primary/Active/Recovered 0.000029 0 0.000000
  -143> 2014-08-12 17:43:21.585786 7f4c5d939700  5 osd.306 pg_epoch: 194300 pg[24.254( v 194215'17471 (192803'14472,194215'17471] local-les=194300 n=-9 ec=192688 les/c 194300/194235 194298/194299/194299) [306,289,6] r=0 lpr=194299 pi=193810-194298/7 crt=192888'17092 lcod 0'0 mlcod 0'0 active] enter Started/Primary/Active/Clean
  -142> 2014-08-12 17:43:21.585798 7f4c69f68700  1 -- 192.168.1.101:6800/51417 --> osd.49 192.168.1.111:6824/11112 -- pg_info(1 pgs e194300:24.104) v4 -- ?+0 0x62bb9c0
  -141> 2014-08-12 17:43:21.585845 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.59 192.168.1.112:6822/5265 -- osd pg remove(epoch 194300; pg24.30f; ) v2 -- ?+0 0x6d840c0
  -140> 2014-08-12 17:43:21.585892 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.301 192.168.1.113:6814/16727 -- osd pg remove(epoch 194300; pg24.30f; ) v2 -- ?+0 0x6d83f00
  -139> 2014-08-12 17:43:21.585922 7f4c69f68700  1 -- 192.168.1.101:6800/51417 --> osd.292 192.168.1.113:6812/16056 -- pg_info(1 pgs e194300:24.104) v4 -- ?+0 0x603f380
  -138> 2014-08-12 17:43:21.586096 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.379 192.168.1.108:6830/4861 -- osd pg remove(epoch 194300; pg24.30f; ) v2 -- ?+0 0x6d83d40
  -137> 2014-08-12 17:43:21.586156 7f4c5e33a700  5 osd.306 pg_epoch: 194300 pg[22.795( v 194215'502 lc 0'0 (0'0,194215'502] local-les=194299 n=26 ec=183096 les/c 194299/194273 194298/194298/192967) [6,306,374] r=1 lpr=194299 pi=194270-194297/3 luod=0'0 crt=194215'502 active m=26] exit Started/ReplicaActive/RepNotRecovering 0.801323 7 0.000751
  -136> 2014-08-12 17:43:21.586175 7f4c5d939700  1 -- 192.168.1.101:6800/51417 --> osd.6 192.168.1.107:6809/53141 -- pg_info(1 pgs e194300:24.254) v4 -- ?+0 0x62bb800
  -135> 2014-08-12 17:43:21.586208 7f4c5e33a700  5 osd.306 pg_epoch: 194300 pg[22.795( v 194215'502 lc 0'0 (0'0,194215'502] local-les=194299 n=26 ec=183096 les/c 194299/194273 194298/194298/192967) [6,306,374] r=1 lpr=194299 pi=194270-194297/3 luod=0'0 crt=194215'502 active m=26] enter Started/ReplicaActive/RepWaitRecoveryReserved
  -134> 2014-08-12 17:43:21.586229 7f4c5d939700  1 -- 192.168.1.101:6800/51417 --> osd.289 192.168.1.113:6808/15284 -- pg_info(1 pgs e194300:24.254) v4 -- ?+0 0x62ba140
  -133> 2014-08-12 17:43:21.586394 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.6 192.168.1.107:6809/53141 -- osd pg remove(epoch 194300; pg24.2a9; ) v2 -- ?+0 0x6d81880
  -132> 2014-08-12 17:43:21.586444 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.29 192.168.1.115:6806/59028 -- osd pg remove(epoch 194300; pg24.2a9; ) v2 -- ?+0 0x6d81c00
  -131> 2014-08-12 17:43:21.586474 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.47 192.168.1.111:6828/12102 -- osd pg remove(epoch 194300; pg24.2a9; ) v2 -- ?+0 0x62bdcc0
  -130> 2014-08-12 17:43:21.586625 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.18 192.168.1.107:6805/52903 -- osd pg remove(epoch 194300; pg24.254; ) v2 -- ?+0 0x62bbf00
  -129> 2014-08-12 17:43:21.586676 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.292 192.168.1.113:6812/16056 -- osd pg remove(epoch 194300; pg24.254; ) v2 -- ?+0 0x62bc980
  -128> 2014-08-12 17:43:21.586701 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.344 192.168.1.106:6812/6757 -- osd pg remove(epoch 194300; pg24.254; ) v2 -- ?+0 0x62b9880
  -127> 2014-08-12 17:43:21.586831 7f4c5e33a700  0 log [WRN] : pg 24.104 has invalid (post-split) stats; must scrub before tier agent can activate
  -126> 2014-08-12 17:43:21.586875 7f4c5e33a700  5 osd.306 pg_epoch: 194300 pg[24.104( v 194215'15298 (192718'12373,194215'15298] local-les=194300 n=12 ec=192688 les/c 194300/194235 194298/194299/194299) [306,49,292] r=0 lpr=194299 pi=194232-194298/5 crt=192834'14593 lcod 0'0 mlcod 0'0 active] exit Started/Primary/Active/Activating 0.368029 12 0.000767
  -125> 2014-08-12 17:43:21.586909 7f4c5e33a700  5 osd.306 pg_epoch: 194300 pg[24.104( v 194215'15298 (192718'12373,194215'15298] local-les=194300 n=12 ec=192688 les/c 194300/194235 194298/194299/194299) [306,49,292] r=0 lpr=194299 pi=194232-194298/5 crt=192834'14593 lcod 0'0 mlcod 0'0 active] enter Started/Primary/Active/Recovered
  -124> 2014-08-12 17:43:21.586935 7f4c5e33a700  5 osd.306 pg_epoch: 194300 pg[24.104( v 194215'15298 (192718'12373,194215'15298] local-les=194300 n=12 ec=192688 les/c 194300/194235 194298/194299/194299) [306,49,292] r=0 lpr=194299 pi=194232-194298/5 crt=192834'14593 lcod 0'0 mlcod 0'0 active] exit Started/Primary/Active/Recovered 0.000026 0 0.000000
  -123> 2014-08-12 17:43:21.586958 7f4c5e33a700  5 osd.306 pg_epoch: 194300 pg[24.104( v 194215'15298 (192718'12373,194215'15298] local-les=194300 n=12 ec=192688 les/c 194300/194235 194298/194299/194299) [306,49,292] r=0 lpr=194299 pi=194232-194298/5 crt=192834'14593 lcod 0'0 mlcod 0'0 active] enter Started/Primary/Active/Clean
  -122> 2014-08-12 17:43:21.587005 7f4c5e33a700  1 -- 192.168.1.101:6800/51417 --> osd.49 192.168.1.111:6824/11112 -- pg_info(1 pgs e194300:24.104) v4 -- ?+0 0x6d863c0
  -121> 2014-08-12 17:43:21.587043 7f4c5e33a700  1 -- 192.168.1.101:6800/51417 --> osd.292 192.168.1.113:6812/16056 -- pg_info(1 pgs e194300:24.104) v4 -- ?+0 0x6d86200
  -120> 2014-08-12 17:43:21.587364 7f4c69f68700  1 -- 192.168.1.101:6800/51417 --> osd.344 192.168.1.106:6812/6757 -- pg_info(1 pgs e194300:24.58) v4 -- ?+0 0x603a680
  -119> 2014-08-12 17:43:21.587407 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.47 192.168.1.111:6828/12102 -- osd pg remove(epoch 194300; pg24.104; ) v2 -- ?+0 0x62bbb80
  -118> 2014-08-12 17:43:21.587444 7f4c69f68700  1 -- 192.168.1.101:6800/51417 --> osd.352 192.168.1.108:6826/4273 -- pg_info(1 pgs e194300:24.58) v4 -- ?+0 0x603aa00
  -117> 2014-08-12 17:43:21.587478 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.295 192.168.1.113:6806/14945 -- osd pg remove(epoch 194300; pg24.104; ) v2 -- ?+0 0x62baa00
  -116> 2014-08-12 17:43:21.587510 7f4c69567700  1 -- 192.168.1.101:6800/51417 --> osd.397 192.168.1.105:6808/41085 -- osd pg remove(epoch 194300; pg24.104; ) v2 -- ?+0 0x62ba4c0
  -115> 2014-08-12 17:43:21.587533 7f4c5d939700  5 osd.306 pg_epoch: 194300 pg[24.58( v 194217'14171 (192718'11167,194217'14171] local-les=194300 n=-1 ec=192688 les/c 194300/193814 194298/194299/194299) [306,352,344]/[306] r=0 lpr=194299 pi=193813-194298/13 bft=344,352 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded+remapped] on activate: bft=344,352 from 0//0//-1
  -114> 2014-08-12 17:43:21.587566 7f4c5d939700  5 osd.306 pg_epoch: 194300 pg[24.58( v 194217'14171 (192718'11167,194217'14171] local-les=194300 n=-1 ec=192688 les/c 194300/193814 194298/194299/194299) [306,352,344]/[306] r=0 lpr=194299 pi=193813-194298/13 bft=344,352 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded+remapped] target shard 344 from 0//0//-1
  -113> 2014-08-12 17:43:21.587584 7f4c5d939700  5 osd.306 pg_epoch: 194300 pg[24.58( v 194217'14171 (192718'11167,194217'14171] local-les=194300 n=-1 ec=192688 les/c 194300/193814 194298/194299/194299) [306,352,344]/[306] r=0 lpr=194299 pi=193813-194298/13 bft=344,352 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded+remapped] target shard 352 from 0//0//-1
  -112> 2014-08-12 17:43:21.587939 7f4c5d939700  0 log [WRN] : pg 24.58 has invalid (post-split) stats; must scrub before tier agent can activate
  -111> 2014-08-12 17:43:21.588814 7f4c69f68700  1 -- 192.168.1.101:6800/51417 --> osd.302 192.168.1.113:6810/15731 -- pg_info(1 pgs e194300:24.3f1) v4 -- ?+0 0x603b640
  -110> 2014-08-12 17:43:21.588955 7f4c69f68700  1 -- 192.168.1.101:6800/51417 --> osd.290 192.168.1.109:6808/31607 -- pg_info(1 pgs e194300:24.119) v4 -- ?+0 0x6039a40
  -109> 2014-08-12 17:43:21.589098 7f4c69f68700  1 -- 192.168.1.101:6800/51417 --> osd.18 192.168.1.107:6805/52903 -- pg_info(1 pgs e194300:24.1c4) v4 -- ?+0 0x603db00
  -108> 2014-08-12 17:43:21.589383 7f4c69f68700  1 -- 192.168.1.101:6800/51417 --> osd.313 192.168.1.109:6820/35445 -- pg_info(1 pgs e194300:24.13b) v4 -- ?+0 0x6038540
  -107> 2014-08-12 17:43:21.589673 7f4c69f68700  1 -- 192.168.1.101:6800/51417 --> osd.342 192.168.1.110:6802/41438 -- pg_info(1 pgs e194300:24.36d) v4 -- ?+0 0x603d240
  -106> 2014-08-12 17:43:21.589818 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.18 192.168.1.107:6805/52903 13 ==== pg_info(1 pgs e194300:24.1c4) v4 ==== 748+0+0 (2739292217 0 0) 0x603c280 con 0x40cdac0
  -105> 2014-08-12 17:43:21.589857 7f4c65145700  5 -- op tracker -- , seq: 649, time: 2014-08-12 17:43:21.589678, event: header_read, request: pg_info(1 pgs e194300:24.1c4) v4
  -104> 2014-08-12 17:43:21.589872 7f4c65145700  5 -- op tracker -- , seq: 649, time: 2014-08-12 17:43:21.589680, event: throttled, request: pg_info(1 pgs e194300:24.1c4) v4
  -103> 2014-08-12 17:43:21.589881 7f4c65145700  5 -- op tracker -- , seq: 649, time: 2014-08-12 17:43:21.589760, event: all_read, request: pg_info(1 pgs e194300:24.1c4) v4
  -102> 2014-08-12 17:43:21.589888 7f4c65145700  5 -- op tracker -- , seq: 649, time: 2014-08-12 17:43:21.589852, event: dispatched, request: pg_info(1 pgs e194300:24.1c4) v4
  -101> 2014-08-12 17:43:21.589898 7f4c65145700  5 -- op tracker -- , seq: 649, time: 2014-08-12 17:43:21.589898, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.1c4) v4
  -100> 2014-08-12 17:43:21.589912 7f4c65145700  5 -- op tracker -- , seq: 649, time: 2014-08-12 17:43:21.589911, event: started, request: pg_info(1 pgs e194300:24.1c4) v4
   -99> 2014-08-12 17:43:21.589972 7f4c65145700  5 -- op tracker -- , seq: 649, time: 2014-08-12 17:43:21.589972, event: done, request: pg_info(1 pgs e194300:24.1c4) v4
   -98> 2014-08-12 17:43:21.589998 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.302 192.168.1.113:6810/15731 12 ==== pg_info(1 pgs e194300:24.3f1) v4 ==== 744+0+0 (1504649001 0 0) 0x603b640 con 0x48baec0
   -97> 2014-08-12 17:43:21.590017 7f4c65145700  5 -- op tracker -- , seq: 650, time: 2014-08-12 17:43:21.589838, event: header_read, request: pg_info(1 pgs e194300:24.3f1) v4
   -96> 2014-08-12 17:43:21.590028 7f4c65145700  5 -- op tracker -- , seq: 650, time: 2014-08-12 17:43:21.589842, event: throttled, request: pg_info(1 pgs e194300:24.3f1) v4
   -95> 2014-08-12 17:43:21.590037 7f4c65145700  5 -- op tracker -- , seq: 650, time: 2014-08-12 17:43:21.589945, event: all_read, request: pg_info(1 pgs e194300:24.3f1) v4
   -94> 2014-08-12 17:43:21.590046 7f4c65145700  5 -- op tracker -- , seq: 650, time: 2014-08-12 17:43:21.590015, event: dispatched, request: pg_info(1 pgs e194300:24.3f1) v4
   -93> 2014-08-12 17:43:21.590056 7f4c65145700  5 -- op tracker -- , seq: 650, time: 2014-08-12 17:43:21.590055, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.3f1) v4
   -92> 2014-08-12 17:43:21.590067 7f4c65145700  5 -- op tracker -- , seq: 650, time: 2014-08-12 17:43:21.590066, event: started, request: pg_info(1 pgs e194300:24.3f1) v4
   -91> 2014-08-12 17:43:21.590095 7f4c65145700  5 -- op tracker -- , seq: 650, time: 2014-08-12 17:43:21.590094, event: done, request: pg_info(1 pgs e194300:24.3f1) v4
   -90> 2014-08-12 17:43:21.590113 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.290 192.168.1.109:6808/31607 16 ==== pg_info(1 pgs e194300:24.119) v4 ==== 748+0+0 (847792910 0 0) 0x6039a40 con 0x40c9760
   -89> 2014-08-12 17:43:21.590110 7f4c69f68700  1 -- 192.168.1.101:6800/51417 --> osd.395 192.168.1.105:6810/41287 -- pg_info(1 pgs e194300:24.354) v4 -- ?+0 0x6039880
   -88> 2014-08-12 17:43:21.590131 7f4c65145700  5 -- op tracker -- , seq: 651, time: 2014-08-12 17:43:21.589950, event: header_read, request: pg_info(1 pgs e194300:24.119) v4
   -87> 2014-08-12 17:43:21.590141 7f4c65145700  5 -- op tracker -- , seq: 651, time: 2014-08-12 17:43:21.589953, event: throttled, request: pg_info(1 pgs e194300:24.119) v4
   -86> 2014-08-12 17:43:21.590148 7f4c65145700  5 -- op tracker -- , seq: 651, time: 2014-08-12 17:43:21.590035, event: all_read, request: pg_info(1 pgs e194300:24.119) v4
   -85> 2014-08-12 17:43:21.590154 7f4c65145700  5 -- op tracker -- , seq: 651, time: 2014-08-12 17:43:21.590129, event: dispatched, request: pg_info(1 pgs e194300:24.119) v4
   -84> 2014-08-12 17:43:21.590160 7f4c65145700  5 -- op tracker -- , seq: 651, time: 2014-08-12 17:43:21.590159, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.119) v4
   -83> 2014-08-12 17:43:21.590167 7f4c65145700  5 -- op tracker -- , seq: 651, time: 2014-08-12 17:43:21.590167, event: started, request: pg_info(1 pgs e194300:24.119) v4
   -82> 2014-08-12 17:43:21.590186 7f4c65145700  5 -- op tracker -- , seq: 651, time: 2014-08-12 17:43:21.590186, event: done, request: pg_info(1 pgs e194300:24.119) v4
   -81> 2014-08-12 17:43:21.590199 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.18 192.168.1.107:6805/52903 14 ==== pg_info(1 pgs e194300:24.1c4) v4 ==== 748+0+0 (660960379 0 0) 0x603db00 con 0x40cdac0
   -80> 2014-08-12 17:43:21.590213 7f4c65145700  5 -- op tracker -- , seq: 652, time: 2014-08-12 17:43:21.589972, event: header_read, request: pg_info(1 pgs e194300:24.1c4) v4
   -79> 2014-08-12 17:43:21.590219 7f4c65145700  5 -- op tracker -- , seq: 652, time: 2014-08-12 17:43:21.589974, event: throttled, request: pg_info(1 pgs e194300:24.1c4) v4
   -78> 2014-08-12 17:43:21.590225 7f4c65145700  5 -- op tracker -- , seq: 652, time: 2014-08-12 17:43:21.590060, event: all_read, request: pg_info(1 pgs e194300:24.1c4) v4
   -77> 2014-08-12 17:43:21.590230 7f4c65145700  5 -- op tracker -- , seq: 652, time: 2014-08-12 17:43:21.590211, event: dispatched, request: pg_info(1 pgs e194300:24.1c4) v4
   -76> 2014-08-12 17:43:21.590240 7f4c65145700  5 -- op tracker -- , seq: 652, time: 2014-08-12 17:43:21.590240, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.1c4) v4
   -75> 2014-08-12 17:43:21.590253 7f4c65145700  5 -- op tracker -- , seq: 652, time: 2014-08-12 17:43:21.590252, event: started, request: pg_info(1 pgs e194300:24.1c4) v4
   -74> 2014-08-12 17:43:21.590266 7f4c69f68700  1 -- 192.168.1.101:6800/51417 --> osd.308 192.168.1.113:6802/14403 -- pg_info(1 pgs e194300:24.2dc) v4 -- ?+0 0x603c440
   -73> 2014-08-12 17:43:21.590274 7f4c65145700  5 -- op tracker -- , seq: 652, time: 2014-08-12 17:43:21.590274, event: done, request: pg_info(1 pgs e194300:24.1c4) v4
   -72> 2014-08-12 17:43:21.590293 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.313 192.168.1.109:6820/35445 11 ==== pg_info(1 pgs e194300:24.13b) v4 ==== 748+0+0 (1395045469 0 0) 0x6038540 con 0x40ca680
   -71> 2014-08-12 17:43:21.590308 7f4c65145700  5 -- op tracker -- , seq: 653, time: 2014-08-12 17:43:21.590201, event: header_read, request: pg_info(1 pgs e194300:24.13b) v4
   -70> 2014-08-12 17:43:21.590315 7f4c65145700  5 -- op tracker -- , seq: 653, time: 2014-08-12 17:43:21.590203, event: throttled, request: pg_info(1 pgs e194300:24.13b) v4
   -69> 2014-08-12 17:43:21.590320 7f4c65145700  5 -- op tracker -- , seq: 653, time: 2014-08-12 17:43:21.590275, event: all_read, request: pg_info(1 pgs e194300:24.13b) v4
   -68> 2014-08-12 17:43:21.590326 7f4c65145700  5 -- op tracker -- , seq: 653, time: 2014-08-12 17:43:21.590306, event: dispatched, request: pg_info(1 pgs e194300:24.13b) v4
   -67> 2014-08-12 17:43:21.590332 7f4c65145700  5 -- op tracker -- , seq: 653, time: 2014-08-12 17:43:21.590331, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.13b) v4
   -66> 2014-08-12 17:43:21.590343 7f4c65145700  5 -- op tracker -- , seq: 653, time: 2014-08-12 17:43:21.590343, event: started, request: pg_info(1 pgs e194300:24.13b) v4
   -65> 2014-08-12 17:43:21.590366 7f4c65145700  5 -- op tracker -- , seq: 653, time: 2014-08-12 17:43:21.590366, event: done, request: pg_info(1 pgs e194300:24.13b) v4
   -64> 2014-08-12 17:43:21.590795 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.342 192.168.1.110:6802/41438 7 ==== pg_info(1 pgs e194300:24.36d) v4 ==== 748+0+0 (330703328 0 0) 0x2aec0c0 con 0x41d3860
   -63> 2014-08-12 17:43:21.590820 7f4c65145700  5 -- op tracker -- , seq: 654, time: 2014-08-12 17:43:21.590687, event: header_read, request: pg_info(1 pgs e194300:24.36d) v4
   -62> 2014-08-12 17:43:21.590832 7f4c65145700  5 -- op tracker -- , seq: 654, time: 2014-08-12 17:43:21.590689, event: throttled, request: pg_info(1 pgs e194300:24.36d) v4
   -61> 2014-08-12 17:43:21.590841 7f4c65145700  5 -- op tracker -- , seq: 654, time: 2014-08-12 17:43:21.590773, event: all_read, request: pg_info(1 pgs e194300:24.36d) v4
   -60> 2014-08-12 17:43:21.590850 7f4c65145700  5 -- op tracker -- , seq: 654, time: 2014-08-12 17:43:21.590817, event: dispatched, request: pg_info(1 pgs e194300:24.36d) v4
   -59> 2014-08-12 17:43:21.590859 7f4c65145700  5 -- op tracker -- , seq: 654, time: 2014-08-12 17:43:21.590859, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.36d) v4
   -58> 2014-08-12 17:43:21.590871 7f4c65145700  5 -- op tracker -- , seq: 654, time: 2014-08-12 17:43:21.590871, event: started, request: pg_info(1 pgs e194300:24.36d) v4
   -57> 2014-08-12 17:43:21.590900 7f4c65145700  5 -- op tracker -- , seq: 654, time: 2014-08-12 17:43:21.590900, event: done, request: pg_info(1 pgs e194300:24.36d) v4
   -56> 2014-08-12 17:43:21.591093 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.313 192.168.1.109:6820/35445 12 ==== pg_info(1 pgs e194300:24.13b) v4 ==== 748+0+0 (2101955393 0 0) 0x6d82840 con 0x40ca680
   -55> 2014-08-12 17:43:21.591136 7f4c65145700  5 -- op tracker -- , seq: 655, time: 2014-08-12 17:43:21.590938, event: header_read, request: pg_info(1 pgs e194300:24.13b) v4
   -54> 2014-08-12 17:43:21.591152 7f4c65145700  5 -- op tracker -- , seq: 655, time: 2014-08-12 17:43:21.590940, event: throttled, request: pg_info(1 pgs e194300:24.13b) v4
   -53> 2014-08-12 17:43:21.591162 7f4c65145700  5 -- op tracker -- , seq: 655, time: 2014-08-12 17:43:21.591008, event: all_read, request: pg_info(1 pgs e194300:24.13b) v4
   -52> 2014-08-12 17:43:21.591171 7f4c65145700  5 -- op tracker -- , seq: 655, time: 2014-08-12 17:43:21.591130, event: dispatched, request: pg_info(1 pgs e194300:24.13b) v4
   -51> 2014-08-12 17:43:21.591187 7f4c65145700  5 -- op tracker -- , seq: 655, time: 2014-08-12 17:43:21.591187, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.13b) v4
   -50> 2014-08-12 17:43:21.591200 7f4c65145700  5 -- op tracker -- , seq: 655, time: 2014-08-12 17:43:21.591200, event: started, request: pg_info(1 pgs e194300:24.13b) v4
   -49> 2014-08-12 17:43:21.591232 7f4c65145700  5 -- op tracker -- , seq: 655, time: 2014-08-12 17:43:21.591231, event: done, request: pg_info(1 pgs e194300:24.13b) v4
   -48> 2014-08-12 17:43:21.591256 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.395 192.168.1.105:6810/41287 7 ==== pg_info(1 pgs e194300:24.354) v4 ==== 748+0+0 (4289877745 0 0) 0x6039880 con 0x40cf0c0
   -47> 2014-08-12 17:43:21.591273 7f4c65145700  5 -- op tracker -- , seq: 656, time: 2014-08-12 17:43:21.590969, event: header_read, request: pg_info(1 pgs e194300:24.354) v4
   -46> 2014-08-12 17:43:21.591280 7f4c65145700  5 -- op tracker -- , seq: 656, time: 2014-08-12 17:43:21.590972, event: throttled, request: pg_info(1 pgs e194300:24.354) v4
   -45> 2014-08-12 17:43:21.591285 7f4c65145700  5 -- op tracker -- , seq: 656, time: 2014-08-12 17:43:21.591052, event: all_read, request: pg_info(1 pgs e194300:24.354) v4
   -44> 2014-08-12 17:43:21.591291 7f4c65145700  5 -- op tracker -- , seq: 656, time: 2014-08-12 17:43:21.591270, event: dispatched, request: pg_info(1 pgs e194300:24.354) v4
   -43> 2014-08-12 17:43:21.591297 7f4c65145700  5 -- op tracker -- , seq: 656, time: 2014-08-12 17:43:21.591296, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.354) v4
   -42> 2014-08-12 17:43:21.591304 7f4c65145700  5 -- op tracker -- , seq: 656, time: 2014-08-12 17:43:21.591304, event: started, request: pg_info(1 pgs e194300:24.354) v4
   -41> 2014-08-12 17:43:21.591324 7f4c65145700  5 -- op tracker -- , seq: 656, time: 2014-08-12 17:43:21.591324, event: done, request: pg_info(1 pgs e194300:24.354) v4
   -40> 2014-08-12 17:43:21.591344 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.308 192.168.1.113:6802/14403 11 ==== pg_info(1 pgs e194300:24.2dc) v4 ==== 748+0+0 (1765976911 0 0) 0x603d240 con 0x41d0f20
   -39> 2014-08-12 17:43:21.591365 7f4c65145700  5 -- op tracker -- , seq: 657, time: 2014-08-12 17:43:21.591016, event: header_read, request: pg_info(1 pgs e194300:24.2dc) v4
   -38> 2014-08-12 17:43:21.591373 7f4c65145700  5 -- op tracker -- , seq: 657, time: 2014-08-12 17:43:21.591019, event: throttled, request: pg_info(1 pgs e194300:24.2dc) v4
   -37> 2014-08-12 17:43:21.591378 7f4c65145700  5 -- op tracker -- , seq: 657, time: 2014-08-12 17:43:21.591104, event: all_read, request: pg_info(1 pgs e194300:24.2dc) v4
   -36> 2014-08-12 17:43:21.591384 7f4c65145700  5 -- op tracker -- , seq: 657, time: 2014-08-12 17:43:21.591363, event: dispatched, request: pg_info(1 pgs e194300:24.2dc) v4
   -35> 2014-08-12 17:43:21.591390 7f4c65145700  5 -- op tracker -- , seq: 657, time: 2014-08-12 17:43:21.591390, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.2dc) v4
   -34> 2014-08-12 17:43:21.591397 7f4c65145700  5 -- op tracker -- , seq: 657, time: 2014-08-12 17:43:21.591397, event: started, request: pg_info(1 pgs e194300:24.2dc) v4
   -33> 2014-08-12 17:43:21.591416 7f4c65145700  5 -- op tracker -- , seq: 657, time: 2014-08-12 17:43:21.591416, event: done, request: pg_info(1 pgs e194300:24.2dc) v4
   -32> 2014-08-12 17:43:21.591430 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.290 192.168.1.109:6808/31607 17 ==== pg_info(1 pgs e194300:24.119) v4 ==== 748+0+0 (3829287638 0 0) 0x629dcc0 con 0x40c9760
   -31> 2014-08-12 17:43:21.591445 7f4c65145700  5 -- op tracker -- , seq: 658, time: 2014-08-12 17:43:21.591217, event: header_read, request: pg_info(1 pgs e194300:24.119) v4
   -30> 2014-08-12 17:43:21.591451 7f4c65145700  5 -- op tracker -- , seq: 658, time: 2014-08-12 17:43:21.591220, event: throttled, request: pg_info(1 pgs e194300:24.119) v4
   -29> 2014-08-12 17:43:21.591457 7f4c65145700  5 -- op tracker -- , seq: 658, time: 2014-08-12 17:43:21.591314, event: all_read, request: pg_info(1 pgs e194300:24.119) v4
   -28> 2014-08-12 17:43:21.591462 7f4c65145700  5 -- op tracker -- , seq: 658, time: 2014-08-12 17:43:21.591443, event: dispatched, request: pg_info(1 pgs e194300:24.119) v4
   -27> 2014-08-12 17:43:21.591468 7f4c65145700  5 -- op tracker -- , seq: 658, time: 2014-08-12 17:43:21.591468, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.119) v4
   -26> 2014-08-12 17:43:21.591475 7f4c65145700  5 -- op tracker -- , seq: 658, time: 2014-08-12 17:43:21.591475, event: started, request: pg_info(1 pgs e194300:24.119) v4
   -25> 2014-08-12 17:43:21.591494 7f4c65145700  5 -- op tracker -- , seq: 658, time: 2014-08-12 17:43:21.591494, event: done, request: pg_info(1 pgs e194300:24.119) v4
   -24> 2014-08-12 17:43:21.591509 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.395 192.168.1.105:6810/41287 8 ==== pg_info(1 pgs e194300:24.354) v4 ==== 748+0+0 (4226165281 0 0) 0x6039500 con 0x40cf0c0
   -23> 2014-08-12 17:43:21.591528 7f4c65145700  5 -- op tracker -- , seq: 659, time: 2014-08-12 17:43:21.591413, event: header_read, request: pg_info(1 pgs e194300:24.354) v4
   -22> 2014-08-12 17:43:21.591538 7f4c65145700  5 -- op tracker -- , seq: 659, time: 2014-08-12 17:43:21.591416, event: throttled, request: pg_info(1 pgs e194300:24.354) v4
   -21> 2014-08-12 17:43:21.591548 7f4c65145700  5 -- op tracker -- , seq: 659, time: 2014-08-12 17:43:21.591482, event: all_read, request: pg_info(1 pgs e194300:24.354) v4
   -20> 2014-08-12 17:43:21.591557 7f4c65145700  5 -- op tracker -- , seq: 659, time: 2014-08-12 17:43:21.591526, event: dispatched, request: pg_info(1 pgs e194300:24.354) v4
   -19> 2014-08-12 17:43:21.591566 7f4c65145700  5 -- op tracker -- , seq: 659, time: 2014-08-12 17:43:21.591566, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.354) v4
   -18> 2014-08-12 17:43:21.591577 7f4c65145700  5 -- op tracker -- , seq: 659, time: 2014-08-12 17:43:21.591577, event: started, request: pg_info(1 pgs e194300:24.354) v4
   -17> 2014-08-12 17:43:21.591604 7f4c65145700  5 -- op tracker -- , seq: 659, time: 2014-08-12 17:43:21.591604, event: done, request: pg_info(1 pgs e194300:24.354) v4
   -16> 2014-08-12 17:43:21.592053 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.308 192.168.1.113:6802/14403 12 ==== pg_info(1 pgs e194300:24.2dc) v4 ==== 748+0+0 (2691013303 0 0) 0x6039340 con 0x41d0f20
   -15> 2014-08-12 17:43:21.592096 7f4c65145700  5 -- op tracker -- , seq: 660, time: 2014-08-12 17:43:21.591898, event: header_read, request: pg_info(1 pgs e194300:24.2dc) v4
   -14> 2014-08-12 17:43:21.592112 7f4c65145700  5 -- op tracker -- , seq: 660, time: 2014-08-12 17:43:21.591901, event: throttled, request: pg_info(1 pgs e194300:24.2dc) v4
   -13> 2014-08-12 17:43:21.592122 7f4c65145700  5 -- op tracker -- , seq: 660, time: 2014-08-12 17:43:21.591987, event: all_read, request: pg_info(1 pgs e194300:24.2dc) v4
   -12> 2014-08-12 17:43:21.592131 7f4c65145700  5 -- op tracker -- , seq: 660, time: 2014-08-12 17:43:21.592090, event: dispatched, request: pg_info(1 pgs e194300:24.2dc) v4
   -11> 2014-08-12 17:43:21.592142 7f4c65145700  5 -- op tracker -- , seq: 660, time: 2014-08-12 17:43:21.592142, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.2dc) v4
   -10> 2014-08-12 17:43:21.592155 7f4c65145700  5 -- op tracker -- , seq: 660, time: 2014-08-12 17:43:21.592155, event: started, request: pg_info(1 pgs e194300:24.2dc) v4
    -9> 2014-08-12 17:43:21.592191 7f4c65145700  5 -- op tracker -- , seq: 660, time: 2014-08-12 17:43:21.592191, event: done, request: pg_info(1 pgs e194300:24.2dc) v4
    -8> 2014-08-12 17:43:21.592226 7f4c65145700  1 -- 192.168.1.101:6800/51417 <== osd.342 192.168.1.110:6802/41438 8 ==== pg_info(1 pgs e194300:24.36d) v4 ==== 748+0+0 (2521894262 0 0) 0x629bb80 con 0x41d3860
    -7> 2014-08-12 17:43:21.592247 7f4c65145700  5 -- op tracker -- , seq: 661, time: 2014-08-12 17:43:21.591904, event: header_read, request: pg_info(1 pgs e194300:24.36d) v4
    -6> 2014-08-12 17:43:21.592258 7f4c65145700  5 -- op tracker -- , seq: 661, time: 2014-08-12 17:43:21.591906, event: throttled, request: pg_info(1 pgs e194300:24.36d) v4
    -5> 2014-08-12 17:43:21.592267 7f4c65145700  5 -- op tracker -- , seq: 661, time: 2014-08-12 17:43:21.591989, event: all_read, request: pg_info(1 pgs e194300:24.36d) v4
    -4> 2014-08-12 17:43:21.592275 7f4c65145700  5 -- op tracker -- , seq: 661, time: 2014-08-12 17:43:21.592245, event: dispatched, request: pg_info(1 pgs e194300:24.36d) v4
    -3> 2014-08-12 17:43:21.592285 7f4c65145700  5 -- op tracker -- , seq: 661, time: 2014-08-12 17:43:21.592285, event: waiting_for_osdmap, request: pg_info(1 pgs e194300:24.36d) v4
    -2> 2014-08-12 17:43:21.592296 7f4c65145700  5 -- op tracker -- , seq: 661, time: 2014-08-12 17:43:21.592296, event: started, request: pg_info(1 pgs e194300:24.36d) v4
    -1> 2014-08-12 17:43:21.592322 7f4c65145700  5 -- op tracker -- , seq: 661, time: 2014-08-12 17:43:21.592322, event: done, request: pg_info(1 pgs e194300:24.36d) v4
     0> 2014-08-12 17:43:21.593174 7f4c5d939700 -1 *** Caught signal (Floating point exception) **
 in thread 7f4c5d939700

 ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
 1: /usr/bin/ceph-osd() [0x9ae061]
 2: (()+0xf500) [0x7f4c77a51500]
 3: (ReplicatedPG::agent_choose_mode(bool)+0xa6e) [0x84f57e]
 4: (ReplicatedPG::agent_setup()+0x1d2) [0x866d72]
 5: (ReplicatedPG::on_activate()+0x457) [0x8677a7]
 6: (PG::RecoveryState::Active::react(PG::AllReplicasActivated const&)+0x73) [0x7dd173]
 7: (boost::statechart::detail::reaction_result boost::statechart::simple_state<PG::RecoveryState::Active, PG::RecoveryState::Primary, PG::RecoveryState::Activating, (boost::statechart::history_mode)0>::local_react_impl_non_empty::local_react_impl<boost::mpl::list<boost::statechart::custom_reaction<PG::QueryState>, boost::statechart::custom_reaction<PG::ActMap>, boost::statechart::custom_reaction<PG::AdvMap>, boost::statechart::custom_reaction<PG::MInfoRec>, boost::statechart::custom_reaction<PG::MNotifyRec>, boost::statechart::custom_reaction<PG::MLogRec>, boost::statechart::custom_reaction<PG::Backfilled>, boost::statechart::custom_reaction<PG::AllReplicasActivated>, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, boost::statechart::simple_state<PG::RecoveryState::Active, PG::RecoveryState::Primary, PG::RecoveryState::Activating, (boost::statechart::history_mode)0> >(boost::statechart::simple_state<PG::RecoveryState::Active, PG::RecoveryState::Primary, PG::RecoveryState::Activating, (boost::statechart::history_mode)0>&, boost::statechart::event_base const&, void const*)+0x36c) [0x844fec]
 8: (boost::statechart::simple_state<PG::RecoveryState::Active, PG::RecoveryState::Primary, PG::RecoveryState::Activating, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x26) [0x845066]
 9: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x4b) [0x83b4db]
 10: (PG::handle_peering_event(std::tr1::shared_ptr<PG::CephPeeringEvt>, PG::RecoveryCtx*)+0x32f) [0x7f756f]
 11: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x2f2) [0x6475b2]
 12: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x16) [0x69e686]
 13: (ThreadPool::worker(ThreadPool::WorkThread*)+0x551) [0xab8561]
 14: (ThreadPool::WorkThread::entry()+0x10) [0xabb5a0]
 15: (()+0x7851) [0x7f4c77a49851]
 16: (clone()+0x6d) [0x7f4c769da94d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.306.log
--- end dump of recent events ---


Files

ceph-osd.4.log (42.1 MB) ceph-osd.4.log karan singh, 08/13/2014 08:24 AM
ceph-osd.325.log.zip (6.43 MB) ceph-osd.325.log.zip karan singh, 08/13/2014 08:24 AM
Actions #1

Updated by karan singh over 9 years ago

I have sent one email to ceph mailing list today , which is related to problem with Ceph pool .

Hello Developers 

I have encountered some wired output of ceph df command , suddenly

When i was writing some data on cache-pool , and checked its used % , i found some used as 8E ( don’t know what is this ) and the used % for cache-pool was 0

# ceph df
GLOBAL:
    SIZE      AVAIL     RAW USED     %RAW USED
    1301T     1284T     17518G       1.31
POOLS:
    NAME                   ID     USED       %USED     OBJECTS
    data                   0      801M       0         2
    metadata               1      801M       0         22
    rbd                    2      0          0         0
    .rgw                   3      3524       0         26
    .rgw.root              4      778        0         3
    .rgw.control           5      0          0         8
    .rgw.buckets           6      8201M      0         2298
    .rgw.buckets.index     7      0          0         13
    .users.swift           8      7          0         1
    volumes                9      1106G      0.08      283387
    images                 10     40960k     0         8
    backups                11     0          0         0
    .rgw.gc                12     0          0         32
    .users.uid             13     848        0         5
    .users                 14     16         0         2
    .log                   15     153k       0         37
                           16     0          0         0
    hpsl4540               21     110G       0         28152
    hpdl380                22     245G       0.02      62688
    EC-2-2                 23     6338G      0.48      4859
    cache-pool             24     8E         0         5849                   ## What is the meaning of E here , also please note used % for cache-pool is 0 here
    ssd                    25     25196M     0         5464

After some time when cache-poo used value changed to 7E and used % as 644301.19  . While there were no objects in the cache-pool

# ceph df
GLOBAL:
    SIZE      AVAIL     RAW USED     %RAW USED
    1301T     1284T     17508G       1.31
POOLS:
    NAME                   ID     USED       %USED         OBJECTS
    data                   0      801M       0             2
    metadata               1      801M       0             22
    rbd                    2      0          0             0
    .rgw                   3      3524       0             26
    .rgw.root              4      778        0             3
    .rgw.control           5      0          0             8
    .rgw.buckets           6      8201M      0             2298
    .rgw.buckets.index     7      0          0             13
    .users.swift           8      7          0             1
    volumes                9      1106G      0.08          283387
    images                 10     40960k     0             8
    backups                11     0          0             0
    .rgw.gc                12     0          0             32
    .users.uid             13     848        0             5
    .users                 14     16         0             2
    .log                   15     153k       0             37
                           16     0          0             0
    hpsl4540               21     110G       0             28152
    hpdl380                22     245G       0.02          62688
    EC-2-2                 23     6338G      0.48          4843
    cache-pool             24     7E         644301.19     1056                   ## The used % for cache-pool has become 644301.19 
    ssd                    25     25196M     0             5464
#

# rados -p cache-pool ls                         
#

# ceph -s
    cluster 009d3518-e60d-4f74-a26d-c08c1976263c
     health HEALTH_WARN 'cache-pool' at/near target max
     monmap e3: 3 mons at 
     mdsmap e14: 1/1/1 up {0=storage0101-ib=up:active}
     osdmap e194215: 402 osds: 402 up, 402 in
      pgmap v743051: 31168 pgs, 22 pools, 8 EB data, 378 kobjects                  ## What is EB here ? 
            17508 GB used, 1284 TB / 1301 TB avail
               31168 active+clean

I think both of these problems are related.

Actions #2

Updated by Sage Weil over 9 years ago

  • Tags deleted (must scrub before tier agent can activate)
Actions #3

Updated by Sage Weil over 9 years ago

  • Status changed from New to 12
  • Assignee set to Sage Weil

can you reproduce this with debug osd = 20 and attach teh log? thanks!

Actions #4

Updated by karan singh over 9 years ago

I have injected debus osd 20 to one OSD , and and then tried to initiate rados bench on EC pool which is tired with cache pool

Sorry for the long output ,i don't have pastbin pro account

# tail -f ceph-osd.306.log
2014-08-12 20:52:56.932752 7f104f812700 10 osd.306 pg_epoch: 194393 pg[22.7b2( v 194215'472 lc 0'0 (0'0,194215'472] local-les=194365 n=28 ec=183096 les/c 194365/194303 194381/194382/192970) [16,325,346] r=-1 lpr=194384 pi=194253-194381/17 crt=194215'472 inactive NOTIFY m=28] handle_peering_event: epoch_sent: 194393 epoch_requested: 194393 FlushedEvt
2014-08-12 20:52:56.932776 7f104f812700 15 osd.306 pg_epoch: 194393 pg[22.7b2( v 194215'472 lc 0'0 (0'0,194215'472] local-les=194365 n=28 ec=183096 les/c 194365/194303 194381/194382/192970) [16,325,346] r=-1 lpr=194384 pi=194253-194381/17 crt=194215'472 inactive NOTIFY m=28]  requeue_ops
2014-08-12 20:52:56.932784 7f104f812700 10 log is not dirty
2014-08-12 20:52:56.932797 7f106a9997a0  0 osd.306 194393 done with init, starting boot process
2014-08-12 20:52:56.932804 7f106a9997a0 10 osd.306 194393 start_boot - have maps 185139..194393
2014-08-12 20:52:56.932832 7f103e967700 10 osd.306 194393 OSD::ms_get_authorizer type=osd
2014-08-12 20:52:56.933769 7f1050c14700 10 osd.306 194393 _maybe_boot mon has osdmaps 185139..194393
2014-08-12 20:52:56.933806 7f1050c14700 10 osd.306 194393 _send_boot
2014-08-12 20:52:56.933817 7f1050c14700 10 osd.306 194393  client_addr 192.168.100.101:6801/6812, cluster_addr 192.168.1.101:6806/6812, hb_back_addr 192.168.1.101:6807/6812, hb_front_addr 192.168.100.101:6805/6812
2014-08-12 20:52:56.934168 7f1050c14700 10 osd.306 194393 _collect_metadata {arch=x86_64,back_addr=192.168.1.101:6806/6812,ceph_version=ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6),cpu=Intel(R) Xeon(R) CPU E5-2450 0 @ 2.10GHz,front_addr=192.168.100.101:6801/6812,hb_back_addr=192.168.1.101:6807/6812,hb_front_addr=192.168.100.101:6805/6812,hostname=storage0101-ib.pocstor,kernel_description=#1 SMP Wed Oct 16 18:37:12 UTC 2013,kernel_version=2.6.32-358.23.2.el6.x86_64,mem_swap_kb=23446856,mem_total_kb=24568036,os=Linux,osd_data=/var/lib/ceph/osd/ceph-306,osd_journal=/var/lib/ceph/osd/ceph-306/journal}
2014-08-12 20:52:57.476438 7f1058420700  1 osd.306 194393 ms_handle_reset con 0x47e4200 session 0
2014-08-12 20:52:57.476857 7f1058420700  1 osd.306 194393 ms_handle_reset con 0x47e4360 session 0
2014-08-12 20:52:57.677447 7f1058420700  1 osd.306 194393 ms_handle_reset con 0x47e4780 session 0
2014-08-12 20:52:58.773621 7f1058420700 10 osd.306 194393 do_waiters -- start
2014-08-12 20:52:58.773627 7f1058420700 10 osd.306 194393 do_waiters -- finish
2014-08-12 20:52:58.773631 7f1058420700 20 osd.306 194393 _dispatch 0x2eb1b00 osd_map(194394..194395 src has 185139..194395) v3
2014-08-12 20:52:58.773747 7f1058420700  3 osd.306 194393 handle_osd_map epochs [194394,194395], i have 194393, src has [185139,194395]
2014-08-12 20:52:58.773759 7f1058420700 10 osd.306 194393 handle_osd_map  got inc map for epoch 194394
2014-08-12 20:52:58.779318 7f1058420700 10 osd.306 194393 handle_osd_map  got inc map for epoch 194395
2014-08-12 20:52:58.784072 7f103d65d700 10 osd.306 194393  new session 0x3078e00 con=0x4ef02c0 addr=192.168.1.110:6802/41438
2014-08-12 20:52:58.784229 7f103d65d700 10 osd.306 194393  session 0x3078e00 osd.342 has caps osdcap[grant(*)] 'allow *'
2014-08-12 20:52:58.785210 7f1058420700 10 osd.306 194393  advance to epoch 194394 (<= newest 194395)
2014-08-12 20:52:58.785313 7f1058420700  7 osd.306 194394 advance_map epoch 194394  78 pgs
2014-08-12 20:52:58.785342 7f1058420700 10 osd.306 194394  advance to epoch 194395 (<= newest 194395)
2014-08-12 20:52:58.785420 7f1058420700  7 osd.306 194395 advance_map epoch 194395  78 pgs
2014-08-12 20:52:58.785430 7f1058420700 10 osd.306 194395 up_epoch is 194395
2014-08-12 20:52:58.785433 7f1058420700 10 osd.306 194395 boot_epoch is 194395
2014-08-12 20:52:58.785459 7f1058420700  1 osd.306 194395 state: booting -> active
2014-08-12 20:52:58.785468 7f1058420700 10 osd.306 194395 write_superblock sb(009d3518-e60d-4f74-a26d-c08c1976263c osd.306 ca1d18c4-a29f-4678-9111-25fcf2fc8840 e194395 [185139,194395] lci=[194395,194395])
2014-08-12 20:52:58.785579 7f1058420700  0 osd.306 194395 crush map has features 2578087936000, adjusting msgr requires for mons
2014-08-12 20:52:58.785589 7f1058420700  7 osd.306 194395 consume_map version 194395
2014-08-12 20:52:58.785656 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194392 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.785715 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194391/194391/193132) [342,306,20] r=1 lpr=194393 pi=193132-194390/27 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] null
2014-08-12 20:52:58.785783 7f1050213700 10 osd.306 pg_epoch: 194393 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194392 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] handle_advance_map [32,356]/[32,356] -- 32/32
2014-08-12 20:52:58.785832 7f1050213700 20 PGPool::update cached_removed_snaps [] newly_removed_snaps [] snapc 0=[] (no change)
2014-08-12 20:52:58.785812 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.30f( v 194383'14664 lc 194370'14635 (192718'11887,194383'14664] local-les=194392 n=20 ec=192688 les/c 194385/194380 194391/194391/194391) [306,308,62] r=0 lpr=194392 pi=194365-194390/8 crt=194383'14664 lcod 0'0 mlcod 0'0 peering m=15] null
2014-08-12 20:52:58.785847 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194392 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] state<Started>: Started advmap
2014-08-12 20:52:58.785848 7f104f812700 10 osd.306 pg_epoch: 194393 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194391/194391/193132) [342,306,20] r=1 lpr=194393 pi=193132-194390/27 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] handle_advance_map [342,20]/[342,20] -- 342/342
2014-08-12 20:52:58.785868 7f1050213700 20 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194392 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] acting_up_affected newup [32,356] newacting [32,356]
2014-08-12 20:52:58.785894 7f104f812700 20 PGPool::update cached_removed_snaps [] newly_removed_snaps [] snapc 0=[] (no change)
2014-08-12 20:52:58.785884 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194392 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] state<Started>: up or acting affected, transitioning to Reset
2014-08-12 20:52:58.785906 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194391/194391/193132) [342,306,20] r=1 lpr=194393 pi=193132-194390/27 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] state<Reset>: Reset advmap
2014-08-12 20:52:58.785911 7f1050213700  5 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194392 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] exit Started/Stray 1.885356 3 0.000195
2014-08-12 20:52:58.785926 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194391/194391/193132) [342,306,20] r=1 lpr=194393 pi=193132-194390/27 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] _calc_past_interval_range: already have past intervals back to 194379
2014-08-12 20:52:58.785919 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.2ea( v 194215'468 (0'0,194215'468] local-les=194392 n=23 ec=183096 les/c 194392/194392 194391/194391/192951) [324,60,306] r=2 lpr=194392 pi=192975-194390/23 crt=192888'430 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.785938 7f1050213700  5 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194392 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] exit Started 1.885446 0 0.000000
2014-08-12 20:52:58.785944 7f104f812700 20 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194391/194391/193132) [342,306,20] r=1 lpr=194393 pi=193132-194390/27 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] acting_up_affected newup [342,20] newacting [342,20]
2014-08-12 20:52:58.785952 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.604( v 194215'480 (0'0,194215'480] local-les=194392 n=38 ec=183096 les/c 194392/194392 194391/194391/192969) [29,306,329] r=1 lpr=194392 pi=192978-194390/23 crt=192888'446 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.785960 7f1050213700  5 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194392 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] enter Reset
2014-08-12 20:52:58.785963 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194391/194391/193132) [342,306,20] r=1 lpr=194393 pi=193132-194390/27 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] state<Reset>: up or acting affected, calling start_peering_interval again
2014-08-12 20:52:58.785977 7f1050213700 20 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194392 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] set_last_peering_reset 194394
2014-08-12 20:52:58.785976 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.4ac( v 194215'540 lc 0'0 (0'0,194215'540] local-les=194393 n=33 ec=183096 les/c 194389/194347 194391/194392/194392) [306,290,376] r=0 lpr=194393 pi=194346-194391/11 crt=194215'540 mlcod 0'0 inactive m=33] null
2014-08-12 20:52:58.785988 7f104f812700 20 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194391/194391/193132) [342,306,20] r=1 lpr=194393 pi=193132-194390/27 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] set_last_peering_reset 194394
2014-08-12 20:52:58.785998 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194394 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] state<Reset>: Reset advmap
2014-08-12 20:52:58.786005 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.119( v 194393'15329 lc 194370'15288 (192834'13554,194393'15329] local-les=194392 n=22 ec=192688 les/c 194392/194379 194391/194391/192969) [290,306,64] r=1 lpr=194393 pi=193125-194390/27 crt=194383'15323 lcod 0'0 inactive NOTIFY m=18] null
2014-08-12 20:52:58.786016 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194394 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] _calc_past_interval_range: already have past intervals back to 194392
2014-08-12 20:52:58.786028 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.31( v 194215'511 (0'0,194215'511] local-les=194392 n=34 ec=183096 les/c 194392/194392 194391/194391/194391) [306,22,341] r=0 lpr=194392 crt=0'0 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.786032 7f1050213700 20 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194394 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] acting_up_affected newup [32,356] newacting [32,356]
2014-08-12 20:52:58.786037 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194391/194391/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] start_peering_interval: check_new_interval output: generate_past_intervals interval(194391-194393 up [342,306,20](342) acting [342,306,20](342)): not rw, up_thru 194391 up_from 192964 last_epoch_clean 194379
generate_past_intervals interval(194391-194393 up [342,306,20](342) acting [342,306,20](342) maybe_went_rw) : primary up 192964-194391 includes interval

2014-08-12 20:52:58.786048 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194394 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] state<Reset>: up or acting affected, calling start_peering_interval again
2014-08-12 20:52:58.786049 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.fd( v 194215'516 (0'0,194215'516] local-les=194392 n=19 ec=183096 les/c 194392/194392 194391/194391/192967) [46,12,306] r=2 lpr=194392 pi=192975-194390/23 crt=192888'470 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786065 7f1050213700 20 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,306,356] r=1 lpr=194394 pi=189940-194390/35 crt=191727'432 lcod 0'0 inactive NOTIFY] set_last_peering_reset 194394
2014-08-12 20:52:58.786063 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194391/194391/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19]  noting past interval(194391-194393 up [342,306,20](342) acting [342,306,20](342) maybe_went_rw)
2014-08-12 20:52:58.786073 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.f8( v 194215'491 (0'0,194215'491] local-les=194392 n=35 ec=183096 les/c 194392/194392 194391/194391/194391) [306,55,64] r=0 lpr=194392 crt=191648'362 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.786086 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19]  up [342,306,20] -> [342,20], acting [342,306,20] -> [342,20], acting_primary 342 -> 342, up_primary 342 -> 342, role 1 -> -1
2014-08-12 20:52:58.786096 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.414( v 194215'515 lc 0'0 (0'0,194215'515] local-les=194392 n=33 ec=183096 les/c 194389/194347 194391/194391/194391) [306,22,359] r=0 lpr=194392 pi=192971-194390/22 crt=194215'515 mlcod 0'0 peering m=33] null
2014-08-12 20:52:58.786103 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] start_peering_interval: check_new_interval output: generate_past_intervals interval(194391-194393 up [32,306,356](32) acting [32,306,356](32)): not rw, up_thru 194391 up_from 192973 last_epoch_clean 194392
generate_past_intervals interval(194391-194393 up [32,306,356](32) acting [32,306,356](32) maybe_went_rw) : primary up 192973-194391 includes interval

2014-08-12 20:52:58.786112 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] on_change
2014-08-12 20:52:58.786120 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.2bc( v 194215'505 (0'0,194215'505] local-les=194392 n=38 ec=183096 les/c 194392/194392 194391/194391/194391) [306,336,64] r=0 lpr=194392 crt=191727'451 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.786124 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194391/194391/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY]  noting past interval(194391-194393 up [32,306,356](32) acting [32,306,356](32) maybe_went_rw)
2014-08-12 20:52:58.786138 7f104f812700 15 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] publish_stats_to_osd -- not primary
2014-08-12 20:52:58.786140 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.3f1( v 194377'11439 lc 194370'11404 (192834'9669,194377'11439] local-les=194392 n=22 ec=192688 les/c 194392/194338 194391/194391/193177) [302,306,51]/[302,306] r=1 lpr=194392 pi=193177-194390/31 crt=194377'11439 lcod 0'0 inactive NOTIFY m=18] null
2014-08-12 20:52:58.786142 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY]  up [32,306,356] -> [32,356], acting [32,306,356] -> [32,356], acting_primary 32 -> 32, up_primary 32 -> 32, role 1 -> -1
2014-08-12 20:52:58.786157 7f104f812700 15 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19]  requeue_ops
2014-08-12 20:52:58.786163 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] on_change
2014-08-12 20:52:58.786168 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.331( v 194215'508 (0'0,194215'508] local-les=194392 n=31 ec=183096 les/c 194392/194392 194391/194391/192972) [344,21,306] r=2 lpr=194392 pi=192976-194390/23 crt=192888'478 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786180 7f1050213700 15 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] publish_stats_to_osd -- not primary
2014-08-12 20:52:58.786187 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] cancel_copy_ops
2014-08-12 20:52:58.786190 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.176( v 194383'17925 lc 194370'17886 (192834'15843,194383'17925] local-les=194392 n=36 ec=192688 les/c 194385/194379 194391/194391/194391) [306,358,356] r=0 lpr=194392 pi=194365-194390/7 crt=194383'17925 lcod 0'0 mlcod 0'0 peering m=20] null
2014-08-12 20:52:58.786198 7f1050213700 15 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY]  requeue_ops
2014-08-12 20:52:58.786210 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] cancel_flush_ops
2014-08-12 20:52:58.786217 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] cancel_copy_ops
2014-08-12 20:52:58.786216 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.254( v 194383'17519 lc 194370'17472 (192803'14472,194383'17519] local-les=194392 n=15 ec=192688 les/c 194385/194380 194391/194391/194391) [306,289,6] r=0 lpr=194392 pi=194365-194390/8 crt=194383'17519 lcod 0'0 mlcod 0'0 peering m=24] null
2014-08-12 20:52:58.786233 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] cancel_flush_ops
2014-08-12 20:52:58.786241 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.4dd( v 194215'479 (0'0,194215'479] local-les=194392 n=32 ec=183096 les/c 194392/194392 194391/194391/192973) [55,336,306] r=2 lpr=194392 pi=192974-194390/23 crt=192888'445 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786236 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] on_change_cleanup
2014-08-12 20:52:58.786250 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] on_change_cleanup
2014-08-12 20:52:58.786263 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] on_change
2014-08-12 20:52:58.786263 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.2a9( v 194383'12890 lc 194370'12833 (192834'10938,194383'12890] local-les=194392 n=33 ec=192688 les/c 194385/194379 194391/194391/194391) [306,24,61] r=0 lpr=194392 pi=194365-194390/7 crt=194383'12890 lcod 0'0 mlcod 0'0 peering m=29] null
2014-08-12 20:52:58.786267 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] on_change
2014-08-12 20:52:58.786289 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.104( v 194215'495 (0'0,194215'495] local-les=194392 n=38 ec=183096 les/c 194392/194392 194391/194391/194391) [306,52,367] r=0 lpr=194392 crt=191648'337 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.786307 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.1c4( v 194393'19217 lc 194370'19162 (192834'17432,194393'19217] local-les=194392 n=30 ec=192688 les/c 194392/194379 194391/194391/192966) [18,306,351] r=1 lpr=194393 pi=193125-194390/27 crt=194383'19209 lcod 0'0 inactive NOTIFY m=24] null
2014-08-12 20:52:58.786470 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.6f9( v 194215'484 (0'0,194215'484] local-les=194392 n=34 ec=183096 les/c 194392/194392 194391/194391/194391) [306,53,21] r=0 lpr=194392 crt=191727'436 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.786490 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.71d( v 194215'485 (0'0,194215'485] local-les=194392 n=30 ec=183096 les/c 194392/194392 194391/194391/192975) [297,306,379] r=1 lpr=194392 pi=192975-194390/23 crt=192888'465 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786515 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.104( v 194393'15336 lc 194370'15299 (192718'12373,194393'15336] local-les=194392 n=30 ec=192688 les/c 194392/194379 194391/194391/194391) [306,49,292] r=0 lpr=194393 pi=194365-194390/7 crt=194383'15332 lcod 0'0 mlcod 0'0 inactive m=17] null
2014-08-12 20:52:58.786536 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.58( v 194393'46 lc 194370'1 (0'0,194393'46] local-les=194392 n=23 ec=192688 les/c 194392/194310 194391/194391/194391) [306,352,344]/[306,333,334] r=0 lpr=194393 pi=193813-194390/43 crt=194383'42 lcod 0'0 mlcod 0'0 inactive m=21] null
2014-08-12 20:52:58.786556 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.b4( v 194215'534 (0'0,194215'534] local-les=194392 n=37 ec=183096 les/c 194392/194392 194391/194391/194391) [306,371,44] r=0 lpr=194392 crt=191648'375 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.786573 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.5b( v 194393'15470 lc 194370'15435 (192718'12626,194393'15470] local-les=194392 n=19 ec=192688 les/c 194392/194328 194387/194391/194388) [15,300,44]/[24,306] r=1 lpr=194393 pi=193125-194390/27 crt=194377'15468 lcod 0'0 inactive NOTIFY m=17] null
2014-08-12 20:52:58.786592 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.5ae( v 194215'506 (0'0,194215'506] local-les=194392 n=22 ec=183096 les/c 194392/194392 194391/194391/192964) [369,306,378] r=1 lpr=194392 pi=192971-194390/23 crt=192888'466 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786610 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.3e9( v 194215'501 (0'0,194215'501] local-les=194392 n=35 ec=183096 les/c 194392/194392 194391/194391/194391) [306,290,295] r=0 lpr=194392 crt=191727'417 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.786627 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.13b( v 194383'12391 lc 194370'12354 (192718'9400,194383'12391] local-les=194385 n=23 ec=192688 les/c 194385/194328 194391/194391/192981) [313,306,325] r=1 lpr=194391 pi=193125-194390/39 crt=194383'12391 lcod 0'0 inactive NOTIFY m=19] null
2014-08-12 20:52:58.786646 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.5e( v 194393'16962 (192718'14372,194393'16962] lb 0//0//-1 local-les=0 n=0 ec=192688 les/c 194392/194372 194391/194391/192968) [289,306,364]/[289,285] r=-1 lpr=194393 pi=194290-194390/19 crt=194393'16962 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786667 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.216( v 194215'532 (0'0,194215'532] local-les=194392 n=29 ec=183096 les/c 194392/194392 194391/194391/192964) [316,306,352] r=1 lpr=194392 pi=192973-194390/29 crt=192888'502 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786692 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.763( v 194215'484 (0'0,194215'484] local-les=194392 n=28 ec=183096 les/c 194392/194392 194391/194391/192965) [59,306,365] r=1 lpr=194392 pi=189945-194390/31 crt=191727'436 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786711 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.2dc( v 194393'16205 lc 194370'16164 (192834'14143,194393'16205] local-les=194392 n=26 ec=192688 les/c 194392/194379 194391/194391/193194) [308,17,306] r=2 lpr=194393 pi=193194-194390/27 crt=194383'16201 lcod 0'0 inactive NOTIFY m=19] null
2014-08-12 20:52:58.786730 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.4f7( v 194215'514 (0'0,194215'514] local-les=194392 n=34 ec=183096 les/c 194392/194392 194391/194391/192966) [62,364,306] r=2 lpr=194392 pi=192975-194390/23 crt=192888'466 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786747 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.62d( v 194215'491 (0'0,194215'491] local-les=194386 n=30 ec=183096 les/c 194386/194368 194391/194391/194382) [325,376,306] r=2 lpr=194391 pi=192971-194390/40 crt=192888'465 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786765 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.432( v 194215'481 (0'0,194215'481] local-les=194392 n=33 ec=183096 les/c 194392/194392 194391/194391/192966) [327,306,63] r=1 lpr=194392 pi=192972-194390/23 crt=192888'451 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786783 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.a5( v 194215'501 (0'0,194215'501] local-les=194392 n=30 ec=183096 les/c 194392/194392 194391/194391/194391) [306,380,297] r=0 lpr=194392 crt=191648'340 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.786801 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.3ba( v 194215'519 (0'0,194215'519] local-les=194392 n=29 ec=183096 les/c 194392/194392 194391/194391/194391) [306,2,349] r=0 lpr=194392 crt=191727'445 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.786818 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.45( v 194215'489 (0'0,194215'489] local-les=194392 n=29 ec=183096 les/c 194389/194235 194391/194391/194391) [306,292,63] r=0 lpr=194392 pi=193306-194390/38 crt=191648'354 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.786836 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.354( v 194380'17474 lc 194370'17431 (192803'14476,194380'17474] local-les=194392 n=1 ec=192688 les/c 194392/194379 194391/194391/193175) [395,290,306] r=2 lpr=194392 pi=193175-194390/27 crt=194380'17474 lcod 0'0 inactive NOTIFY m=22] null
2014-08-12 20:52:58.786855 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.1e5( v 194215'484 (0'0,194215'484] local-les=194392 n=35 ec=183096 les/c 194392/194392 194391/194391/192976) [360,372,306] r=2 lpr=194392 pi=192976-194390/23 crt=192888'460 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786879 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.127( v 194215'500 (0'0,194215'500] local-les=194392 n=34 ec=183096 les/c 194392/194392 194391/194391/192943) [337,306,64] r=1 lpr=194392 pi=192964-194390/23 crt=192888'464 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786896 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.795( v 194215'502 lc 0'0 (0'0,194215'502] local-les=194393 n=26 ec=183096 les/c 194393/194368 194391/194392/192967) [6,306,374] r=1 lpr=194393 pi=194270-194391/26 crt=194215'502 inactive NOTIFY m=26] null
2014-08-12 20:52:58.786914 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.5dc( v 194215'562 (0'0,194215'562] local-les=194392 n=35 ec=183096 les/c 194392/194392 194391/194391/192964) [347,306,7] r=1 lpr=194392 pi=189951-194390/34 crt=191727'496 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.786932 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.4f( v 194393'20596 lc 194370'20549 (192834'18884,194393'20596] local-les=194392 n=25 ec=192688 les/c 194392/194379 194391/194391/192977) [293,21,306] r=2 lpr=194393 pi=192688-194390/34 crt=194383'20588 lcod 0'0 inactive NOTIFY m=20] null
2014-08-12 20:52:58.786994 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.6bf( v 194215'506 (0'0,194215'506] local-les=194392 n=23 ec=183096 les/c 194392/194392 194391/194391/194391) [306,0,14] r=0 lpr=194392 crt=191727'436 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.787014 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.3ad( v 194393'12465 lc 194370'12428 (192718'9872,194393'12465] local-les=194392 n=23 ec=192688 les/c 194392/194380 194391/194391/194391) [306,295,31] r=0 lpr=194393 pi=194321-194390/13 crt=194383'12461 lcod 0'0 mlcod 0'0 inactive m=17] null
2014-08-12 20:52:58.787033 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.44( v 194383'19322 lc 194370'19275 (192834'17118,194383'19322] local-les=194392 n=28 ec=192688 les/c 194392/194379 194391/194391/192970) [60,318,306] r=2 lpr=194392 pi=193125-194390/27 crt=194383'19322 lcod 0'0 inactive NOTIFY m=24] null
2014-08-12 20:52:58.787052 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.a8( v 194215'462 (0'0,194215'462] local-les=194392 n=26 ec=183096 les/c 194392/194392 194391/194391/192965) [380,306,337] r=1 lpr=194392 pi=192968-194390/23 crt=192888'444 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.787070 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.567( v 194215'499 (0'0,194215'499] local-les=194392 n=37 ec=183096 les/c 194392/194392 194391/194391/192976) [284,375,306] r=2 lpr=194392 pi=191649-194390/33 crt=191727'435 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.787087 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.12d( v 194215'506 (0'0,194215'506] local-les=194392 n=35 ec=183096 les/c 194392/194392 194391/194391/192979) [363,306,15] r=1 lpr=194392 pi=192984-194390/23 crt=192888'488 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.787113 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.125( v 194393'11425 lc 194370'11362 (192718'8676,194393'11425] local-les=194392 n=34 ec=192688 les/c 194392/194379 194391/194391/194391) [306,339,397] r=0 lpr=194393 pi=194365-194390/7 crt=194383'11415 lcod 0'0 mlcod 0'0 inactive m=27] null
2014-08-12 20:52:58.787134 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.187( v 194383'19768 lc 194370'19729 (192718'16727,194383'19768] local-les=194392 n=30 ec=192688 les/c 194392/194379 194391/194391/192966) [26,12,306] r=2 lpr=194392 pi=193125-194390/27 crt=194383'19768 lcod 0'0 inactive NOTIFY m=20] null
2014-08-12 20:52:58.787153 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.481( v 194215'519 lc 0'0 (0'0,194215'519] local-les=194393 n=32 ec=183096 les/c 194393/194304 194391/194392/194388) [303,306,381] r=1 lpr=194393 pi=194237-194391/25 crt=194215'519 inactive NOTIFY m=32] null
2014-08-12 20:52:58.787170 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.2c7( v 194377'19986 lc 194370'19951 (192718'16949,194377'19986] local-les=194392 n=25 ec=192688 les/c 194385/194379 194391/194391/194391) [306,315,316] r=0 lpr=194392 pi=194365-194390/7 crt=194377'19986 lcod 0'0 mlcod 0'0 peering m=18] null
2014-08-12 20:52:58.787190 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.296( v 194215'502 (0'0,194215'502] local-les=194392 n=29 ec=183096 les/c 194392/194392 194391/194391/192975) [10,306,367] r=1 lpr=194392 pi=192975-194390/23 crt=192888'472 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.787207 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.2fa( v 194377'17104 lc 194370'17073 (192718'14071,194377'17104] local-les=194392 n=30 ec=192688 les/c 194385/194379 194391/194391/194391) [306,8,364] r=0 lpr=194392 pi=194365-194390/7 crt=194377'17104 lcod 0'0 mlcod 0'0 peering m=16] null
2014-08-12 20:52:58.787226 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.4eb( v 194215'514 (0'0,194215'514] local-les=194392 n=37 ec=183096 les/c 194392/194392 194391/194391/192963) [315,306,18] r=1 lpr=194392 pi=190472-194390/35 crt=191727'464 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.787243 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.56c( v 194215'492 (0'0,194215'492] local-les=194392 n=25 ec=183096 les/c 194392/194392 194391/194391/194391) [306,374,54] r=0 lpr=194392 crt=191727'436 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.787260 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.6af( v 194215'508 (0'0,194215'508] local-les=194392 n=32 ec=183096 les/c 194392/194392 194391/194391/194391) [306,313,358] r=0 lpr=194392 crt=191727'440 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.787277 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.7e( v 194215'524 (0'0,194215'524] local-les=194392 n=39 ec=183096 les/c 194392/194392 194391/194391/192971) [361,2,306] r=2 lpr=194392 pi=192971-194390/23 crt=192888'488 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.787295 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.2f0( v 194215'524 (0'0,194215'524] local-les=194392 n=26 ec=183096 les/c 194392/194392 194391/194391/192970) [379,372,306] r=2 lpr=194392 pi=192974-194390/19 crt=192888'490 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.787312 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.6aa( v 194215'477 lc 0'0 (0'0,194215'477] local-les=194393 n=25 ec=183096 les/c 194393/194353 194391/194392/192967) [13,308,306] r=2 lpr=194393 pi=194248-194391/20 crt=194215'477 inactive NOTIFY m=25] null
2014-08-12 20:52:58.787331 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.456( v 194215'515 (0'0,194215'515] local-les=194392 n=33 ec=183096 les/c 194392/194392 194391/194391/192972) [344,306,299] r=1 lpr=194392 pi=183108-194390/44 crt=191727'441 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.787348 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.d9( v 194215'495 (0'0,194215'495] local-les=194392 n=24 ec=183096 les/c 194392/194392 194391/194391/194391) [306,360,307] r=0 lpr=194392 crt=191648'365 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.787365 7f1058420700 10 osd.306 pg_epoch: 194393 pg[2.f( v 194215'872 (0'0,194215'872] local-les=194392 n=0 ec=1 les/c 194392/194392 194391/194391/194391) [306,19] r=0 lpr=194392 crt=0'0 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.787383 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.77( v 194393'15398 lc 194370'15345 (192718'12444,194393'15398] local-les=194392 n=20 ec=192688 les/c 194392/194380 194391/194391/194391) [306,380,287] r=0 lpr=194393 pi=194365-194390/8 crt=194383'15394 lcod 0'0 mlcod 0'0 inactive m=25] null
2014-08-12 20:52:58.787401 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.310( v 194215'468 (0'0,194215'468] local-les=194392 n=24 ec=183096 les/c 194392/194392 194391/194391/192979) [363,313,306] r=2 lpr=194392 pi=192979-194390/23 crt=192888'436 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.787419 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.1b6( v 194383'15400 lc 194370'15363 (192834'12371,194383'15400] local-les=194392 n=20 ec=192688 les/c 194392/194379 194391/194391/192928) [49,308,306] r=2 lpr=194392 pi=193125-194390/27 crt=194383'15400 lcod 0'0 inactive NOTIFY m=19] null
2014-08-12 20:52:58.787437 7f1058420700 10 osd.306 pg_epoch: 194393 pg[1.10( v 194215'1650 (0'0,194215'1650] local-les=194392 n=2 ec=1 les/c 194392/194392 194391/194391/194391) [306,19] r=0 lpr=194392 crt=111'2 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.787454 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.1a9( v 194383'12966 lc 194370'12915 (192718'10360,194383'12966] local-les=194392 n=28 ec=192688 les/c 194392/194379 194391/194391/192964) [316,306,366] r=1 lpr=194392 pi=193125-194390/27 crt=194383'12966 lcod 0'0 inactive NOTIFY m=26] null
2014-08-12 20:52:58.787473 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.2f9( v 194215'498 (0'0,194215'498] local-les=194392 n=33 ec=183096 les/c 194392/194392 194391/194391/194391) [306,3,58] r=0 lpr=194392 crt=191727'436 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.787490 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.533( v 194215'462 (0'0,194215'462] local-les=194392 n=23 ec=183096 les/c 194392/194392 194391/194391/192967) [13,306,324] r=1 lpr=194392 pi=192975-194390/23 crt=192888'432 lcod 0'0 inactive NOTIFY] null
2014-08-12 20:52:58.787511 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.cf( v 194383'20646 lc 194370'20599 (192834'18925,194383'20646] local-les=194392 n=37 ec=192688 les/c 194385/194379 194391/194391/194391) [306,17,379] r=0 lpr=194392 pi=194365-194390/7 crt=194383'20646 lcod 0'0 mlcod 0'0 peering m=24] null
2014-08-12 20:52:58.787532 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.7b2( v 194215'472 lc 0'0 (0'0,194215'472] local-les=194365 n=28 ec=183096 les/c 194365/194303 194381/194382/192970) [16,325,346] r=-1 lpr=194384 pi=194253-194381/17 crt=194215'472 inactive NOTIFY m=28] null
2014-08-12 20:52:58.787550 7f1058420700 10 osd.306 pg_epoch: 194393 pg[24.c1( v 194380'13431 lc 194370'13392 (192718'10766,194380'13431] local-les=194392 n=16 ec=192688 les/c 194385/194379 194391/194391/194391) [306,50,65] r=0 lpr=194392 pi=194321-194390/12 crt=194380'13431 lcod 0'0 mlcod 0'0 peering m=20] null
2014-08-12 20:52:58.787569 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.61e( v 194215'469 (0'0,194215'469] local-les=194392 n=33 ec=183096 les/c 194392/194392 194391/194391/194391) [306,312,45] r=0 lpr=194392 crt=191727'413 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.787586 7f1058420700 10 osd.306 pg_epoch: 194393 pg[22.59b( v 194215'511 (0'0,194215'511] local-les=194392 n=39 ec=183096 les/c 194392/194392 194391/194391/194391) [306,45,9] r=0 lpr=194392 crt=191727'437 lcod 0'0 mlcod 0'0 peering] null
2014-08-12 20:52:58.787965 7f1050213700 20 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] exit NotTrimming
2014-08-12 20:52:58.788507 7f104f812700 20 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] exit NotTrimming
2014-08-12 20:52:58.788560 7f1050213700 20 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] enter NotTrimming
2014-08-12 20:52:58.788600 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] on_role_change
2014-08-12 20:52:58.788618 7f1050213700 15 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY]  requeue_ops
2014-08-12 20:52:58.788632 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] cancel_recovery
2014-08-12 20:52:58.788646 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] clear_recovery_state
2014-08-12 20:52:58.788669 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] check_recovery_sources no source osds () went down
2014-08-12 20:52:58.788725 7f1050213700 10 osd.306 pg_epoch: 194394 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] handle_advance_map [32,306,356]/[32,306,356] -- 32/32
2014-08-12 20:52:58.788747 7f1050213700 20 PGPool::update cached_removed_snaps [] newly_removed_snaps [] snapc 0=[] (no change)
2014-08-12 20:52:58.788753 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] state<Reset>: Reset advmap
2014-08-12 20:52:58.788768 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] _calc_past_interval_range: already have past intervals back to 194392
2014-08-12 20:52:58.788782 7f1050213700 20 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] acting_up_affected newup [32,306,356] newacting [32,306,356]
2014-08-12 20:52:58.788796 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] state<Reset>: up or acting affected, calling start_peering_interval again
2014-08-12 20:52:58.788813 7f1050213700 20 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,356] r=-1 lpr=194394 pi=189940-194393/36 crt=191727'432 lcod 0'0 inactive NOTIFY] set_last_peering_reset 194395
2014-08-12 20:52:58.788847 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] start_peering_interval: check_new_interval output: generate_past_intervals interval(194394-194394 up [32,356](32) acting [32,356](32)): not rw, up_thru 194391 up_from 192973 last_epoch_clean 194392
generate_past_intervals interval(194394-194394 up [32,356](32) acting [32,356](32)) : primary up 192973-194391 does not include interval

2014-08-12 20:52:58.788864 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194394/194394/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY]  noting past interval(194394-194394 up [32,356](32) acting [32,356](32))
2014-08-12 20:52:58.788881 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY]  up [32,356] -> [32,306,356], acting [32,356] -> [32,306,356], acting_primary 32 -> 32, up_primary 32 -> 32, role -1 -> 1
2014-08-12 20:52:58.788933 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] on_change
2014-08-12 20:52:58.788949 7f1050213700 15 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] publish_stats_to_osd -- not primary
2014-08-12 20:52:58.788249 7f1058420700 10 osd.306 194395 maybe_update_heartbeat_peers updating
2014-08-12 20:52:58.788962 7f1050213700 15 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY]  requeue_ops
2014-08-12 20:52:58.788972 7f1058420700 20 osd.306 194395 22.599 heartbeat_peers
2014-08-12 20:52:58.788981 7f1058420700 20 osd.306 194395 24.36d heartbeat_peers
2014-08-12 20:52:58.788984 7f1058420700 20 osd.306 194395 24.30f heartbeat_peers 62,306,308
2014-08-12 20:52:58.788978 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] cancel_copy_ops
2014-08-12 20:52:58.788992 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] cancel_flush_ops
2014-08-12 20:52:58.789006 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] on_change_cleanup
2014-08-12 20:52:58.789020 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] on_change
2014-08-12 20:52:58.789035 7f1050213700 20 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] exit NotTrimming
2014-08-12 20:52:58.789049 7f1050213700 20 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] enter NotTrimming
2014-08-12 20:52:58.789063 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] on_role_change
2014-08-12 20:52:58.789076 7f1050213700 15 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY]  requeue_ops
2014-08-12 20:52:58.789089 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] cancel_recovery
2014-08-12 20:52:58.789102 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] clear_recovery_state
2014-08-12 20:52:58.789116 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] check_recovery_sources no source osds () went down
2014-08-12 20:52:58.789133 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] handle_activate_map
2014-08-12 20:52:58.789128 7f104f812700 20 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] enter NotTrimming
2014-08-12 20:52:58.789163 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] on_role_change
2014-08-12 20:52:58.789179 7f104f812700 15 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19]  requeue_ops
2014-08-12 20:52:58.789194 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] cancel_recovery
2014-08-12 20:52:58.789208 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] clear_recovery_state
2014-08-12 20:52:58.789211 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] update_heartbeat_peers  unchanged
2014-08-12 20:52:58.789229 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] take_waiters
2014-08-12 20:52:58.789227 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] check_recovery_sources no source osds () went down
2014-08-12 20:52:58.789247 7f1050213700  5 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] exit Reset 0.003286 2 0.003228
2014-08-12 20:52:58.789267 7f1050213700  5 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] enter Started
2014-08-12 20:52:58.789283 7f1050213700  5 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] enter Start
2014-08-12 20:52:58.789288 7f104f812700 10 osd.306 pg_epoch: 194394 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] handle_advance_map [342,306,20]/[342,306,20] -- 342/342
2014-08-12 20:52:58.789297 7f1050213700  1 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] state<Start>: transitioning to Stray
2014-08-12 20:52:58.789319 7f104f812700 20 PGPool::update cached_removed_snaps [] newly_removed_snaps [] snapc 0=[] (no change)
2014-08-12 20:52:58.789318 7f1050213700  5 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] exit Start 0.000036 0 0.000000
2014-08-12 20:52:58.789337 7f1050213700  5 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] enter Started/Stray
2014-08-12 20:52:58.789326 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] state<Reset>: Reset advmap
2014-08-12 20:52:58.789358 7f1050213700 20 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 194392 while current is 194395
2014-08-12 20:52:58.789361 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] _calc_past_interval_range: already have past intervals back to 194379
2014-08-12 20:52:58.789375 7f1050213700 10 osd.306 pg_epoch: 194395 pg[22.599( v 194215'492 (0'0,194215'492] local-les=194392 n=17 ec=183096 les/c 194392/194392 194395/194395/192973) [32,306,356] r=1 lpr=194395 pi=189940-194394/37 crt=191727'432 lcod 0'0 inactive NOTIFY] handle_peering_event: epoch_sent: 194395 epoch_requested: 194395 NullEvt
2014-08-12 20:52:58.789382 7f104f812700 20 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] acting_up_affected newup [342,306,20] newacting [342,306,20]
2014-08-12 20:52:58.789398 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] state<Reset>: up or acting affected, calling start_peering_interval again
2014-08-12 20:52:58.789418 7f104f812700 20 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,20] r=-1 lpr=194394 pi=193132-194393/28 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] set_last_peering_reset 194395
2014-08-12 20:52:58.789449 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] start_peering_interval: check_new_interval output: generate_past_intervals interval(194394-194394 up [342,20](342) acting [342,20](342)): not rw, up_thru 194391 up_from 192964 last_epoch_clean 194379
generate_past_intervals interval(194394-194394 up [342,20](342) acting [342,20](342)) : primary up 192964-194391 does not include interval

2014-08-12 20:52:58.789501 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194394/194394/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19]  noting past interval(194394-194394 up [342,20](342) acting [342,20](342))
2014-08-12 20:52:58.789532 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19]  up [342,20] -> [342,306,20], acting [342,20] -> [342,306,20], acting_primary 342 -> 342, up_primary 342 -> 342, role -1 -> 1
2014-08-12 20:52:58.789567 7f1050213700 10 log is not dirty
2014-08-12 20:52:58.789551 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] on_change
2014-08-12 20:52:58.789573 7f104f812700 15 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] publish_stats_to_osd -- not primary
2014-08-12 20:52:58.789592 7f104f812700 15 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19]  requeue_ops
2014-08-12 20:52:58.789613 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] cancel_copy_ops
2014-08-12 20:52:58.789683 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] cancel_flush_ops
2014-08-12 20:52:58.789735 7f1058420700 10 osd.306 194395 _add_heartbeat_peer: new peer osd.62 192.168.1.112:6825/5505 192.168.100.112:6810/5505
2014-08-12 20:52:58.789729 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] on_change_cleanup
2014-08-12 20:52:58.789760 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] on_change
2014-08-12 20:52:58.789780 7f104f812700 20 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] exit NotTrimming
2014-08-12 20:52:58.789795 7f104f812700 20 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] enter NotTrimming
2014-08-12 20:52:58.789810 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] on_role_change
2014-08-12 20:52:58.789879 7f104f812700 15 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19]  requeue_ops
2014-08-12 20:52:58.789954 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] cancel_recovery
2014-08-12 20:52:58.790007 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] clear_recovery_state
2014-08-12 20:52:58.790025 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] check_recovery_sources no source osds () went down
2014-08-12 20:52:58.790052 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] handle_activate_map
2014-08-12 20:52:58.790172 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-194394/29 crt=194383'15755 lcod 0'0 inactive NOTIFY m=19] update_heartbeat_peers  unchanged
2014-08-12 20:52:58.790227 7f104f812700 10 osd.306 pg_epoch: 194395 pg[24.36d( v 194393'15759 lc 194370'15718 (192834'14118,194393'15759] local-les=194392 n=25 ec=192688 les/c 194392/194379 194395/194395/193132) [342,306,20] r=1 lpr=194395 pi=193132-19439
Actions #5

Updated by Sage Weil over 9 years ago

  • Status changed from 12 to Need More Info

Updated by karan singh over 9 years ago

Hello Sage

As i have found some PG / OSD that make agent_choose_mode() unhappy. I am attaching logs of 2 different OSDS (osd.4 and osd.325)

I am trying to manually restarting both of these OSDs but they are getting down and out every time. out of 402 osds , 3 ods's are showing this type of problem.

The PG that are creating problem are from pool number 24 , which is cache-pool .


# ceph -s
    cluster 009d3518-e60d-4f74-a26d-c08c1976263c
     health HEALTH_WARN 15 pgs backfill; 4 pgs backfilling; 30 pgs degraded; 1 pgs incomplete; 13 pgs recovering; 154 pgs recovery_wait; 1 pgs stuck inactive; 198 pgs stuck unclean; recovery 15228/1045806 objects degraded (1.456%); 'cache-pool' at/near target max; norecover flag(s) set
     monmap e3: 3 mons at 
     mdsmap e14: 1/1/1 up {0=   =up:active}
     osdmap e194503: 402 osds: 399 up, 399 in
            flags norecover
      pgmap v757836: 31168 pgs, 22 pools, 8 EB data, 433 kobjects
            18166 GB used, 1276 TB / 1293 TB avail
            15228/1045806 objects degraded (1.456%)
                   2 active
               30970 active+clean
                   8 active+degraded+remapped
                   4 active+degraded+remapped+backfilling
                 138 active+recovery_wait
                   3 active+recovery_wait+degraded+remapped
                   1 active+remapped
                   1 incomplete
                  13 active+recovery_wait+remapped
                  15 active+degraded+remapped+wait_backfill
                  13 active+recovering
Actions #7

Updated by karan singh over 9 years ago

Hello Sage

Thanks for your time checking this bug. As required i have found some PG’s and 3 OSDs which are making agent_choose_mode() unhappy. Logs have been uploaded in tracker.ceph.com. [In previous update]

Out of 402 , 3 OSDS are DOWN and not coming up at all.
All the PG are from cache-pool , so there is something wrong with cache tier.
when i try to put some data to Erasure coded pool , which is backed by cache-pool , after sometime radios bench its not writing any thing.

Please see cur MB/s field , after 97th second speed suddenly becomes 0 , that mean has stopped writing to cache-pool ( rados bench is instructed for 120 seconds , but at 97th seconds its not doing any thing )

FYI , EC-2-2 is my erasure coded pool and cache-pool is the cache tier for it.

#  rados bench -p EC-2-2  120 write --no-cleanup
 Maintaining 16 concurrent writes of 4194304 bytes for up to 120 seconds or 0 objects
 Object prefix: benchmark_data_storage0111-ib.pocstor_45649
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
     0       0         0         0         0         0         -         0
     1      16       190       174   695.857       696  0.067083 0.0884709
     2      15       397       382   763.871       832  0.073667 0.0804624
     3      16       642       626   834.536       976  0.051549 0.0730512
     4      16       863       847   846.881       884  0.098459 0.0714683
     5      16      1125      1109   887.083      1048  0.055077 0.0683977
     6      16      1389      1373   915.218      1056  0.054186 0.0661972
     7      16      1658      1642   938.172      1076  0.047244 0.0643397
     8      16      1924      1908   953.887      1064  0.056443 0.0633581
     9      15      2187      2172   965.221      1056  0.049508 0.0623299
    10      15      2450      2435    973.89      1052  0.044576 0.0610902
    11      16      2716      2700   981.708      1060   0.05082 0.0600529
    12      15      2979      2964   987.891      1056  0.047838  0.059104
    13      16      3245      3229    993.43      1060  0.040635 0.0580248
    14      16      3502      3486   995.892      1028  0.054317 0.0569024
    15      15      3723      3708   988.693       888  0.049351 0.0562001
    16      16      3982      3966   991.394      1032  0.039706 0.0550734
    17      16      4231      4215   991.659       996  0.049223 0.0539998
    18      16      4493      4477   994.782      1048  0.026297 0.0532175
    19      16      4724      4708   991.052       924  0.026083 0.0525153
2014-08-13 18:24:34.142903min lat: 0.021505 max lat: 0.690929 avg lat: 0.0516814
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
    20      16      4975      4959   991.694      1004  0.032724 0.0516814
    21      16      5234      5218   993.798      1036  0.035081 0.0508343
    22      16      5492      5476    995.53      1032  0.030321 0.0500818
    23      16      5730      5714   993.633       952  0.034105 0.0495723
    24      16      5974      5958   992.895       976  0.027677 0.0489262
    25      16      6182      6166   986.455       832  0.026697  0.048393
    26      16      6368      6352   977.126       744  0.032077 0.0479137
    27      16      6542      6526   966.711       696  0.034528 0.0475623
    28      16      6732      6716   959.325       760  0.031908 0.0471095
    29      16      6914      6898   951.346       728   0.03235  0.046739
    30      16      7087      7071   942.699       692  0.031762 0.0464441
    31      16      7278      7262   936.932       764  0.037495 0.0460474
    32      16      7466      7450   931.151       752  0.030711 0.0456857
    33      16      7630      7614    922.81       656  0.033268 0.0454944
    34      16      7805      7789   916.255       700  0.025204 0.0452396
    35      16      7983      7967   910.418       712  0.034867 0.0449635
    36      16      8157      8141    904.46       696  0.033268 0.0447589
    37      16      8344      8328   900.229       748  0.027925 0.0444739
    38      16      8515      8499   894.537       684  0.035933 0.0442857
    39      16      8697      8681   890.265       728  0.034758 0.0440484
2014-08-13 18:24:54.144995min lat: 0.019437 max lat: 0.690929 avg lat: 0.0438109
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
    40      16      8864      8848   884.707       668  0.029835 0.0438109
    41      16      9020      9004   878.345       624  0.025962 0.0436106
    42      16      9182      9166   872.858       648  0.035187 0.0433818
    43      16      9341      9325   867.349       636  0.044437 0.0431808
    44      16      9506      9490   862.635       660  0.027373 0.0429564
    45      16      9660      9644   857.152       616   0.02804 0.0427688
    46      16      9829      9813   853.213       676  0.032177 0.0425597
    47      16      9990      9974    848.76       644   0.02674 0.0423772
    48      15     10161     10146    845.41       688  0.024874 0.0421511
    49      16     10321     10305   841.134       636  0.035208 0.0419834
    50      16     10467     10451    835.99       584  0.030727 0.0418296
    51      16     10625     10609   831.989       632  0.027092 0.0417251
    52      16     10782     10766   828.065       628  0.033714 0.0415791
    53      16     10932     10916    823.76       600  0.026319 0.0414116
    54      16     11063     11047   818.208       524  0.049069 0.0414033
    55      16     11216     11200   814.457       612  0.027741  0.041306
    56      16     11379     11363   811.555       652  0.031221  0.041156
    57      16     11527     11511   807.702       592  0.028816 0.0410259
    58      16     11666     11650   803.361       556  0.025575 0.0408803
    59      16     11800     11784   798.829       536  0.026138 0.0407547
2014-08-13 18:25:14.147303min lat: 0.019437 max lat: 1.09496 avg lat: 0.0406389
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
    60      16     11932     11916   794.314       528  0.030501 0.0406389
    61      16     12068     12052   790.208       544  0.025198 0.0405135
    62      16     12199     12183   785.913       524  0.032456 0.0403954
    63      16     12336     12320   782.136       548  0.035351 0.0402791
    64      16     12473     12457   778.476       548  0.029517 0.0401582
    65      15     12606     12591   774.744       536  0.027151 0.0400501
    66      16     12737     12721   770.883       520  0.030675 0.0399549
    67      16     12874     12858   767.556       548  0.025515 0.0398405
    68      16     12989     12973   763.032       460  0.029258 0.0397442
    69      16     13089     13073    757.77       400  0.030403 0.0396706
    70      16     13191     13175   752.772       408  0.027487 0.0395906
    71      16     13291     13275   747.803       400  0.024642 0.0395181
    72      16     13394     13378   743.138       412  0.028956 0.0394375
    73      16     13495     13479   738.492       404  0.034251 0.0393652
    74      16     13601     13585   734.242       424  0.026095 0.0392797
    75      16     13708     13692   730.158       428  0.025401 0.0391887
    76      16     13759     13743   723.234       204  0.027427 0.0391483
    77      16     13794     13778    715.66       140  0.030907  0.039121
    78      16     13829     13813   708.279       140  0.026868 0.0390938
    79      16     13865     13849   701.136       144   0.02843 0.0390645
2014-08-13 18:25:34.149847min lat: 0.019437 max lat: 1.09496 avg lat: 0.0390355
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
    80      16     13901     13885   694.171       144  0.027766 0.0390355
    81      16     13936     13920   687.329       140  0.028827  0.039008
    82      16     13971     13955   680.654       140  0.043892 0.0389829
    83      16     14005     13989   674.092       136  0.030925   0.03896
    84      16     14040     14024   667.733       140  0.029571 0.0389337
    85      16     14076     14060   661.571       144   0.02884 0.0389039
    86      16     14112     14096   655.553       144  0.026494 0.0388759
    87      16     14148     14132   649.673       144  0.026507 0.0388487
    88      16     14185     14169   643.971       148  0.030229 0.0388181
    89      16     14220     14204   638.309       140  0.026673 0.0387929
    90      16     14256     14240   632.816       144  0.029613 0.0387642
    91      16     14291     14275     627.4       140  0.028177 0.0387391
    92      16     14327     14311   622.146       144  0.027472 0.0387122
    93      16     14363     14347   617.004       144  0.026809 0.0386837
    94      16     14399     14383   611.972       144  0.028466 0.0386565
    95      16     14436     14420   607.088       148  0.029694 0.0386281
    96      16     14458     14442   601.681        88  0.028512 0.0386108
    97      16     14458     14442   595.478         0         - 0.0386108
    98      16     14458     14442   589.401         0         - 0.0386108
    99      16     14458     14442   583.448         0         - 0.0386108
2014-08-13 18:25:54.152351min lat: 0.019437 max lat: 1.09496 avg lat: 0.0386108
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   100      16     14458     14442   577.613         0         - 0.0386108
   101      16     14458     14442   571.894         0         - 0.0386108
   102      16     14458     14442   566.287         0         - 0.0386108
   103      16     14458     14442   560.789         0         - 0.0386108
   104      16     14458     14442   555.396         0         - 0.0386108
   105      16     14458     14442   550.107         0         - 0.0386108
   106      16     14458     14442   544.917         0         - 0.0386108
   107      16     14458     14442   539.824         0         - 0.0386108
   108      16     14458     14442   534.826         0         - 0.0386108
   109      16     14458     14442   529.919         0         - 0.0386108
   110      16     14458     14442   525.102         0         - 0.0386108
   111      16     14458     14442   520.371         0         - 0.0386108
   112      16     14458     14442   515.725         0         - 0.0386108
   113      16     14458     14442   511.161         0         - 0.0386108
   114      16     14458     14442   506.677         0         - 0.0386108
   115      16     14458     14442   502.271         0         - 0.0386108
   116      16     14458     14442   497.942         0         - 0.0386108
   117      16     14458     14442   493.686         0         - 0.0386108
   118      16     14458     14442   489.502         0         - 0.0386108
   119      16     14458     14442   485.388         0         - 0.0386108
2014-08-13 18:26:14.154900min lat: 0.019437 max lat: 1.09496 avg lat: 0.0386108
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   120      16     14458     14442   481.343         0         - 0.0386108
   121      16     14458     14442   477.365         0         - 0.0386108
   122      16     14458     14442   473.452         0         - 0.0386108
   123      16     14458     14442   469.603         0         - 0.0386108
   124      16     14458     14442   465.816         0         - 0.0386108
   125      16     14458     14442   462.089         0         - 0.0386108
   126      16     14458     14442   458.422         0         - 0.0386108
   127      16     14458     14442   454.812         0         - 0.0386108
   128      16     14458     14442   451.259         0         - 0.0386108
   129      16     14458     14442    447.76         0         - 0.0386108
   130      16     14458     14442   444.316         0         - 0.0386108
   131      16     14458     14442   440.924         0         - 0.0386108
   132      16     14458     14442   437.584         0         - 0.0386108
   133      16     14458     14442   434.294         0         - 0.0386108
   134      16     14458     14442   431.053         0         - 0.0386108
   135      16     14458     14442    427.86         0         - 0.0386108
   136      16     14458     14442   424.714         0         - 0.0386108
   137      16     14458     14442   421.614         0         - 0.0386108
   138      16     14458     14442   418.558         0         - 0.0386108
   139      16     14458     14442   415.547         0         - 0.0386108
2014-08-13 18:26:34.157601min lat: 0.019437 max lat: 1.09496 avg lat: 0.0386108
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   140      16     14458     14442   412.579         0         - 0.0386108
   141      16     14458     14442   409.653         0         - 0.0386108
   142      16     14458     14442   406.768         0         - 0.0386108
   143      16     14458     14442   403.923         0         - 0.0386108
   144      16     14458     14442   401.118         0         - 0.0386108
   145      16     14458     14442   398.352         0         - 0.0386108
   146      16     14458     14442   395.624         0         - 0.0386108
   147      16     14458     14442   392.932         0         - 0.0386108
   148      16     14458     14442   390.277         0         - 0.0386108
   149      16     14458     14442   387.658         0         - 0.0386108
   150      16     14458     14442   385.074         0         - 0.0386108
   151      16     14458     14442   382.524         0         - 0.0386108
   152      16     14458     14442   380.007         0         - 0.0386108
   153      16     14458     14442   377.523         0         - 0.0386108
   154      16     14458     14442   375.072         0         - 0.0386108
   155      16     14458     14442   372.652         0         - 0.0386108
   156      16     14458     14442   370.263         0         - 0.0386108
   157      16     14458     14442   367.905         0         - 0.0386108
   158      16     14458     14442   365.576         0         - 0.0386108
   159      16     14458     14442   363.277         0         - 0.0386108
2014-08-13 18:26:54.160099min lat: 0.019437 max lat: 1.09496 avg lat: 0.0386108
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   160      16     14458     14442   361.006         0         - 0.0386108
   161      16     14458     14442   358.764         0         - 0.0386108
   162      16     14458     14442   356.549         0         - 0.0386108
   163      16     14458     14442   354.362         0         - 0.0386108
   164      16     14458     14442   352.201         0         - 0.0386108
   165      16     14458     14442   350.067         0         - 0.0386108
   166      16     14458     14442   347.958         0         - 0.0386108
   167      16     14458     14442   345.874         0         - 0.0386108
   168      16     14458     14442   343.816         0         - 0.0386108
   169      16     14458     14442   341.781         0         - 0.0386108
   170      16     14458     14442   339.771         0         - 0.0386108
   171      16     14458     14442   337.784         0         - 0.0386108
   172      16     14458     14442    335.82         0         - 0.0386108
   173      16     14458     14442   333.879         0         - 0.0386108
   174      16     14458     14442    331.96         0         - 0.0386108
   175      16     14458     14442   330.063         0         - 0.0386108
   176      16     14458     14442   328.187         0         - 0.0386108
   177      16     14458     14442   326.333         0         - 0.0386108
   178      16     14458     14442     324.5         0         - 0.0386108
   179      16     14458     14442   322.687         0         - 0.0386108
2014-08-13 18:27:14.162670min lat: 0.019437 max lat: 1.09496 avg lat: 0.0386108
   sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
   180      16     14458     14442   320.894         0         - 0.0386108
   181      16     14458     14442   319.121         0         - 0.0386108
   182      16     14458     14442   317.368         0         - 0.0386108
   183      16     14458     14442   315.634         0         - 0.0386108
   184      16     14458     14442   313.918         0         - 0.0386108
   185      16     14458     14442   312.222         0         - 0.0386108
   186      16     14458     14442   310.543         0         - 0.0386108
   187      16     14458     14442   308.882         0         - 0.0386108
   188      16     14458     14442   307.239         0         - 0.0386108
   189      16     14458     14442   305.614         0         - 0.0386108
   190      16     14458     14442   304.005         0         - 0.0386108
   191      16     14458     14442   302.413         0         - 0.0386108
   192      16     14458     14442   300.838         0         - 0.0386108
^C
[root@storage0111-ib ~]# ^C
Actions #8

Updated by Sage Weil over 9 years ago

i've pushed wip-9082-firefly... can you please try this and see if it avoids the crash? i was looking for a divide by zero but I don't think it's possible; my guess is the negative int64_t is throwing the floating point handling for a loop?

Actions #9

Updated by Sage Weil over 9 years ago

  • Status changed from Need More Info to Resolved
Actions #10

Updated by karan singh over 9 years ago

Thanks Sage , the issue has been resolved, cluster is Healthy now.

Actions

Also available in: Atom PDF