Project

General

Profile

Bug #8387

osd: skipping missing maps broken

Added by karan singh almost 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Problem :: 50+ OSD are getting marked out of cluster and are down. The cluster is degraded. On checking logs of failed OSD we are getting wired entries that are continuously getting generated.

logs from OSD :: (debug logging set on OSDs)

2014-05-19 17:09:49.185921 7fcbae674700 0 -- 192.168.1.110:6819/58592 >> 192.168.1.104:6826/777047378 pipe(0x447aa80 sd=129 :0 s=1 pgs=0 cs=0 l=0 c=0x7c165c0).fault with nothing to send, going to standby
2014-05-19 17:09:49.187704 7fcbb0a98700 0 -- 192.168.1.110:6819/58592 >> 192.168.1.102:6803/1081057589 pipe(0x7ddd280 sd=131 :48417 s=1 pgs=0 cs=0 l=0 c=0x912b180).connect claims to be 192.168.1.102:6803/2072057589 not 192.168.1.102:6803/1081057589 - wrong node!
2014-05-19 17:09:49.187817 7fcbb0a98700 0 -- 192.168.1.110:6819/58592 >> 192.168.1.102:6803/1081057589 pipe(0x7ddd280 sd=131 :48417 s=1 pgs=0 cs=0 l=0 c=0x912b180).fault with nothing to send, going to standby
2014-05-19 17:09:49.193429 7fcbb8fba700 1 osd.159 pg_epoch: 226656 pg[3.5c( v 76227'43 (0'0,76227'43] local-les=221836 n=1 ec=71845 les/c 221836/221836 226656/226656/226656) [78,12] r=-1 lpr=226656 pi=71845-226655/6176 crt=0'0 lcod 0'0 inactive NOTIFY] state<Start>: transitioning to Stray
2014-05-19 17:09:49.200057 7fcbb99bb700 1 osd.159 pg_epoch: 226656 pg[2.2ef( empty local-les=221905 n=0 ec=1 les/c 221905/221905 226656/226656/225867) [77,20] r=-1 lpr=226656 pi=221879-226655/156 crt=0'0 inactive NOTIFY] state<Start>: transitioning to Stray
2014-05-19 17:09:49.211625 7fcbb99bb700 1 osd.159 pg_epoch: 226656 pg[0.7a1( empty local-les=221905 n=0 ec=1 les/c 221905/221905 226622/226622/226216) [66,88,121] r=-1 lpr=226622 pi=629-226621/7373 crt=0'0 inactive NOTIFY] state<Start>: transitioning to Stray
2014-05-19 17:09:49.213734 7fcbb8fba700 1 osd.159 pg_epoch: 226656 pg[2.497( empty local-les=221807 n=0 ec=1 les/c 221807/221807 226656/226656/226436) [2,151] r=-1 lpr=226656 pi=490-226655/9134 crt=0'0 inactive NOTIFY] state<Start>: transitioning to Stray
2014-05-19 17:09:49.237009 7fcbb99bb700 1 osd.159 pg_epoch: 226656 pg[0.525( empty local-les=221906 n=0 ec=1 les/c 221906/221906 226637/226637/226561) [21,117,73] r=-1 lpr=226637 pi=338-226636/14696 crt=0'0 inactive NOTIFY] state<Start>: transitioning to Stray
2014-05-19 17:09:52.355092 7fcbb8fba700 1 osd.159 pg_epoch: 226656 pg[0.2db( empty local-les=221894 n=0 ec=1 les/c 221894/221894 226620/226620/226615) [129,106,158] r=-1 lpr=226620 pi=629-226619/16410 crt=0'0 inactive NOTIFY] state<Start>: transitioning to Stray
2014-05-19 17:09:52.374646 7fcbb109e700 0 -- 192.168.1.110:6819/58592 >> 192.168.1.107:6828/1519061057 pipe(0x7dda580 sd=549 :48520 s=1 pgs=0 cs=0 l=0 c=0x7c14d00).connect claims to be 192.168.1.107:6828/1486062762 not 192.168.1.107:6828/1519061057 - wrong node!
2014-05-19 17:09:52.374728 7fcbb109e700 0 -- 192.168.1.110:6819/58592 >> 192.168.1.107:6828/1519061057 pipe(0x7dda580 sd=549 :48520 s=1 pgs=0 cs=0 l=0 c=0x7c14d00).fault with nothing to send, going to standby
2014-05-19 17:09:54.871888 7fcbb8fba700 1 osd.159 pg_epoch: 226656 pg[2.510( empty local-les=221820 n=0 ec=1 les/c 221820/221820 226597/226597/226597) [68,25,95] r=-1 lpr=226597 pi=625-226596/7959 crt=0'0 inactive NOTIFY] state<Start>: transitioning to Stray
2014-05-19 17:09:54.872696 7fcbb99bb700 1 osd.159 pg_epoch: 226656 pg[0.290( empty local-les=221906 n=0 ec=1 les/c 221906/221906 226656/226656/226637) [117,127] r=-1 lpr=226656 pi=108400-226655/8477 crt=0'0 inactive NOTIFY] state<Start>: transitioning to Stray
2014-05-19 17:09:54.883318 7fcbae876700 0 -- 192.168.1.110:6819/58592 >> 192.168.1.105:6851/3429031235 pipe(0x7ddf800 sd=174 :0 s=1 pgs=0 cs=0 l=0 c=0x912d960).fault with nothing to send, going to standby

More logs check : http://pastebin.com/0DRuiGRS

  1. ceph -v
    ceph version 0.80-469-g991f7f1 (991f7f15a6e107b33a24bbef1169f21eb7fcce2c) #
  1. ceph osd stat
    osdmap e357073: 165 osds: 112 up, 165 in
    flags noout #

I have tried doing :

1. Restarting the problematic OSDs , but no luck
2. One host has 4 affected OSDs , so i restarted entire host but no luck, still osds are down and getting the same mesage

2014-05-19 17:16:00.140440 7fcbb008e700 0 -- 192.168.1.110:6819/58592 >> 192.168.1.111:6815/727016887 pipe(0x447ee00 sd=262 :40743 s=1 pgs=0 cs=0 l=0 c=0x7c127e0).fault with nothing to send, going to standby
2014-05-19 17:16:00.140815 7fcbaf987700 0 -- 192.168.1.110:6819/58592 >> 192.168.1.102:6802/982001016 pipe(0xa4f5000 sd=267 :43043 s=1 pgs=0 cs=0 l=0 c=0x77fdee0).connect claims to be 192.168.1.102:6802/1848051066 not 192.168.1.102:6802/982001016 - wrong node!
2014-05-19 17:16:00.140846 7fcbaf987700 0 -- 192.168.1.110:6819/58592 >> 192.168.1.102:6802/982001016 pipe(0xa4f5000 sd=267 :43043 s=1 pgs=0 cs=0 l=0 c=0x77fdee0).fault with nothing to send, going to standby
2014-05-19 17:16:00.141851 7fcbb8fba700 1 osd.159 pg_epoch: 236857 pg[2.40c( empty local-les=221836 n=0 ec=1 les/c 221836/221836 236795/236795/236795) [61,46,78] r=-1 lpr=236795 pi=221835-236794/489 crt=0'0 inactive NOTIFY] state<Start>: transitioning to Stray

3. Disks do not have errors , no message in dmesg and /var/log/messages

4. there was a bug in the past http://tracker.ceph.com/issues/4006 , dont know it again came bacin in Firefly

5. Recently no activity performed on cluster , except some pool and keys creation for cinder /glance integration

6. Nodes have enough free resources for osds.

Please suggest a solution to this bug.

History

#2 Updated by Sage Weil almost 10 years ago

Can you find one line like

2014-05-19 17:09:52.374646 7fcbb109e700 0 -- 192.168.1.110:6819/58592 >> 192.168.1.107:6828/1519061057 pipe(0x7dda580 sd=549 :48520 s=1 pgs=0 cs=0 l=0 c=0x7c14d00).connect claims to be 192.168.1.107:6828/1486062762 not 192.168.1.107:6828/1519061057 - wrong node!

and then grep for both of the addresses in ceph osd dump? You can set 'ceph osd set nodown' temporarily to stop the flapping so that it is not a moving target (these messages will presumably continue and it won't fix the actual problem).

#3 Updated by Ian Colle almost 10 years ago

  • Status changed from New to Need More Info

#4 Updated by Sage Weil almost 10 years ago

  • Tags deleted (multiple osd getting down)

#5 Updated by karan singh almost 10 years ago

Hi Sage

Here is the complete output from another OSD. This problem is occurring with multiple OSDS (74 osd) across cluster. Rest other OSDs (91 osds) are working fine

OSD Complete logs : http://pastebin.com/agTKh6zB

Ceph osd dump output : http://pastebin.com/M4D5A6JJ

As advised by you i have checked multiple lines for "wrong node" and compared with ceph osd dump , here is what i found

=================================================== FROM OSD.158 LOGS ============================================================================================
  1. 1 ##

2014-05-20 10:19:03.635180 7f230002b700 0 -- 192.168.1.112:6802/3807 >> 192.168.1.106:6805/2449059892 pipe(0x56ac600 sd=146 :59669 s=1 pgs=0 cs=0 l=0 c=0x83e7900).connect claims to be 192.168.1.106:6805/16811 not 192.168.1.106:6805/2449059892 - wrong node!

  1. 2 ##

2014-05-20 10:19:03.700093 7f22ff621700 0 -- 192.168.1.112:6802/3807 >> 192.168.1.109:6802/910005982 pipe(0x8698500 sd=35 :33500 s=1 pgs=0 cs=0 l=0 c=0x83018c0).connect claims to be 192.168.1.109:6802/63896 not 192.168.1.109:6802/910005982 - wrong node!

  1. 3 ##

2014-05-20 10:19:09.703901 7f22fd80d700 0 -- 192.168.1.112:6802/3807 >> 192.168.1.113:6802/13870 pipe(0x56adf00 sd=137 :42889 s=1 pgs=0 cs=0 l=0 c=0x8302aa0).connect claims to be 192.168.1.113:6802/24612 not 192.168.1.113:6802/13870 - wrong node!

=================================================== FROM Ceph osd dump ============================================================================================
  1. 1 ##

osd.158 down in weight 1 up_from 357777 up_thru 357779 down_at 357798 last_clean_interval [72933,357532) 192.168.100.112:6803/3807 192.168.1.112:6802/3807 192.168.1.112:6803/3807 192.168.100.112:6804/3807 exists 1b815863-f961-447d-adea-3ec79996c544

osd.114 down in weight 1 up_from 357259 up_thru 357506 down_at 357546 last_clean_interval [72924,357256) 192.168.100.106:6810/59892 192.168.1.106:6805/2449059892 192.168.1.106:6808/2449059892 192.168.100.106:6807/2449059892 exists 6bfe95e1-d45f-4771-8074-c23389b67232

osd.164 up in weight 1 up_from 357622 up_thru 357798 down_at 357543 last_clean_interval [72938,357532) 192.168.100.106:6806/16811 192.168.1.106:6804/16811 192.168.1.106:6805/16811 192.168.100.106:6807/16811 exists,up 1266e091-05f0-4168-b3bd-63cde8d0b92b

  1. 2 ##

osd.158 down in weight 1 up_from 357777 up_thru 357779 down_at 357798 last_clean_interval [72933,357532) 192.168.100.112:6803/3807 192.168.1.112:6802/3807 192.168.1.112:6803/3807 192.168.100.112:6804/3807 exists 1b815863-f961-447d-adea-3ec79996c544

osd.111 down in weight 1 up_from 350958 up_thru 357522 down_at 357543 last_clean_interval [72928,350955) 192.168.100.109:6818/5982 192.168.1.109:6802/910005982 192.168.1.109:6808/910005982 192.168.100.109:6802/910005982 exists 2cde1608-9133-4346-bb90-074f845fa1d4

not able to find 192.168.1.109:6802/63896

  1. 3 ##

osd.158 down in weight 1 up_from 357777 up_thru 357779 down_at 357798 last_clean_interval [72933,357532) 192.168.100.112:6803/3807 192.168.1.112:6802/3807 192.168.1.112:6803/3807 192.168.100.112:6804/3807 exists 1b815863-f961-447d-adea-3ec79996c544

not able to find 192.168.1.113:6802/13870

osd.128 up in weight 1 up_from 357708 up_thru 357798 down_at 357546 last_clean_interval [72929,357532) 192.168.100.113:6804/24612 192.168.1.113:6802/24612 192.168.1.113:6803/24612 192.168.100.113:6807/24612 exists,up 874d3f2d-c57a-43e2-8491-01baf1e5e3fa

===================================================================================================================================================================

Can you help me in understanding what is happening here. I tried bringing up each osd manually , but they are not coming up, throwing the same error "wrong node"

#6 Updated by Sage Weil almost 10 years ago

  • Assignee set to Sage Weil

#7 Updated by Sage Weil almost 10 years ago

  • Priority changed from Immediate to Urgent

#8 Updated by karan singh almost 10 years ago

Hello Sage

As per our discussion on mailing list and IRC , i tried your procedure for this problem. Its been over 72 hours the problematic OSDs are still trying loading and decoding the correct OSD map epoch. All the 32 OSDS have the same problem . OSD service is started , but OSD is down in cluster and they are still performing loading and decoding.

To troubleshoot this issue , i have kept debug osd = 20 , and till now all these OSDS have generated 7TB of logs ( I do not have space problem for logs as logs are kept on a 60TB common NFS share ).

Now after comparing the logs , i found that loading and decoding operation is repeating itself , its not at all reaching to the Final correct OSD map epoch. Here are some OSD logs for your reference.

http://pastebin.com/aSBY4XYm

[root@storage0106-ib ceph]# tail ceph-osd.58.log
2014-05-25 00:18:16.756357 7fbfa35b7700 20 osd.58 350946 get_map 289768 - loading and decoding 0x10f16200
2014-05-25 00:18:16.756464 7fbfa2bb6700 20 osd.58 350946 get_map 289569 - loading and decoding 0x10f15e00
2014-05-25 00:18:16.756457 7fbfa35b7700 10 osd.58 pg_epoch: 289568 pg[2.429( empty local-les=289539 n=0 ec=1 les/c 289539/289539 289567/289567/289538) [52,65] r=-1 lpr=289568 pi=288989-289566/12 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 00:18:16.756518 7fbfa35b7700 10 osd.58 pg_epoch: 289568 pg[2.429( empty local-les=289539 n=0 ec=1 les/c 289539/289539 289567/289567/289538) [52,65] r=-1 lpr=289568 pi=288989-289566/12 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 00:18:16.756536 7fbfa35b7700 20 osd.58 pg_epoch: 289568 pg[2.429( empty local-les=289539 n=0 ec=1 les/c 289539/289539 289567/289567/289538) [52,65] r=-1 lpr=289568 pi=288989-289566/12 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 00:18:16.756556 7fbfa35b7700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[2.429( empty local-les=289539 n=0 ec=1 les/c 289539/289539 289567/289567/289538) [52,65] r=-1 lpr=289568 pi=288989-289566/12 crt=0'0 inactive NOTIFY]
2014-05-25 00:18:16.756580 7fbfa35b7700 10 osd.58 pg_epoch: 289568 pg[2.429( empty local-les=289539 n=0 ec=1 les/c 289539/289539 289567/289567/289538) [52,65] r=-1 lpr=289568 pi=288989-289566/12 crt=0'0 inactive NOTIFY] null
2014-05-25 00:18:16.756599 7fbfa35b7700 10 log is not dirty
2014-05-25 00:18:16.756600 7fbfa2bb6700 20 osd.58 350946 get_map 289570 - loading and decoding 0x10f15e00
2014-05-25 00:18:16.756695 7fbfa2bb6700 20 osd.58 350946 get_map 289571 - loading and decoding 0x10f15e00

[root@storage0102-ib ceph]# tail ceph-osd.7.log
2014-05-25 00:07:07.547447 7f7d1649b700 20 osd.7 350946 get_map 299068 - loading and decoding 0x35ec200
2014-05-25 00:07:07.547542 7f7d1649b700 20 osd.7 350946 get_map 299069 - loading and decoding 0x35ec200
2014-05-25 00:07:07.547636 7f7d1649b700 20 osd.7 350946 get_map 299070 - loading and decoding 0x35ec200
2014-05-25 00:07:07.547730 7f7d1649b700 20 osd.7 350946 get_map 299071 - loading and decoding 0x35ec200
2014-05-25 00:07:07.547825 7f7d1649b700 20 osd.7 350946 get_map 299072 - loading and decoding 0x35ec200
2014-05-25 00:07:07.547920 7f7d1649b700 20 osd.7 350946 get_map 299073 - loading and decoding 0x35ec200
2014-05-25 00:07:07.548014 7f7d1649b700 20 osd.7 350946 get_map 299074 - loading and decoding 0x35ec200
2014-05-25 00:07:07.548116 7f7d1649b700 20 osd.7 350946 get_map 299075 - loading and decoding 0x35ec200
2014-05-25 00:07:07.548211 7f7d1649b700 20 osd.7 350946 get_map 299076 - loading and decoding 0x35ec200
2014-05-25 00:07:07.548305 7f7d1649b700 20 osd.7 350946 get_map 299077 - loading and decoding 0x35ec200
[root@storage0102-ib ceph]# tail ceph-osd.7.log
2014-05-25 00:07:08.764574 7f7d1649b700 20 osd.7 350946 get_map 299070 - loading and decoding 0x35ec200
2014-05-25 00:07:08.764565 7f7d15a9a700 10 osd.7 pg_epoch: 298887 pg[0.64a( empty local-les=298850 n=0 ec=1 les/c 298850/298850 298887/298887/298887) [113,86] r=-1 lpr=298887 pi=298849-298886/1 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 00:07:08.764609 7f7d15a9a700 10 osd.7 pg_epoch: 298887 pg[0.64a( empty local-les=298850 n=0 ec=1 les/c 298850/298850 298887/298887/298887) [113,86] r=-1 lpr=298887 pi=298849-298886/1 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 00:07:08.764630 7f7d15a9a700 20 osd.7 pg_epoch: 298887 pg[0.64a( empty local-les=298850 n=0 ec=1 les/c 298850/298850 298887/298887/298887) [113,86] r=-1 lpr=298887 pi=298849-298886/1 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 298887 while current is 298887
2014-05-25 00:07:08.764643 7f7d15a9a700 10 osd.7 350946 advance_pg advanced by max 200 past min epoch 298887 ... will requeue pg[0.64a( empty local-les=298850 n=0 ec=1 les/c 298850/298850 298887/298887/298887) [113,86] r=-1 lpr=298887 pi=298849-298886/1 crt=0'0 inactive NOTIFY]
2014-05-25 00:07:08.764654 7f7d15a9a700 10 osd.7 pg_epoch: 298887 pg[0.64a( empty local-les=298850 n=0 ec=1 les/c 298850/298850 298887/298887/298887) [113,86] r=-1 lpr=298887 pi=298849-298886/1 crt=0'0 inactive NOTIFY] null
2014-05-25 00:07:08.764675 7f7d15a9a700 10 log is not dirty
2014-05-25 00:07:08.764712 7f7d1649b700 20 osd.7 350946 get_map 299071 - loading and decoding 0x35ec200
2014-05-25 00:07:08.764814 7f7d1649b700 20 osd.7 350946 get_map 299072 - loading and decoding 0x35ec200
2014-05-25 00:07:08.764915 7f7d1649b700 20 osd.7 350946 get_map 299073 - loading and decoding 0x35ec200

2014-05-25 00:16:44.291343 7f38c9551700 20 osd.93 350946 get_map 118210 - loading and decoding 0x35e2c00
2014-05-25 00:16:44.294767 7f38c8d50700 10 osd.93 pg_epoch: 118065 pg[2.752( empty local-les=118029 n=0 ec=1 les/c 118029/118029 118065/118065/117839) [14,126] r=-1 lpr=118065 pi=529-118064/6419 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 00:16:44.294829 7f38c8d50700 20 osd.93 pg_epoch: 118065 pg[2.752( empty local-les=118029 n=0 ec=1 les/c 118029/118029 118065/118065/117839) [14,126] r=-1 lpr=118065 pi=529-118064/6419 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 118065 while current is 118065
2014-05-25 00:16:44.294920 7f38c8d50700 10 osd.93 350946 advance_pg advanced by max 200 past min epoch 118065 ... will requeue pg[2.752( empty local-les=118029 n=0 ec=1 les/c 118029/118029 118065/118065/117839) [14,126] r=-1 lpr=118065 pi=529-118064/6419 crt=0'0 inactive NOTIFY]
2014-05-25 00:16:44.294929 7f38c8d50700 10 osd.93 pg_epoch: 118065 pg[2.752( empty local-les=118029 n=0 ec=1 les/c 118029/118029 118065/118065/117839) [14,126] r=-1 lpr=118065 pi=529-118064/6419 crt=0'0 inactive NOTIFY] null
2014-05-25 00:16:44.294954 7f38c8d50700 10 log is not dirty
2014-05-25 00:16:44.295162 7f38c9551700 20 osd.93 350946 get_map 118211 - loading and decoding 0x35e2c00
2014-05-25 00:16:44.295205 7f38c9551700 20 osd.93 350946 get_map 118212 - loading and decoding 0x35e2c00
2014-05-25 00:16:44.295242 7f38c9551700 20 osd.93 350946 get_map 118213 - loading and decoding 0x35e2c00
2014-05-25 00:16:44.295277 7f38c9551700 20 osd.93 350946 get_map 118214 - loading and decoding 0x35e2c00
2014-05-25 00:16:44.295312 7f38c9551700 20 osd.93 350946 get_map 118215 - loading and decoding 0x35e2c00

Can you suggest what next can be done here for these 32 OSDS , your support will be highly appreciated to make cluster healthy again.

#9 Updated by karan singh almost 10 years ago

Hi

This problem occurred somewhere between 16 - 18th May . Before this cluster was healthy.

  1. As far as i remember correctly last thing i performed was cluster up gradation to 0.80-rc-1 and creating several cinder volumes after a day.
  1. The cluster is not too big , consists of 165 OSD , total capacity 450 TB , used capacity is very low.
  1. I tried osd map cache size = 75000 for a single osd (osd.58) .

Memory utilisation on this node spiked UP.
After monitoring for couple of hours , i found osd loading and decoding operation is not able to go beyond map epoch 289XXX while the latest osd map epoch is 350946. OSD.58 is continuously trying but not able to cross epoch 289XXX.
This is the same problem with rest 32 OSDs , they are not able to cross the epoch number and repeating in the same loop.

Here are some output from osd.58 , for a specific epoch number where its getting stuccoed.

[root@storage0106-ib ceph]# grep 289568 ceph-osd.58.log > /tmp/osd.58

[root@storage0106-ib tmp]# tail -100 osd.58
2014-05-25 15:10:41.222579 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.38c( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/289473) [72,2] r=-1 lpr=289568 pi=289471-289566/4 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.222628 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.38c( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/289473) [72,2] r=-1 lpr=289568 pi=289471-289566/4 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.222650 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.38c( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/289473) [72,2] r=-1 lpr=289568 pi=289471-289566/4 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.222665 7fbc0b54c700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[2.38c( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/289473) [72,2] r=-1 lpr=289568 pi=289471-289566/4 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.222676 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.38c( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/289473) [72,2] r=-1 lpr=289568 pi=289471-289566/4 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.243297 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.1f4( empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289501) [37,110] r=-1 lpr=289568 pi=571-289566/9373 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.255742 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.1f4( empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289501) [37,110] r=-1 lpr=289568 pi=571-289566/9373 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.255773 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[0.1f4( empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289501) [37,110] r=-1 lpr=289568 pi=571-289566/9373 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.255788 7fbc0b54c700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[0.1f4( empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289501) [37,110] r=-1 lpr=289568 pi=571-289566/9373 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.255799 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.1f4( empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289501) [37,110] r=-1 lpr=289568 pi=571-289566/9373 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.263083 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.71e( empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289535) [78,124] r=-1 lpr=289568 pi=625-289566/11022 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.277054 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.71e( empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289535) [78,124] r=-1 lpr=289568 pi=625-289566/11022 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.277094 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[2.71e( empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289535) [78,124] r=-1 lpr=289568 pi=625-289566/11022 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.277109 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[2.71e( empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289535) [78,124] r=-1 lpr=289568 pi=625-289566/11022 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.277121 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.71e( empty local-les=289536 n=0 ec=1 les/c 289536/289536 289567/289567/289535) [78,124] r=-1 lpr=289568 pi=625-289566/11022 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.282697 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.282753 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.282774 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.71d( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.282789 7fbc0b54c700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[2.71d( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.282800 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.302274 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.f9( empty local-les=289561 n=0 ec=1 les/c 289561/289561 289567/289567/289567) [79,21] r=-1 lpr=289568 pi=289560-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.302328 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.f9( empty local-les=289561 n=0 ec=1 les/c 289561/289561 289567/289567/289567) [79,21] r=-1 lpr=289568 pi=289560-289566/1 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.302350 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[2.f9( empty local-les=289561 n=0 ec=1 les/c 289561/289561 289567/289567/289567) [79,21] r=-1 lpr=289568 pi=289560-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.302365 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[2.f9( empty local-les=289561 n=0 ec=1 les/c 289561/289561 289567/289567/289567) [79,21] r=-1 lpr=289568 pi=289560-289566/1 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.302376 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.f9( empty local-les=289561 n=0 ec=1 les/c 289561/289561 289567/289567/289567) [79,21] r=-1 lpr=289568 pi=289560-289566/1 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.340166 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.626( empty local-les=289415 n=0 ec=1 les/c 289415/289415 289567/289567/289414) [66,41] r=-1 lpr=289568 pi=124449-289566/5588 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.343261 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.6c( empty local-les=289454 n=0 ec=1 les/c 289454/289454 289567/289567/289567) [38,55] r=-1 lpr=289568 pi=289453-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.343304 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.6c( empty local-les=289454 n=0 ec=1 les/c 289454/289454 289567/289567/289567) [38,55] r=-1 lpr=289568 pi=289453-289566/1 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.343325 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.6c( empty local-les=289454 n=0 ec=1 les/c 289454/289454 289567/289567/289567) [38,55] r=-1 lpr=289568 pi=289453-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.343339 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[0.6c( empty local-les=289454 n=0 ec=1 les/c 289454/289454 289567/289567/289567) [38,55] r=-1 lpr=289568 pi=289453-289566/1 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.343351 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.6c( empty local-les=289454 n=0 ec=1 les/c 289454/289454 289567/289567/289567) [38,55] r=-1 lpr=289568 pi=289453-289566/1 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.347767 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.626( empty local-les=289415 n=0 ec=1 les/c 289415/289415 289567/289567/289414) [66,41] r=-1 lpr=289568 pi=124449-289566/5588 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.347800 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.626( empty local-les=289415 n=0 ec=1 les/c 289415/289415 289567/289567/289414) [66,41] r=-1 lpr=289568 pi=124449-289566/5588 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.347815 7fbc0b54c700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[2.626( empty local-les=289415 n=0 ec=1 les/c 289415/289415 289567/289567/289414) [66,41] r=-1 lpr=289568 pi=124449-289566/5588 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.347828 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.626( empty local-les=289415 n=0 ec=1 les/c 289415/289415 289567/289567/289414) [66,41] r=-1 lpr=289568 pi=124449-289566/5588 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.363524 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.3b5( empty local-les=289538 n=0 ec=1 les/c 289538/289541 289567/289567/289567) [89,129] r=-1 lpr=289568 pi=289535-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.363556 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.3b5( empty local-les=289538 n=0 ec=1 les/c 289538/289541 289567/289567/289567) [89,129] r=-1 lpr=289568 pi=289535-289566/1 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.365664 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.3b5( empty local-les=289538 n=0 ec=1 les/c 289538/289541 289567/289567/289567) [89,129] r=-1 lpr=289568 pi=289535-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.365680 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[0.3b5( empty local-les=289538 n=0 ec=1 les/c 289538/289541 289567/289567/289567) [89,129] r=-1 lpr=289568 pi=289535-289566/1 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.365691 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.3b5( empty local-les=289538 n=0 ec=1 les/c 289538/289541 289567/289567/289567) [89,129] r=-1 lpr=289568 pi=289535-289566/1 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.396342 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.604( empty local-les=289514 n=0 ec=1 les/c 289514/289514 289567/289567/289513) [3,109] r=-1 lpr=289568 pi=161020-289566/4775 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.402783 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.604( empty local-les=289514 n=0 ec=1 les/c 289514/289514 289567/289567/289513) [3,109] r=-1 lpr=289568 pi=161020-289566/4775 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.402816 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.604( empty local-les=289514 n=0 ec=1 les/c 289514/289514 289567/289567/289513) [3,109] r=-1 lpr=289568 pi=161020-289566/4775 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.402831 7fbc0b54c700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[2.604( empty local-les=289514 n=0 ec=1 les/c 289514/289514 289567/289567/289513) [3,109] r=-1 lpr=289568 pi=161020-289566/4775 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.402842 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.604( empty local-les=289514 n=0 ec=1 les/c 289514/289514 289567/289567/289513) [3,109] r=-1 lpr=289568 pi=161020-289566/4775 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.405597 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.21d( empty local-les=289521 n=0 ec=1 les/c 289521/289521 289567/289567/286195) [77,137] r=-1 lpr=289568 pi=286192-289566/77 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.405733 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.21d( empty local-les=289521 n=0 ec=1 les/c 289521/289521 289567/289567/286195) [77,137] r=-1 lpr=289568 pi=286192-289566/77 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.405750 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.21d( empty local-les=289521 n=0 ec=1 les/c 289521/289521 289567/289567/286195) [77,137] r=-1 lpr=289568 pi=286192-289566/77 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.405764 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[0.21d( empty local-les=289521 n=0 ec=1 les/c 289521/289521 289567/289567/286195) [77,137] r=-1 lpr=289568 pi=286192-289566/77 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.405775 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.21d( empty local-les=289521 n=0 ec=1 les/c 289521/289521 289567/289567/286195) [77,137] r=-1 lpr=289568 pi=286192-289566/77 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.426717 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.782( empty local-les=289344 n=0 ec=1 les/c 289344/289344 289567/289567/289343) [97,38] r=-1 lpr=289568 pi=289340-289566/2 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.426757 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.782( empty local-les=289344 n=0 ec=1 les/c 289344/289344 289567/289567/289343) [97,38] r=-1 lpr=289568 pi=289340-289566/2 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.426772 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.782( empty local-les=289344 n=0 ec=1 les/c 289344/289344 289567/289567/289343) [97,38] r=-1 lpr=289568 pi=289340-289566/2 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.426786 7fbc0b54c700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[2.782( empty local-les=289344 n=0 ec=1 les/c 289344/289344 289567/289567/289343) [97,38] r=-1 lpr=289568 pi=289340-289566/2 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.426797 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.782( empty local-les=289344 n=0 ec=1 les/c 289344/289344 289567/289567/289343) [97,38] r=-1 lpr=289568 pi=289340-289566/2 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.449454 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.7b2( empty local-les=289491 n=0 ec=1 les/c 289491/289491 289567/289567/289567) [32,60] r=-1 lpr=289568 pi=289490-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.449509 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.7b2( empty local-les=289491 n=0 ec=1 les/c 289491/289491 289567/289567/289567) [32,60] r=-1 lpr=289568 pi=289490-289566/1 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.449526 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.7b2( empty local-les=289491 n=0 ec=1 les/c 289491/289491 289567/289567/289567) [32,60] r=-1 lpr=289568 pi=289490-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.449547 7fbc0b54c700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[2.7b2( empty local-les=289491 n=0 ec=1 les/c 289491/289491 289567/289567/289567) [32,60] r=-1 lpr=289568 pi=289490-289566/1 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.449559 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.7b2( empty local-les=289491 n=0 ec=1 les/c 289491/289491 289567/289567/289567) [32,60] r=-1 lpr=289568 pi=289490-289566/1 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.468461 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d0( empty local-les=289490 n=0 ec=1 les/c 289490/289490 289567/289567/289481) [107,101] r=-1 lpr=289568 pi=289479-289566/4 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.468511 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d0( empty local-les=289490 n=0 ec=1 les/c 289490/289490 289567/289567/289481) [107,101] r=-1 lpr=289568 pi=289479-289566/4 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.468533 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.1d0( empty local-les=289490 n=0 ec=1 les/c 289490/289490 289567/289567/289481) [107,101] r=-1 lpr=289568 pi=289479-289566/4 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.468560 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[0.1d0( empty local-les=289490 n=0 ec=1 les/c 289490/289490 289567/289567/289481) [107,101] r=-1 lpr=289568 pi=289479-289566/4 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.468573 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d0( empty local-les=289490 n=0 ec=1 les/c 289490/289490 289567/289567/289481) [107,101] r=-1 lpr=289568 pi=289479-289566/4 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.495508 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.495594 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.495618 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[2.71d( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.495633 7fbc0b54c700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[2.71d( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.495645 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[2.71d( empty local-les=289504 n=0 ec=1 les/c 289504/289504 289567/289567/288958) [38,2] r=-1 lpr=289568 pi=288954-289566/17 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.508936 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.512( empty local-les=289554 n=0 ec=1 les/c 289554/289554 289567/289567/289567) [61,70] r=-1 lpr=289568 pi=289553-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.508975 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.512( empty local-les=289554 n=0 ec=1 les/c 289554/289554 289567/289567/289567) [61,70] r=-1 lpr=289568 pi=289553-289566/1 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.508991 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.512( empty local-les=289554 n=0 ec=1 les/c 289554/289554 289567/289567/289567) [61,70] r=-1 lpr=289568 pi=289553-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.509005 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[0.512( empty local-les=289554 n=0 ec=1 les/c 289554/289554 289567/289567/289567) [61,70] r=-1 lpr=289568 pi=289553-289566/1 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.509016 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.512( empty local-les=289554 n=0 ec=1 les/c 289554/289554 289567/289567/289567) [61,70] r=-1 lpr=289568 pi=289553-289566/1 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.541973 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1ee( empty local-les=289331 n=0 ec=1 les/c 289331/289331 289567/289567/289567) [146,108] r=-1 lpr=289568 pi=289330-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.542031 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1ee( empty local-les=289331 n=0 ec=1 les/c 289331/289331 289567/289567/289567) [146,108] r=-1 lpr=289568 pi=289330-289566/1 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.542049 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.1ee( empty local-les=289331 n=0 ec=1 les/c 289331/289331 289567/289567/289567) [146,108] r=-1 lpr=289568 pi=289330-289566/1 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.542064 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[0.1ee( empty local-les=289331 n=0 ec=1 les/c 289331/289331 289567/289567/289567) [146,108] r=-1 lpr=289568 pi=289330-289566/1 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.542075 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1ee( empty local-les=289331 n=0 ec=1 les/c 289331/289331 289567/289567/289567) [146,108] r=-1 lpr=289568 pi=289330-289566/1 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.553850 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[3.7( v 289523'6912 (193506'3911,289523'6912] local-les=289417 n=5 ec=71845 les/c 289417/289417 289567/289567/289567) [152,85] r=-1 lpr=289568 pi=289416-289566/1 crt=286267'6814 lcod 0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.553892 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[3.7( v 289523'6912 (193506'3911,289523'6912] local-les=289417 n=5 ec=71845 les/c 289417/289417 289567/289567/289567) [152,85] r=-1 lpr=289568 pi=289416-289566/1 crt=286267'6814 lcod 0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.553915 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[3.7( v 289523'6912 (193506'3911,289523'6912] local-les=289417 n=5 ec=71845 les/c 289417/289417 289567/289567/289567) [152,85] r=-1 lpr=289568 pi=289416-289566/1 crt=286267'6814 lcod 0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.553930 7fbc0b54c700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[3.7( v 289523'6912 (193506'3911,289523'6912] local-les=289417 n=5 ec=71845 les/c 289417/289417 289567/289567/289567) [152,85] r=-1 lpr=289568 pi=289416-289566/1 crt=286267'6814 lcod 0'0 inactive NOTIFY]
2014-05-25 15:10:41.553949 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[3.7( v 289523'6912 (193506'3911,289523'6912] local-les=289417 n=5 ec=71845 les/c 289417/289417 289567/289567/289567) [152,85] r=-1 lpr=289568 pi=289416-289566/1 crt=286267'6814 lcod 0'0 inactive NOTIFY] null
2014-05-25 15:10:41.578314 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d8( empty local-les=289535 n=0 ec=1 les/c 289535/289535 289567/289567/289534) [46,47] r=-1 lpr=289568 pi=629-289566/8210 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.589983 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d8( empty local-les=289535 n=0 ec=1 les/c 289535/289535 289567/289567/289534) [46,47] r=-1 lpr=289568 pi=629-289566/8210 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.590025 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[0.1d8( empty local-les=289535 n=0 ec=1 les/c 289535/289535 289567/289567/289534) [46,47] r=-1 lpr=289568 pi=629-289566/8210 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.590042 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[0.1d8( empty local-les=289535 n=0 ec=1 les/c 289535/289535 289567/289567/289534) [46,47] r=-1 lpr=289568 pi=629-289566/8210 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.590055 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[0.1d8( empty local-les=289535 n=0 ec=1 les/c 289535/289535 289567/289567/289534) [46,47] r=-1 lpr=289568 pi=629-289566/8210 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.594791 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.185( empty local-les=289526 n=0 ec=1 les/c 289526/289526 289567/289567/289502) [121,89] r=-1 lpr=289568 pi=492-289566/9900 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.607898 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.185( empty local-les=289526 n=0 ec=1 les/c 289526/289526 289567/289567/289502) [121,89] r=-1 lpr=289568 pi=492-289566/9900 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.607931 7fbc0b54c700 20 osd.58 pg_epoch: 289568 pg[0.185( empty local-les=289526 n=0 ec=1 les/c 289526/289526 289567/289567/289502) [121,89] r=-1 lpr=289568 pi=492-289566/9900 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.607946 7fbc0b54c700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[0.185( empty local-les=289526 n=0 ec=1 les/c 289526/289526 289567/289567/289502) [121,89] r=-1 lpr=289568 pi=492-289566/9900 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.607959 7fbc0b54c700 10 osd.58 pg_epoch: 289568 pg[0.185( empty local-les=289526 n=0 ec=1 les/c 289526/289526 289567/289567/289502) [121,89] r=-1 lpr=289568 pi=492-289566/9900 crt=0'0 inactive NOTIFY] null
2014-05-25 15:10:41.614492 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.35d( empty local-les=289560 n=0 ec=1 les/c 289560/289560 289567/289567/289559) [86,89] r=-1 lpr=289568 pi=289555-289566/2 crt=0'0 inactive NOTIFY] handle_activate_map
2014-05-25 15:10:41.614528 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.35d( empty local-les=289560 n=0 ec=1 les/c 289560/289560 289567/289567/289559) [86,89] r=-1 lpr=289568 pi=289555-289566/2 crt=0'0 inactive NOTIFY] take_waiters
2014-05-25 15:10:41.614548 7fbc0ab4b700 20 osd.58 pg_epoch: 289568 pg[2.35d( empty local-les=289560 n=0 ec=1 les/c 289560/289560 289567/289567/289559) [86,89] r=-1 lpr=289568 pi=289555-289566/2 crt=0'0 inactive NOTIFY] handle_activate_map: Not dirtying info: last_persisted is 289568 while current is 289568
2014-05-25 15:10:41.614574 7fbc0ab4b700 10 osd.58 350946 advance_pg advanced by max 200 past min epoch 289568 ... will requeue pg[2.35d( empty local-les=289560 n=0 ec=1 les/c 289560/289560 289567/289567/289559) [86,89] r=-1 lpr=289568 pi=289555-289566/2 crt=0'0 inactive NOTIFY]
2014-05-25 15:10:41.614586 7fbc0ab4b700 10 osd.58 pg_epoch: 289568 pg[2.35d( empty local-les=289560 n=0 ec=1 les/c 289560/289560 289567/289567/289559) [86,89] r=-1 lpr=289568 pi=289555-289566/2 crt=0'0 inactive NOTIFY] null
[root@storage0106-ib tmp]#

#10 Updated by Sage Weil almost 10 years ago

  • Subject changed from Firefly OSD problem , multiple osds are getting out , down and not coming up: fault with nothing to send, going to standby to osd: skipping missing maps broken
  • Status changed from Need More Info to Fix Under Review

see wip-map-advance

#11 Updated by Sage Weil almost 10 years ago

  • Status changed from Fix Under Review to Resolved

#12 Updated by karan singh almost 10 years ago

Unfortunately i was running development release of ceph, sage helped me to give 2 possible fixes.

1) Apply patch wip-map-advance from gitbuilder.
2) downgrade cluster to latest stable firefly release 0.80.1

I tried both these fixes and both worked for me.

Thanks SAGE for your support during this problem.

Also available in: Atom PDF