Project

General

Profile

notes.txt

Colin McCabe, 11/18/2010 09:50 PM

Download (20.4 KB)

 
1
### OSD0: OSD::handle_pg_create for PG 0.5 on osd0
2
### Here, we can see that PG 0.5 is originally created with OSD0 as the primary and OSD1 as a replica.
3
2010-11-18 15:03:05.123870 7f7d89ea6710 osd0 2 mkpg 0.5 e1
4
2010-11-18 15:03:05.123885 7f7d89ea6710 osd0 2 project_pg_history 0.5 from 1 to 2, start ec=0 les=0 0/0/0
5
2010-11-18 15:03:05.123905 7f7d89ea6710 osd0 2 project_pg_history 0.5 changed in 2 from [] -> [0]
6
2010-11-18 15:03:05.123912 7f7d89ea6710 osd0 2 project_pg_history 0.5 changed in 2 from [] -> [0]
7
2010-11-18 15:03:05.123919 7f7d89ea6710 osd0 2 project_pg_history 0.5 primary changed in 2
8
2010-11-18 15:03:05.123925 7f7d89ea6710 osd0 2 project_pg_history end ec=0 les=0 2/2/2
9
2010-11-18 15:03:05.123935 7f7d89ea6710 osd0 2 calc_priors_during 0.5 [1,2)
10
2010-11-18 15:03:05.123953 7f7d89ea6710 osd0 2   0.5 in epoch 1 was []
11
2010-11-18 15:03:05.123960 7f7d89ea6710 osd0 2 calc_priors_during 0.5 [1,2) =
12
2010-11-18 15:03:05.123967 7f7d89ea6710 osd0 2 mkpg 0.5 e1 h ec=0 les=0 2/2/2 : querying priors
13
2010-11-18 15:03:05.123976 7f7d89ea6710 osd0 2 can_create_pg 0.5 - can create now
14
2010-11-18 15:03:05.123986 7f7d89ea6710 osd0 2 _create_lock_new_pg pgid 0.5 -> [0]
15
2010-11-18 15:03:05.123993 7f7d89ea6710 osd0 2 _open_lock_pg 0.5
16
2010-11-18 15:03:05.123999 7f7d89ea6710 osd0 2 _get_pool 0 5 -> 6
17
2010-11-18 15:03:05.124052 7f7d89ea6710 filestore(dev/osd0) collection_stat dev/osd0/current/0.5_head = -2
18
2010-11-18 15:03:05.124082 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 inactive] write_info info 309
19
2010-11-18 15:03:05.124099 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 inactive] write_info bigbl 8
20
2010-11-18 15:03:05.124115 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 inactive] write_log
21
2010-11-18 15:03:05.124132 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 inactive] write_log to 0~0
22
2010-11-18 15:03:05.124143 7f7d89ea6710 osd0 2 _create_lock_new_pg pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 inactive]
23
2010-11-18 15:03:05.124154 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 inactive] peer up [0], acting [0]
24
2010-11-18 15:03:05.124164 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 peering] clear_prior
25
2010-11-18 15:03:05.124175 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 peering] build_prior  have started since joining this pg
26
2010-11-18 15:03:05.124186 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 peering] build_prior =  down =  ...
27
2010-11-18 15:03:05.124197 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 peering] peer prior_set is
28
2010-11-18 15:03:05.124207 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 peering] num_missing = 0, num_unfound = 0
29
2010-11-18 15:03:05.124218 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 peering] up_thru 0 < same_since 2, must notify monitor
30
2010-11-18 15:03:05.124228 7f7d89ea6710 osd0 2 queue_want_up_thru want 2 <= queued 2, currently 0
31
2010-11-18 15:03:05.124238 7f7d89ea6710 osd0 2 pg[0.5( empty n=0 ec=2 les=2 2/2/2) [0] r=0 mlcod 0'0 peering] update_stats 2'1
32

    
33
### osd1: OSD::project_pg_history for PG 0.5
34
2010-11-18 15:04:59.732991 7fe83b066710 osd1 6 project_pg_history 0.5 from 3 to 6, start ec=2 les=2 3/3/2
35
2010-11-18 15:04:59.733018 7fe83b066710 osd1 6 project_pg_history end ec=2 les=2 3/3/2
36
2010-11-18 15:04:59.733026 7fe83b066710 osd1 6 _create_lock_pg 0.5
37
2010-11-18 15:04:59.733032 7fe83b066710 osd1 6 _open_lock_pg 0.5
38
2010-11-18 15:04:59.733038 7fe83b066710 osd1 6 _get_pool 0 5 -> 6
39
2010-11-18 15:04:59.733098 7fe83b066710 filestore(dev/osd1) collection_stat dev/osd1/current/0.5_head = -2
40
2010-11-18 15:04:59.733125 7fe83b066710 osd1 6 pg[0.5( DNE empty n=0 ec=0 les=0 0/0/0) [] r=0 mlcod 0'0 inactive] write_info info 309
41
2010-11-18 15:04:59.733143 7fe83b066710 osd1 6 pg[0.5( DNE empty n=0 ec=0 les=0 0/0/0) [] r=0 mlcod 0'0 inactive] write_info bigbl 8
42
2010-11-18 15:04:59.733157 7fe83b066710 osd1 6 pg[0.5( DNE empty n=0 ec=0 les=0 0/0/0) [] r=0 mlcod 0'0 inactive] write_log
43
2010-11-18 15:04:59.733179 7fe83b066710 osd1 6 pg[0.5( DNE empty n=0 ec=0 les=0 0/0/0) [] r=0 mlcod 0'0 inactive] write_log to 0~0
44
2010-11-18 15:04:59.733208 7fe83b066710 osd1 6 pg[0.5( empty n=0 ec=2 les=2 3/3/2) [0,1] r=1 inactive] write_info info 309
45
2010-11-18 15:04:59.733226 7fe83b066710 osd1 6 pg[0.5( empty n=0 ec=2 les=2 3/3/2) [0,1] r=1 inactive] write_info bigbl 8
46
2010-11-18 15:04:59.733241 7fe83b066710 osd1 6 pg[0.5( empty n=0 ec=2 les=2 3/3/2) [0,1] r=1 inactive] write_log
47
2010-11-18 15:04:59.733258 7fe83b066710 osd1 6 pg[0.5( empty n=0 ec=2 les=2 3/3/2) [0,1] r=1 inactive] write_log to 0~0
48
2010-11-18 15:04:59.733287 7fe83b066710 filestore(dev/osd1) queue_transactions (writeahead) 35 0x1716000
49
2010-11-18 15:04:59.733296 7fe83b066710 journal journal_transactions 35
50
2010-11-18 15:04:59.733312 7fe83b066710 journal submit_entry seq 35 len 1216 (0x16f2aa0)
51
2010-11-18 15:04:59.733332 7fe83b066710 osd1 6 pg[0.5( empty n=0 ec=2 les=2 3/3/2) [0,1] r=1 inactive] dne (before), but i am role 1
52
2010-11-18 15:04:59.733353 7fe83b066710 osd1 6 pg[0.5( empty n=0 ec=2 les=2 3/3/2) [0,1] r=1 inactive] sending info
53
### Hmm. We are sending info from osd1 to osd0 here. This happens inside OSD::handle_pg_query
54

    
55
### OSD1: OSD::do_notifies
56
### The comment for this function says:
57
### Send an MOSDPGNotify to a primary, with a list of PGs that I have content for, and they are primary for.
58
### So osd1 is notifying osd0 about the PGs here! osd0 must be the primary for 0.5.
59
2010-11-18 15:04:59.739228 7fe83b066710 osd1 6 do_notify osd0 on 14 PGs
60
2010-11-18 15:04:59.739239 7fe83b066710 -- 10.3.14.10:6802/29157 --> osd0 10.3.14.10:6800/29113 -- PGnot v1 -- ?+0 0x17b1540
61

    
62
### OSD0 is marked as 'out'
63
+ ./ceph osd out 0
64
2010-11-18 15:05:00.809554 mon <- [osd,out,0]
65
2010-11-18 15:05:01.606338 mon0 -> 'marked out osd0' (0)
66

    
67
### Entering epoch 14, the final epoch.
68
### Epoch 14 stats:
69
created 2010-11-18 15:02:44.930075
70
modifed 2010-11-18 15:05:28.319869
71
max_osd 3
72
osd0 up    outup_from 2 up_thru 3 down_at 0 last_clean_interval 0-0 10.3.14.10:6800/29113 10.3.14.10:6801/29113
73
osd1 up    in  weight 1up_from 10 up_thru 11 down_at 8 last_clean_interval 3-9 10.3.14.10:6802/29157 10.3.14.10:6803/29157
74
osd2 up    in  weight 1up_from 4 up_thru 11 down_at 0 last_clean_interval 0-0 10.3.14.10:6804/29550 10.3.14.10:6805/29550
75

    
76
### OSD2 sends "some info" to osd1 in response to its MOSDPGQuery
77
### This info does *not* include PG 0.5, because OSD2 doesn't know about that PG.
78
2010-11-18 15:05:01.931574 7fd48dc51710 osd2 6 handle_pg_query from osd1 epoch 6
79
...
80
2010-11-18 15:05:01.931950 7fd48dc51710 osd2 6 pg[0.7( empty n=0 ec=2 les=2 4/4/3) [1,2] r=1 inactive] sending info
81
2010-11-18 15:05:01.932323 7fd48dc51710 osd2 6 pg[1.6( empty n=0 ec=2 les=2 4/4/3) [1,2] r=1 inactive] sending info
82
2010-11-18 15:05:01.933848 7fd48dc51710 osd2 6 pg[2.5( empty n=0 ec=2 les=2 4/4/3) [1,2] r=1 inactive] sending info
83
2010-11-18 15:05:01.934229 7fd48dc51710 osd2 6 pg[3.4( empty n=0 ec=2 les=2 4/4/3) [1,2] r=1 inactive] sending info
84
2010-11-18 15:05:01.934860 7fd48dc51710 osd2 6 pg[0.0( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
85
2010-11-18 15:05:01.935237 7fd48dc51710 osd2 6 pg[0.1( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
86
2010-11-18 15:05:01.935613 7fd48dc51710 osd2 6 pg[0.a( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
87
2010-11-18 15:05:01.935965 7fd48dc51710 osd2 6 pg[0.b( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
88
2010-11-18 15:05:01.936346 7fd48dc51710 osd2 6 pg[0.0p0( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
89
2010-11-18 15:05:01.937602 7fd48dc51710 osd2 6 pg[0.1p0( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
90
2010-11-18 15:05:01.937973 7fd48dc51710 osd2 6 pg[1.0( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
91
2010-11-18 15:05:01.938332 7fd48dc51710 osd2 6 pg[1.9( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
92
2010-11-18 15:05:01.938687 7fd48dc51710 osd2 6 pg[1.a( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
93
2010-11-18 15:05:01.939052 7fd48dc51710 osd2 6 pg[1.b( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
94
2010-11-18 15:05:01.939417 7fd48dc51710 osd2 6 pg[1.0p0( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
95
2010-11-18 15:05:01.939768 7fd48dc51710 osd2 6 pg[2.8( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
96
2010-11-18 15:05:01.940120 7fd48dc51710 osd2 6 pg[2.9( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
97
2010-11-18 15:05:01.940471 7fd48dc51710 osd2 6 pg[2.a( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
98
2010-11-18 15:05:01.940831 7fd48dc51710 osd2 6 pg[3.7( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
99
2010-11-18 15:05:01.941188 7fd48dc51710 osd2 6 pg[3.8( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
100
2010-11-18 15:05:01.941542 7fd48dc51710 osd2 6 pg[3.9( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
101
2010-11-18 15:05:01.941916 7fd48dc51710 osd2 6 pg[3.1p0( empty n=0 ec=2 les=2 4/4/2) [0,2] r=1 inactive] sending info
102

    
103
### OSD2: OSD::do_mon_report
104
### OSD2: OSD::send_pg_stats
105
### Things to notice here:
106
### 1) There is no record of PG 0.5 here.
107
### We don't know about it since, as we've seen, OSD 1 sent the PG information to OSD0 instead.
108
2010-11-18 15:05:15.361342 7fd49245a710 osd2 9 send_pg_stats
109
2010-11-18 15:05:15.361378 7fd49245a710 osd2 9 send_pg_stats - 42 pgs updated
110
2010-11-18 15:05:15.361397 7fd49245a710 osd2 9  sending 3.1p2 4'4
111
2010-11-18 15:05:15.361412 7fd49245a710 osd2 9  sending 2.0p2 4'4
112
2010-11-18 15:05:15.361426 7fd49245a710 osd2 9  sending 3.0p2 4'4
113
2010-11-18 15:05:15.361440 7fd49245a710 osd2 9  sending 2.1p2 4'4
114
2010-11-18 15:05:15.361457 7fd49245a710 osd2 9  sending 1.1p2 4'4
115
2010-11-18 15:05:15.361471 7fd49245a710 osd2 9  sending 0.0p2 4'4
116
2010-11-18 15:05:15.361485 7fd49245a710 osd2 9  sending 1.0p2 4'4
117
2010-11-18 15:05:15.361499 7fd49245a710 osd2 9  sending 0.1p2 4'4
118
2010-11-18 15:05:15.361512 7fd49245a710 osd2 9  sending 1.8 4'4
119
2010-11-18 15:05:15.361529 7fd49245a710 osd2 9  sending 0.9 4'4
120
2010-11-18 15:05:15.361542 7fd49245a710 osd2 9  sending 0.8 4'4
121
2010-11-18 15:05:15.361556 7fd49245a710 osd2 9  sending 3.5 4'4
122
2010-11-18 15:05:15.361571 7fd49245a710 osd2 9  sending 2.4 4'4
123
2010-11-18 15:05:15.361585 7fd49245a710 osd2 9  sending 1.7 4'4
124
2010-11-18 15:05:15.361598 7fd49245a710 osd2 9  sending 0.6 4'4
125
2010-11-18 15:05:15.361612 7fd49245a710 osd2 9  sending 3.6 4'4
126
2010-11-18 15:05:15.361626 7fd49245a710 osd2 9  sending 2.7 4'4
127
2010-11-18 15:05:15.361639 7fd49245a710 osd2 9  sending 2.6 4'4
128
2010-11-18 15:05:15.361653 7fd49245a710 osd2 9  sending 1.5 4'4
129
2010-11-18 15:05:15.361666 7fd49245a710 osd2 9  sending 3.3 4'4
130
2010-11-18 15:05:15.361681 7fd49245a710 osd2 9  sending 0.0p0 8'1
131
2010-11-18 15:05:15.361696 7fd49245a710 osd2 9  sending 1.0p0 8'1
132
2010-11-18 15:05:15.361710 7fd49245a710 osd2 9  sending 0.1p0 8'1
133
2010-11-18 15:05:15.361724 7fd49245a710 osd2 9  sending 3.1p0 8'1
134
2010-11-18 15:05:15.361738 7fd49245a710 osd2 9  sending 3.8 8'1
135
2010-11-18 15:05:15.361752 7fd49245a710 osd2 9  sending 2.9 8'1
136
2010-11-18 15:05:15.361766 7fd49245a710 osd2 9  sending 1.a 8'1
137
2010-11-18 15:05:15.361782 7fd49245a710 osd2 9  sending 0.b 8'1
138
2010-11-18 15:05:15.361796 7fd49245a710 osd2 9  sending 3.9 8'1
139
2010-11-18 15:05:15.361811 7fd49245a710 osd2 9  sending 2.8 8'1
140
2010-11-18 15:05:15.361824 7fd49245a710 osd2 9  sending 1.b 8'1
141
2010-11-18 15:05:15.361839 7fd49245a710 osd2 9  sending 0.a 8'1
142
2010-11-18 15:05:15.361852 7fd49245a710 osd2 9  sending 2.a 8'1
143
2010-11-18 15:05:15.361866 7fd49245a710 osd2 9  sending 1.9 8'1
144
2010-11-18 15:05:15.361881 7fd49245a710 osd2 9  sending 3.4 8'1
145
2010-11-18 15:05:15.361895 7fd49245a710 osd2 9  sending 2.5 8'1
146
2010-11-18 15:05:15.361909 7fd49245a710 osd2 9  sending 1.6 8'1
147
2010-11-18 15:05:15.361923 7fd49245a710 osd2 9  sending 0.7 8'1
148
2010-11-18 15:05:15.361937 7fd49245a710 osd2 9  sending 3.7 8'1
149
2010-11-18 15:05:15.361953 7fd49245a710 osd2 9  sending 1.0 8'1
150
2010-11-18 15:05:15.361967 7fd49245a710 osd2 9  sending 0.1 8'1
151
2010-11-18 15:05:15.361991 7fd49245a710 osd2 9  sending 0.0 8'1
152
2010-11-18 15:05:15.362001 7fd49245a710 monclient: _send_mon_message to mon.b at mon1 10.3.14.10:6790/0
153
2010-11-18 15:05:15.362010 7fd49245a710 -- 10.3.14.10:6804/29550 --> mon1 10.3.14.10:6790/0 -- pg_stats(42 pgs v 9) v1 -- ?+0 0x2ea1000
154
2010-11-18 15:05:15.362213 7fd48dc51710 osd2 9 do_waiters -- start
155
2010-11-18 15:05:15.362238 7fd48dc51710 osd2 9 _dispatch 0x29f0b40 osd_op(mds0.1:5 1.00000000.inode [writefull 0~425] 1.e14 RETRY) v1
156
2010-11-18 15:05:15.362260 7fd48dc51710 osd2 9 require_same_or_newer_map 8 (i am 9) 0x29f0b40
157
2010-11-18 15:05:15.362274 7fd48dc51710 osd2 9 _share_map_incoming mds0 10.3.14.10:6806/30589 8
158
2010-11-18 15:05:15.362298 7fd48dc51710 osd2 9 hit non-existent pg 1.4, waiting
159
2010-11-18 15:05:15.362544 7fd48dc51710 osd2 9 hit non-existent pg 1.4, waiting
160
2010-11-18 15:05:15.362553 7fd48dc51710 osd2 9 _dispatch 0x29f06c0 osd_op(mds0.1:6 600.00000000 [tmapput 0~205,setxattr path (11),setxattr parent (26)]  1.41b0 RETRY) v1
161
2010-11-18 15:05:15.362572 7fd48dc51710 osd2 9 require_same_or_newer_map 8 (i am 9) 0x29f06c0
162
2010-11-18 15:05:15.362583 7fd48dc51710 osd2 9 _share_map_incoming mds0 10.3.14.10:6806/30589 8
163
2010-11-18 15:05:15.362594 7fd48dc51710 osd2 9 request for pool=1 (metadata) owner=0 perm=7 may_read=0 may_write=1 may_exec=0 require_exec_caps=0
164
2010-11-18 15:05:15.362604 7fd48dc51710 osd2 9 pg[1.0( empty n=0 ec=2 les=2 8/8/8) [2] r=0 mlcod 0'0 !hml degraded+peering] not active (yet)
165
2010-11-18 15:05:15.362619 7fd48dc51710 osd2 9 _dispatch 0x2a146c0 osd_op(mds0.1:7 100.00000000 [tmapput 0~903,setxattr path (5)] 1.5ab3 RETRY) v1
166
2010-11-18 15:05:15.362634 7fd48dc51710 osd2 9 require_same_or_newer_map 8 (i am 9) 0x2a146c0
167
2010-11-18 15:05:15.362644 7fd48dc51710 osd2 9 _share_map_incoming mds0 10.3.14.10:6806/30589 8
168
2010-11-18 15:05:15.362653 7fd48dc51710 osd2 9 hit non-existent pg 1.3, waiting
169
2010-11-18 15:05:15.362662 7fd48dc51710 osd2 9 _dispatch 0x29f4900 osd_op(mds1.1:3 601.00000000 [tmapput 0~205,setxattr path (11),setxattr parent (26)]  1.290 RETRY) v1
170
2010-11-18 15:05:15.362678 7fd48dc51710 osd2 9 require_same_or_newer_map 8 (i am 9) 0x29f4900
171
2010-11-18 15:05:15.362689 7fd48dc51710 osd2 9 _share_map_incoming mds1 10.3.14.10:6808/31526 8
172
2010-11-18 15:05:15.362700 7fd48dc51710 osd2 9 request for pool=1 (metadata) owner=0 perm=7 may_read=0 may_write=1 may_exec=0 require_exec_caps=0
173
2010-11-18 15:05:15.362718 7fd48dc51710 osd2 9 pg[1.0( empty n=0 ec=2 les=2 8/8/8) [2] r=0 mlcod 0'0 !hml degraded+peering] not active (yet)
174
2010-11-18 15:05:15.362733 7fd48dc51710 osd2 9 _dispatch 0x2a14d80 osd_op(mds1.1:6 mds1_inotable [writefull 0~29] 1.75db RETRY) v1
175
2010-11-18 15:05:15.362745 7fd48dc51710 osd2 9 require_same_or_newer_map 8 (i am 9) 0x2a14d80
176
2010-11-18 15:05:15.362755 7fd48dc51710 osd2 9 _share_map_incoming mds1 10.3.14.10:6808/31526 8
177
2010-11-18 15:05:15.362767 7fd48dc51710 osd2 9 request for pool=1 (metadata) owner=0 perm=7 may_read=0 may_write=1 may_exec=0 require_exec_caps=0
178
2010-11-18 15:05:15.362775 7fd48dc51710 osd2 9 pg[1.b( empty n=0 ec=2 les=2 8/8/8) [2] r=0 mlcod 0'0 !hml degraded+peering] not active (yet)
179
2010-11-18 15:05:15.362789 7fd48dc51710 osd2 9 _dispatch 0x2d246c0 osd_op(mds1.1:4 101.00000000 [tmapput 0~903,setxattr path (5)] 1.4883 RETRY) v1
180
2010-11-18 15:05:15.362803 7fd48dc51710 osd2 9 require_same_or_newer_map 8 (i am 9) 0x2d246c0
181
2010-11-18 15:05:15.362813 7fd48dc51710 osd2 9 _share_map_incoming mds1 10.3.14.10:6808/31526 8
182
2010-11-18 15:05:15.362822 7fd48dc51710 osd2 9 hit non-existent pg 1.3, waiting
183
2010-11-18 15:05:15.362831 7fd48dc51710 osd2 9 _dispatch 0x2d24480 osd_op(mds1.1:2 201.00000001 [write 0~107] 1.a2db RETRY) v1
184
2010-11-18 15:05:15.362843 7fd48dc51710 osd2 9 require_same_or_newer_map 8 (i am 9) 0x2d24480
185
2010-11-18 15:05:15.362853 7fd48dc51710 osd2 9 _share_map_incoming mds1 10.3.14.10:6808/31526 8
186
2010-11-18 15:05:15.362864 7fd48dc51710 osd2 9 request for pool=1 (metadata) owner=0 perm=7 may_read=0 may_write=1 may_exec=0 require_exec_caps=0
187
2010-11-18 15:05:15.362872 7fd48dc51710 osd2 9 pg[1.b( empty n=0 ec=2 les=2 8/8/8) [2] r=0 mlcod 0'0 !hml degraded+peering] not active (yet)
188
2010-11-18 15:05:15.362887 7fd48dc51710 osd2 9 _dispatch 0x2d24240 osd_op(mds1.1:7 mds1_sessionmap [writefull 0~17] 1.fe43 RETRY) v1
189
2010-11-18 15:05:15.362898 7fd48dc51710 osd2 9 require_same_or_newer_map 8 (i am 9) 0x2d24240
190
2010-11-18 15:05:15.362909 7fd48dc51710 osd2 9 _share_map_incoming mds1 10.3.14.10:6808/31526 8
191
2010-11-18 15:05:15.362918 7fd48dc51710 osd2 9 hit non-existent pg 1.3, waiting
192
2010-11-18 15:05:15.362926 7fd48dc51710 osd2 9 _dispatch 0x2d24d80 osd_op(client4113.0:1 obj01 [writefull 0~1000000] 0.92bd) v1
193
2010-11-18 15:05:15.362938 7fd48dc51710 osd2 9 require_same_or_newer_map 9 (i am 9) 0x2d24d80
194
2010-11-18 15:05:15.362949 7fd48dc51710 osd2 9 _share_map_incoming client4113 10.3.14.10:0/1691 9
195
2010-11-18 15:05:15.362959 7fd48dc51710 osd2 9 hit non-existent pg 0.5, waiting
196
2010-11-18 15:05:15.362968 7fd48dc51710 osd2 9 do_waiters -- finish
197

    
198
### OSD1 : do_peer for PG 0.5
199
2010-11-18 15:05:17.751121 7fe83b066710 osd1 10 pg[0.5( empty n=0 ec=2 les=2 10/10/10) [1,2] r=0 mlcod 0'0 !hml peering] got osd0 info 0.5( empty n=0 ec=2 les=2 10/10/10)
200
2010-11-18 15:05:17.751147 7fe83b066710 osd1 10 pg[0.5( empty n=0 ec=2 les=2 10/10/10) [1,2] r=0 mlcod 0'0 !hml peering] osd0 has stray content: 0.5( empty n=0 ec=2 les=2 10/10/10)
201
2010-11-18 15:05:17.751161 7fe83b066710 osd1 10 pg[0.5( empty n=0 ec=2 les=2 10/10/10) [1,2] r=0 mlcod 0'0 !hml peering] peer up [1,2], acting [1,2]
202
2010-11-18 15:05:17.751172 7fe83b066710 osd1 10 pg[0.5( empty n=0 ec=2 les=2 10/10/10) [1,2] r=0 mlcod 0'0 !hml peering] peer prior_set is 0,2
203
2010-11-18 15:05:17.751183 7fe83b066710 osd1 10 pg[0.5( empty n=0 ec=2 les=2 10/10/10) [1,2] r=0 mlcod 0'0 !hml peering] recover_master_log
204
2010-11-18 15:05:17.751193 7fe83b066710 osd1 10 pg[0.5( empty n=0 ec=2 les=2 10/10/10) [1,2] r=0 mlcod 0'0 !hml peering]  have info from osd0: 0.5( empty n=0 ec=2 les=2 10/10/10)
205
### OSD1: This is from PG::recover_master_log.
206
### OSD1: We're waiting for a response from osd2. This response NEVER COMES.
207
### Ok. time to find out if OSD2 ever received this query.
208
2010-11-18 15:05:17.751206 7fe83b066710 osd1 10 pg[0.5( empty n=0 ec=2 les=2 10/10/10) [1,2] r=0 mlcod 0'0 !hml peering]  waiting for osd2
209
2010-11-18 15:05:17.751219 7fe83b066710 osd1 10 pg[0.5( empty n=0 ec=2 les=2 10/10/10) [1,2] r=0 mlcod 0'0 !hml peering] update_stats 10'2
210
2010-11-18 15:05:17.751244 7fe83b066710 filestore(dev/osd1) queue_transactions (writeahead) 165 0x13f41c0
211
2010-11-18 15:05:17.751252 7fe83b066710 journal journal_transactions 165
212
2010-11-18 15:05:17.751261 7fe83b066710 journal submit_entry seq 165 len 33 (0x1361960)
213

    
214
### OSD2: radostool blocks waiting for PG 0.5
215
### OSD2: Note: this is the *first* reference to PG 0.5 *ever* in the log for osd2
216
2010-11-18 15:05:33.161056 7fd48dc51710 osd2 14 _dispatch 0x2d24d80 osd_op(client4113.0:1 obj01 [writefull 0~1000000] 0.92bd) v1
217
2010-11-18 15:05:33.161069 7fd48dc51710 osd2 14 require_same_or_newer_map 9 (i am 14) 0x2d24d80
218
2010-11-18 15:05:33.161079 7fd48dc51710 osd2 14 _share_map_incoming client4113 10.3.14.10:0/1691 9
219
2010-11-18 15:05:33.161088 7fd48dc51710 osd2 14 client4113 has old map 9 < 14
220
2010-11-18 15:05:33.161094 7fd48dc51710 osd2 14 send_incremental_map 9 - > 14 to client4113 10.3.14.10:0/1691
221
2010-11-18 15:05:33.161131 7fd48dc51710 filestore(dev/osd2) read dev/osd2/current/meta/inc_osdmap.14_0 0~116 = 116
222
2010-11-18 15:05:33.161152 7fd48dc51710 filestore(dev/osd2) read dev/osd2/current/meta/inc_osdmap.13_0 0~116 = 116
223
2010-11-18 15:05:33.161170 7fd48dc51710 filestore(dev/osd2) read dev/osd2/current/meta/inc_osdmap.12_0 0~116 = 116
224
2010-11-18 15:05:33.161187 7fd48dc51710 filestore(dev/osd2) read dev/osd2/current/meta/inc_osdmap.11_0 0~116 = 116
225
2010-11-18 15:05:33.161214 7fd48dc51710 filestore(dev/osd2) read dev/osd2/current/meta/inc_osdmap.10_0 0~424 = 424
226
2010-11-18 15:05:33.161224 7fd48dc51710 -- 10.3.14.10:6804/29550 --> client4113 10.3.14.10:0/1691 -- osd_map(10,14) v1 -- ?+0 0x2e94000
227
### OSD2: This is in OSD::handle_op
228
### OSD2: But handle_op can hardly be expected to do anything else. The real bug must be earlier on. Why can't we ever get out of peering?
229
2010-11-18 15:05:33.161260 7fd48dc51710 osd2 14 hit non-existent pg 0.5, waiting
230

    
231
### Final PG map
232
cmccabe@flab:~/src/ceph2/src$ ./ceph pg map 0.5
233
2010-11-18 21:19:46.532101 mon <- [pg,map,0.5]
234
2010-11-18 21:19:46.532658 mon1 -> 'osdmap e14 pg 0.5 -> up [1,2] acting [1,2]' (0)