Ceph : Issues
https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2024-03-29T03:12:41Z
Ceph
Redmine
CephFS - Bug #65217 (Fix Under Review): cephfs: add fscrypt protection support from non-fscrypt c...
https://tracker.ceph.com/issues/65217
2024-03-29T03:12:41Z
Xiubo Li
xiubli@redhat.com
<p>Clients that do not support fscrypt can execute operations that may cause unrecoverable data loss. Add protection on the MDS so that it prevents these clients from executing some operations.</p>
<p>Note, however, that clients will still be able corrupt encrypted files by appending data to them. And they will still be able to read encrypted data from those files.</p>
<p>For the non-fscrypt support client we will allow it to read the encrypted files and directories, but couldn't change the contents of them. For the directories we won't allow to create new sub directories and file under a encrypted file, else in the kclient it will fail to dencrypt the dentry names:</p>
<pre>
125 <7>[201192.339126] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] __prepare_send_request: 0000000071b24ca5 tid 30 readdir (attempt 1)
126 <7>[201192.339144] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] set_request_path_attr: inode 0000000039d46bc2 10000007491.fffffffffffffffe
127 <7>[201192.339345] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] ceph_encode_inode_release: 0000000039d46bc2 10000007491.fffffffffffffffe mds0 used|dirty p drop Fx unless -
128 <7>[201192.339366] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] ceph_encode_inode_release: 0000000039d46bc2 10000007491.fffffffffffffffe cap 00000000b0491451 pAsLsXsFs (force)
129 <7>[201192.339386] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] __prepare_send_request: r_parent = 0000000000000000
130 <7>[201192.339448] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] ceph_mdsc_wait_request: do_request waiting
131 <7>[201192.342097] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] handle_reply: handle_reply 0000000071b24ca5
132 <7>[201192.342118] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] __unregister_request: 0000000071b24ca5 tid 30
133 <7>[201192.342134] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] handle_reply: tid 30 result 0
134 <7>[201192.342214] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] parse_reply_info_readdir: parsed dir dname 'fscrypt_crash_file'
135 <3>[201192.342232] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426]: unable to decode ~Ç+Ê<9b>pt_crash_file, got -5
136 <3>[201192.342245] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426]: problem parsing dir contents -5
137 <3>[201192.342256] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426]: mds parse_reply err -5
138 <7>[201192.342269] header: 00000000: 71 00 00 00 00 00 00 00 1e 00 00 00 00 00 00 00 q...............
139 <7>[201192.342281] header: 00000010: 1a 00 7f 00 01 00 65 03 00 00 00 00 00 00 00 00 ......e.........
140 <7>[201192.342292] header: 00000020: 00 00 00 00 02 00 00 00 00 00 00 00 00 01 00 00 ................
141 <7>[201192.342299] header: 00000030: 00 00 00 00 00 .....
142 <7>[201192.342309] front: 00000000: 05 03 00 00 00 00 00 00 f2 00 00 00 01 00 01 7a ...............z
143 <7>[201192.342320] front: 00000010: 01 00 00 07 01 74 01 00 00 91 74 00 00 00 01 00 .....t....t.....
144 <7>[201192.342327] front: 00000020: 00 fe ff ff ff ff ff ff ff 00 00 00 00 0a 0a 0a ................
145 <7>[201192.342338] front: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 55 01 00 .............U..
146 <7>[201192.342348] front: 00000040: 00 00 00 00 00 01 00 00 00 00 00 00 00 08 00 00 ................
147 <7>[201192.342359] front: 00000050: 00 00 00 00 00 01 00 00 00 00 00 00 00 01 00 00 ................
148 <7>[201192.342370] front: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
149 <7>[201192.342391] front: 00000070: 00 00 00 00 00 00 00 00 00 00 2f f1 03 66 4f 6f ........../..fOo
150 <7>[201192.342402] front: 00000080: c9 2a 2f f1 03 66 4f 6f c9 2a 1c f1 03 66 b7 5c .*/..fOo.*...f.\
151 <7>[201192.342412] front: 00000090: ec 2e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
152 <7>[201192.342423] front: 000000a0: 00 00 00 00 00 00 ff ff ff ff ff ff ff ff 01 00 ................
153 <7>[201192.342433] front: 000000b0: 00 00 ed 41 00 00 e8 03 00 00 e8 03 00 00 01 00 ...A............
154 <7>[201192.342444] front: 000000c0: 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
155 <7>[201192.342454] front: 000000d0: 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 ................
156 <7>[201192.342465] front: 000000e0: 00 00 01 00 00 00 00 00 00 00 2f f1 03 66 54 18 ........../..fT.
157 <7>[201192.342475] front: 000000f0: c0 2b 00 00 00 00 00 00 00 00 02 00 00 00 00 00 .+..............
158 <7>[201192.342486] front: 00000100: 00 00 00 00 00 00 ff ff ff ff ff ff ff ff 00 00 ................
159 <7>[201192.342497] front: 00000110: 00 00 01 01 10 00 00 00 00 00 00 00 00 00 00 00 ................
160 <7>[201192.342507] front: 00000120: 00 00 00 00 00 00 00 00 00 00 00 00 1c f1 03 66 ...............f
161 <7>[201192.342518] front: 00000130: b7 5c ec 2e 01 00 00 00 00 00 00 00 ff ff ff ff .\..............
162 <7>[201192.342529] front: 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
163 <7>[201192.342539] front: 00000150: 00 00 00 00 01 30 00 00 00 01 00 00 00 28 00 00 .....0.......(..
164 <7>[201192.342550] front: 00000160: 00 02 01 04 00 00 00 00 00 81 66 70 7d 58 e2 3b ..........fp}X.;
165 <7>[201192.342560] front: 00000170: 91 3b bc 4d 82 30 5b 68 a2 fd 80 c0 16 ac cb f5 .;.M.0[h........
166 <7>[201192.342570] front: 00000180: 38 bd ea de e4 f3 c4 e3 57 00 00 00 00 90 01 00 8.......W.......
167 <7>[201192.342581] front: 00000190: 00 01 01 0c 00 00 00 00 00 00 00 ff ff ff ff 00 ................
168 <7>[201192.342591] front: 000001a0: 00 00 00 01 00 00 00 01 07 12 00 00 00 7e c7 2b .............~.+
169 <7>[201192.342602] front: 000001b0: ca 9b 70 74 5f 63 72 61 73 68 5f 66 69 6c 65 02 ..pt_crash_file.
170 <7>[201192.342611] front: 000001c0: 01 0e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
171 <7>[201192.342618] front: 000001d0: 00 00 00 07 01 48 01 00 00 df 7a 00 00 00 01 00 .....H....z.....
172 <7>[201192.342624] front: 000001e0: 00 fe ff ff ff ff ff ff ff 00 00 00 00 08 00 00 ................
173 <7>[201192.342632] front: 000001f0: 00 00 00 00 00 01 00 00 00 00 00 00 00 55 0d 00 .............U..
174 <7>[201192.342640] front: 00000200: 00 00 00 00 00 d2 18 00 00 00 00 00 00 03 00 00 ................
175 <7>[201192.342646] front: 00000210: 00 00 00 00 00 01 00 00 00 00 00 00 00 01 00 00 ................
176 <7>[201192.342651] front: 00000220: 40 00 01 00 00 00 00 00 40 00 00 00 00 00 00 00 @.......@.......
177 <7>[201192.342657] front: 00000230: 00 00 00 00 00 00 03 00 00 00 2f f1 03 66 54 18 ........../..fT.
178 <7>[201192.342663] front: 00000240: c0 2b 2f f1 03 66 d7 26 73 2b 2f f1 03 66 d7 26 .+/..f.&s+/..f.&
179 <7>[201192.342668] front: 00000250: 73 2b 02 00 00 00 00 00 00 00 00 00 00 00 00 00 s+..............
180 <7>[201192.342674] front: 00000260: 00 00 00 00 00 00 ff ff ff ff ff ff ff ff 01 00 ................
181 <7>[201192.342680] front: 00000270: 00 00 a4 81 00 00 e8 03 00 00 e8 03 00 00 01 00 ................
182 <7>[201192.342685] front: 00000280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
183 <7>[201192.342691] front: 00000290: 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 ................
184 <7>[201192.342696] front: 000002a0: 00 00 00 00 00 00 00 00 00 00 2f f1 03 66 54 18 ........../..fT.
185 <7>[201192.342702] front: 000002b0: c0 2b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 .+..............
186 <7>[201192.342707] front: 000002c0: 00 00 04 00 00 00 00 00 00 00 ff ff ff ff ff ff ................
187 <7>[201192.342713] front: 000002d0: ff ff 00 00 00 00 01 01 10 00 00 00 00 00 00 00 ................
188 <7>[201192.342719] front: 000002e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
189 <7>[201192.342724] front: 000002f0: 2f f1 03 66 4f 6f c9 2a 01 00 00 00 00 00 00 00 /..fOo.*........
190 <7>[201192.342730] front: 00000300: ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ................
191 <7>[201192.342736] front: 00000310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
192 <7>[201192.342741] front: 00000320: 00 40 00 00 00 01 00 00 00 00 00 00 00 03 00 00 .@..............
193 <7>[201192.342747] front: 00000330: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 ................
194 <7>[201192.342752] front: 00000340: 00 00 00 00 00 03 00 00 00 00 00 00 00 02 00 00 ................
195 <7>[201192.342758] front: 00000350: 00 00 00 00 00 03 00 00 00 00 00 00 00 02 00 00 ................
196 <7>[201192.342763] front: 00000360: 00 00 00 00 00 .....
197 <7>[201192.342769] footer: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
198 <7>[201192.342775] footer: 00000010: 00 00 00 00 00 .....
199 <3>[201192.342780] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426]: got corrupt reply mds0(tid:30)
200 <7>[201192.342788] header: 00000000: 71 00 00 00 00 00 00 00 1e 00 00 00 00 00 00 00 q...............
201 <7>[201192.342793] header: 00000010: 1a 00 7f 00 01 00 65 03 00 00 00 00 00 00 00 00 ......e.........
202 <7>[201192.342799] header: 00000020: 00 00 00 00 02 00 00 00 00 00 00 00 00 01 00 00 ................
203 <7>[201192.342804] header: 00000030: 00 00 00 00 00 .....
204 <7>[201192.342810] front: 00000000: 05 03 00 00 00 00 00 00 f2 00 00 00 01 00 01 7a ...............z
205 <7>[201192.342815] front: 00000010: 01 00 00 07 01 74 01 00 00 91 74 00 00 00 01 00 .....t....t.....
206 <7>[201192.342821] front: 00000020: 00 fe ff ff ff ff ff ff ff 00 00 00 00 0a 0a 0a ................
207 <7>[201192.342827] front: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 55 01 00 .............U..
208 <7>[201192.342832] front: 00000040: 00 00 00 00 00 01 00 00 00 00 00 00 00 08 00 00 ................
209 <7>[201192.342838] front: 00000050: 00 00 00 00 00 01 00 00 00 00 00 00 00 01 00 00 ................
210 <7>[201192.342843] front: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
211 <7>[201192.342849] front: 00000070: 00 00 00 00 00 00 00 00 00 00 2f f1 03 66 4f 6f ........../..fOo
212 <7>[201192.342855] front: 00000080: c9 2a 2f f1 03 66 4f 6f c9 2a 1c f1 03 66 b7 5c .*/..fOo.*...f.\
213 <7>[201192.342860] front: 00000090: ec 2e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
214 <7>[201192.342866] front: 000000a0: 00 00 00 00 00 00 ff ff ff ff ff ff ff ff 01 00 ................
215 <7>[201192.342871] front: 000000b0: 00 00 ed 41 00 00 e8 03 00 00 e8 03 00 00 01 00 ...A............
216 <7>[201192.342877] front: 000000c0: 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
217 <7>[201192.342883] front: 000000d0: 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 ................
218 <7>[201192.342888] front: 000000e0: 00 00 01 00 00 00 00 00 00 00 2f f1 03 66 54 18 ........../..fT.
219 <7>[201192.342894] front: 000000f0: c0 2b 00 00 00 00 00 00 00 00 02 00 00 00 00 00 .+..............
220 <7>[201192.342900] front: 00000100: 00 00 00 00 00 00 ff ff ff ff ff ff ff ff 00 00 ................
221 <7>[201192.342905] front: 00000110: 00 00 01 01 10 00 00 00 00 00 00 00 00 00 00 00 ................
222 <7>[201192.342911] front: 00000120: 00 00 00 00 00 00 00 00 00 00 00 00 1c f1 03 66 ...............f
223 <7>[201192.342916] front: 00000130: b7 5c ec 2e 01 00 00 00 00 00 00 00 ff ff ff ff .\..............
224 <7>[201192.342922] front: 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
225 <7>[201192.342928] front: 00000150: 00 00 00 00 01 30 00 00 00 01 00 00 00 28 00 00 .....0.......(..
226 <7>[201192.342933] front: 00000160: 00 02 01 04 00 00 00 00 00 81 66 70 7d 58 e2 3b ..........fp}X.;
227 <7>[201192.342939] front: 00000170: 91 3b bc 4d 82 30 5b 68 a2 fd 80 c0 16 ac cb f5 .;.M.0[h........
228 <7>[201192.342945] front: 00000180: 38 bd ea de e4 f3 c4 e3 57 00 00 00 00 90 01 00 8.......W.......
229 <7>[201192.342950] front: 00000190: 00 01 01 0c 00 00 00 00 00 00 00 ff ff ff ff 00 ................
230 <7>[201192.342956] front: 000001a0: 00 00 00 01 00 00 00 01 07 12 00 00 00 7e c7 2b .............~.+
231 <7>[201192.342962] front: 000001b0: ca 9b 70 74 5f 63 72 61 73 68 5f 66 69 6c 65 02 ..pt_crash_file.
232 <7>[201192.342967] front: 000001c0: 01 0e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
233 <7>[201192.342973] front: 000001d0: 00 00 00 07 01 48 01 00 00 df 7a 00 00 00 01 00 .....H....z.....
234 <7>[201192.342978] front: 000001e0: 00 fe ff ff ff ff ff ff ff 00 00 00 00 08 00 00 ................
235 <7>[201192.342984] front: 000001f0: 00 00 00 00 00 01 00 00 00 00 00 00 00 55 0d 00 .............U..
236 <7>[201192.342990] front: 00000200: 00 00 00 00 00 d2 18 00 00 00 00 00 00 03 00 00 ................
237 <7>[201192.342996] front: 00000210: 00 00 00 00 00 01 00 00 00 00 00 00 00 01 00 00 ................
238 <7>[201192.343001] front: 00000220: 40 00 01 00 00 00 00 00 40 00 00 00 00 00 00 00 @.......@.......
239 <7>[201192.343007] front: 00000230: 00 00 00 00 00 00 03 00 00 00 2f f1 03 66 54 18 ........../..fT.
240 <7>[201192.343012] front: 00000240: c0 2b 2f f1 03 66 d7 26 73 2b 2f f1 03 66 d7 26 .+/..f.&s+/..f.&
241 <7>[201192.343018] front: 00000250: 73 2b 02 00 00 00 00 00 00 00 00 00 00 00 00 00 s+..............
242 <7>[201192.343024] front: 00000260: 00 00 00 00 00 00 ff ff ff ff ff ff ff ff 01 00 ................
243 <7>[201192.343029] front: 00000270: 00 00 a4 81 00 00 e8 03 00 00 e8 03 00 00 01 00 ................
244 <7>[201192.343035] front: 00000280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
245 <7>[201192.343041] front: 00000290: 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 ................
246 <7>[201192.343047] front: 000002a0: 00 00 00 00 00 00 00 00 00 00 2f f1 03 66 54 18 ........../..fT.
247 <7>[201192.343053] front: 000002b0: c0 2b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 .+..............
248 <7>[201192.343058] front: 000002c0: 00 00 04 00 00 00 00 00 00 00 ff ff ff ff ff ff ................
249 <7>[201192.343064] front: 000002d0: ff ff 00 00 00 00 01 01 10 00 00 00 00 00 00 00 ................
250 <7>[201192.343069] front: 000002e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
251 <7>[201192.343075] front: 000002f0: 2f f1 03 66 4f 6f c9 2a 01 00 00 00 00 00 00 00 /..fOo.*........
252 <7>[201192.343083] front: 00000300: ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ................
253 <7>[201192.343088] front: 00000310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
254 <7>[201192.343094] front: 00000320: 00 40 00 00 00 01 00 00 00 00 00 00 00 03 00 00 .@..............
255 <7>[201192.343100] front: 00000330: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 ................
256 <7>[201192.343105] front: 00000340: 00 00 00 00 00 03 00 00 00 00 00 00 00 02 00 00 ................
257 <7>[201192.343111] front: 00000350: 00 00 00 00 00 03 00 00 00 00 00 00 00 02 00 00 ................
258 <7>[201192.343116] front: 00000360: 00 00 00 00 00 .....
259 <7>[201192.343122] footer: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
260 <7>[201192.343128] footer: 00000010: 00 00 00 00 00 .....
261 <7>[201192.343195] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] ceph_mdsc_wait_request: do_request waited, got 0
262 <7>[201192.343206] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] ceph_mdsc_do_request: do_request 0000000071b24ca5 done, result -5
263 <7>[201192.343289] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] __ceph_put_cap_refs: 0000000039d46bc2 10000007491.fffffffffffffffe had p
264 <7>[201192.343302] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] ceph_unreserve_caps: ctx=000000008741ca3c count=20
265 <7>[201192.343311] ceph: [9a8fd138-5876-4325-af3b-ba7f972e5776 9426] __ceph_unreserve_caps: caps 25 = 5 used + 0 resv + 20 avail
</pre>
rgw - Bug #65216 (New): rgw: only accept valid ipv4 from host header
https://tracker.ceph.com/issues/65216
2024-03-29T00:30:04Z
Seena Fallah
<p>Right now the validation for ipv4 from the host header is based on the number of periods - this leads to accepting invalid ips.</p>
rgw - Bug #65212 (Fix Under Review): pubsub: validate Name in CreateTopic api
https://tracker.ceph.com/issues/65212
2024-03-28T16:24:24Z
Casey Bodley
cbodley@redhat.com
<p>prevent topic names that would confuse things like ARN parsing and rados object namespacing</p>
<p>from <a class="external" href="https://docs.aws.amazon.com/sns/latest/api/API_CreateTopic.html#API_CreateTopic_RequestParameters">https://docs.aws.amazon.com/sns/latest/api/API_CreateTopic.html#API_CreateTopic_RequestParameters</a></p>
<pre>
Name
The name of the topic you want to create.
Constraints: Topic names must be made up of only uppercase and lowercase ASCII letters, numbers, underscores, and hyphens, and must be between 1 and 256 characters long.
For a FIFO (first-in-first-out) topic, the name must end with the .fifo suffix.
Type: String
Required: Yes
</pre>
crimson - Bug #65203 (New): ReplicatedRecoveryBackend::recalc_subsets(ObjectRecoveryInfo&, crimso...
https://tracker.ceph.com/issues/65203
2024-03-28T15:00:23Z
Matan Breizman
<p>osd.3: <a class="external" href="https://pulpito.ceph.com/matan-2024-03-27_13:02:57-crimson-rados-main-distro-crimson-smithi/7626294">https://pulpito.ceph.com/matan-2024-03-27_13:02:57-crimson-rados-main-distro-crimson-smithi/7626294</a></p>
<p>After adding a restart OSDs to the thrash tests: <a class="external" href="https://github.com/ceph/ceph/pull/56511">https://github.com/ceph/ceph/pull/56511</a></p>
<pre><code class="text syntaxhl"><span class="CodeRay">DEBUG 2024-03-27 13:26:06,805 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): starting start_pg_operation
DEBUG 2024-03-27 13:26:06,805 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): start_pg_operation in await_active stage
DEBUG 2024-03-27 13:26:06,805 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): start_pg_operation active, entering await_map
DEBUG 2024-03-27 13:26:06,805 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): start_pg_operation await_map stage
DEBUG 2024-03-27 13:26:06,806 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): got map 26, entering get_pg_mapping
DEBUG 2024-03-27 13:26:06,806 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): can_create=false, target-core=2
DEBUG 2024-03-27 13:26:06,806 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): send 37 to the remote pg core 2
DEBUG 2024-03-27 13:26:06,806 [shard 2:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): entering create_or_wait_pg
DEBUG 2024-03-27 13:26:06,806 [shard 2:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): have_pg
DEBUG 2024-03-27 13:26:06,806 [shard 2:main] osd - 0x603000429b00 RecoverySubRequest::with_pg: RecoverySubRequest::with_pg: background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))}))
DEBUG 2024-03-27 13:26:06,806 [shard 2:main] osd - handle_pull_response: MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))}) v4
DEBUG 2024-03-27 13:26:06,806 [shard 2:main] osd - handle_pull_response ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false) ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false) data.size() is 1048576 data_included: [(655473, 716476), (2099033, 332100)]
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - pg_epoch 26 pg[3.d( v 26'20 lc 17'15 (0'0,26'20] local-lis/les=25/26 n=0 ec=14/14 lis/c=25/14 les/c/f=26/15/0 sis=25) [3,0] r=0 lpr=25 pi=[14,25)/1 luod=26'21 lua=21'18 crt=26'21 mlcod 17'15 active+recovering+undersized+degraded ObjectContextLoader::with_head_obc: object 3:bd1211d5:::smithi05531420-40:head
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - pg_epoch 26 pg[3.d( v 26'20 lc 17'15 (0'0,26'20] local-lis/les=25/26 n=0 ec=14/14 lis/c=25/14 les/c/f=26/15/0 sis=25) [3,0] r=0 lpr=25 pi=[14,25)/1 luod=26'21 lua=21'18 crt=26'21 mlcod 17'15 active+recovering+undersized+degraded ObjectContextLoader::get_or_load_obc: cache hit on 3:bd1211d5:::smithi05531420-40:head
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - resolve_oid oid.snap=1,head snapset.seq=1
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - pg_epoch 26 pg[3.d( v 26'20 lc 17'15 (0'0,26'20] local-lis/les=25/26 n=0 ec=14/14 lis/c=25/14 les/c/f=26/15/0 sis=25) [3,0] r=0 lpr=25 pi=[14,25)/1 luod=26'21 lua=21'18 crt=26'21 mlcod 17'15 active+recovering+undersized+degraded ObjectContextLoader::get_or_load_obc: cache miss on 3:bd1211d5:::smithi05531420-40:1
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - load_metadata: object 3:bd1211d5:::smithi05531420-40:1 doesn't exist, returning empty metadata
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - pg_epoch 26 pg[3.d( v 26'20 lc 17'15 (0'0,26'20] local-lis/les=25/26 n=0 ec=14/14 lis/c=25/14 les/c/f=26/15/0 sis=25) [3,0] r=0 lpr=25 pi=[14,25)/1 luod=26'21 lua=21'18 crt=26'21 mlcod 17'15 active+recovering+undersized+degraded ObjectContextLoader::load_obc: loaded obs 3:bd1211d5:::smithi05531420-40:1(0'0 unknown.0.0:0 s 0 uv 0 alloc_hint [0 0 0]) for 3:bd1211d5:::smithi05531420-40:1
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - pg_epoch 26 pg[3.d( v 26'20 lc 17'15 (0'0,26'20] local-lis/les=25/26 n=0 ec=14/14 lis/c=25/14 les/c/f=26/15/0 sis=25) [3,0] r=0 lpr=25 pi=[14,25)/1 luod=26'21 lua=21'18 crt=26'21 mlcod 17'15 active+recovering+undersized+degraded ObjectContextLoader::load_obc: returning obc 3:bd1211d5:::smithi05531420-40:1(0'0 unknown.0.0:0 s 0 uv 0 alloc_hint [0 0 0]) for 3:bd1211d5:::smithi05531420-40:1
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.0.0-2476-g56e21662/rpm/el9/BUILD/ceph-19.0.0-2476-g56e21662/src/crimson/osd/replicated_recovery_backend.cc:886: void ReplicatedRecoveryBackend::recalc_subsets(ObjectRecoveryInfo&, crimson::osd::SnapSetContextRef): Assertion `ssc' failed.
Aborting on shard 2.
Backtrace:
0# 0x00007F182BAA154C in /lib64/libc.so.6
1# raise in /lib64/libc.so.6
2# abort in /lib64/libc.so.6
3# 0x00007F182BA2871B in /lib64/libc.so.6
4# 0x00007F182BA4DCA6 in /lib64/libc.so.6
5# ReplicatedRecoveryBackend::recalc_subsets(ObjectRecoveryInfo&, boost::intrusive_ptr<crimson::osd::SnapSetContext>) in ceph-osd
</span></code></pre>
crimson - Bug #65201 (New): ReplicatedRecoveryBackend::prep_push_to_replica(const hobject_t&, eve...
https://tracker.ceph.com/issues/65201
2024-03-28T14:55:47Z
Matan Breizman
<p>osd.3: <a class="external" href="https://pulpito.ceph.com/matan-2024-03-27_13:02:57-crimson-rados-main-distro-crimson-smithi/7626293">https://pulpito.ceph.com/matan-2024-03-27_13:02:57-crimson-rados-main-distro-crimson-smithi/7626293</a></p>
<p>After adding a restart OSDs to the thrash tests: <a class="external" href="https://github.com/ceph/ceph/pull/56511">https://github.com/ceph/ceph/pull/56511</a></p>
<pre><code class="text syntaxhl"><span class="CodeRay">DEBUG 2024-03-27 13:27:01,678 [shard 0:main] osd - pg_epoch 45 pg[3.0( v 37'19 (0'0,37'19] local-lis/les=44/45 n=6 ec=14/14 lis/c=44/14 les/c/f=45/15/0 sis=44) [3,2,1] r=0 lpr=44 pi=[14,44)/1 crt=37'19 lcod 0'0 mlcod 0'0 active+recovering+degraded ObjectContextLoader::load_obc: returning obc 3:0254ed2b:::smithi01231316-5:8(37'18 client.4225.0:19 s 2067228 uv 3 alloc_hint [0 0 0]) for 3:0254ed2b:::smithi01231316-5:8
DEBUG 2024-03-27 13:27:01,678 [shard 0:main] osd - recover_object: loaded obc: 3:0254ed2b:::smithi01231316-5:8
DEBUG 2024-03-27 13:27:01,678 [shard 0:main] osd - prep_push_to_replica: 3:0254ed2b:::smithi01231316-5:8, 37'18
ERROR 2024-03-27 13:27:01,678 [shard 0:main] none - /home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.0.0-2476-g56e21662/rpm/el9/BUILD/ceph-19.0.0-2476-g56e21662/src/crimson/osd/replicated_recovery_backend.cc:347 : In function 'RecoveryBackend::interruptible_future<PushOp> ReplicatedRecoveryBackend::prep_push_to_replica(const hobject_t&, eversion_t, pg_shard_t)', ceph_assert(%s)
ssc
Aborting on shard 0.
Backtrace:
0# 0x00007F96396A154C in /lib64/libc.so.6
1# raise in /lib64/libc.so.6
2# abort in /lib64/libc.so.6
3# ceph::__ceph_assert_fail(ceph::assert_data const&) in ceph-osd
4# ReplicatedRecoveryBackend::prep_push_to_replica(hobject_t const&, eversion_t, pg_shard_t) in ceph-osd
</span></code></pre>
crimson - Bug #65200 (New): PeeringState::get_peer_info(pg_shard_t) const: Assertion `it != peer_...
https://tracker.ceph.com/issues/65200
2024-03-28T14:54:17Z
Matan Breizman
<p>osd.1: <a class="external" href="https://pulpito.ceph.com/matan-2024-03-27_13:02:57-crimson-rados-main-distro-crimson-smithi/7626293">https://pulpito.ceph.com/matan-2024-03-27_13:02:57-crimson-rados-main-distro-crimson-smithi/7626293</a></p>
<p>After adding a restart OSDs to the thrash tests: <a class="external" href="https://github.com/ceph/ceph/pull/56511">https://github.com/ceph/ceph/pull/56511</a></p>
<pre><code class="text syntaxhl"><span class="CodeRay">INFO 2024-03-27 13:27:01,801 [shard 0:main] osd - start_primary_recovery_ops recovering 0 in pg pg_epoch 45 pg[3.2( v 40'55 lc 36'54 (0'0,40'55] local-lis/les=44/45 n=0 ec=14/14 lis/c=44/14 les/c/f=45/15/0 sis=44) [1,0,3] r=0 lpr=44 pi=[14,44)/2 crt=40'55 mlcod 0'0 active+recovering , missing missing(1 may_include_deletes = 1)
INFO 2024-03-27 13:27:01,801 [shard 0:main] osd - start_primary_recovery_ops 3:48a442ac:::smithi01231316-12:head item.need 40'55 (missing) (missing head)
INFO 2024-03-27 13:27:01,801 [shard 0:main] osd - recover_missing 3:48a442ac:::smithi01231316-12:head v 40'55
INFO 2024-03-27 13:27:01,801 [shard 0:main] osd - recover_missing 3:48a442ac:::smithi01231316-12:head v 40'55, new recovery
DEBUG 2024-03-27 13:27:01,801 [shard 0:main] osd - recover_object: 3:48a442ac:::smithi01231316-12:head, 40'55
DEBUG 2024-03-27 13:27:01,801 [shard 0:main] osd - maybe_pull_missing_obj: 3:48a442ac:::smithi01231316-12:head, 40'55
DEBUG 2024-03-27 13:27:01,802 [shard 0:main] osd - pg_epoch 45 pg[3.2( v 40'55 lc 36'54 (0'0,40'55] local-lis/les=44/45 n=0 ec=14/14 lis/c=44/14 les/c/f=45/15/0 sis=44) [1,0,3] r=0 lpr=44 pi=[14,44)/2 crt=40'55 mlcod 0'0 active+recovering ObjectContextLoader::with_head_obc: object 3:48a442ac:::smithi01231316-12:head
INFO 2024-03-27 13:27:01,802 [shard 0:main] osd - start_primary_recovery_ops started 1 skipped 1
DEBUG 2024-03-27 13:27:01,802 [shard 0:main] osd - pg_epoch 45 pg[3.2( v 40'55 lc 36'54 (0'0,40'55] local-lis/les=44/45 n=0 ec=14/14 lis/c=44/14 les/c/f=45/15/0 sis=44) [1,0,3] r=0 lpr=44 pi=[14,44)/2 crt=40'55 mlcod 0'0 active+recovering ObjectContextLoader::get_or_load_obc: cache hit on 3:48a442ac:::smithi01231316-12:head
DEBUG 2024-03-27 13:27:01,802 [shard 0:main] osd - prepare_pull: 3:48a442ac:::smithi01231316-12:head, 40'55
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.0.0-2476-g56e21662/rpm/el9/BUILD/ceph-19.0.0-2476-g56e21662/src/osd/PeeringState.h:2349: const pg_info_t& PeeringState::get_peer_info(pg_shard_t) const: Assertion `it != peer_info.end()' failed.
Aborting on shard 0.
Backtrace:
Reactor stalled for 159 ms on shard 0. Backtrace: 0x6bddd 0xb99f089 0xb871b50 0xb8730cc 0xb8732e2 0xb873438 0xb873901 0x54daf 0x118a06 0x118829 0x6efa70b 0x6efca58 0x6efd993 0x6efde84 0x6efe723 0x6efedaf 0x6efef6b 0x6ef9294 0x6ef9685 0x6ef994c 0x54daf 0xa154b 0x54d05 0x287f2 0x2871a 0x4dca5 0x3f96f7c 0x4f4503f 0x4f5601c 0x4f5771b 0x4f578fb 0x4f57a79 0x3f5e56c 0x460952c 0x4609732 0x46098e6 0x461333a 0x4613515 0x4613839 0x4613b44 0x4613caf 0x4613d38 0x46175d6 0x461789a 0x461792e 0x46179e6 0x462b282 0x462b4fa 0x462b76d 0x4688845 0xb8847d5 0xb89ea6f 0xb93fa6d 0xb9410bb 0xb61d823 0xb61e19f 0x368057a 0x3feaf 0x3ff5f 0x346c434
kernel callstack: 0xffffffffffffff80 0xffffffff8e781dc1 0xffffffff8e782126 0xffffffff8e505d94 0xffffffff8e505f31 0xffffffff8e50733f 0xffffffff8e50801b 0xffffffff8e5084d0 0xffffffff8f07e45c 0xffffffff8f2000ea
Reactor stalled for 303 ms on shard 0. Backtrace: 0x6bddd 0xb99f089 0xb871b50 0xb8730cc 0xb8732e2 0xb873438 0xb873901 0x54daf 0x195b59 0x6efa069 0x6efc6cb 0x6efd993 0x6efde84 0x6efe723 0x6efedaf 0x6efef6b 0x6ef9294 0x6ef9685 0x6ef994c 0x54daf 0xa154b 0x54d05 0x287f2 0x2871a 0x4dca5 0x3f96f7c 0x4f4503f 0x4f5601c 0x4f5771b 0x4f578fb 0x4f57a79 0x3f5e56c 0x460952c 0x4609732 0x46098e6 0x461333a 0x4613515 0x4613839 0x4613b44 0x4613caf 0x4613d38 0x46175d6 0x461789a 0x461792e 0x46179e6 0x462b282 0x462b4fa 0x462b76d 0x4688845 0xb8847d5 0xb89ea6f 0xb93fa6d 0xb9410bb 0xb61d823 0xb61e19f 0x368057a 0x3feaf 0x3ff5f 0x346c434
kernel callstack:
Reactor stalled for 539 ms on shard 0. Backtrace: 0x6bddd 0xb99f089 0xb871b50 0xb8730cc 0xb8732e2 0xb873438 0xb873901 0x54daf 0x195b53 0x6efa069 0x6efe1dd 0x6efe723 0x6efedaf 0x6efef6b 0x6ef9294 0x6ef9685 0x6ef994c 0x54daf 0xa154b 0x54d05 0x287f2 0x2871a 0x4dca5 0x3f96f7c 0x4f4503f 0x4f5601c 0x4f5771b 0x4f578fb 0x4f57a79 0x3f5e56c 0x460952c 0x4609732 0x46098e6 0x461333a 0x4613515 0x4613839 0x4613b44 0x4613caf 0x4613d38 0x46175d6 0x461789a 0x461792e 0x46179e6 0x462b282 0x462b4fa 0x462b76d 0x4688845 0xb8847d5 0xb89ea6f 0xb93fa6d 0xb9410bb 0xb61d823 0xb61e19f 0x368057a 0x3feaf 0x3ff5f 0x346c434
kernel callstack:
Reactor stalled for 975 ms on shard 0. Backtrace: 0x6bddd 0xb99f089 0xb871b50 0xb8730cc 0xb8732e2 0xb873438 0xb873901 0x54daf 0x195bc1 0x6efa069 0x6efc6cb 0x6efd006 0x6efd5f7 0x6efd7b2 0x6efdcdf 0x6efe723 0x6efedaf 0x6efef6b 0x6ef9294 0x6ef9685 0x6ef994c 0x54daf 0xa154b 0x54d05 0x287f2 0x2871a 0x4dca5 0x3f96f7c 0x4f4503f 0x4f5601c 0x4f5771b 0x4f578fb 0x4f57a79 0x3f5e56c 0x460952c 0x4609732 0x46098e6 0x461333a 0x4613515 0x4613839 0x4613b44 0x4613caf 0x4613d38 0x46175d6 0x461789a 0x461792e 0x46179e6 0x462b282 0x462b4fa 0x462b76d 0x4688845 0xb8847d5 0xb89ea6f 0xb93fa6d 0xb9410bb 0xb61d823 0xb61e19f 0x368057a 0x3feaf 0x3ff5f 0x346c434
kernel callstack:
0# 0x00007F0AE5EA154C in /lib64/libc.so.6
1# raise in /lib64/libc.so.6
2# abort in /lib64/libc.so.6
3# 0x00007F0AE5E2871B in /lib64/libc.so.6
4# 0x00007F0AE5E4DCA6 in /lib64/libc.so.6
5# PeeringState::get_peer_info(pg_shard_t) const in ceph-osd
6# ReplicatedRecoveryBackend::prepare_pull(boost::intrusive_ptr<crimson::osd::ObjectContext> const&, PullOp&, RecoveryBackend::pull_info_t&, hobject_t const&, eversion_t) in ceph-
</span></code></pre>
Ceph - Bug #65199 (New): autoscaler: Scale PGs based on number of objects
https://tracker.ceph.com/issues/65199
2024-03-28T12:42:52Z
Niklas Hambuechen
<p>Ceph's autoscaler scales PGs based on Bytes stored. It seemingly ignores number of objects. This creates problems for pools with many small files.</p>
<p>It creates even more problems for pools with an apparent byte size of 0, but millions of objects; such pools get created when following CephFS-on-EC best practices in the docs.</p>
<p>Red Hat docs describe:</p>
<p><a class="external" href="https://access.redhat.com/documentation/de-de/red_hat_ceph_storage/4/html/storage_strategies_guide/placement_groups_pgs#viewing-placement-group-scaling-recommendations">https://access.redhat.com/documentation/de-de/red_hat_ceph_storage/4/html/storage_strategies_guide/placement_groups_pgs#viewing-placement-group-scaling-recommendations</a></p>
<blockquote>
<p><strong>BIAS</strong>, is a pool property that is used by the PG autoscaler to scale some pools faster than others, in terms of number of PGs. It is essentially a multiplier used to give more PG to a pool than the default number of PGs. This property is <strong>particularly used for metadata pools which might be small in size but have large number of objects</strong>, so scaling them faster is important for better performance.`</p>
</blockquote>
<p>(Note these docs are better than the upstream Ceph docs on BIAS, which are much shorter: <a class="external" href="https://docs.ceph.com/en/reef/rados/operations/placement-groups/">https://docs.ceph.com/en/reef/rados/operations/placement-groups/</a>)</p>
<p>So this confirms that BIAS (pg_autoscale_bias) can be used to partially address the "many small objects", using a constant factor.</p>
<p>But the constant factor stops working when the objects are 0-sized.</p>
<p>This happens when following CephFS best practices: <a class="external" href="https://docs.ceph.com/en/reef/cephfs/createfs/#creating-pools">https://docs.ceph.com/en/reef/cephfs/createfs/#creating-pools</a></p>
<blockquote>
<p>The data pool used to create the file system is the “default” data pool and the location for storing all inode backtrace information, which is used for hard link management and disaster recovery. For this reason, all CephFS inodes have at least one object in the default data pool.<br /><strong>If erasure-coded pools are planned for file system data, it is best to configure the default as a replicated pool</strong> to improve small-object write and read performance when updating backtraces. Separately, another erasure-coded data pool can be added (see also Erasure code) that can be used on an entire hierarchy of directories and files (see also File layouts).</p>
</blockquote>
<p>If you do what is described here ("default" pool on replicated, directory File Layout on EC), you end up with pools like this in `ceph df`:</p>
<pre>
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 1 1 203 MiB 26 609 MiB 90.00 5 GiB
data 2 32 0 B 112.23M 0 B 0 61 TiB
data_ec 3 168 124 TiB 115.30M 186 TiB 50.53 121 TiB
metadata 4 128 63 GiB 32.87k 189 GiB 90.00 5 GiB
</pre>
<p>Note how <strong>the `data` pool that stores the inodes bas 112 M objects but 0 Bytes stored</strong>. Apparently the inodes</p>
<p>Because the data size is low (0), the autoscaler assigns no more than 32 PGs.</p>
<p>This means that there are ~4 M objects per PG. If the objects are on HDD that can do 100 seeks per second, running scrubbing, recovery, or balancing (which needs to seek all objects in a PG) will <strong>take at least 11 hours</strong>. And this does not even take EC overhead factors into account.</p>
<p>If there were 1 B objects, handling a single PG would take > 100 hours.</p>
<p>There seems to be nothing in Ceph that scales PGs based on number of objects. This issue requests that to be added.</p>
<p>This would:</p>
<ul>
<li>Fix that CephFS EC recommendations actually make sense and do not produce operational problems.</li>
<li>Improve Ceph's default behaviour for many small files/objects, without the user manually having to set BIAS.</li>
</ul>
rgw - Bug #65188 (Fix Under Review): rgwlc: Executing radosgw-admin lc process --bucket <bkt-name...
https://tracker.ceph.com/issues/65188
2024-03-27T22:30:11Z
Matt Benjamin
mbenjamin@redhat.com
<p>[LC-Process]: Executing radosgw-admin lc process --bucket <bkt-name> without setting lc rule results in Segmentation fault</p>
<p>Description of problem:<br />[LC-Process]: Executing radosgw-admin lc process --bucket <bkt-name> without setting lc rule results in Segmentation fault</p>
<p>Version-Release number of selected component (if applicable):<br />ceph version 18.2.1-73.el9cp</p>
<p>How reproducible:<br />3/3</p>
<p>Steps to Reproduce:<br />1. Deploy cluster with: ceph version 18.2.1-73.el9cp<br />2. Create a bucket: <bkt_name><br />3. Upload object to the bucket<br />4. Perform: radosgw-admin lc process --bucket <bkt_name></p>
<p>Actual results:<br />Throwing error:</p>
<ul>
<li>Caught signal (Segmentation fault) <b><br /> in thread 7f74eed29800 thread_name:radosgw-admin<br /> ceph version 18.2.1-73.el9cp (16d1bc4bed21ede5993c301b4626fa21cbe97cff) reef (stable)<br /> 1: /lib64/libc.so.6(+0x54db0) [0x7f74ef254db0]<br /> 2: (RGWLC::process_bucket(int, int, RGWLC::LCWorker*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)+0x2b6) [0x556e82e69626]<br /> 3: (RGWLC::process(RGWLC::LCWorker*, std::unique_ptr<rgw::sal::Bucket, std::default_delete<rgw::sal::Bucket> > const&, bool)+0xb7) [0x556e82e6d1a7]<br /> 4: (RGWRados::process_lc(std::unique_ptr<rgw::sal::Bucket, std::default_delete<rgw::sal::Bucket> > const&)+0xdd) [0x556e831d409d]<br /> 5: main()<br /> 6: /lib64/libc.so.6(+0x3feb0) [0x7f74ef23feb0]<br /> 7: __libc_start_main()<br /> 8: _start()<br />2024-03-20T02:17:03.968-0400 7f74eed29800 -1 <strong></b> Caught signal (Segmentation fault) *</strong><br /> in thread 7f74eed29800 thread_name:radosgw-admin</li>
</ul>
Orchestrator - Bug #65187 (New): upgrade/quincy-x/stress-split: upgrade test fails to install qui...
https://tracker.ceph.com/issues/65187
2024-03-27T22:18:47Z
Laura Flores
<pre><code class="text syntaxhl"><span class="CodeRay">2024-03-22T06:52:10.566 DEBUG:teuthology.packaging:Querying https://shaman.ceph.com/api/search?status=ready&project=ceph&flavor=default&distros=ubuntu%2F22.04%2Fx86_64&ref=quincy
2024-03-22T06:52:10.571 INFO:teuthology.orchestra.run.smithi031.stdout:uid [ unknown] Ceph automated package build (Ceph automated package build) <sage@newdream.net>
2024-03-22T06:52:10.571 INFO:teuthology.orchestra.run.smithi031.stdout:uid [ unknown] Ceph.com (release key) <security@ceph.com>
2024-03-22T06:52:10.572 INFO:teuthology.task.install.deb:Installing packages: ceph, cephadm, ceph-mds, ceph-mgr, ceph-common, ceph-fuse, ceph-test, radosgw, python3-rados, python3-rgw, python3-cephfs, python3-rbd, libcephfs2, libcephfs-dev, librados2, librbd1, rbd-fuse on remote deb x86_64
2024-03-22T06:52:10.572 WARNING:teuthology.packaging:More than one of ref, tag, branch, or sha1 supplied; using branch
2024-03-22T06:52:10.572 INFO:teuthology.packaging:ref: None
2024-03-22T06:52:10.572 INFO:teuthology.packaging:tag: None
2024-03-22T06:52:10.572 INFO:teuthology.packaging:branch: quincy
2024-03-22T06:52:10.572 INFO:teuthology.packaging:sha1: db0330b1e4e2470d52b750e251e55a522b4f7d69
2024-03-22T06:52:10.572 DEBUG:teuthology.packaging:Querying https://shaman.ceph.com/api/search?status=ready&project=ceph&flavor=default&distros=ubuntu%2F22.04%2Fx86_64&ref=quincy
2024-03-22T06:52:10.709 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/contextutil.py", line 30, in nested
vars.append(enter())
File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/task/install/__init__.py", line 218, in install
install_packages(ctx, package_list, config)
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/task/install/__init__.py", line 81, in install_packages
p.spawn(
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/parallel.py", line 84, in __exit__
for result in self:
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/parallel.py", line 98, in __next__
resurrect_traceback(result)
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/parallel.py", line 30, in resurrect_traceback
raise exc.exc_info[1]
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/parallel.py", line 23, in capture_traceback
return func(*args, **kwargs)
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/task/install/deb.py", line 79, in _update_package_list_and_install
log.info('Pulling from %s', builder.base_url)
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/packaging.py", line 554, in base_url
return self._get_base_url()
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/packaging.py", line 856, in _get_base_url
self.assert_result()
File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/packaging.py", line 937, in assert_result
raise VersionNotFoundError(self._result.url)
teuthology.exceptions.VersionNotFoundError: Failed to fetch package version from https://shaman.ceph.com/api/search/?status=ready&project=ceph&flavor=default&distros=ubuntu%2F22.04%2Fx86_64&ref=quincy
2024-03-22T06:52:10.711 ERROR:teuthology.run_tasks:Saw exception from tasks.
</span></code></pre>
RADOS - Bug #65186 (New): OSDs unreachable in upgrade test
https://tracker.ceph.com/issues/65186
2024-03-27T20:28:19Z
Laura Flores
<p>/a/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/7616011/remote/smithi087/log/a8e8c570-e819-11ee-95cd-87774f69a715<br /><pre><code class="text syntaxhl"><span class="CodeRay">2024-03-22T07:19:18.215315+0000 mon.a (mon.0) 10 : cluster 0 Standby manager daemon x restarted
2024-03-22T07:19:18.215450+0000 mon.a (mon.0) 11 : cluster 0 Standby manager daemon x started
2024-03-22T07:19:18.215315+0000 mon.a (mon.0) 10 : cluster 0 Standby manager daemon x restarted
2024-03-22T07:19:18.215450+0000 mon.a (mon.0) 11 : cluster 0 Standby manager daemon x started
2024-03-22T07:19:18.277027+0000 mon.a (mon.0) 12 : cluster 0 mgrmap e33: y(active, since 63s), standbys: x
2024-03-22T07:19:18.414028+0000 mon.a (mon.0) 13 : cluster 1 Active manager daemon y restarted
2024-03-22T07:19:18.414630+0000 mon.a (mon.0) 14 : cluster 4 Health check failed: 8 osds(s) are not reachable (OSD_UNREACHABLE)
2024-03-22T07:19:18.414953+0000 mon.a (mon.0) 15 : cluster 1 Activating manager daemon y
2024-03-22T07:19:18.427127+0000 mon.a (mon.0) 16 : cluster 0 osdmap e81: 8 total, 8 up, 8 in
2024-03-22T07:19:18.277027+0000 mon.a (mon.0) 12 : cluster 0 mgrmap e33: y(active, since 63s), standbys: x
2024-03-22T07:19:18.427673+0000 mon.a (mon.0) 17 : cluster 0 mgrmap e34: y(active, starting, since 0.0129348s), standbys: x
2024-03-22T07:19:18.414028+0000 mon.a (mon.0) 13 : cluster 1 Active manager daemon y restarted
2024-03-22T07:19:18.433869+0000 osd.4 (osd.4) 3 : cluster 3 failed to encode map e81 with expected crc
2024-03-22T07:19:18.435418+0000 osd.2 (osd.2) 3 : cluster 3 failed to encode map e81 with expected crc
2024-03-22T07:19:18.414630+0000 mon.a (mon.0) 14 : cluster 4 Health check failed: 8 osds(s) are not reachable (OSD_UNREACHABLE)
2024-03-22T07:19:18.443967+0000 osd.4 (osd.4) 4 : cluster 3 failed to encode map e81 with expected crc
</span></code></pre></p>
<p>Likely connected to <a class="external" href="https://tracker.ceph.com/issues/63389">https://tracker.ceph.com/issues/63389</a>.</p>
RADOS - Bug #65185 (New): OSD_SCRUB_ERROR, inconsistent pg in upgrade tests
https://tracker.ceph.com/issues/65185
2024-03-27T20:21:29Z
Laura Flores
<p>/a/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/7616025/remote/smithi098/log/b1f19696-e81a-11ee-95cd-87774f69a715/ceph.log.gz<br /><pre><code class="text syntaxhl"><span class="CodeRay">2024-03-22T09:20:00.000187+0000 mon.a (mon.0) 7863 : cluster 4 [ERR] OSD_SCRUB_ERRORS: 1 scrub errors
2024-03-22T09:20:00.000194+0000 mon.a (mon.0) 7864 : cluster 4 [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
2024-03-22T09:19:59.897409+0000 mon.a (mon.0) 7860 : cluster 0 osdmap e3595: 8 total, 8 up, 8 in
2024-03-22T09:20:00.000202+0000 mon.a (mon.0) 7865 : cluster 4 pg 103.14 is active+clean+inconsistent, acting [5,1,2]
2024-03-22T09:20:00.000151+0000 mon.a (mon.0) 7861 : cluster 4 Health detail: HEALTH_ERR noscrub flag(s) set; 1 scrub errors; Possible data damage: 1 pg inconsistent
</span></code></pre></p>
<p>More in this run: <a class="external" href="https://pulpito.ceph.com/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/">https://pulpito.ceph.com/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/</a></p>
RADOS - Bug #65183 (Fix Under Review): Overriding an EC pool needs the "--yes-i-really-mean-it" f...
https://tracker.ceph.com/issues/65183
2024-03-27T16:23:12Z
Laura Flores
<p>/a/yuriw-2024-03-26_14:32:05-rados-wip-yuri8-testing-2024-03-25-1419-distro-default-smithi/7623454<br /><pre><code class="text syntaxhl"><span class="CodeRay">2024-03-26T20:13:29.028 INFO:tasks.workunit.client.0.smithi110.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:35: expect_false: set -x
2024-03-26T20:13:29.028 INFO:tasks.workunit.client.0.smithi110.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:36: expect_false: ceph osd erasure-code-profile set fooprofile a=b c=d e=f
2024-03-26T20:13:29.301 INFO:tasks.workunit.client.0.smithi110.stderr:Error EPERM: will not override erasure code profile fooprofile because the existing profile {a=b,c=d,crush-device-class=,crush-failure-domain=osd,crush-num-failure-domains=0,crush-osds-per-failure-domain=0,crush-root=default,jerasure-per-chunk-alignment=false,k=2,m=1,plugin=jerasure,technique=reed_sol_van,w=8} is different from the proposed profile {a=b,c=d,crush-device-class=,crush-failure-domain=osd,crush-num-failure-domains=0,crush-osds-per-failure-domain=0,crush-root=default,e=f,jerasure-per-chunk-alignment=false,k=2,m=1,plugin=jerasure,technique=reed_sol_van,w=8}
2024-03-26T20:13:29.304 INFO:tasks.workunit.client.0.smithi110.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:36: expect_false: return 0
2024-03-26T20:13:29.304 INFO:tasks.workunit.client.0.smithi110.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2493: test_mon_osd_erasure_code: ceph osd erasure-code-profile set fooprofile a=b c=d e=f --force
2024-03-26T20:13:29.581 INFO:tasks.workunit.client.0.smithi110.stderr:Error EPERM: overriding erasure code profile can be DANGEROUS; add --yes-i-really-mean-it to do it anyway
2024-03-26T20:13:29.585 INFO:tasks.workunit.client.0.smithi110.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:1: test_mon_osd_erasure_code: rm -fr /tmp/cephtool.wZw
2024-03-26T20:13:29.586 DEBUG:teuthology.orchestra.run:got remote process result: 1
2024-03-26T20:13:29.587 INFO:tasks.workunit:Stopping ['cephtool'] on client.0...
</span></code></pre></p>
<p>Here's the test that fails:<br />qa/workunits/cephtool/test.sh<br /><pre><code class="text syntaxhl"><span class="CodeRay">function test_mon_osd_erasure_code()
{
ceph osd erasure-code-profile set fooprofile a=b c=d
ceph osd erasure-code-profile set fooprofile a=b c=d
expect_false ceph osd erasure-code-profile set fooprofile a=b c=d e=f
ceph osd erasure-code-profile set fooprofile a=b c=d e=f --force ---------------------> this one
ceph osd erasure-code-profile set fooprofile a=b c=d e=f
expect_false ceph osd erasure-code-profile set fooprofile a=b c=d e=f g=h
# make sure rule-foo doesn't work anymore
expect_false ceph osd erasure-code-profile set barprofile ruleset-failure-domain=host
ceph osd erasure-code-profile set barprofile crush-failure-domain=host
# clean up
ceph osd erasure-code-profile rm fooprofile
ceph osd erasure-code-profile rm barprofile
# try weird k and m values
expect_false ceph osd erasure-code-profile set badk k=1 m=1
expect_false ceph osd erasure-code-profile set badk k=1 m=2
expect_false ceph osd erasure-code-profile set badk k=0 m=2
expect_false ceph osd erasure-code-profile set badk k=-1 m=2
expect_false ceph osd erasure-code-profile set badm k=2 m=0
expect_false ceph osd erasure-code-profile set badm k=2 m=-1
ceph osd erasure-code-profile set good k=2 m=1
ceph osd erasure-code-profile rm good
}
</span></code></pre></p>
teuthology - Bug #65181 (New): Scrape log not properly collected
https://tracker.ceph.com/issues/65181
2024-03-27T15:04:49Z
Laura Flores
<p>For the following run, the scrape log was almost empty despite the run having many failures.</p>
<p>/a/yuriw-2024-03-26_14:32:05-rados-wip-yuri8-testing-2024-03-25-1419-distro-default-smithi/scrape.log<br /><pre><code class="text syntaxhl"><span class="CodeRay">Found 304 jobs
Missing teuthology log /home/teuthworker/archive/yuriw-2024-03-26_14:32:05-rados-wip-yuri8-testing-2024-03-25-1419-distro-default-smithi/7623511/teuthology.log
</span></code></pre></p>
rgw - Bug #65179 (New): rgw incorrectly uses `Range` header in `X-Amz-Cache`
https://tracker.ceph.com/issues/65179
2024-03-27T14:25:25Z
Taha Jahangir
<p>As noted in RGW Data caching and CDN (<a class="external" href="https://docs.ceph.com/en/latest/radosgw/rgw-cache/">https://docs.ceph.com/en/latest/radosgw/rgw-cache/</a> commited in <a class="external" href="https://github.com/ceph/ceph/pull/33646">https://github.com/ceph/ceph/pull/33646</a>), the `X-Amz-Cache` header can be used to verify original headers/signature, but generating response with new `Range` header, but this doesn't work in action. And the response is returned with original `Range` header. A sample request/response is:</p>
<pre><code class="text syntaxhl"><span class="CodeRay">GET /temp/testfile HTTP/1.1
Host: myrgw.domain.com
x-amz-date: 20240327T134301Z
Authorization: AWS4-HMAC-SHA256 Credential=R612WE7A53PNXNZB4SUW/20240327/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-cache;x-amz-co....
Connection: close
x-amz-cache: .HOST.myrgw.domain.com.RANGE.bytes=10-20.X-AMZ-CONTENT-SHA256.e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855.X-AMZ-DATE.20240327....
Range: bytes=0-5242879
X-Request-ID: ba471c53e0256f05585a26ba988968fc
X-Real-IP: 10.76.74.227
X-Forwarded-For: 10.76.74.227
X-Forwarded-Host: myrgw.domain.com
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Forwarded-Scheme: https
X-Scheme: https
Accept-Encoding: identity
User-Agent: Boto3/1.34.23 md/Botocore#1.34.23 ua/2.0 os/linux#6.6.22-1-lts md/arch#x86_64 lang/python#3.9.16 md/pyimpl#CPython Botocore/1.34.23 Resource
X-Amz-Content-SHA256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
amz-sdk-invocation-id: d0b8677c-bcd2-4e65-8bf5-7499e7706396
amz-sdk-request: attempt=1
HTTP/1.1 206 Partial Content
Content-Length: 11
Content-Range: bytes 10-20/1371344
Accept-Ranges: bytes
Last-Modified: Tue, 11 Apr 2023 11:31:05 GMT
x-rgw-object-type: Normal
ETag: "444340706c6ec4d192b59d3f9a453525"
x-amz-meta-mtime: 1663901421.014806364
x-amz-request-id: tx00000a2d3b3ace625e991-0066042265-6b3e37f-myrgw
Content-Type: application/octet-stream
Date: Wed, 27 Mar 2024 13:43:01 GMT
Connection: close
........>..
</span></code></pre>
<p>Tested with Ceph v16.2.15 but bug should exists in all versions.</p>
<p>RGW Log (with level=0) and the req/res is attached.</p>
rgw - Bug #65177 (New): reef: Syscall param write(buf) points to uninitialised byte(s)
https://tracker.ceph.com/issues/65177
2024-03-27T13:42:52Z
Casey Bodley
cbodley@redhat.com
<p>saw on several jobs in <a class="external" href="https://pulpito.ceph.com/cbodley-2024-03-26_12:30:03-rgw-wip-63856-reef-distro-default-smithi/">https://pulpito.ceph.com/cbodley-2024-03-26_12:30:03-rgw-wip-63856-reef-distro-default-smithi/</a></p>
<p><a class="external" href="https://qa-proxy.ceph.com/teuthology/cbodley-2024-03-26_12:30:03-rgw-wip-63856-reef-distro-default-smithi/7623215/teuthology.log">https://qa-proxy.ceph.com/teuthology/cbodley-2024-03-26_12:30:03-rgw-wip-63856-reef-distro-default-smithi/7623215/teuthology.log</a></p>
<p><a class="external" href="https://qa-proxy.ceph.com/teuthology/cbodley-2024-03-26_12:30:03-rgw-wip-63856-reef-distro-default-smithi/7623215/remote/smithi060/log/valgrind/ceph.client.0.log.gz">https://qa-proxy.ceph.com/teuthology/cbodley-2024-03-26_12:30:03-rgw-wip-63856-reef-distro-default-smithi/7623215/remote/smithi060/log/valgrind/ceph.client.0.log.gz</a></p>
<pre><code class="xml syntaxhl"><span class="CodeRay"><span class="tag"><error></span>
<span class="tag"><unique></span>0x0<span class="tag"></unique></span>
<span class="tag"><tid></span>1<span class="tag"></tid></span>
<span class="tag"><kind></span>SyscallParam<span class="tag"></kind></span>
<span class="tag"><what></span>Syscall param write(buf) points to uninitialised byte(s)<span class="tag"></what></span>
<span class="tag"><stack></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x78D9E5D<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/libc.so.6<span class="tag"></obj></span>
<span class="tag"><fn></span>syscall<span class="tag"></fn></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x9962941<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/libunwind.so.8.0.1<span class="tag"></obj></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x9962A57<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/libunwind.so.8.0.1<span class="tag"></obj></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x9967179<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/libunwind.so.8.0.1<span class="tag"></obj></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x99681A1<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/libunwind.so.8.0.1<span class="tag"></obj></span>
<span class="tag"><fn></span>_ULx86_64_step<span class="tag"></fn></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x6F5871A<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/libtcmalloc.so.4.5.9<span class="tag"></obj></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x6F57C6F<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/libtcmalloc.so.4.5.9<span class="tag"></obj></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x6F3E371<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/libtcmalloc.so.4.5.9<span class="tag"></obj></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x6F3D9E6<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/libtcmalloc.so.4.5.9<span class="tag"></obj></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x400A1AD<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/ld-linux-x86-64.so.2<span class="tag"></obj></span>
<span class="tag"><fn></span>call_init<span class="tag"></fn></span>
<span class="tag"><dir></span>/usr/src/debug/glibc-2.34-82.el9.x86_64/elf<span class="tag"></dir></span>
<span class="tag"><file></span>dl-init.c<span class="tag"></file></span>
<span class="tag"><line></span>70<span class="tag"></line></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x400A1AD<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/ld-linux-x86-64.so.2<span class="tag"></obj></span>
<span class="tag"><fn></span>call_init<span class="tag"></fn></span>
<span class="tag"><dir></span>/usr/src/debug/glibc-2.34-82.el9.x86_64/elf<span class="tag"></dir></span>
<span class="tag"><file></span>dl-init.c<span class="tag"></file></span>
<span class="tag"><line></span>26<span class="tag"></line></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x400A29B<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/ld-linux-x86-64.so.2<span class="tag"></obj></span>
<span class="tag"><fn></span>_dl_init<span class="tag"></fn></span>
<span class="tag"><dir></span>/usr/src/debug/glibc-2.34-82.el9.x86_64/elf<span class="tag"></dir></span>
<span class="tag"><file></span>dl-init.c<span class="tag"></file></span>
<span class="tag"><line></span>117<span class="tag"></line></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x4020E79<span class="tag"></ip></span>
<span class="tag"><obj></span>/usr/lib64/ld-linux-x86-64.so.2<span class="tag"></obj></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0xD<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000A16<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000A1E<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000A2E<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000A7B<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000A7E<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000A87<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000A91<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000A96<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000A99<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000AB9<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000AC4<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000AE8<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000B02<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"><frame></span>
<span class="tag"><ip></span>0x1FFF000B36<span class="tag"></ip></span>
<span class="tag"></frame></span>
<span class="tag"></stack></span>
<span class="tag"><auxwhat></span>Address 0x1fff000000 is on thread 1's stack<span class="tag"></auxwhat></span>
<span class="tag"></error></span>
</span></code></pre>