Actions
Bug #374
closedmon: osd will null addr added to map
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
On Wido's cluster, saw
10.08.23_14:15:20.398952 7f7f5a5ca710 log [INF] : osd1 [2001:16f8:10:2::c3c3:a24f]:6800/19228 boot 10.08.23_14:15:20.399163 7f7f5a5ca710 log [INF] : osd3 [2001:16f8:10:2::c3c3:2e3a]:6800/23848 boot 10.08.23_14:15:20.399255 7f7f5a5ca710 log [INF] : osd2 [2001:16f8:10:2::c3c3:4a8c]:6800/7865 boot
and shortly after
10.08.23_14:15:32.438157 7f7f5a5ca710 log [INF] : osd1 :/0 boot 10.08.23_14:15:32.438356 7f7f5a5ca710 log [INF] : osd3 :/0 boot 10.08.23_14:15:32.438500 7f7f5a5ca710 log [INF] : osd2 :/0 boot
Map sequence was:
10209: osd1 in weight 1 up (up_from 10207 up_thru 10196 down_at 10204 last_clean 6116-10203) [2001:16f8:10:2::c3c3:a24f]:6800/19228 [2001:16f8:10:2::c3c3:a24f]:6801/19228 osd2 in weight 1 up (up_from 10207 up_thru 9970 down_at 10205 last_clean 6120-10204) [2001:16f8:10:2::c3c3:4a8c]:6800/7865 [2001:16f8:10:2::c3c3:4a8c]:6801/7865 osd3 in weight 1 up (up_from 10207 up_thru 10202 down_at 10205 last_clean 6124-10204) [2001:16f8:10:2::c3c3:2e3a]:6800/23848 [2001:16f8:10:2::c3c3:2e3a]:6801/23848 10210: osd1 in weight 1 down (up_from 10207 up_thru 10209 down_at 10210 last_clean 6116-10203) osd2 in weight 1 down (up_from 10207 up_thru 10207 down_at 10210 last_clean 6120-10204) osd3 in weight 1 down (up_from 10207 up_thru 10209 down_at 10210 last_clean 6124-10204) 10211: osd1 in weight 1 up (up_from 10211 up_thru 10209 down_at 10210 last_clean 6116-10203) :/0 [2001:16f8:10:2::c3c3:a24f]:6801/19228 osd2 in weight 1 up (up_from 10211 up_thru 10207 down_at 10210 last_clean 6120-10204) :/0 [2001:16f8:10:2::c3c3:4a8c]:6801/7865 osd3 in weight 1 up (up_from 10211 up_thru 10209 down_at 10210 last_clean 6124-10204) :/0 [2001:16f8:10:2::c3c3:2e3a]:6801/23848
No intervening failure note in mon log, so this was a mark down+up in the boot handler code.
Maybe osd sent dup boots, and the current vs pending checks in the monitor are off?
Updated by Sage Weil over 13 years ago
- Target version changed from v0.21.2 to v0.21.3
Updated by Sage Weil over 13 years ago
- Target version changed from v0.21.3 to v0.21.4
Updated by Sage Weil over 13 years ago
- Target version changed from v0.21.4 to v0.23
Updated by Sage Weil over 13 years ago
- Status changed from New to Can't reproduce
Couldn't find anything with code inspection, and haven't been able to reproduce. Hopefully if/when this pops up again we'll have full monitor logs.
Actions