Bug #6226
closedafter editing crushmap and adding new hosts, injecting it, several existing OSD crashed
0%
Description
I have edited the crushmap on our 70 OSD, 10 server, 0.61.5 ceph cluster (see attached file) and injected it.
I had trouble with the new OSDs (36, 65, 66, 67, 68, 69) being in the wrong hosts, so I set them using:
ceph osd crush set 68 0.5 root=ssd host=h4ssd
This added them to the correct spot in the hierarchy.
However around 10 of my existing OSD processes crashed with the following logs (see attachment) - search for "assert" to see the crash
Not sure if there is a relation or not....
Files
Updated by Samuel Just over 10 years ago
- Assignee changed from Samuel Just to David Zafman
Updated by David Zafman over 10 years ago
The bug description claims that cluster is running v0.61.5 but attached log says v0.61.7. Could there be a mix of nodes?
I haven't yet been able to reproduce with all machines running v0.61.7.
Updated by Jens-Christian Fischer over 10 years ago
I was wrong - we are indeed on 0.61.7
root@ineri ~$ ndo all_nodes ceph --version
h0 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
h1 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
h2 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
h3 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
h4 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
h5 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
s0 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
s1 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
s2 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
s4 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
This bug has not resurfaced (luckily) and I'm not particularly keen on trying to reproduce it :)
not sure what the best way forward is at this point.
/jc
Updated by David Zafman over 10 years ago
There is a race already fixed in a later release by (01d3e094) which could allow start_recovery_ops() to be called with a negative value for "max" arg. The way to see this assert is for recovery to be required on one or more replicas but not the primary. In that case no replica operations would be started and the code would attempt to transition to Recovered state.
I think we should add a call to needs_recovery() before assuming that recovery must be done. There are other possible obscure error paths that could result in this assert. That change could be backported too.
Updated by David Zafman over 10 years ago
- Status changed from New to Fix Under Review
Updated by David Zafman over 10 years ago
- Status changed from Fix Under Review to Resolved
This was already fixed by backport commit 1ea6b561 in v0.61.8 release. See previous comment.