Calamari Install hangs forever when ceph is not there.
Install calamari all the way until ceph-deploy calmari connect operation.
The web page shows ADD button for Hostys Requesting to be managed by Calamari
Click ADD button.
Accept Reuquest Sent window seems to wait forever for the cluster to join.
NOTE that Ceph has not yet been installed. This worked when Ceph was previously installed.
Also NOTE that this happens on Precise, I ran this on Precise because Bug 8558 is preventing
yum installations from getting this far.
#2 Updated by Yan-Fa Li over 7 years ago
I'm confused warren. Can you manually verify using
/api/v2/cluster that a cluster is actually recognized by Calamari?
I don't understand your methodology. Is this from a completely fresh cluster and calamari install or are you re-installing on top of an old one?
I'd love to know the exact steps you are taking to get into this state. BTW, there is no way to for calamari to run without a valid FSID. There are 2 pre-requisites for Calamari to even function in a reasonable way:
1. salt minions on each cluster node
2. a working ceph installation on those nodes in a valid and functioning cluster
So my questions are:
1. what steps have you completed?
2. what state is the cluster in?
3. has salt been successfully deployed to all members of the cluster?
Additionally, I would love it if you could take screen caps at all steps of the UI so I can see what you are seeing. Telling me in text is not very easy. I recommend Evernote Web Clipper. It has a built in annotation and screen capture facility and works on any chrome and firefox browser. Accounts are free.
#4 Updated by John Spray over 7 years ago
Dan Mick wrote:
It doesn't make any sense to "ceph-deploy calamari connect" a host that isn't part of a Ceph cluster...
Perhaps for the moment, but when Calamari gets its provisioning powers this will be the normal order: first add your servers to calamari, then set up Ceph. The only part that is opinionated about the order at the moment is the UI with this "detecting clusters" spinner. It should never be stalling forever though: worst case it should drop back to the "Ceph servers are connected to Calamari, but no Ceph cluster has been created yet" prompt mentioned in #8566.
Yan - is there a timeout mechanism in place here? How long should Warren have to wait for the UI to give up waiting for a cluster and fall back to something else?
#5 Updated by Yan-Fa Li over 7 years ago
@John I can certainly add a timeout. If the calamari doesn't register a new cluster within say 3 minutes. I'll pop up a message. This however is a proxy for progress indicators.
How does this sound:
1. try for 3 minutes to wait for a cluster to be registered
2. if > 3 minutes display a message explaining no cluster was discovered
and asking user if they have installed ceph yet?
#6 Updated by Warren Usui over 7 years ago
Here's what I did.
1. Reimaged two vms
2. Established an ssh connection between the two vms.
3. Ran ice_setup.py on machine A
4. Ran calamari-clt initialize on machine A.
5. Started a browser and went to machine A.
6. Ran ceph-deploy calamari connect
7. Clicked the ADD button on the browser to add the host to be recognized by Calamari.
8. Went out to get some coffee.
9. Came back and ^C'ed out of the screen that said Accept Request sent (The green button in the lower right of the screen did not work.
The bug being reported here was that the Accept Request sent window took several minutes and still did not indicate completion.
Bug 8566 is what I did next.
#7 Updated by Yan-Fa Li over 7 years ago
- File 192_168_100_59_8000_manage___first.png View added
- Status changed from New to Resolved
- Assignee set to Yan-Fa Li
- Target version set to v1.2-dev11
- Source changed from other to Q/A
Fixed on master, commit f4d61aa6c4ae28102d290279c31d3020bd5c9015.
- add a 3 minute timeout to polling loop
- add a warning message that the cluster was not initialized and
alternative steps are needed
- add a count down timer to modal to display remaining time to wait
- add a skip button to UI to bypass the waiting period
See screenshot for new message. This probably needs a tech writer to word smith it.
There are 3 use cases this should now handle;
1. cluster never gets discovered by Calamari. Goes to this message after 3 minutes.
2. user clicks Skip. Goes to this message.
3. Cluster is discovered as normal.