Project

General

Profile

Bug #8565

Calamari Install hangs forever when ceph is not there.

Added by Warren Usui over 7 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
UI
Target version:
% Done:

100%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Steps:

Install calamari all the way until ceph-deploy calmari connect operation.

The web page shows ADD button for Hostys Requesting to be managed by Calamari

Click ADD button.

Accept Reuquest Sent window seems to wait forever for the cluster to join.

NOTE that Ceph has not yet been installed. This worked when Ceph was previously installed.

Also NOTE that this happens on Precise, I ran this on Precise because Bug 8558 is preventing
yum installations from getting this far.

192_168_100_59_8000_manage___first.png View (41.1 KB) Yan-Fa Li, 06/10/2014 11:26 PM


Subtasks

Bug #8566: Calamari Installation -- Asks for ceph installation when ceph is already there.ClosedJohn Spray

History

#1 Updated by John Spray over 7 years ago

  • Category set to UI

#2 Updated by Yan-Fa Li over 7 years ago

I'm confused warren. Can you manually verify using

/api/v2/cluster that a cluster is actually recognized by Calamari?

I don't understand your methodology. Is this from a completely fresh cluster and calamari install or are you re-installing on top of an old one?

I'd love to know the exact steps you are taking to get into this state. BTW, there is no way to for calamari to run without a valid FSID. There are 2 pre-requisites for Calamari to even function in a reasonable way:

1. salt minions on each cluster node
2. a working ceph installation on those nodes in a valid and functioning cluster

So my questions are:

1. what steps have you completed?
2. what state is the cluster in?
3. has salt been successfully deployed to all members of the cluster?

Additionally, I would love it if you could take screen caps at all steps of the UI so I can see what you are seeing. Telling me in text is not very easy. I recommend Evernote Web Clipper. It has a built in annotation and screen capture facility and works on any chrome and firefox browser. Accounts are free.

Thanks!

#3 Updated by Dan Mick over 7 years ago

It doesn't make any sense to "ceph-deploy calamari connect" a host that isn't part of a Ceph cluster...

#4 Updated by John Spray over 7 years ago

Dan Mick wrote:

It doesn't make any sense to "ceph-deploy calamari connect" a host that isn't part of a Ceph cluster...

Perhaps for the moment, but when Calamari gets its provisioning powers this will be the normal order: first add your servers to calamari, then set up Ceph. The only part that is opinionated about the order at the moment is the UI with this "detecting clusters" spinner. It should never be stalling forever though: worst case it should drop back to the "Ceph servers are connected to Calamari, but no Ceph cluster has been created yet" prompt mentioned in #8566.

Yan - is there a timeout mechanism in place here? How long should Warren have to wait for the UI to give up waiting for a cluster and fall back to something else?

#5 Updated by Yan-Fa Li over 7 years ago

@John I can certainly add a timeout. If the calamari doesn't register a new cluster within say 3 minutes. I'll pop up a message. This however is a proxy for progress indicators.

How does this sound:

1. try for 3 minutes to wait for a cluster to be registered
2. if > 3 minutes display a message explaining no cluster was discovered
and asking user if they have installed ceph yet?

#6 Updated by Warren Usui over 7 years ago

Here's what I did.

1. Reimaged two vms

2. Established an ssh connection between the two vms.

3. Ran ice_setup.py on machine A

4. Ran calamari-clt initialize on machine A.

5. Started a browser and went to machine A.

6. Ran ceph-deploy calamari connect

7. Clicked the ADD button on the browser to add the host to be recognized by Calamari.

8. Went out to get some coffee.

9. Came back and ^C'ed out of the screen that said Accept Request sent (The green button in the lower right of the screen did not work.

The bug being reported here was that the Accept Request sent window took several minutes and still did not indicate completion.
Bug 8566 is what I did next.

#7 Updated by Yan-Fa Li over 7 years ago

Fixed on master, commit f4d61aa6c4ae28102d290279c31d3020bd5c9015.

- add a 3 minute timeout to polling loop
- add a warning message that the cluster was not initialized and
alternative steps are needed
- add a count down timer to modal to display remaining time to wait
- add a skip button to UI to bypass the waiting period

See screenshot for new message. This probably needs a tech writer to word smith it.

There are 3 use cases this should now handle;

1. cluster never gets discovered by Calamari. Goes to this message after 3 minutes.
2. user clicks Skip. Goes to this message.
3. Cluster is discovered as normal.

#8 Updated by Yan-Fa Li over 7 years ago

Some minor fixes on commit c8795c992692cb1107ab70ef08c9282ecf832ef0

- disable timeouts once cluster is connected
- disable skip button once cluster is up
- hide timeout message once cluster is up

#9 Updated by Warren Usui over 7 years ago

  • Status changed from Resolved to Closed

This appears to be fixed

Also available in: Atom PDF