Project

General

Profile

Actions

Bug #52116

closed

kubeadm task fails with error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

Added by Neha Ojha over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
mgr/rook
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-08-06T17:08:58.760 INFO:teuthology.orchestra.run.smithi187.stdout:[kubelet-check] It seems like the kubelet isn't running or healthy.
2021-08-06T17:08:58.761 INFO:teuthology.orchestra.run.smithi187.stdout:[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
2021-08-06T17:08:58.761 INFO:teuthology.orchestra.run.smithi187.stdout:
2021-08-06T17:08:58.762 INFO:teuthology.orchestra.run.smithi187.stdout: Unfortunately, an error has occurred:
2021-08-06T17:08:58.762 INFO:teuthology.orchestra.run.smithi187.stdout:         timed out waiting for the condition
2021-08-06T17:08:58.762 INFO:teuthology.orchestra.run.smithi187.stdout:
2021-08-06T17:08:58.763 INFO:teuthology.orchestra.run.smithi187.stdout: This error is likely caused by:
2021-08-06T17:08:58.763 INFO:teuthology.orchestra.run.smithi187.stdout:         - The kubelet is not running
2021-08-06T17:08:58.763 INFO:teuthology.orchestra.run.smithi187.stdout:         - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
2021-08-06T17:08:58.763 INFO:teuthology.orchestra.run.smithi187.stdout:
2021-08-06T17:08:58.764 INFO:teuthology.orchestra.run.smithi187.stdout: If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
2021-08-06T17:08:58.764 INFO:teuthology.orchestra.run.smithi187.stdout:         - 'systemctl status kubelet'
2021-08-06T17:08:58.765 INFO:teuthology.orchestra.run.smithi187.stdout:         - 'journalctl -xeu kubelet'
2021-08-06T17:08:58.765 INFO:teuthology.orchestra.run.smithi187.stdout:
2021-08-06T17:08:58.765 INFO:teuthology.orchestra.run.smithi187.stdout: Additionally, a control plane component may have crashed or exited when started by the container runtime.
2021-08-06T17:08:58.765 INFO:teuthology.orchestra.run.smithi187.stdout: To troubleshoot, list all containers using your preferred container runtimes CLI.
2021-08-06T17:08:58.765 INFO:teuthology.orchestra.run.smithi187.stdout:
2021-08-06T17:08:58.766 INFO:teuthology.orchestra.run.smithi187.stdout: Here is one example how you may list all Kubernetes containers running in docker:
2021-08-06T17:08:58.766 INFO:teuthology.orchestra.run.smithi187.stdout:         - 'docker ps -a | grep kube | grep -v pause'
2021-08-06T17:08:58.766 INFO:teuthology.orchestra.run.smithi187.stdout:         Once you have found the failing container, you can inspect its logs with:
2021-08-06T17:08:58.766 INFO:teuthology.orchestra.run.smithi187.stdout:         - 'docker logs CONTAINERID'
2021-08-06T17:08:58.767 INFO:teuthology.orchestra.run.smithi187.stdout:
2021-08-06T17:08:58.767 DEBUG:teuthology.orchestra.run:got remote process result: 1
2021-08-06T17:08:58.768 INFO:teuthology.orchestra.run.smithi187.stderr:error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
2021-08-06T17:08:58.769 INFO:teuthology.orchestra.run.smithi187.stderr:To see the stack trace of this error execute with --v=5 or higher
2021-08-06T17:08:58.769 ERROR:tasks.kubeadm:Command failed on smithi187 with status 1: 'sudo kubeadm init --node-name smithi187 --token abcdef.8trb50ra0rrc1b01 --pod-network-cidr 10.253.208.0/21'
Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/kubeadm.py", line 252, in kubeadm_init_join
    bootstrap_remote.run(args=cmd)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_04c2febe7099917d97a71271f17abb5710030132/teuthology/orchestra/remote.py", line 509, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_04c2febe7099917d97a71271f17abb5710030132/teuthology/orchestra/run.py", line 455, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_04c2febe7099917d97a71271f17abb5710030132/teuthology/orchestra/run.py", line 161, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_04c2febe7099917d97a71271f17abb5710030132/teuthology/orchestra/run.py", line 183, in _raise_for_status
    node=self.hostname, label=self.label
teuthology.exceptions.CommandFailedError: Command failed on smithi187 with status 1: 'sudo kubeadm init --node-name smithi187 --token abcdef.8trb50ra0rrc1b01 --pod-network-cidr 10.253.208.0/21'

/a/yuriw-2021-08-06_16:31:19-rados-wip-yuri-master-8.6.21-distro-basic-smithi/6324419


Related issues 1 (1 open0 closed)

Related to Orchestrator - Bug #58258: rook: kubelet fails from connection refusedNew

Actions
Actions #1

Updated by Neha Ojha over 2 years ago

  • Backport set to pacific

/a/sseshasa-2021-08-06_04:49:51-rados-wip-sseshasa2-testing-2021-08-04-1847-pacific-distro-basic-smithi/6323276

Actions #2

Updated by Sebastian Wagner over 2 years ago

  • Category set to mgr/rook
Actions #3

Updated by Joseph Sawaya over 2 years ago

  • Status changed from New to Pending Backport
Actions #4

Updated by Neha Ojha over 2 years ago

  • Pull request ID set to 42709
Actions #5

Updated by Deepika Upadhyay over 2 years ago

  • Priority changed from Normal to High
Actions #6

Updated by Deepika Upadhyay over 2 years ago

ceph/teuthology-archive/yuriw-2021-10-02_15:03:31-rados-wip-yuri2-testing-2021-10-01-0902-pacific-distro-basic-smithi/641
7863/teuthology.log

Actions #7

Updated by Laura Flores over 2 years ago

teuthology/yuriw-2021-11-08_15:10:38-rados-wip-yuri8-testing-2021-11-02-1009-pacific-distro-basic-smithi/6491072/teuthology.log

Updating this Tracker to keep a history of the issue. This occurred in a recent Pacific Teuthology run.

Actions #9

Updated by Sebastian Wagner over 2 years ago

  • Status changed from Pending Backport to Resolved
Actions #10

Updated by Laura Flores over 1 year ago

  • Related to Bug #58258: rook: kubelet fails from connection refused added
Actions

Also available in: Atom PDF