Project

General

Profile

Bug #57973

rook:rook module failed to connect k8s api server because of self-signed cert with server

Added by Ben Gao 3 months ago. Updated 3 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Yes
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

steps to reproduce:

1, with rook deploy ceph on k8s cluster
2, run the following to enable rook as orchestrator
ceph mgr module enable rook
ceph orch set backend rook
3, dashboard hangs and check active mgr log as following:

debug 2022-10-21T09:07:21.760+0000 7f018e586700 -1 Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in make_request
self._validate_conn(conn)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 839, in _validate_conn
conn.connect()
File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 358, in connect
ssl_context=context)
File "/usr/lib/python3.6/site-packages/urllib3/util/ssl
.py", line 367, in ssl_wrap_socket
return context.wrap_socket(sock)
File "/usr/lib64/python3.6/ssl.py", line 365, in wrap_socket
_context=self, _session=session)
File "/usr/lib64/python3.6/ssl.py", line 776, in init
self.do_handshake()
File "/usr/lib64/python3.6/ssl.py", line 1036, in do_handshake
self._sslobj.do_handshake()
File "/usr/lib64/python3.6/ssl.py", line 648, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/share/ceph/mgr/rook/module.py", line 195, in serve
self._k8s_CoreV1_api.list_namespaced_pod(self._rook_env.namespace)
File "/usr/lib/python3.6/site-packages/kubernetes/client/api/core_v1_api.py", line 12803, in list_namespaced_pod
(data) = self.list_namespaced_pod_with_http_info(namespace, **kwargs) # noqa: E501
File "/usr/lib/python3.6/site-packages/kubernetes/client/api/core_v1_api.py", line 12905, in list_namespaced_pod_with_http_info
collection_formats=collection_formats)
File "/usr/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 345, in call_api
_preload_content, _request_timeout)
File "/usr/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 176, in __call_api
_request_timeout=_request_timeout)
File "/usr/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 366, in request
headers=headers)
File "/usr/lib/python3.6/site-packages/kubernetes/client/rest.py", line 241, in GET
query_params=query_params)
File "/usr/lib/python3.6/site-packages/kubernetes/client/rest.py", line 214, in request
headers=headers)
File "/usr/lib/python3.6/site-packages/urllib3/request.py", line 68, in request
**urlopen_kw)
File "/usr/lib/python3.6/site-packages/urllib3/request.py", line 89, in request_encode_url
return self.urlopen(method, url, **extra_kw)
File "/usr/lib/python3.6/site-packages/urllib3/poolmanager.py", line 324, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 667, in urlopen
**response_kw)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 667, in urlopen
**response_kw)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 667, in urlopen
**response_kw)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 399, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='169.169.0.1', port=443): Max retries exceeded with url: /api/v1/namespaces/rook-ceph/pods (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))

expected result:
rook can work well as orchestrator to help manage nfs clusters and exports.

History

#1 Updated by Ben Gao 3 months ago

I am working on it.

#2 Updated by Ben Gao 3 months ago

It seems due to bad k8s cert trust chain. ceph is fine. This bug could be closed.

Also available in: Atom PDF