Project

General

Profile

Bug #54030

mgr/dashboard: cephadm e2e failing because of rgw commands getting stuck

Added by Nizamudeen A about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Testing & QA
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Pasted below is an instance of the log where radosgw command is getting hanged because of the health of the cluster.
From https://github.com/ceph/ceph/pull/44384#issuecomment-1019969906 the tests needs to be adopted according to this.

Jan 21 15:19:21 ceph-node-00 ceph-mgr[11741]: [cephadm INFO cephadm.serve] Checking dashboard <-> RGW credentials
Jan 21 15:19:21 ceph-node-00 ceph-mgr[11741]: log_channel(cephadm) log [INF] : Checking dashboard <-> RGW credentials
Jan 21 15:19:21 ceph-node-00 ceph-mgr[11741]: [dashboard INFO rgw_client] Configuring dashboard RGW credentials
Jan 21 15:19:21 ceph-node-00 ceph-mgr[11741]: [dashboard INFO orchestrator] is orchestrator available: True, 
Jan 21 15:19:21 ceph-node-00 ceph-mgr[11741]: [dashboard INFO request] [::ffff:192.168.100.1:40492] [GET] [200] [0.015s] [admin] [385.0B] /api/service
Jan 21 15:19:22 ceph-node-00 ceph-mgr[11741]: log_channel(cluster) log [DBG] : pgmap v74: 1 pgs: 1 unknown; 0 B data, 0 B used, 0 B / 0 B avail
Jan 21 15:19:23 ceph-node-00 ceph-mgr[11741]: [progress INFO root] Writing back 2 completed events
Jan 21 15:19:23 ceph-node-00 ceph-mgr[11741]: [progress INFO root] Processing OSDMap change 5..5
Jan 21 15:19:24 ceph-node-00 ceph-mgr[11741]: log_channel(cluster) log [DBG] : pgmap v75: 1 pgs: 1 unknown; 0 B data, 0 B used, 0 B / 0 B avail
Jan 21 15:19:26 ceph-node-00 ceph-mgr[11741]: log_channel(cluster) log [DBG] : pgmap v76: 1 pgs: 1 unknown; 0 B data, 0 B used, 0 B / 0 B avail
Jan 21 15:19:26 ceph-node-00 ceph-mgr[11741]: [dashboard INFO request] [::ffff:192.168.100.1:40492] [GET] [200] [0.005s] [admin] [22.0B] /api/prometheus/notifications
Jan 21 15:19:26 ceph-node-00 ceph-mgr[11741]: [dashboard INFO request] [::ffff:192.168.100.1:40492] [GET] [200] [0.011s] [admin] [877.0B] /api/summary
Jan 21 15:19:26 ceph-node-00 ceph-mgr[11741]: [dashboard INFO orchestrator] is orchestrator available: True, 
Jan 21 15:19:26 ceph-node-00 ceph-mgr[11741]: [dashboard INFO request] [::ffff:192.168.100.1:40492] [GET] [200] [0.011s] [admin] [385.0B] /api/service
Jan 21 15:19:28 ceph-node-00 ceph-mgr[11741]: log_channel(cluster) log [DBG] : pgmap v77: 1 pgs: 1 unknown; 0 B data, 0 B used, 0 B / 0 B avail
Jan 21 15:19:28 ceph-node-00 ceph-mgr[11741]: [progress INFO root] Processing OSDMap change 5..5
Jan 21 15:19:30 ceph-node-00 ceph-mgr[11741]: log_channel(cluster) log [DBG] : pgmap v78: 1 pgs: 1 unknown; 0 B data, 0 B used, 0 B / 0 B avail
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: [dashboard INFO request] [::ffff:192.168.100.1:40492] [GET] [200] [0.007s] [admin] [22.0B] /api/prometheus/notifications
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: [dashboard INFO request] [::ffff:192.168.100.1:40492] [GET] [200] [0.012s] [admin] [877.0B] /api/summary
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: [dashboard ERROR root] Timeout (10s) executing radosgw-admin ['radosgw-admin', '-c', '/etc/ceph/ceph.conf', '-k', '/var/lib/ceph/mgr/ceph-ceph-node-00.njzmln/keyring', '-n', 'mgr.ceph-node-00.njzmln', 'realm', 'list']
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: [dashboard ERROR rgw_client] Command '['radosgw-admin', '-c', '/etc/ceph/ceph.conf', '-k', '/var/lib/ceph/mgr/ceph-ceph-node-00.njzmln/keyring', '-n', 'mgr.ceph-node-00.njzmln', 'realm', 'list']' timed out after 10 seconds
                                              Traceback (most recent call last):
                                                File "/lib64/python3.6/subprocess.py", line 425, in run
                                                  stdout, stderr = process.communicate(input, timeout=timeout)
                                                File "/lib64/python3.6/subprocess.py", line 863, in communicate
                                                  stdout, stderr = self._communicate(input, endtime, timeout)
                                                File "/lib64/python3.6/subprocess.py", line 1535, in _communicate
                                                  self._check_timeout(endtime, orig_timeout)
                                                File "/lib64/python3.6/subprocess.py", line 891, in _check_timeout
                                                  raise TimeoutExpired(self.args, orig_timeout)
                                              subprocess.TimeoutExpired: Command '['radosgw-admin', '-c', '/etc/ceph/ceph.conf', '-k', '/var/lib/ceph/mgr/ceph-ceph-node-00.njzmln/keyring', '-n', 'mgr.ceph-node-00.njzmln', 'realm', 'list']' timed out after 10 seconds

                                              During handling of the above exception, another exception occurred:

                                              Traceback (most recent call last):
                                                File "/usr/share/ceph/mgr/dashboard/services/rgw_client.py", line 243, in configure_rgw_credentials
                                                  _, out, err = mgr.send_rgwadmin_command(['realm', 'list'])
                                                File "/usr/share/ceph/mgr/mgr_module.py", line 2235, in send_rgwadmin_command
                                                  timeout=10,
                                                File "/lib64/python3.6/subprocess.py", line 430, in run
                                                  stderr=stderr)
                                              subprocess.TimeoutExpired: Command '['radosgw-admin', '-c', '/etc/ceph/ceph.conf', '-k', '/var/lib/ceph/mgr/ceph-ceph-node-00.njzmln/keyring', '-n', 'mgr.ceph-node-00.njzmln', 'realm', 'list']' timed out after 10 seconds
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: [progress WARNING root] complete: ev a17c58b8-dd16-4393-8694-6c4c78723bd8 does not exist
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: [progress WARNING root] complete: ev a092e721-a7cb-47d5-a218-4ef2698477b6 does not exist
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: [progress WARNING root] complete: ev 609ee482-a1d1-49d6-8a6b-9e5076d0b978 does not exist
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: [progress WARNING root] complete: ev 25dd2cd7-80c6-4450-bd53-d9d8384616ca does not exist
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: [cephadm INFO cephadm.serve] Purge service mds.test
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: log_channel(cephadm) log [INF] : Purge service mds.test
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: [dashboard INFO orchestrator] is orchestrator available: True, 
Jan 21 15:19:31 ceph-node-00 ceph-mgr[11741]: [dashboard INFO request] [::ffff:192.168.100.1:40492] [GET] [200] [0.014s] [admin] [336.0B] /api/service
Jan 21 15:19:32 ceph-node-00 ceph-mgr[11741]: log_channel(cluster) log [DBG] : pgmap v79: 1 pgs: 1 unknown; 0 B data, 0 B used, 0 B / 0 B avail
Jan 21 15:19:33 ceph-node-00 ceph-mgr[11741]: [progress INFO root] Processing OSDMap change 5..5
Jan 21 15:19:34 ceph-node-00 ceph-mgr[11741]: [cephadm INFO cephadm.serve] Checking dashboard <-> RGW credentials
Jan 21 15:19:34 ceph-node-00 ceph-mgr[11741]: log_channel(cephadm) log [INF] : Checking dashboard <-> RGW credentials
Jan 21 15:19:34 ceph-node-00 ceph-mgr[11741]: [dashboard INFO rgw_client] Configuring dashboard RGW credentials

Related issues

Copied to Dashboard - Backport #54178: pacific: mgr/dashboard: cephadm e2e failing because of rgw commands getting stuck Resolved

History

#1 Updated by Ernesto Puerta about 2 years ago

  • Status changed from New to Pending Backport
  • Assignee set to Nizamudeen A
  • Pull request ID set to 44825

#2 Updated by Backport Bot about 2 years ago

  • Copied to Backport #54178: pacific: mgr/dashboard: cephadm e2e failing because of rgw commands getting stuck added

#3 Updated by Ernesto Puerta about 2 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF