Project

General

Profile

Bug #48924

cephadm: upgrade process failed to pull target image: not enough values to unpack (expected 2, got 1)

Added by Gunther Heinrich about 1 month ago. Updated 23 days ago.

Status:
Fix Under Review
Priority:
High
Assignee:
-
Category:
orchestrator
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

When trying to upgrade a cluster from version 15.2.7 to 15.2.8 the process fails after seeemingly switching over the upgrade process to a newly deployed mgr daemon:

2021-01-19T08:44:59.723616+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 315 : cephadm [INF] Upgrade: First pull of docker.io/ceph/ceph:v15.2.8
2021-01-19T08:45:06.056900+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 319 : cephadm [INF] Upgrade: Target is docker.io/ceph/ceph:v15.2.8 with id 5553b0cb212ca2aa220d33ba39d9c602c8412ce6c5fe
bc57ef9cdc9c5844b185
2021-01-19T08:45:06.061479+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 320 : cephadm [INF] Upgrade: Checking mgr daemons...
2021-01-19T08:45:07.284208+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 322 : cephadm [INF] It is presumed safe to stop ['mgr.iz-ceph-v1-mon-01.elswai']
2021-01-19T08:45:07.284314+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 323 : cephadm [INF] Upgrade: It is presumed safe to stop ['mgr.iz-ceph-v1-mon-01.elswai']
2021-01-19T08:45:07.284388+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 324 : cephadm [INF] Upgrade: Redeploying mgr.iz-ceph-v1-mon-01.elswai
2021-01-19T08:45:07.310192+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 325 : cephadm [INF] Deploying daemon mgr.iz-ceph-v1-mon-01.elswai on iz-ceph-v1-mon-01
2021-01-19T08:45:27.812855+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 336 : cephadm [INF] Upgrade: Target is docker.io/ceph/ceph:v15.2.8 with id 5553b0cb212ca2aa220d33ba39d9c602c8412ce6c5fe
bc57ef9cdc9c5844b185
2021-01-19T08:45:27.815176+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 337 : cephadm [INF] Upgrade: Checking mgr daemons...
2021-01-19T08:45:31.563173+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 340 : cephadm [INF] It is presumed safe to stop ['mgr.iz-ceph-v1-mon-02.foqmfa']
2021-01-19T08:45:31.563262+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 341 : cephadm [INF] Upgrade: It is presumed safe to stop ['mgr.iz-ceph-v1-mon-02.foqmfa']
2021-01-19T08:45:31.563330+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 342 : cephadm [INF] Upgrade: Redeploying mgr.iz-ceph-v1-mon-02.foqmfa
2021-01-19T08:45:31.645213+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 343 : cephadm [INF] Deploying daemon mgr.iz-ceph-v1-mon-02.foqmfa on iz-ceph-v1-mon-02
2021-01-19T08:45:46.375987+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 351 : cephadm [INF] Upgrade: Target is docker.io/ceph/ceph:v15.2.8 with id 5553b0cb212ca2aa220d33ba39d9c602c8412ce6c5fe
bc57ef9cdc9c5844b185
2021-01-19T08:45:46.390768+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 352 : cephadm [INF] Upgrade: Checking mgr daemons...
2021-01-19T08:45:46.391212+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 353 : cephadm [INF] Upgrade: Need to upgrade myself (mgr.iz-ceph-v1-mon-03.ncnoal)
2021-01-19T08:45:46.393970+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.20125479) 354 : cephadm [INF] Upgrade: there are 2 other already-upgraded standby mgrs, failing over
2021-01-19T08:46:04.945765+0000 mgr.iz-ceph-v1-mon-02.foqmfa (mgr.20156181) 5 : cephadm [INF] Upgrade: Target is docker.io/ceph/ceph:v15.2.8 with id 5553b0cb212ca2aa220d33ba39d9c602c8412ce6c5febc
57ef9cdc9c5844b185
2021-01-19T08:46:04.948477+0000 mgr.iz-ceph-v1-mon-02.foqmfa (mgr.20156181) 6 : cephadm [INF] Upgrade: Checking mgr daemons...
2021-01-19T08:46:05.435559+0000 mgr.iz-ceph-v1-mon-02.foqmfa (mgr.20156181) 7 : cephadm [INF] Upgrade: Pulling docker.io/ceph/ceph:v15.2.8 on iz-ceph-v1-mon-03
2021-01-19T08:47:08.755726+0000 mgr.iz-ceph-v1-mon-02.foqmfa (mgr.20156181) 40 : cephadm [ERR] Upgrade: Paused due to UPGRADE_FAILED_PULL: Upgrade: failed to pull target image

When trying to manually pull the image, cephadm logs the following exception:
Using recent ceph image docker.io/ceph/ceph:v15.2.8
Pulling container image docker.io/ceph/ceph:v15.2.8...
Traceback (most recent call last):
  File "/usr/sbin/cephadm", line 6111, in <module>
    r = args.func()
  File "/usr/sbin/cephadm", line 1381, in _infer_image
    return func()
  File "/usr/sbin/cephadm", line 2676, in command_pull
    return command_inspect_image()
  File "/usr/sbin/cephadm", line 1381, in _infer_image
    return func()
  File "/usr/sbin/cephadm", line 2716, in command_inspect_image
    info_from = get_image_info_from_inspect(out.strip(), args.image)
  File "/usr/sbin/cephadm", line 2727, in get_image_info_from_inspect
    image_id, digests = out.split(',', 1)
ValueError: not enough values to unpack (expected 2, got 1)

Before I started the cluster upgrade I updated the system itself, so the podman version used is
Version:      2.2.1
API Version:  2.1.0
Go Version:   go1.15.2
Built:        Thu Jan  1 01:00:00 1970
OS/Arch:      linux/amd64

and the OS is Ubuntu server in the latest version 20.04.1 with all updates installed.

Could this problem somehow be related to bug #48870 I issued some days ago?

It seems that podman might be causing this and some (many, too many?) other issues for Ceph due to the breakneck speed at which the developers update their software and the Ceph community not keeping up with it. Are there no compatibility checks done to prevent this? Docker and Podman are the two biggest dependencies for Ceph.

History

#1 Updated by Sebastian Wagner about 1 month ago

  • Priority changed from Normal to High

#2 Updated by Sebastian Wagner 23 days ago

[23:15:06] <dpivonka> came across an issue today. with podman 2.2.1 doing an upgrade will fail with an error saying it failed to pull the image https://pastebin.com/PpRY4Tax
[23:15:06] <dpivonka> the trace back from the log show its crashing in the get_image_info_from_inspect function in the binary.  https://pastebin.com/XQrhh5Rh
[23:15:06] <dpivonka> that function processes the results of 'podman inspect --format '{{.ID}},{{json .RepoDigests}} <image>'
[23:15:06] <dpivonka> which in version 2.2.1 of podman it seem that --format '{{json}}'  is broken https://pastebin.com/2WWxwZt4       https://github.com/containers/podman/issues/8882. 
[23:15:06] <dpivonka> it is fixed in the podman 3.0 prerelease though
[23:21:24] <dpivonka> the last known version it was working in was 2.1.1. i didnt test 2.2.0 though
[ceph: root@vm-00 /]# ceph orch upgrade status
{
    "target_image": "docker.io/dpivonka/ceph:test",
    "in_progress": true,
    "services_complete": [],
    "message": "Error: UPGRADE_FAILED_PULL: Upgrade: failed to pull target image" 
}
2021-02-02T17:20:32.066040+0000 mgr.vm-00.yfmoku (mgr.14162) 322 : cephadm [ERR] cephadm exited with an error code: 1, stderr:Pulling container image docker.io/dpivonka/ceph:test...
Traceback (most recent call last):
  File "<stdin>", line 7637, in <module>
  File "<stdin>", line 7626, in main
  File "<stdin>", line 1700, in _infer_image
  File "<stdin>", line 3099, in command_pull
  File "<stdin>", line 1700, in _infer_image
  File "<stdin>", line 3141, in command_inspect_image
  File "<stdin>", line 3152, in get_image_info_from_inspect
ValueError: not enough values to unpack (expected 2, got 1)
Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 1050, in _remote_connection
    yield (conn, connr)
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 978, in _run_cephadm
    code, '\n'.join(err)))
orchestrator._interface.OrchestratorError: cephadm exited with an error code: 1, stderr:Pulling container image docker.io/dpivonka/ceph:test...
Traceback (most recent call last):
  File "<stdin>", line 7637, in <module>
  File "<stdin>", line 7626, in main
  File "<stdin>", line 1700, in _infer_image
  File "<stdin>", line 3099, in command_pull
  File "<stdin>", line 1700, in _infer_image
  File "<stdin>", line 3141, in command_inspect_image
  File "<stdin>", line 3152, in get_image_info_from_inspect
ValueError: not enough values to unpack (expected 2, got 1)

#3 Updated by Sebastian Wagner 23 days ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 39069

Also available in: Atom PDF