Ceph : Issues
https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2020-12-08T11:06:56Z
Ceph
Redmine
CephFS - Bug #48491 (Resolved): tasks.cephfs.test_nfs.TestNFS.test_cluster_info: IP mismatch
https://tracker.ceph.com/issues/48491
2020-12-08T11:06:56Z
Sebastian Wagner
<pre>
2020-12-07T20:32:02.847 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2020-12-07T20:32:02.847 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-swagner3-testing-2020-12-07-1114/qa/tasks/cephfs/test_nfs.py", line 436, in test_cluster_info
2020-12-07T20:32:02.847 INFO:tasks.cephfs_test_runner: self.assertDictEqual(info_output, host_details)
2020-12-07T20:32:02.848 INFO:tasks.cephfs_test_runner:AssertionError: {'tes[21 chars]ithi069', 'ip': ['172.21.15.69', '127.0.1.1'], 'port': 2049}]} != {'tes[21 chars]ithi069', 'ip': ['172.17.0.1', '172.21.15.69'], 'port': 2049}]}
2020-12-07T20:32:02.848 INFO:tasks.cephfs_test_runner: {'test': [{'hostname': 'smithi069',
2020-12-07T20:32:02.848 INFO:tasks.cephfs_test_runner:- 'ip': ['172.21.15.69', '127.0.1.1'],
2020-12-07T20:32:02.848 INFO:tasks.cephfs_test_runner:+ 'ip': ['172.17.0.1', '172.21.15.69'],
2020-12-07T20:32:02.849 INFO:tasks.cephfs_test_runner: 'port': 2049}]}
2020-12-07T20:32:02.849 INFO:tasks.cephfs_test_runner:
2020-12-07T20:32:02.849 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2020-12-07T20:32:02.850 INFO:tasks.cephfs_test_runner:Ran 1 test in 32.117s
</pre>
<p><a class="external" href="https://pulpito.ceph.com/swagner-2020-12-07_12:07:52-rados:cephadm-wip-swagner3-testing-2020-12-07-1114-distro-basic-smithi/5689544/">https://pulpito.ceph.com/swagner-2020-12-07_12:07:52-rados:cephadm-wip-swagner3-testing-2020-12-07-1114-distro-basic-smithi/5689544/</a></p>
<p>log snippet:<br /><pre>
2020-12-07T20:31:47.592 INFO:tasks.cephfs_test_runner:Starting test: test_cluster_info (tasks.cephfs.test_nfs.TestNFS)
2020-12-07T20:31:47.596 INFO:teuthology.orchestra.run.smithi069:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph log 'Starting test tasks.cephfs.test_nfs.TestNFS.test_cluster_info'
2020-12-07T20:31:48.934 INFO:teuthology.orchestra.run.smithi069:> sudo systemctl status nfs-server
2020-12-07T20:31:48.982 INFO:teuthology.orchestra.run.smithi069.stdout:* nfs-server.service - NFS server and services
2020-12-07T20:31:48.983 INFO:teuthology.orchestra.run.smithi069.stdout: Loaded: loaded (/lib/systemd/system/nfs-server.service; enabled; vendor preset: enabled)
2020-12-07T20:31:48.983 INFO:teuthology.orchestra.run.smithi069.stdout: Active: active (exited) since Mon 2020-12-07 20:22:19 UTC; 9min ago
2020-12-07T20:31:48.984 INFO:teuthology.orchestra.run.smithi069.stdout: Main PID: 1291 (code=exited, status=0/SUCCESS)
2020-12-07T20:31:48.985 INFO:teuthology.orchestra.run.smithi069.stdout: Tasks: 0 (limit: 4915)
2020-12-07T20:31:48.985 INFO:teuthology.orchestra.run.smithi069.stdout: CGroup: /system.slice/nfs-server.service
2020-12-07T20:31:48.986 INFO:teuthology.orchestra.run.smithi069.stdout:
2020-12-07T20:31:48.987 INFO:teuthology.orchestra.run.smithi069.stdout:Dec 07 20:22:19 smithi121 systemd[1]: Starting NFS server and services...
2020-12-07T20:31:48.988 INFO:teuthology.orchestra.run.smithi069.stdout:Dec 07 20:22:19 smithi121 systemd[1]: Started NFS server and services.
2020-12-07T20:31:48.991 INFO:tasks.cephfs.test_nfs:Disabling NFS
2020-12-07T20:31:48.993 INFO:teuthology.orchestra.run.smithi069:> sudo systemctl disable nfs-server --now
2020-12-07T20:31:49.057 INFO:teuthology.orchestra.run.smithi069.stderr:Removed /etc/systemd/system/multi-user.target.wants/nfs-server.service.
2020-12-07T20:31:49.194 INFO:teuthology.orchestra.run.smithi069:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph nfs cluster create cephfs test
2020-12-07T20:31:51.150 INFO:teuthology.orchestra.run.smithi069.stdout:NFS Cluster Created Successfully
2020-12-07T20:32:01.168 INFO:teuthology.orchestra.run.smithi069:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph orch ps --service_name=nfs.ganesha-test
2020-12-07T20:32:01.498 INFO:teuthology.orchestra.run.smithi069.stdout:NAME HOST STATUS REFRESHED AGE VERSION IMAGE NAME IMAGE ID CONTAINER ID
2020-12-07T20:32:01.498 INFO:teuthology.orchestra.run.smithi069.stdout:nfs.ganesha-test.smithi069 smithi069 running (6s) 4s ago 6s 3.3 quay.ceph.io/ceph-ci/ceph:acb9f90cf72d3c35abbc0efbef808026624ef6c0 59dd54023d46 4282061f6a85
2020-12-07T20:32:01.512 INFO:teuthology.orchestra.run.smithi069:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph nfs cluster info test
2020-12-07T20:32:01.862 INFO:teuthology.orchestra.run.smithi069.stdout:{
2020-12-07T20:32:02.312 INFO:teuthology.orchestra.run.smithi069.stdout: "test": [
2020-12-07T20:32:02.312 INFO:teuthology.orchestra.run.smithi069.stdout: {
2020-12-07T20:32:02.313 INFO:teuthology.orchestra.run.smithi069.stdout: "hostname": "smithi069",
2020-12-07T20:32:02.313 INFO:teuthology.orchestra.run.smithi069.stdout: "ip": [
2020-12-07T20:32:02.313 INFO:teuthology.orchestra.run.smithi069.stdout: "172.21.15.69",
2020-12-07T20:32:02.313 INFO:teuthology.orchestra.run.smithi069.stdout: "127.0.1.1"
2020-12-07T20:32:02.314 INFO:teuthology.orchestra.run.smithi069.stdout: ],
2020-12-07T20:32:02.314 INFO:teuthology.orchestra.run.smithi069.stdout: "port": 2049
2020-12-07T20:32:02.314 INFO:teuthology.orchestra.run.smithi069.stdout: }
2020-12-07T20:32:02.314 INFO:teuthology.orchestra.run.smithi069.stdout: ]
2020-12-07T20:32:02.314 INFO:teuthology.orchestra.run.smithi069.stdout:}
2020-12-07T20:32:02.316 INFO:teuthology.orchestra.run.smithi069:> sudo hostname
2020-12-07T20:32:02.329 INFO:teuthology.orchestra.run.smithi069.stdout:smithi069
2020-12-07T20:32:02.331 INFO:teuthology.orchestra.run.smithi069:> sudo hostname -I
2020-12-07T20:32:02.387 INFO:teuthology.orchestra.run.smithi069.stdout:172.21.15.69 172.17.0.1
</pre></p>
Orchestrator - Bug #46534 (Resolved): cephadm podman pull: Digest did not match
https://tracker.ceph.com/issues/46534
2020-07-14T13:07:14Z
Sebastian Wagner
<p><a class="external" href="https://pulpito.ceph.com/swagner-2020-07-14_12:03:52-rados:cephadm-wip-swagner-testing-2020-07-14-1125-distro-basic-smithi/">https://pulpito.ceph.com/swagner-2020-07-14_12:03:52-rados:cephadm-wip-swagner-testing-2020-07-14-1125-distro-basic-smithi/</a></p>
<pre>
INFO:cephadm:Verifying port 3000 ...
INFO:cephadm:Non-zero exit code 125 from /bin/podman run --rm --net=host -e CONTAINER_IMAGE=ceph/ceph-grafana:latest -e NODE_NAME=smithi141 --entrypoint stat ceph/ceph-grafana:latest -c %u %g /var/lib/grafana
INFO:cephadm:stat:stderr Trying to pull registry.access.redhat.com/ceph/ceph-grafana:latest...
INFO:cephadm:stat:stderr name unknown: Repo not found
INFO:cephadm:stat:stderr Trying to pull registry.fedoraproject.org/ceph/ceph-grafana:latest...
INFO:cephadm:stat:stderr manifest unknown: manifest unknown
INFO:cephadm:stat:stderr Trying to pull registry.centos.org/ceph/ceph-grafana:latest...
INFO:cephadm:stat:stderr manifest unknown: manifest unknown
INFO:cephadm:stat:stderr Trying to pull docker.io/ceph/ceph-grafana:latest...
INFO:cephadm:stat:stderr Getting image source signatures
INFO:cephadm:stat:stderr Copying blob sha256:003efafe5a84678b585af8a06810c47079aa4705e60d07f1c31a52f0e35ce0b5
INFO:cephadm:stat:stderr Digest did not match, expected sha256:003efafe5a84678b585af8a06810c47079aa4705e60d07f1c31a52f0e35ce0b5, got sha256:f4c0a426e0aa470680560b267b79382c183c9608f3da72acce8cc5beaf8ebe31
INFO:cephadm:stat:stderr Error: unable to pull ceph/ceph-grafana:latest: 4 errors occurred:
INFO:cephadm:stat:stderr * Error initializing source docker://registry.access.redhat.com/ceph/ceph-grafana:latest: Error reading manifest latest in registry.access.redhat.com/ceph/ceph-grafana: name unknown: Repo not found
INFO:cephadm:stat:stderr * Error initializing source docker://registry.fedoraproject.org/ceph/ceph-grafana:latest: Error reading manifest latest in registry.fedoraproject.org/ceph/ceph-grafana: manifest unknown: manifest unknown
INFO:cephadm:stat:stderr * Error initializing source docker://registry.centos.org/ceph/ceph-grafana:latest: Error reading manifest latest in registry.centos.org/ceph/ceph-grafana: manifest unknown: manifest unknown
INFO:cephadm:stat:stderr * Error writing blob: error storing blob to file "/var/tmp/storage976712658/1": Digest did not match, expected sha256:003efafe5a84678b585af8a06810c47079aa4705e60d07f1c31a52f0e35ce0b5, got sha256:f4c0a426e0aa470680560b267b79382c183c9608f3da72acce8cc5beaf8ebe31
INFO:cephadm:stat:stderr
Traceback (most recent call last):
File "<stdin>", line 4247, in <module>
File "<stdin>", line 968, in _default_image
File "<stdin>", line 2508, in command_deploy
File "<stdin>", line 2452, in extract_uid_gid_monitoring
File "<stdin>", line 1535, in extract_uid_gid
File "<stdin>", line 1974, in run
File "<stdin>", line 696, in call_throws
RuntimeError: Failed command: /bin/podman run --rm --net=host -e CONTAINER_IMAGE=ceph/ceph-grafana:latest -e NODE_NAME=smithi141 --entrypoint stat ceph/ceph-grafana:latest -c %u %g /var/lib/grafana
Traceback (most recent call last):
File "/usr/share/ceph/mgr/cephadm/module.py", line 1613, in _run_cephadm
code, '\n'.join(err)))
RuntimeError: cephadm exited with an error code: 1, stderr:INFO:cephadm:Deploying daemon grafana.a ...
</pre>
Orchestrator - Bug #45427 (Resolved): cephadm: auth get failed: invalid entity_auth mon
https://tracker.ceph.com/issues/45427
2020-05-07T10:13:25Z
Sebastian Wagner
<p><a class="external" href="http://pulpito.ceph.com/mgfritch-2020-05-07_02:27:06-rados-wip-mgfritch-testing-2020-05-06-1821-distro-basic-smithi/5029062">http://pulpito.ceph.com/mgfritch-2020-05-07_02:27:06-rados-wip-mgfritch-testing-2020-05-06-1821-distro-basic-smithi/5029062</a></p>
<pre>
cephadm 2020-05-07T03:43:08.989542+0000 mgr.smithi154.qjpiuj (mgr.27922) 6 : cephadm [ERR] Failed to apply node-exporter spec ServiceSpec({'placement': PlacementSpec(host_pattern='*'), 'service_type': 'node-exporter', 'service_id': None, 'unmanaged': False}): auth get failed: invalid entity_auth mon
Traceback (most recent call last):
File "/usr/share/ceph/mgr/cephadm/module.py", line 2219, in _apply_all_services
if self._apply_service(spec):
File "/usr/share/ceph/mgr/cephadm/module.py", line 2190, in _apply_service
create_func(daemon_id, host) # type: ignore
File "/usr/share/ceph/mgr/cephadm/module.py", line 2967, in _create_node_exporter
return self._create_daemon('node-exporter', daemon_id, host)
File "/usr/share/ceph/mgr/cephadm/module.py", line 2021, in _create_daemon
extra_ceph_config=extra_config.pop('config', ''))
File "/usr/share/ceph/mgr/cephadm/module.py", line 1974, in _get_config_and_keyring
'entity': ename,
File "/usr/share/ceph/mgr/mgr_module.py", line 1096, in check_mon_command
raise MonCommandFailed(f'{cmd_dict["prefix"]} failed: {r.stderr}')
mgr_module.MonCommandFailed: auth get failed: invalid entity_auth mon
</pre>
<p>(as a side note, why do we need the mon keyrig for the node_exporter?)</p>
Orchestrator - Documentation #45411 (Resolved): cephadm: add section about container images
https://tracker.ceph.com/issues/45411
2020-05-06T16:57:38Z
Sebastian Wagner
<ul>
<li>we recommend against using the</li>
</ul>
<pre>
:latest
</pre>
<p>tag for images. Reason is, there is in no guarantee that a you will get the same image on all hosts, thus having different images on different nodes. Also, upgrades no longer properly work.</p>
<p>Instead, we recommend to always use explicit tags or image IDs.</p>
<p>Basically the same as <a class="external" href="https://kubernetes.io/docs/concepts/configuration/overview/#container-images">https://kubernetes.io/docs/concepts/configuration/overview/#container-images</a></p>
Orchestrator - Cleanup #45321 (Resolved): Servcie spec: unify `spec:` vs omitting `spec:`
https://tracker.ceph.com/issues/45321
2020-04-29T09:01:05Z
Sebastian Wagner
<pre><code class="yaml syntaxhl"><span class="CodeRay"><span class="key">service_type</span>: <span class="string"><span class="content">iscsi </span></span>
<span class="key">service_id</span>: <span class="string"><span class="content">test</span></span>
<span class="key">placement</span>:
<span class="key">hosts</span>:
- <span class="string"><span class="content">osd0</span></span>
<span class="key">spec</span>:
<span class="key">pool</span>: <span class="string"><span class="content">rbd</span></span>
<span class="key">api_user</span>: <span class="string"><span class="content">admin</span></span>
<span class="key">api_password</span>: <span class="string"><span class="content">admin</span></span>
<span class="key">trusted_ip_list</span>: <span class="string"><span class="content">192.168.121.1</span></span>
</span></code></pre>
<pre><code class="yaml syntaxhl"><span class="CodeRay"><span class="key">service_type</span>: <span class="string"><span class="content">iscsi </span></span>
<span class="key">service_id</span>: <span class="string"><span class="content">test</span></span>
<span class="key">placement</span>:
<span class="key">hosts</span>:
- <span class="string"><span class="content">osd0</span></span>
<span class="key">pool</span>: <span class="string"><span class="content">rbd</span></span>
<span class="key">api_user</span>: <span class="string"><span class="content">admin</span></span>
<span class="key">api_password</span>: <span class="string"><span class="content">admin</span></span>
<span class="key">trusted_ip_list</span>: <span class="string"><span class="content">192.168.121.1</span></span>
</span></code></pre>
<p>are the same data. we should unify them to one or the other.</p>
<p>This is is especially relevant for OSD specs.</p>
Orchestrator - Cleanup #45118 (Closed): orch (pacific): cleanup CLI
https://tracker.ceph.com/issues/45118
2020-04-16T18:43:25Z
Sebastian Wagner
<p>use</p>
<pre>
ceph orch <verb> <object>
</pre>
<p>for everything. Like</p>
<pre>
host rm -> rm host
</pre>
Orchestrator - Bug #45016 (Resolved): mgr: `ceph tell mgr mgr_status` hangs
https://tracker.ceph.com/issues/45016
2020-04-09T13:11:21Z
Sebastian Wagner
<p>cephadm bootstrap hangs:</p>
<pre>
root@buster:/cephadm# ./cephadm --image quay.io/ceph-ci/ceph:master bootstrap --mon-ip '[::1]' --skip-mon-network
INFO:cephadm:Verifying podman|docker is present...
INFO:cephadm:Verifying lvm2 is present...
INFO:cephadm:Verifying time synchronization is in place...
INFO:cephadm:Unit systemd-timesyncd.service is enabled and running
INFO:cephadm:Repeating the final host check...
INFO:cephadm:podman|docker (/usr/bin/podman) is present
INFO:cephadm:systemctl is present
INFO:cephadm:lvcreate is present
INFO:cephadm:Unit systemd-timesyncd.service is enabled and running
INFO:cephadm:Host looks OK
INFO:root:Cluster fsid: 160e9ea8-7a60-11ea-b487-525400e3bceb
INFO:cephadm:Verifying IP [::1] port 3300 ...
INFO:cephadm:Verifying IP [::1] port 6789 ...
INFO:cephadm:Pulling latest quay.io/ceph-ci/ceph:master container...
INFO:cephadm:Extracting ceph user uid/gid from container image...
INFO:cephadm:Creating initial keys...
INFO:cephadm:Creating initial monmap...
INFO:cephadm:Creating mon...
INFO:cephadm:Waiting for mon to start...
INFO:cephadm:Waiting for mon...
INFO:cephadm:Assimilating anything we can from ceph.conf...
INFO:cephadm:Generating new minimal ceph.conf...
INFO:cephadm:Restarting the monitor...
INFO:cephadm:Creating mgr...
INFO:cephadm:Wrote keyring to /etc/ceph/ceph.client.admin.keyring
INFO:cephadm:Wrote config to /etc/ceph/ceph.conf
INFO:cephadm:Waiting for mgr to start...
INFO:cephadm:Waiting for mgr...
INFO:cephadm:mgr not available, waiting (1/10)...
INFO:cephadm:mgr not available, waiting (2/10)...
INFO:cephadm:mgr not available, waiting (3/10)...
INFO:cephadm:mgr not available, waiting (4/10)...
INFO:cephadm:Enabling cephadm module...
INFO:cephadm:Waiting for the mgr to restart...
INFO:cephadm:Waiting for mgr epoch 5...
^C
</pre>
<p>While `mgr dump` works:</p>
<pre>
root@buster:/cephadm# ./cephadm shell -- ceph mgr dump | jq .epoch
INFO:cephadm:Inferring fsid 160e9ea8-7a60-11ea-b487-525400e3bceb
INFO:cephadm:Using recent ceph image quay.io/ceph-ci/ceph:master
8
</pre>
<p>But `ceph tell mgr mgr_status` hangs:</p>
<pre>
root@buster:/cephadm# ./cephadm shell -- ceph tell mgr mgr_status
INFO:cephadm:Inferring fsid 160e9ea8-7a60-11ea-b487-525400e3bceb
INFO:cephadm:Using recent ceph image quay.io/ceph-ci/ceph:master
^CInterrupted
</pre>
<p>And neither mon or mgr logs show a trace of the attempted call.</p>
<pre>
root@buster:/cephadm# ./cephadm shell -- ceph config generate-minimal-conf
# minimal ceph.conf for 160e9ea8-7a60-11ea-b487-525400e3bceb
[global]
fsid = 160e9ea8-7a60-11ea-b487-525400e3bceb
mon_host = [v2:[::1]:3300/0,v1:[::1]:6789/0]
</pre>
<p>Any idea where to look for clues?</p>
Orchestrator - Feature #44625 (Resolved): cephadm: test dmcrypt
https://tracker.ceph.com/issues/44625
2020-03-16T14:26:54Z
Sebastian Wagner
<p>we need to verify it.</p>
Orchestrator - Feature #43839 (Resolved): enhance `host ls`
https://tracker.ceph.com/issues/43839
2020-01-27T16:40:17Z
Sebastian Wagner
<p>Right now, host ls only returns the host names. We need more:</p>
<pre>
E.g. STATUS:
NAME STATUS ROLES AGE VERSION
worker-1 Ready,SchedulingDisabled <none> 20h v1.15.2
</pre>
<p>What additional things do we need to show? IPs?</p>
Orchestrator - Bug #43713 (Resolved): drive group filters: use `and` instead of `or`
https://tracker.ceph.com/issues/43713
2020-01-20T15:42:27Z
Sebastian Wagner
Orchestrator - Documentation #43683 (Resolved): Missing docs for HostSpec
https://tracker.ceph.com/issues/43683
2020-01-20T13:57:51Z
Sebastian Wagner
<p><del>Also `Host 'node1' is missing a network spec` must not be a `RuntimeError`, but an OrchestratorValidationError</del> done</p>
Orchestrator - Cleanup #43674 (Resolved): rename/merge orchestrator_cli -> orchestrator
https://tracker.ceph.com/issues/43674
2020-01-20T13:44:30Z
Sebastian Wagner
Orchestrator - Feature #43673 (Resolved): ceph-ansible playbook: pivot to cephadm
https://tracker.ceph.com/issues/43673
2020-01-20T13:44:05Z
Sebastian Wagner
<p>(this issue was imported form the old Trello board)</p>
<p>Make ceph-ansible upgrade to cephadm</p>
<p>see <a class="external" href="https://github.com/ceph/ceph/pull/33459">https://github.com/ceph/ceph/pull/33459</a> for the adoption process documentation</p>
CephFS - Bug #40429 (Resolved): mgr/volumes: subvolume.py calls Exceptions with too few arguments.
https://tracker.ceph.com/issues/40429
2019-06-19T09:45:11Z
Sebastian Wagner
<p>mypy revealed</p>
<pre>
+pybind/mgr/volumes/fs/subvolume.py: note: In member "get_subvolume_path" of class "SubVolume":
+pybind/mgr/volumes/fs/subvolume.py:167: error: Too few arguments for "VolumeException"
+pybind/mgr/volumes/fs/subvolume.py: note: In member "_get_ancestor_xattr" of class "SubVolume":
+pybind/mgr/volumes/fs/subvolume.py:203: error: Too few arguments for "NoData"
</pre>
<p>both of these errors are actual bugs in the code and needs to get fixed.</p>
CephFS - Bug #40014 (Resolved): mgr/volumes: Name 'sub_name' is not defined
https://tracker.ceph.com/issues/40014
2019-05-23T10:11:45Z
Sebastian Wagner
<p>I'm getting a new mypy error in master:</p>
<pre>
pybind/mgr/volumes/module.py: note: In member "_cmd_fs_subvolumegroup_snapshot_rm" of class "Module":
pybind/mgr/volumes/module.py:552: error: Name 'sub_name' is not defined
</pre>
<p>Regression of <a class="external" href="https://github.com/ceph/ceph/pull/27594">https://github.com/ceph/ceph/pull/27594</a></p>