Bug #41386
pg_autoscaler: pool id key not present in pool_stats
0%
Description
2019-08-21T17:08:41.253 INFO:tasks.workunit.client.0.smithi159.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:435: test_tiering_1: ceph osd pool delete slow2 slow2 --yes-i-really-really-mean-it 2019-08-21T17:08:42.130 INFO:tasks.ceph.mgr.x.smithi159.stderr:2019-08-21T17:08:42.124+0000 7fc0d7c48700 -1 log_channel(cluster) log [ERR] : Unhandled exception from module 'pg_autoscaler' while running on mgr.x: (3L,) 2019-08-21T17:08:42.130 INFO:tasks.ceph.mgr.x.smithi159.stderr:2019-08-21T17:08:42.124+0000 7fc0d7c48700 -1 pg_autoscaler.serve: 2019-08-21T17:08:42.130 INFO:tasks.ceph.mgr.x.smithi159.stderr:2019-08-21T17:08:42.124+0000 7fc0d7c48700 -1 Traceback (most recent call last): 2019-08-21T17:08:42.130 INFO:tasks.ceph.mgr.x.smithi159.stderr: File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 164, in serve 2019-08-21T17:08:42.130 INFO:tasks.ceph.mgr.x.smithi159.stderr: self._maybe_adjust() 2019-08-21T17:08:42.131 INFO:tasks.ceph.mgr.x.smithi159.stderr: File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 339, in _maybe_adjust 2019-08-21T17:08:42.131 INFO:tasks.ceph.mgr.x.smithi159.stderr: ps, root_map, pool_root = self._get_pool_status(osdmap, pools) 2019-08-21T17:08:42.131 INFO:tasks.ceph.mgr.x.smithi159.stderr: File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 273, in _get_pool_status 2019-08-21T17:08:42.131 INFO:tasks.ceph.mgr.x.smithi159.stderr: pool_logical_used = pool_stats[pool_id]['bytes_used'] 2019-08-21T17:08:42.131 INFO:tasks.ceph.mgr.x.smithi159.stderr:KeyError: (3L,) 2019-08-21T17:08:42.131 INFO:tasks.ceph.mgr.x.smithi159.stderr: 2019-08-21T17:08:42.240 INFO:tasks.workunit.client.0.smithi159.stderr:pool 'slow2' does not exist 2019-08-21T17:08:42.251 INFO:tasks.workunit.client.0.smithi159.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:436: test_tiering_1: ceph osd pool delete cache cache --yes-i-really-really-mean-it
/a/sage-2019-08-21_15:17:39-rados-wip-sage2-testing-2019-08-20-0935-distro-basic-smithi/4237079
Related issues
History
#1 Updated by Sage Weil over 4 years ago
- Pull request ID set to 29807
#2 Updated by Kefu Chai over 4 years ago
i ran into a similar issue while testing https://github.com/ceph/ceph/pull/29035,
see https://github.com/ceph/ceph/pull/29035#discussion_r316500629
but i am not sure why the error looks like
KeyError: (3L,)
seems we passed a tuple of "(3,)" instead of an integer to
pool_stats
as the index.#3 Updated by Sebastian Wagner over 4 years ago
don't be totally mislead by the tuple. I think this comes down to the arguments passed to Exception:
In [1]: str(Exception(2)) Out[1]: '2' In [2]: repr(Exception(2)) Out[2]: 'Exception(2,)' In [3]: Exception(2).args Out[3]: (2,)
I remember a similar case where this was misleading.
I might be wrong here, but don't just look at the strange tuple.
#4 Updated by Sebastian Wagner over 4 years ago
- Category set to pg_autoscaler module
#5 Updated by Sage Weil over 4 years ago
- Status changed from 12 to Pending Backport
#6 Updated by Kefu Chai over 4 years ago
Thanks Sebastian, that explains!
#7 Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41436: nautilus: pg_autoscaler: pool id key not present in pool_stats added
#8 Updated by Kefu Chai over 4 years ago
i still have
2019-08-29T07:52:47.255+0000 7f691c344700 -1 log_channel(cluster) log [ERR] : Unhandled exception from module 'pg_autoscaler' while running on mgr.x: (1,) 2019-08-29T07:52:47.255+0000 7f691c344700 -1 pg_autoscaler.serve: 2019-08-29T07:52:47.255+0000 7f691c344700 -1 Traceback (most recent call last): File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 175, in serve self._update_progress_events() File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 353, in _update_progress_events pool_data = pools[int(pool_id)] KeyError: (1,)
while testing https://github.com/ceph/ceph/pull/29035
/a/kchai-2019-08-29_03:14:53-rados-wip-kefu-testing-2019-08-27-1807-distro-basic-mira/4260378/
the tested branch contains https://github.com/ceph/ceph/pull/29807
#9 Updated by Sage Weil over 4 years ago
- Status changed from Pending Backport to 12
I saw it again too,
2019-10-04T20:18:20.761 INFO:tasks.ceph.mgr.x.smithi183.stderr:2019-10-04T20:18:20.767+0000 7f67d9e9d700 -1 Traceback (most recent call last): 2019-10-04T20:18:20.761 INFO:tasks.ceph.mgr.x.smithi183.stderr: File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 175, in serve 2019-10-04T20:18:20.761 INFO:tasks.ceph.mgr.x.smithi183.stderr: self._update_progress_events() 2019-10-04T20:18:20.762 INFO:tasks.ceph.mgr.x.smithi183.stderr: File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 353, in _update_progress_events 2019-10-04T20:18:20.762 INFO:tasks.ceph.mgr.x.smithi183.stderr: pool_data = pools[int(pool_id)] 2019-10-04T20:18:20.762 INFO:tasks.ceph.mgr.x.smithi183.stderr:KeyError: (1,)
/a/sage-2019-10-04_18:20:43-rados-wip-sage-testing-2019-10-04-0923-distro-basic-smithi/4358946
#10 Updated by Sage Weil over 4 years ago
- Status changed from 12 to Pending Backport
different cause! see #42249
#11 Updated by Nathan Cutler over 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".