Project

General

Profile

Actions

Bug #38427

open

ceph-mgr: mgr generates a traceback when test_mgr_rest_api.py is run "TypeError: string indices must be integers, not str"

Added by Brad Hubbard about 5 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
restful module
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Original error can be seen in any of the failures here, http://pulpito.ceph.com/bhubbard-2019-02-18_07:30:56-ceph-ansible-wip-badone-testing-distro-basic-ovh/

I added some debugging to capture the log output and traceback below. Looks like the node_ids are not expected to be negative (maybe in nodes_by_id)?

2019-02-22 06:27:29.890 7f2378fd3700  0 mgr[restful] descendent_ids = set([-7L, -5L, -3L])
2019-02-22 06:27:29.890 7f2378fd3700  0 mgr[restful] node_id = -7
2019-02-22 06:27:29.890 7f2378fd3700  0 mgr[restful] desc_node = hash
2019-02-22 06:27:29.891 7f2378fd3700  0 mgr[restful] Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/pecan/core.py", line 570, in __call__
    self.handle_request(req, resp)
  File "/usr/lib/python2.7/site-packages/pecan/core.py", line 508, in handle_request
    result = controller(*args, **kwargs)
  File "/usr/lib64/ceph/mgr/restful/decorators.py", line 35, in decorated
    return f(*args, **kwargs)
  File "/usr/lib64/ceph/mgr/restful/api/osd.py", line 130, in get
    return context.instance.get_osds(pool_id)
  File "/usr/lib64/ceph/mgr/restful/module.py", line 539, in get_osds
    pools_map = self.get_osd_pools()
  File "/usr/lib64/ceph/mgr/restful/module.py", line 515, in get_osd_pools
    pool_osds = common.crush_rule_osds(crush['buckets'], rule)
  File "/usr/lib64/ceph/mgr/restful/common.py", line 164, in crush_rule_osds
    osds |= _gather_osds(nodes_by_id[step['item']], rule['steps'][i + 1:])
  File "/usr/lib64/ceph/mgr/restful/common.py", line 154, in _gather_osds
    osds |= _gather_leaf_ids(desc_node)
  File "/usr/lib64/ceph/mgr/restful/common.py", line 97, in _gather_leaf_ids
    if node['id'] >= 0:
TypeError: string indices must be integers, not str

Cluster state.

  cluster:
    id:     fee68cf5-2b42-46f1-a64a-1b65d3c448b5
    health: HEALTH_WARN
            clock skew detected on mon.ovh068

  services:
    mon: 3 daemons, quorum ovh083,ovh068,ovh057 (age 2h)
    mgr: ovh068(active, since 7m)
    mds: cephfs:1 {0=ovh083=up:active}, 0 up:standby
    osd: 9 osds: 9 up (since 2h), 9 in (since 2h)

  data:
    pools:   4 pools, 384 pgs
    objects: 22 objects, 2.2 KiB
    usage:   9.1 GiB used, 72 GiB / 81 GiB avail
    pgs:     384 active+clean

Looks like the descendent_ids correspond to the following from the crushmap.

host ovh068 {                                                                                                                                                                                                                        [0/8328]
        id -3           # do not change unnecessarily
        id -4 class hdd         # do not change unnecessarily
        # weight 0.026
        alg straw2
        hash 0  # rjenkins1
        item osd.0 weight 0.009
        item osd.3 weight 0.009
        item osd.6 weight 0.009
}
host ovh057 {
        id -5           # do not change unnecessarily
        id -6 class hdd         # do not change unnecessarily
        # weight 0.026
        alg straw2
        hash 0  # rjenkins1
        item osd.1 weight 0.009
        item osd.4 weight 0.009
        item osd.7 weight 0.009
}
host ovh083 {
        id -7           # do not change unnecessarily
        id -8 class hdd         # do not change unnecessarily
        # weight 0.026
        alg straw2
        hash 0  # rjenkins1
        item osd.2 weight 0.009
        item osd.5 weight 0.009
        item osd.8 weight 0.009
}
Actions #1

Updated by Brad Hubbard about 5 years ago

Or is it literally returning the string "hash" from "hash 0 # rjenkins1"?

Actions #2

Updated by Brad Hubbard about 5 years ago

I actually managed to get it to "work" with the following ugly hack. Pretty sure this is not the final solution :)

                    for desc_node in nodes_by_id[node_id]:
                        # Short circuit another iteration to find the emit
                        # and assume anything we've done a chooseleaf on
                        # is going to be part of the selected set of osds
                        context.instance.log.error("desc_node = " + str(desc_node))
                        if desc_node == "hash" or desc_node == "name" or desc_node == "weight" or desc_node == "type_id" or desc_node == "alg" or desc_node == "type_name" or desc_node == "items"  or desc_node == "id":
                           continue
                        osds |= _gather_leaf_ids(desc_node)

HTH.

Actions #3

Updated by Brad Hubbard about 5 years ago

  • Severity changed from 3 - minor to 2 - major

Bumping severity since this is stopping the teuthology CA suite from passing.

Actions #4

Updated by Greg Farnum about 5 years ago

  • Project changed from Ceph to mgr
Actions #5

Updated by Sebastian Wagner over 4 years ago

  • Category set to restful module
Actions #6

Updated by Brad Hubbard over 4 years ago

This may have been resolved by 23b6c904941444f0bebb912e7dd069f2d2b1f44a

Actions

Also available in: Atom PDF