Actions
Bug #12181
closedtest: indep mapping fails because an osd is down
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
When running test-erasure-code.sh the mapping of pg 2.7 fails:
$ gzip -d < /tmp/bad-report.txt.gz | jq '.pgmap.pg_stats[] | select(.state != "active+clean") | [.pgid, .acting]' [ "2.7", [ 2147483647, 0, 4 ] ] <pre> because it misses osd.6 as shown by a report from a good run of the same test: </pre> $ gzip -d < /tmp/good-report.txt.gz | jq '.pgmap.pg_stats[] | select(.pgid == "2.7") | .acting' [ 6, 0, 4 ]
and the osd map shows it as out/down
gzip -d < /tmp/bad-report.txt.gz | jq '.osdmap.osds[] | select(.osd == 6)' { "osd": 6, "uuid": "913da64e-3527-4d06-9441-62e8d1145356", "up": 0, "in": 0, "weight": 0, "primary_affinity": 1, "last_clean_begin": 0, "last_clean_end": 0, "up_from": 26, "up_thru": 0, "down_at": 28, "lost_at": 0, "public_addr": "127.0.0.1:6889/29036", "cluster_addr": "127.0.0.1:6890/29036", "heartbeat_back_addr": "127.0.0.1:6891/29036", "heartbeat_front_addr": "127.0.0.1:6892/29036", "state": [ "autoout", "exists" ] }
nothing in the bad.log.gz explains why the osd.6 has failed. It could just be the host running the test failing although dmesg did not show any sign of memory starvation or disk troubles.
Files
Actions