Bug #12181: test: indep mapping fails because an osd is down - Ceph - Ceph

Actions

Copy link

Bug #12181

closed

test: indep mapping fails because an osd is down

Added by Loïc Dachary almost 9 years ago. Updated over 8 years ago.

Status:

Can't reproduce

Priority:

Normal

Assignee:

Loïc Dachary

Category:

Target version:

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

When running test-erasure-code.sh the mapping of pg 2.7 fails:

$ gzip -d < /tmp/bad-report.txt.gz | jq '.pgmap.pg_stats[] | select(.state != "active+clean") | [.pgid, .acting]'
[
  "2.7",
  [
    2147483647,
    0,
    4
  ]
]
<pre>
because it misses osd.6 as shown by a report from a good run of the same test:
</pre>
$ gzip -d < /tmp/good-report.txt.gz | jq '.pgmap.pg_stats[] | select(.pgid == "2.7") | .acting'
[
  6,
  0,
  4
]

and the osd map shows it as out/down

gzip -d < /tmp/bad-report.txt.gz | jq '.osdmap.osds[] | select(.osd == 6)'
{
  "osd": 6,
  "uuid": "913da64e-3527-4d06-9441-62e8d1145356",
  "up": 0,
  "in": 0,
  "weight": 0,
  "primary_affinity": 1,
  "last_clean_begin": 0,
  "last_clean_end": 0,
  "up_from": 26,
  "up_thru": 0,
  "down_at": 28,
  "lost_at": 0,
  "public_addr": "127.0.0.1:6889/29036",
  "cluster_addr": "127.0.0.1:6890/29036",
  "heartbeat_back_addr": "127.0.0.1:6891/29036",
  "heartbeat_front_addr": "127.0.0.1:6892/29036",
  "state": [
    "autoout",
    "exists" 
  ]
}

nothing in the bad.log.gz explains why the osd.6 has failed. It could just be the host running the test failing although dmesg did not show any sign of memory starvation or disk troubles.

Files

Download all files

bad.log.gz (59.1 KB) bad.log.gz	bad.log	Loïc Dachary, 06/27/2015 04:29 PM
bad-report.txt.gz (7.29 KB) bad-report.txt.gz	ceph report for the bad run	Loïc Dachary, 06/27/2015 04:29 PM
good-report.txt.gz (7.28 KB) good-report.txt.gz	ceph report for the good run	Loïc Dachary, 06/27/2015 04:29 PM

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #12181

test: indep mapping fails because an osd is down

Updated by Loïc Dachary almost 9 years ago

Updated by Loïc Dachary almost 9 years ago

Updated by Samuel Just almost 9 years ago

Updated by Loïc Dachary almost 9 years ago

Updated by Loïc Dachary over 8 years ago