Project

General

Profile

Bug #8578

teuthology: OSD thrasher killing OSDs which hold the sole copies of PGs

Added by Greg Farnum about 5 years ago. Updated almost 5 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
teuthology
Target version:
-
Start date:
06/10/2014
Due date:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

ubuntu@teuthology:/a/gregf-2014-06-09_22:28:53-rados-wip-xattr-spillout-testing-basic-plana/303486

For example:

0.63    0       0       0       0       0       0       0       down+peering    2014-06-10 19:17:03.794611      0'0     1035:18 [4]     4       [4]     4       0'0     2014-06-10 19:07:33.982623      0'0     2014-06-10 18:37:59.6250527.828766
0.62    0       0       0       0       0       0       0       active+clean    2014-06-10 19:16:42.355566      0'0     1035:28 [0,5]   0       [0,5]   0       0'0     2014-06-10 19:07:15.921025      0'0     2014-06-10 18:37:56.7977837.791664
0.61    0       0       0       0       0       0       0       stale+down+peering      2014-06-10 19:16:10.623753      0'0     1019:19 [1]     1       [1]     1       0'0     2014-06-10 19:07:32.444863      0'0     2014-06-10 18:37:56.766243
0.60    0       0       0       0       0       0       0       down+peering    2014-06-10 19:17:03.838535      0'0     1035:17 [0]     0       [0]     0       0'0     2014-06-10 19:07:28.050747      0'0     2014-06-10 18:37:55.855174

I'm guessing all of those were mapped to 1 when the thrasher killed it:

2014-06-10T19:17:03.034 INFO:teuthology.task.thrashosds.thrasher:in_osds:  [5, 2, 0, 4, 1]  out_osds:  [3] dead_osds:  [2] live_osds:  [0, 1, 4, 5, 3]
2014-06-10T19:17:03.034 INFO:teuthology.task.thrashosds.thrasher:choose_action: min_in 3 min_out 0 min_live 2 min_dead 0
2014-06-10T19:17:03.034 INFO:teuthology.task.thrashosds.thrasher:inject_pause on 1
2014-06-10T19:17:03.034 INFO:teuthology.task.thrashosds.thrasher:Testing filestore_inject_stall pause injection for duration 3
2014-06-10T19:17:03.034 INFO:teuthology.task.thrashosds.thrasher:Checking after 0, should_be_down=False
2014-06-10T19:17:03.035 INFO:teuthology.orchestra.run.plana43:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.1.asok config set filestore_inject_stall 3'
2014-06-10T19:17:09.207 INFO:teuthology.task.thrashosds.thrasher:in_osds:  [5, 2, 0, 4, 1]  out_osds:  [3] dead_osds:  [2] live_osds:  [0, 1, 4, 5, 3]
2014-06-10T19:17:09.208 INFO:teuthology.task.thrashosds.thrasher:choose_action: min_in 3 min_out 0 min_live 2 min_dead 0
2014-06-10T19:17:09.208 INFO:teuthology.task.thrashosds.thrasher:Killing osd 1, live_osds are [0, 1, 4, 5, 3]

History

#1 Updated by Sage Weil almost 5 years ago

  • Status changed from New to Can't reproduce

Also available in: Atom PDF