Project

General

Profile

Actions

Bug #22711

closed

qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands: expect_false test 4096 '!=' 8192

Added by Sage Weil over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
Tests
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2018-01-16T03:47:18.410 INFO:tasks.workunit.client.0.smithi058.stderr://home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2366: test_mon_cephdf_commands:  ceph df detail --format=json
2018-01-16T03:47:18.411 INFO:tasks.workunit.client.0.smithi058.stderr://home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2366: test_mon_cephdf_commands:  jq '.pools | .[] | select(.name == "cephdf_for_test") | .stats.raw_bytes_used'
2018-01-16T03:47:18.708 INFO:tasks.workunit.client.0.smithi058.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2366: test_mon_cephdf_commands:  cal_raw_used_size=4096
2018-01-16T03:47:18.709 INFO:tasks.workunit.client.0.smithi058.stderr://home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2367: test_mon_cephdf_commands:  ceph df detail --format=json
2018-01-16T03:47:18.709 INFO:tasks.workunit.client.0.smithi058.stderr://home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2367: test_mon_cephdf_commands:  jq '.pools | .[] | select(.name == "cephdf_for_test") | .stats.bytes_used * 2'
2018-01-16T03:47:18.967 INFO:tasks.workunit.client.0.smithi058.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2367: test_mon_cephdf_commands:  raw_used_size=8192
2018-01-16T03:47:18.967 INFO:tasks.workunit.client.0.smithi058.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2369: test_mon_cephdf_commands:  ceph osd pool delete cephdf_for_test cephdf_for_test --yes-i-really-really-mean-it
2018-01-16T03:47:19.774 INFO:tasks.ceph.osd.0.smithi058.stderr:2018-01-16 03:47:19.764 7fbf20cde700 -1 osd.0 571 handle_osd_map lastmap 571
2018-01-16T03:47:19.775 INFO:tasks.ceph.osd.0.smithi058.stderr:2018-01-16 03:47:19.764 7fbf20cde700 -1 osd.0 571 handle_osd_map i.second 572
2018-01-16T03:47:19.778 INFO:tasks.ceph.osd.1.smithi058.stderr:2018-01-16 03:47:19.764 7f1b1b50d700 -1 osd.1 571 handle_osd_map lastmap 571
2018-01-16T03:47:19.778 INFO:tasks.ceph.osd.1.smithi058.stderr:2018-01-16 03:47:19.764 7f1b1b50d700 -1 osd.1 571 handle_osd_map i.second 572
2018-01-16T03:47:19.929 INFO:tasks.workunit.client.0.smithi058.stderr:pool 'cephdf_for_test' does not exist
2018-01-16T03:47:19.940 INFO:tasks.workunit.client.0.smithi058.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2370: test_mon_cephdf_commands:  rm ./cephdf_for_test
2018-01-16T03:47:19.942 INFO:tasks.workunit.client.0.smithi058.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2372: test_mon_cephdf_commands:  expect_false test 4096 '!=' 8192
2018-01-16T03:47:19.942 INFO:tasks.workunit.client.0.smithi058.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:48: expect_false:  set -x
2018-01-16T03:47:19.942 INFO:tasks.workunit.client.0.smithi058.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:49: expect_false:  test 4096 '!=' 8192
2018-01-16T03:47:19.942 INFO:tasks.workunit.client.0.smithi058.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:49: expect_false:  return 1
2018-01-16T03:47:19.942 INFO:tasks.workunit.client.0.smithi058.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:1: test_mon_cephdf_commands:  rm -fr /tmp/cephtool.QUE
2018-01-16T03:47:19.943 INFO:tasks.workunit:Stopping ['cephtool'] on client.0...

/a/sage-2018-01-16_03:08:54-rados-wip-sage2-testing-2018-01-15-1257-distro-basic-smithi/2077685
Actions #1

Updated by Chang Liu over 6 years ago

2018-01-16T03:47:10.576 INFO:tasks.workunit.client.0.smithi058.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2350: test_mon_cephdf_commands:  ceph osd pool create cephdf_for_test 32 32 replicated
2018-01-16T03:47:11.416 INFO:tasks.workunit.client.0.smithi058.stderr:pool 'cephdf_for_test' already exists

2018-01-16T03:47:18.967 INFO:tasks.workunit.client.0.smithi058.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:2369: test_mon_cephdf_commands:  ceph osd pool delete cephdf_for_test cephdf_for_test --yes-i-really-really-mean-it
2018-01-16T03:47:19.929 INFO:tasks.workunit.client.0.smithi058.stderr:pool 'cephdf_for_test' does not exist

It looks very strange

Actions #2

Updated by Sage Weil over 6 years ago

the curr_object_copies_rate value in PGMap.cc dump_object_stat_sum is .5, which is counteracting the 2x replication factor.

2018-01-22T01:21:14.105 INFO:tasks.workunit.client.0.smithi101.stdout:            "stats": {
2018-01-22T01:21:14.105 INFO:tasks.workunit.client.0.smithi101.stdout:                "kb_used": 4,
2018-01-22T01:21:14.105 INFO:tasks.workunit.client.0.smithi101.stdout:                "bytes_used": 4096,
2018-01-22T01:21:14.105 INFO:tasks.workunit.client.0.smithi101.stdout:                "percent_used": 0.00,
2018-01-22T01:21:14.105 INFO:tasks.workunit.client.0.smithi101.stdout:                "max_avail": 137773940736,
2018-01-22T01:21:14.105 INFO:tasks.workunit.client.0.smithi101.stdout:                "objects": 1,
2018-01-22T01:21:14.105 INFO:tasks.workunit.client.0.smithi101.stdout:                "quota_objects": 0,
2018-01-22T01:21:14.105 INFO:tasks.workunit.client.0.smithi101.stdout:                "quota_bytes": 0,
2018-01-22T01:21:14.106 INFO:tasks.workunit.client.0.smithi101.stdout:                "dirty": 1,
2018-01-22T01:21:14.106 INFO:tasks.workunit.client.0.smithi101.stdout:                "rd": 0,
2018-01-22T01:21:14.106 INFO:tasks.workunit.client.0.smithi101.stdout:                "rd_bytes": 0,
2018-01-22T01:21:14.106 INFO:tasks.workunit.client.0.smithi101.stdout:                "wr": 1,
2018-01-22T01:21:14.106 INFO:tasks.workunit.client.0.smithi101.stdout:                "wr_bytes": 4096,
2018-01-22T01:21:14.106 INFO:tasks.workunit.client.0.smithi101.stdout:                "raw_bytes_used": 4096,
2018-01-22T01:21:14.106 INFO:tasks.workunit.client.0.smithi101.stdout:                "raw_used_rate": 2.000000,
2018-01-22T01:21:14.106 INFO:tasks.workunit.client.0.smithi101.stdout:                "curr_object_copies_rate": 0.500000,
2018-01-22T01:21:14.106 INFO:tasks.workunit.client.0.smithi101.stdout:                "num_object_copies": 2,
2018-01-22T01:21:14.106 INFO:tasks.workunit.client.0.smithi101.stdout:                "num_objects_degraded": 1
2018-01-22T01:21:14.106 INFO:tasks.workunit.client.0.smithi101.stdout:            }

note the extra values in my test run /a/sage-2018-01-22_00:54:47-rados-wip-22668-test-distro-basic-smithi/2096774 .. raw_used-rate is fine.

Actions #3

Updated by Kefu Chai over 6 years ago

the weirdness of this issue is that some PGs are mapped to a single OSD:

$ ceph pg map 46.1 --format=json-pretty

{
    "epoch": 641,
    "raw_pgid": "46.1",
    "pgid": "46.1",
    "up": [
        2
    ],
    "acting": [
        2
    ]
}

that's why quite some PGs are in "active+undersized+degraded" state. i tracked down into do_rule(), turns out the returned osds is actually [2].

Actions #4

Updated by Kefu Chai over 6 years ago

  • Assignee set to Kefu Chai
Actions #5

Updated by Kefu Chai over 6 years ago

  • Category set to Tests
  • Status changed from 12 to Fix Under Review
Actions #6

Updated by Sage Weil over 6 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF