Project

General

Profile

Actions

Bug #25006

closed

bad csum during upgrade test

Added by Yuri Weinstein over 5 years ago. Updated over 5 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/luminous-x
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/yuriw-2018-07-18_21:43:34-upgrade:luminous-x-mimic-distro-basic-smithi/
Job: 2796264
Logs: http://qa-proxy.ceph.com/teuthology/yuriw-2018-07-18_21:43:34-upgrade:luminous-x-mimic-distro-basic-smithi/2796264/teuthology.log

2018-07-19T12:26:20.247 INFO:teuthology.orchestra.run.smithi130:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.5.asok dump_ops_in_flight'
2018-07-19T12:26:20.302 INFO:tasks.ceph.osd.5.smithi130.stderr:2018-07-19 12:26:20.290112 7fcb47faa700 -1 received  signal: Hangup from  PID: 50243 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 5  UID: 0
2018-07-19T12:26:20.378 INFO:tasks.ceph.osd.5.smithi130.stderr:2018-07-19 12:26:20.375775 7fcb47faa700 -1 received  signal: Hangup from  PID: 50243 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 5  UID: 0
2018-07-19T12:26:20.391 INFO:tasks.ceph.osd.1.smithi104.stderr:2018-07-19 12:26:20.384 7f56ea308700 -1 bluestore(/var/lib/ceph/osd/ceph-1) _verify_csum bad crc32c/0x10000 checksum at blob offset 0x0, got 0x1c104513, expected 0x315eeaf7, device location [0xed70000~10000], logical extent 0x1b0000~10000, object #14:1d1092d3:::benchmark_data_smithi105_10641_object5831:head#
2018-07-19T12:26:20.391 INFO:tasks.ceph.osd.1.smithi104.stderr:2018-07-19 12:26:20.384 7f56ea308700 -1 log_channel(cluster) log [ERR] : 14.8 missing primary copy of 14:1d1092d3:::benchmark_data_smithi105_10641_object5831:head, unfound
2018-07-19T12:26:20.393 INFO:teuthology.orchestra.run.smithi130:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.5.asok dump_blocked_ops'
2018-07-19T12:26:20.503 INFO:tasks.ceph.osd.7.smithi130.stderr:2018-07-19 12:26:20.500133 7f4ba88a1700 -1 received  signal: Hangup from  PID: 15430 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 7  UID: 0


Related issues 2 (0 open2 closed)

Related to bluestore - Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)Won't Fix

Actions
Related to bluestore - Bug #24901: Client reads fail due to bad CRC under high memory pressure on OSDsResolved07/13/2018

Actions
Actions #1

Updated by Yuri Weinstein over 5 years ago

  • Project changed from RADOS to Ceph
Actions #2

Updated by Nathan Cutler over 5 years ago

The "missing primary copy" message happens on a mixed luminous/mimic cluster (MONs, MGR, and half of OSDs running mimic, other half of OSDs running luminous) in the third or fourth iteration of radosbench within "stress-tasks", right after a bluestore "bad crc32c/0x10000 checksum" message.

Here is the immediately preceding cluster health status:

2018-07-19T12:26:19.206 INFO:teuthology.orchestra.run.smithi104.stdout:
2018-07-19T12:26:19.207 INFO:teuthology.orchestra.run.smithi104.stdout:{
2018-07-19T12:26:19.207 INFO:teuthology.orchestra.run.smithi104.stdout:    "fsid": "6fb081a2-4dae-443e-ad49-08c8b4670dc8",
2018-07-19T12:26:19.207 INFO:teuthology.orchestra.run.smithi104.stdout:    "health": {
2018-07-19T12:26:19.207 INFO:teuthology.orchestra.run.smithi104.stdout:        "checks": {
2018-07-19T12:26:19.207 INFO:teuthology.orchestra.run.smithi104.stdout:            "OSDMAP_FLAGS": {
2018-07-19T12:26:19.208 INFO:teuthology.orchestra.run.smithi104.stdout:                "severity": "HEALTH_WARN",
2018-07-19T12:26:19.208 INFO:teuthology.orchestra.run.smithi104.stdout:                "summary": {
2018-07-19T12:26:19.208 INFO:teuthology.orchestra.run.smithi104.stdout:                    "message": "nodeep-scrub flag(s) set" 
2018-07-19T12:26:19.208 INFO:teuthology.orchestra.run.smithi104.stdout:                }
2018-07-19T12:26:19.208 INFO:teuthology.orchestra.run.smithi104.stdout:            },
2018-07-19T12:26:19.208 INFO:teuthology.orchestra.run.smithi104.stdout:            "OBJECT_MISPLACED": {
2018-07-19T12:26:19.208 INFO:teuthology.orchestra.run.smithi104.stdout:                "severity": "HEALTH_WARN",
2018-07-19T12:26:19.208 INFO:teuthology.orchestra.run.smithi104.stdout:                "summary": {
2018-07-19T12:26:19.209 INFO:teuthology.orchestra.run.smithi104.stdout:                    "message": "166/9278 objects misplaced (1.789%)" 
2018-07-19T12:26:19.209 INFO:teuthology.orchestra.run.smithi104.stdout:                }
2018-07-19T12:26:19.209 INFO:teuthology.orchestra.run.smithi104.stdout:            },
2018-07-19T12:26:19.209 INFO:teuthology.orchestra.run.smithi104.stdout:            "PG_DEGRADED": {
2018-07-19T12:26:19.209 INFO:teuthology.orchestra.run.smithi104.stdout:                "severity": "HEALTH_WARN",
2018-07-19T12:26:19.209 INFO:teuthology.orchestra.run.smithi104.stdout:                "summary": {
2018-07-19T12:26:19.209 INFO:teuthology.orchestra.run.smithi104.stdout:                    "message": "Degraded data redundancy: 211/9278 objects degraded (2.274%), 1 pg degraded" 
2018-07-19T12:26:19.210 INFO:teuthology.orchestra.run.smithi104.stdout:                }
2018-07-19T12:26:19.210 INFO:teuthology.orchestra.run.smithi104.stdout:            },
2018-07-19T12:26:19.210 INFO:teuthology.orchestra.run.smithi104.stdout:            "SMALLER_PGP_NUM": {
2018-07-19T12:26:19.210 INFO:teuthology.orchestra.run.smithi104.stdout:                "severity": "HEALTH_WARN",
2018-07-19T12:26:19.210 INFO:teuthology.orchestra.run.smithi104.stdout:                "summary": {
2018-07-19T12:26:19.210 INFO:teuthology.orchestra.run.smithi104.stdout:                    "message": "1 pools have pg_num > pgp_num" 
2018-07-19T12:26:19.210 INFO:teuthology.orchestra.run.smithi104.stdout:                }
2018-07-19T12:26:19.211 INFO:teuthology.orchestra.run.smithi104.stdout:            }
2018-07-19T12:26:19.211 INFO:teuthology.orchestra.run.smithi104.stdout:        },
2018-07-19T12:26:19.211 INFO:teuthology.orchestra.run.smithi104.stdout:        "status": "HEALTH_WARN",
2018-07-19T12:26:19.211 INFO:teuthology.orchestra.run.smithi104.stdout:        "overall_status": "HEALTH_WARN" 
2018-07-19T12:26:19.211 INFO:teuthology.orchestra.run.smithi104.stdout:    },
2018-07-19T12:26:19.211 INFO:teuthology.orchestra.run.smithi104.stdout:    "election_epoch": 12,
2018-07-19T12:26:19.211 INFO:teuthology.orchestra.run.smithi104.stdout:    "quorum": [
2018-07-19T12:26:19.212 INFO:teuthology.orchestra.run.smithi104.stdout:        0,
2018-07-19T12:26:19.212 INFO:teuthology.orchestra.run.smithi104.stdout:        1,
2018-07-19T12:26:19.212 INFO:teuthology.orchestra.run.smithi104.stdout:        2
2018-07-19T12:26:19.213 INFO:teuthology.orchestra.run.smithi104.stdout:    ],
2018-07-19T12:26:19.213 INFO:teuthology.orchestra.run.smithi104.stdout:    "quorum_names": [
2018-07-19T12:26:19.213 INFO:teuthology.orchestra.run.smithi104.stdout:        "a",
2018-07-19T12:26:19.213 INFO:teuthology.orchestra.run.smithi104.stdout:        "b",
2018-07-19T12:26:19.213 INFO:teuthology.orchestra.run.smithi104.stdout:        "c" 
2018-07-19T12:26:19.213 INFO:teuthology.orchestra.run.smithi104.stdout:    ],
2018-07-19T12:26:19.213 INFO:teuthology.orchestra.run.smithi104.stdout:    "monmap": {
2018-07-19T12:26:19.214 INFO:teuthology.orchestra.run.smithi104.stdout:        "epoch": 2,
2018-07-19T12:26:19.214 INFO:teuthology.orchestra.run.smithi104.stdout:        "fsid": "6fb081a2-4dae-443e-ad49-08c8b4670dc8",
2018-07-19T12:26:19.214 INFO:teuthology.orchestra.run.smithi104.stdout:        "modified": "2018-07-19 11:55:04.062383",
2018-07-19T12:26:19.214 INFO:teuthology.orchestra.run.smithi104.stdout:        "created": "2018-07-19 11:53:29.762084",
2018-07-19T12:26:19.214 INFO:teuthology.orchestra.run.smithi104.stdout:        "features": {
2018-07-19T12:26:19.214 INFO:teuthology.orchestra.run.smithi104.stdout:            "persistent": [
2018-07-19T12:26:19.214 INFO:teuthology.orchestra.run.smithi104.stdout:                "kraken",
2018-07-19T12:26:19.214 INFO:teuthology.orchestra.run.smithi104.stdout:                "luminous",
2018-07-19T12:26:19.215 INFO:teuthology.orchestra.run.smithi104.stdout:                "mimic",
2018-07-19T12:26:19.215 INFO:teuthology.orchestra.run.smithi104.stdout:                "osdmap-prune" 
2018-07-19T12:26:19.215 INFO:teuthology.orchestra.run.smithi104.stdout:            ],
2018-07-19T12:26:19.215 INFO:teuthology.orchestra.run.smithi104.stdout:            "optional": []
2018-07-19T12:26:19.215 INFO:teuthology.orchestra.run.smithi104.stdout:        },
2018-07-19T12:26:19.215 INFO:teuthology.orchestra.run.smithi104.stdout:        "mons": [
2018-07-19T12:26:19.215 INFO:teuthology.orchestra.run.smithi104.stdout:            {
2018-07-19T12:26:19.216 INFO:teuthology.orchestra.run.smithi104.stdout:                "rank": 0,
2018-07-19T12:26:19.216 INFO:teuthology.orchestra.run.smithi104.stdout:                "name": "a",
2018-07-19T12:26:19.216 INFO:teuthology.orchestra.run.smithi104.stdout:                "addr": "172.21.15.104:6789/0",
2018-07-19T12:26:19.216 INFO:teuthology.orchestra.run.smithi104.stdout:                "public_addr": "172.21.15.104:6789/0" 
2018-07-19T12:26:19.216 INFO:teuthology.orchestra.run.smithi104.stdout:            },
2018-07-19T12:26:19.216 INFO:teuthology.orchestra.run.smithi104.stdout:            {
2018-07-19T12:26:19.216 INFO:teuthology.orchestra.run.smithi104.stdout:                "rank": 1,
2018-07-19T12:26:19.217 INFO:teuthology.orchestra.run.smithi104.stdout:                "name": "b",
2018-07-19T12:26:19.217 INFO:teuthology.orchestra.run.smithi104.stdout:                "addr": "172.21.15.104:6790/0",
2018-07-19T12:26:19.217 INFO:teuthology.orchestra.run.smithi104.stdout:                "public_addr": "172.21.15.104:6790/0" 
2018-07-19T12:26:19.217 INFO:teuthology.orchestra.run.smithi104.stdout:            },
2018-07-19T12:26:19.217 INFO:teuthology.orchestra.run.smithi104.stdout:            {
2018-07-19T12:26:19.217 INFO:teuthology.orchestra.run.smithi104.stdout:                "rank": 2,
2018-07-19T12:26:19.217 INFO:teuthology.orchestra.run.smithi104.stdout:                "name": "c",
2018-07-19T12:26:19.218 INFO:teuthology.orchestra.run.smithi104.stdout:                "addr": "172.21.15.104:6791/0",
2018-07-19T12:26:19.218 INFO:teuthology.orchestra.run.smithi104.stdout:                "public_addr": "172.21.15.104:6791/0" 
2018-07-19T12:26:19.218 INFO:teuthology.orchestra.run.smithi104.stdout:            }
2018-07-19T12:26:19.218 INFO:teuthology.orchestra.run.smithi104.stdout:        ]
2018-07-19T12:26:19.218 INFO:teuthology.orchestra.run.smithi104.stdout:    },
2018-07-19T12:26:19.218 INFO:teuthology.orchestra.run.smithi104.stdout:    "osdmap": {
2018-07-19T12:26:19.218 INFO:teuthology.orchestra.run.smithi104.stdout:        "osdmap": {
2018-07-19T12:26:19.218 INFO:teuthology.orchestra.run.smithi104.stdout:            "epoch": 949,
2018-07-19T12:26:19.219 INFO:teuthology.orchestra.run.smithi104.stdout:            "num_osds": 8,
2018-07-19T12:26:19.219 INFO:teuthology.orchestra.run.smithi104.stdout:            "num_up_osds": 8,
2018-07-19T12:26:19.219 INFO:teuthology.orchestra.run.smithi104.stdout:            "num_in_osds": 8,
2018-07-19T12:26:19.219 INFO:teuthology.orchestra.run.smithi104.stdout:            "full": false,
2018-07-19T12:26:19.219 INFO:teuthology.orchestra.run.smithi104.stdout:            "nearfull": false,
2018-07-19T12:26:19.219 INFO:teuthology.orchestra.run.smithi104.stdout:            "num_remapped_pgs": 0
2018-07-19T12:26:19.219 INFO:teuthology.orchestra.run.smithi104.stdout:        }
2018-07-19T12:26:19.220 INFO:teuthology.orchestra.run.smithi104.stdout:    },
2018-07-19T12:26:19.220 INFO:teuthology.orchestra.run.smithi104.stdout:    "pgmap": {
2018-07-19T12:26:19.220 INFO:teuthology.orchestra.run.smithi104.stdout:        "pgs_by_state": [
2018-07-19T12:26:19.220 INFO:teuthology.orchestra.run.smithi104.stdout:            {
2018-07-19T12:26:19.220 INFO:teuthology.orchestra.run.smithi104.stdout:                "state_name": "active+clean",
2018-07-19T12:26:19.220 INFO:teuthology.orchestra.run.smithi104.stdout:                "count": 102
2018-07-19T12:26:19.220 INFO:teuthology.orchestra.run.smithi104.stdout:            },
2018-07-19T12:26:19.221 INFO:teuthology.orchestra.run.smithi104.stdout:            {
2018-07-19T12:26:19.221 INFO:teuthology.orchestra.run.smithi104.stdout:                "state_name": "active+recovering+degraded",
2018-07-19T12:26:19.221 INFO:teuthology.orchestra.run.smithi104.stdout:                "count": 1
2018-07-19T12:26:19.221 INFO:teuthology.orchestra.run.smithi104.stdout:            },
2018-07-19T12:26:19.221 INFO:teuthology.orchestra.run.smithi104.stdout:            {
2018-07-19T12:26:19.221 INFO:teuthology.orchestra.run.smithi104.stdout:                "state_name": "active+recovering",
2018-07-19T12:26:19.221 INFO:teuthology.orchestra.run.smithi104.stdout:                "count": 1
2018-07-19T12:26:19.221 INFO:teuthology.orchestra.run.smithi104.stdout:            }
2018-07-19T12:26:19.222 INFO:teuthology.orchestra.run.smithi104.stdout:        ],
2018-07-19T12:26:19.222 INFO:teuthology.orchestra.run.smithi104.stdout:        "num_pgs": 104,
2018-07-19T12:26:19.222 INFO:teuthology.orchestra.run.smithi104.stdout:        "num_pools": 2,
2018-07-19T12:26:19.222 INFO:teuthology.orchestra.run.smithi104.stdout:        "num_objects": 4639,
2018-07-19T12:26:19.223 INFO:teuthology.orchestra.run.smithi104.stdout:        "data_bytes": 19453181976,
2018-07-19T12:26:19.223 INFO:teuthology.orchestra.run.smithi104.stdout:        "bytes_used": 58726039552,
2018-07-19T12:26:19.223 INFO:teuthology.orchestra.run.smithi104.stdout:        "bytes_avail": 714368073728,
2018-07-19T12:26:19.223 INFO:teuthology.orchestra.run.smithi104.stdout:        "bytes_total": 773094113280,
2018-07-19T12:26:19.223 INFO:teuthology.orchestra.run.smithi104.stdout:        "degraded_objects": 211,
2018-07-19T12:26:19.223 INFO:teuthology.orchestra.run.smithi104.stdout:        "degraded_total": 9278,
2018-07-19T12:26:19.223 INFO:teuthology.orchestra.run.smithi104.stdout:        "degraded_ratio": 0.022742,
2018-07-19T12:26:19.224 INFO:teuthology.orchestra.run.smithi104.stdout:        "misplaced_objects": 166,
2018-07-19T12:26:19.224 INFO:teuthology.orchestra.run.smithi104.stdout:        "misplaced_total": 9278,
2018-07-19T12:26:19.224 INFO:teuthology.orchestra.run.smithi104.stdout:        "misplaced_ratio": 0.017892,
2018-07-19T12:26:19.224 INFO:teuthology.orchestra.run.smithi104.stdout:        "recovering_objects_per_sec": 62,
2018-07-19T12:26:19.224 INFO:teuthology.orchestra.run.smithi104.stdout:        "recovering_bytes_per_sec": 260815863,
2018-07-19T12:26:19.224 INFO:teuthology.orchestra.run.smithi104.stdout:        "recovering_keys_per_sec": 0,
2018-07-19T12:26:19.224 INFO:teuthology.orchestra.run.smithi104.stdout:        "num_objects_recovered": 351,
2018-07-19T12:26:19.225 INFO:teuthology.orchestra.run.smithi104.stdout:        "num_bytes_recovered": 1472200704,
2018-07-19T12:26:19.225 INFO:teuthology.orchestra.run.smithi104.stdout:        "num_keys_recovered": 0,
2018-07-19T12:26:19.225 INFO:teuthology.orchestra.run.smithi104.stdout:        "read_bytes_sec": 181,
2018-07-19T12:26:19.225 INFO:teuthology.orchestra.run.smithi104.stdout:        "write_bytes_sec": 52757806,
2018-07-19T12:26:19.225 INFO:teuthology.orchestra.run.smithi104.stdout:        "read_op_per_sec": 0,
2018-07-19T12:26:19.225 INFO:teuthology.orchestra.run.smithi104.stdout:        "write_op_per_sec": 416
2018-07-19T12:26:19.225 INFO:teuthology.orchestra.run.smithi104.stdout:    },
...

and here are the "incriminated log messages":

2018-07-19T12:26:20.073 INFO:tasks.ceph.osd.0.smithi104.stderr:2018-07-19 12:26:20.060 7fd055462700 -1 received  signal: Hangup from /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 0  (PID: 40895) UID: 0
2018-07-19T12:26:20.157 INFO:teuthology.orchestra.run.smithi130:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.6.asok dump_historic_ops'
2018-07-19T12:26:20.202 INFO:tasks.ceph.osd.4.smithi130.stderr:2018-07-19 12:26:20.200241 7f110ceee700 -1 received  signal: Hangup from  PID: 12916 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 4  UID: 0
2018-07-19T12:26:20.247 INFO:teuthology.orchestra.run.smithi130:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.5.asok dump_ops_in_flight'
2018-07-19T12:26:20.302 INFO:tasks.ceph.osd.5.smithi130.stderr:2018-07-19 12:26:20.290112 7fcb47faa700 -1 received  signal: Hangup from  PID: 50243 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 5  UID: 0
2018-07-19T12:26:20.378 INFO:tasks.ceph.osd.5.smithi130.stderr:2018-07-19 12:26:20.375775 7fcb47faa700 -1 received  signal: Hangup from  PID: 50243 task name: /usr/bin/python /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 5  UID: 0
2018-07-19T12:26:20.391 INFO:tasks.ceph.osd.1.smithi104.stderr:2018-07-19 12:26:20.384 7f56ea308700 -1 bluestore(/var/lib/ceph/osd/ceph-1) _verify_csum bad crc32c/0x10000 checksum at blob offset 0x0, got 0x1c104513, expected 0x315eeaf7, device location [0xed70000~10000], logical extent 0x1b0000~10000, object #14:1d1092d3:::benchmark_data_smithi105_10641_object5831:head#
2018-07-19T12:26:20.391 INFO:tasks.ceph.osd.1.smithi104.stderr:2018-07-19 12:26:20.384 7f56ea308700 -1 log_channel(cluster) log [ERR] : 14.8 missing primary copy of 14:1d1092d3:::benchmark_data_smithi105_10641_object5831:head, unfound
Actions #3

Updated by Nathan Cutler over 5 years ago

  • Related to Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block) added
Actions #4

Updated by Nathan Cutler over 5 years ago

  • Related to Bug #24901: Client reads fail due to bad CRC under high memory pressure on OSDs added
Actions #5

Updated by Nathan Cutler over 5 years ago

Also, I noticed this in the test yaml:

  - parallel:
    - stress-tasks
  - install.upgrade:
      client.0: null
      osd.4: null
  - ceph.restart:
      daemons:
      - osd.4
      - osd.5
      - osd.6
      - osd.7
      wait-for-healthy: false
      wait-for-osds-up: true

The only thing within parallel is "stress-tasks". Shouldn't the "install.upgrade" and "ceph.restart" stanzas be included under parallel, as well?

The test in its current form seems to do: (1) upgrade all MONs, MGRs, and half of the OSDs to mimic, (2) stress this "split luminous/mimic" cluster, (3) upgrade the remaining OSDs to mimic.

It does not appear to thrash or otherwise stress the cluster during the upgrade process.

Actions #6

Updated by Patrick Donnelly over 5 years ago

  • Project changed from Ceph to RADOS
Actions #7

Updated by Neha Ojha over 5 years ago

  • Assignee set to Neha Ojha
Actions #8

Updated by Neha Ojha over 5 years ago

  • Project changed from RADOS to bluestore
  • Assignee changed from Neha Ojha to Sage Weil
Actions #9

Updated by Sage Weil over 5 years ago

Nathan Cutler wrote:

Also, I noticed this in the test yaml:

[...]

The only thing within parallel is "stress-tasks". Shouldn't the "install.upgrade" and "ceph.restart" stanzas be included under parallel, as well?

The test in its current form seems to do: (1) upgrade all MONs, MGRs, and half of the OSDs to mimic, (2) stress this "split luminous/mimic" cluster, (3) upgrade the remaining OSDs to mimic.

It does not appear to thrash or otherwise stress the cluster during the upgrade process.

That's by design. The intent is to stress a mixed-version cluster with thrashing. The parallel tests do the upgrade and stress workload in parallel.

Actions #10

Updated by Sage Weil over 5 years ago

  • Subject changed from "[ERR] : 14.8 missing primary copy of .." in upgrade:luminous-x-mimic to bad csum during upgrade test
  • Status changed from New to Need More Info
  • Assignee deleted (Sage Weil)
  • Priority changed from Urgent to High

Looking at the log, I don't see any useful clues as to what might have went wrong. No intervening writes, etc.

Actions

Also available in: Atom PDF