Project

General

Profile

Actions

Bug #13370

closed

ceph-disk: ceph-deploy suite fails with dmcrypt (hammer)

Added by Loïc Dachary over 8 years ago. Updated over 8 years ago.

Status:
Won't Fix
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Workaround

Run ceph-disk activate manually.

Rationale for Won't fix

The problem revealed by this suite is a race condition that can be fixed easily manually. It is fixed in infernalis, together with a number of fixes related to udev related fixes. It would not be trivial to backport on hammer and it looks like people usually work around this.

Description

curl --silent http://paddles.front.sepia.ceph.com/runs/?suite=ceph-deploy | jq '.[] | .name' | while read run ; do eval run=$run ; curl --silent http://paddles.front.sepia.ceph.com/runs/$run/jobs/ | jq '.[] | select(.os_version == "6.5" and .status == "pass") | select(.description | contains("dmcrypt")) | .name' ; done 

returns nothing and
teuthology-suite --verbose --suite ceph-deploy --filter="ceph-deploy/basic/{ceph-deploy-overrides/ceph_deploy_dmcrypt.yaml config_options/cephdeploy_conf.yaml distros/centos_6.5.yaml tasks/ceph-deploy_hello_world.yaml}" --suite-branch wip-ceph-deploy-test-hammer --ceph hammer-backports --machine-type vps --priority 101

fails consistently (see http://pulpito.ceph.com/loic-2015-10-05_01:40:39-ceph-deploy-hammer-backports---basic-vps/1088457/ for an example).

I suspect this is a combination of incorrect udev rules and racing udev events generated by ceph-disk.

teuthology-suite --verbose --suite ceph-deploy --filter="ceph-deploy/basic/{ceph-deploy-overrides/ceph_deploy_dmcrypt.yaml config_options/cephdeploy_conf.yaml distros/centos_6.5.yaml tasks/ceph-deploy_hello_world.yaml}" --suite-branch wip-ceph-deploy-test-hammer --ceph hammer-backports --machine-type vps --priority 101

will create a teuthology job that never returns (it waits forever for the cluster to be healthy) which is convenient to investigate.


Related issues 2 (0 open2 closed)

Has duplicate Ceph - Bug #13721: ceph-deploy: "unable to get 'HEALTH_OK' after waiting 15 minutes" in ceph-deploy-hammer-distro-basic-vps runDuplicate11/08/2015

Actions
Has duplicate Ceph - Bug #13366: ceph-deploy: "unable to get 'HEALTH_OK' after waiting 15 minutes" in ceph-deploy-hammer-distro-basic-vps runDuplicate10/05/2015

Actions
Actions #1

Updated by Loïc Dachary over 8 years ago

  • Project changed from 18 to Ceph
  • Subject changed from ceph-deploy suite fails on CentOS 6.5 with dmcrypt to ceph-disk: ceph-deploy suite fails on CentOS 6.5 with dmcrypt (hammer)
  • Description updated (diff)
Actions #2

Updated by Loïc Dachary over 8 years ago

  • Description updated (diff)
Actions #3

Updated by Loïc Dachary over 8 years ago

teuthology-suite --verbose --suite ceph-deploy --filter="ceph-deploy/basic/{ceph-deploy-overrides/ceph_deploy_dmcrypt.yaml config_options/cephdeploy_conf.yaml distros/ubuntu_14.04.yaml tasks/ceph-deploy_hello_world.yaml}" --suite-branch hammer --ceph hammer-backports --machine-type vps --priority 101

fails also and on the first target it looks like /dev/vdb1 failed to be mapped with dmcrypt

/dev/vdb :
 /dev/vdb1 ceph data (dmcrypt LUKS), not currently mapped
 /dev/vdb2 ceph journal (dmcrypt LUKS /dev/dm-0)
/dev/vdc other, unknown
/dev/vdd other, unknown
ubuntu@vpm040:~$ sudo ceph-disk activate /dev/vdb1
mount: unknown filesystem type 'crypto_LUKS'
ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t', 'crypto_LUKS', '-o', '', '--', '/dev/vdb1', '/var/lib/ceph/tmp/mnt.3FLhuJ']' returned non-zero exit status 32
Actions #4

Updated by Loïc Dachary over 8 years ago

  • Subject changed from ceph-disk: ceph-deploy suite fails on CentOS 6.5 with dmcrypt (hammer) to ceph-disk: ceph-deploy suite fails with dmcrypt (hammer)
Actions #6

Updated by Samuel Just over 8 years ago

  • Priority changed from Normal to High
Actions #7

Updated by Loïc Dachary over 8 years ago

  • Has duplicate Bug #13721: ceph-deploy: "unable to get 'HEALTH_OK' after waiting 15 minutes" in ceph-deploy-hammer-distro-basic-vps run added
Actions #8

Updated by Loïc Dachary over 8 years ago

  • Has duplicate Bug #13366: ceph-deploy: "unable to get 'HEALTH_OK' after waiting 15 minutes" in ceph-deploy-hammer-distro-basic-vps run added
Actions #9

Updated by Loïc Dachary over 8 years ago

  • Description updated (diff)
  • Status changed from 12 to Won't Fix
Actions

Also available in: Atom PDF