Project

General

Profile

Actions

Bug #8963

closed

erasure coding crush rulset breaks rbd kernel clients on non-ec pools on Ubuntu 14.04

Added by Greg Dahlman over 9 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
firefly
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On a fresh install using ceph-deploy on Ubuntu 14.04 creating any erasure coded pool breaks rbd clients on linux 3.13.0-32-generic.

OS: Ubuntu 14.04
ceph-deploy installed packages:
ii ceph 0.80.5-1trusty amd64 distributed storage and file system
ii ceph-common 0.80.5-1trusty amd64 common utilities to mount and interact with a ceph storage cluster
ii ceph-fs-common 0.80.5-1trusty amd64 common utilities to mount and interact with a ceph file system
ii ceph-mds 0.80.5-1trusty amd64 metadata server for the ceph distributed file system
ii librbd1 0.80.5-1trusty amd64 RADOS block device client library
ii python-ceph 0.80.5-1trusty amd64 Python libraries for the Ceph distributed filesystem

Deployment steps:

ceph-deploy install sbseaceph3001 sbseaceph3002 sbseaceph3003 sbseaceph3004 sbseaceph3005 sbseaceph3006
ceph-deploy new sbseaceph3001 sbseaceph3002 sbseaceph3003
ceph-deploy mon create sbseaceph3001 sbseaceph3002 sbseaceph3003
ceph-deploy gatherkeys sbseaceph3001

ceph-deploy disk zap sbseaceph3001:sda sbseaceph3001:sdb sbseaceph3001:sdc sbseaceph3001:sdd sbseaceph3001:sdg sbseaceph3001:sdh sbseaceph3001:sdi sbseaceph3001:sdj sbseaceph3001:sdk sbseaceph3001:sdl
ceph-deploy disk zap sbseaceph3002:sda sbseaceph3002:sdb sbseaceph3002:sdc sbseaceph3002:sdd sbseaceph3002:sdg sbseaceph3002:sdh sbseaceph3002:sdi sbseaceph3002:sdj sbseaceph3002:sdk sbseaceph3002:sdl
ceph-deploy disk zap sbseaceph3003:sda sbseaceph3003:sdb sbseaceph3003:sdc sbseaceph3003:sdd sbseaceph3003:sdg sbseaceph3003:sdh sbseaceph3003:sdi sbseaceph3003:sdj sbseaceph3003:sdk sbseaceph3003:sdl
ceph-deploy disk zap sbseaceph3004:sda sbseaceph3004:sdb sbseaceph3004:sdc sbseaceph3004:sdd sbseaceph3004:sdg sbseaceph3004:sdh sbseaceph3004:sdi sbseaceph3004:sdj sbseaceph3004:sdk sbseaceph3004:sdl
ceph-deploy disk zap sbseaceph3005:sda sbseaceph3005:sdb sbseaceph3005:sdc sbseaceph3005:sdd sbseaceph3005:sdg sbseaceph3005:sdh sbseaceph3005:sdi sbseaceph3005:sdj sbseaceph3005:sdk sbseaceph3005:sdl
ceph-deploy disk zap sbseaceph3006:sda sbseaceph3006:sdb sbseaceph3006:sdc sbseaceph3006:sdd sbseaceph3006:sdg sbseaceph3006:sdh sbseaceph3006:sdi sbseaceph3006:sdj sbseaceph3006:sdk sbseaceph3006:sdl

ceph-deploy osd create --fs-type=btrfs sbseaceph3001:sda:sde5 sbseaceph3002:sda:sde5 sbseaceph3003:sda:sde5 sbseaceph3004:sda:sde5 sbseaceph3005:sda:sde5 sbseaceph3006:sda:sde5
ceph-deploy osd create --fs-type=btrfs sbseaceph3001:sdb:sde6 sbseaceph3002:sdb:sde6 sbseaceph3003:sdb:sde6 sbseaceph3004:sdb:sde6 sbseaceph3005:sdb:sde6 sbseaceph3006:sdb:sde6
ceph-deploy osd create --fs-type=btrfs sbseaceph3001:sdc:sde7 sbseaceph3002:sdc:sde7 sbseaceph3003:sdc:sde7 sbseaceph3004:sdc:sde7 sbseaceph3005:sdc:sde7 sbseaceph3006:sdc:sde7
ceph-deploy osd create --fs-type=btrfs sbseaceph3001:sdd:sde8 sbseaceph3002:sdd:sde8 sbseaceph3003:sdd:sde8 sbseaceph3004:sdd:sde8 sbseaceph3005:sdd:sde8 sbseaceph3006:sdd:sde8
ceph-deploy osd create --fs-type=btrfs sbseaceph3001:sdg:sde9 sbseaceph3002:sdg:sde9 sbseaceph3003:sdg:sde9 sbseaceph3004:sdg:sde9 sbseaceph3005:sdg:sde9 sbseaceph3006:sdg:sde9

ceph-deploy osd create --fs-type=btrfs sbseaceph3001:sdh:sdf5 sbseaceph3002:sdh:sdf5 sbseaceph3003:sdh:sdf5 sbseaceph3004:sdh:sdf5 sbseaceph3005:sdh:sdf5 sbseaceph3006:sdh:sdf5
ceph-deploy osd create --fs-type=btrfs sbseaceph3001:sdi:sdf6 sbseaceph3002:sdi:sdf6 sbseaceph3003:sdi:sdf6 sbseaceph3004:sdi:sdf6 sbseaceph3005:sdi:sdf6 sbseaceph3006:sdi:sdf6
ceph-deploy osd create --fs-type=btrfs sbseaceph3001:sdj:sdf7 sbseaceph3002:sdj:sdf7 sbseaceph3003:sdj:sdf7 sbseaceph3004:sdj:sdf7 sbseaceph3005:sdj:sdf7 sbseaceph3006:sdj:sdf7
ceph-deploy osd create --fs-type=btrfs sbseaceph3001:sdk:sdf8 sbseaceph3002:sdk:sdf8 sbseaceph3003:sdk:sdf8 sbseaceph3004:sdk:sdf8 sbseaceph3005:sdk:sdf8 sbseaceph3006:sdk:sdf8
ceph-deploy osd create --fs-type=btrfs sbseaceph3001:sdl:sdf9 sbseaceph3002:sdl:sdf9 sbseaceph3003:sdl:sdf9 sbseaceph3004:sdl:sdf9 sbseaceph3005:sdl:sdf9 sbseaceph3006:sdl:sdf9

Issue recreation:

Test cluster prior to creating ec pool

  1. rbd create --size 10240 bench
  2. rbd map bench
Note mapping works.
  1. rbd unmap /dev/rbd1

Create ec pool:

  1. ceph osd pool create ecpool 12 12 erasure

Attempt to attach replicated rbd volume in unrelated pool

  1. rbd map bench
  2. rbd: add failed: (5) Input/output error

dmesg output:

[ 8422.748007] libceph: client4235 fsid 4af7828f-9360-45fd-aa82-9e10cd69ac6a
[ 8422.749388] libceph: mon2 10.3.11.203:6789 session established
[ 8422.756190] rbd1: unknown partition table
[ 8422.756267] rbd: rbd1: added with size 0x280000000
[ 8496.226697] libceph: osd21 up
[ 8496.226704] libceph: osd21 weight 0x10000 (in)
[ 8496.226737] libceph: osd22 up
[ 8496.226739] libceph: osd22 weight 0x10000 (in)
[ 8496.226769] libceph: osd23 up
[ 8496.226771] libceph: osd23 weight 0x10000 (in)
[ 8496.226802] libceph: osd24 up
[ 8496.226804] libceph: osd24 weight 0x10000 (in)
[ 8496.226834] libceph: osd25 up
[ 8496.226836] libceph: osd25 weight 0x10000 (in)
[ 8496.226891] libceph: osd26 up
[ 8496.226893] libceph: osd26 weight 0x10000 (in)
[ 8508.456496] libceph: mon2 10.3.11.203:6789 feature set mismatch, my 4a042a42 < server's 104a042a42, missing 1000000000
[ 8508.544665] libceph: mon2 10.3.11.203:6789 socket error on read
[ 8518.463590] libceph: mon0 10.3.11.201:6789 feature set mismatch, my 4a042a42 < server's 104a042a42, missing 1000000000
[ 8518.551548] libceph: mon0 10.3.11.201:6789 socket error on read
[ 8528.469541] libceph: mon2 10.3.11.203:6789 feature set mismatch, my 4a042a42 < server's 104a042a42, missing 1000000000
[ 8528.557437] libceph: mon2 10.3.11.203:6789 socket error on read
[ 8538.475586] libceph: mon1 10.3.11.202:6789 feature set mismatch, my 4a042a42 < server's 104a042a42, missing 1000000000
[ 8538.563276] libceph: mon1 10.3.11.202:6789 socket error on read
[ 8548.481613] libceph: mon1 10.3.11.202:6789 feature set mismatch, my 4a042a42 < server's 104a042a42, missing 1000000000
[ 8548.569251] libceph: mon1 10.3.11.202:6789 socket error on read
[ 8558.487569] libceph: mon1 10.3.11.202:6789 feature set mismatch, my 4a042a42 < server's 104a042a42, missing 1000000000
[ 8558.575080] libceph: mon1 10.3.11.202:6789 socket error on read

Fix: delete ec pool and crush rule "erasure-code"

  1. ceph osd crush rule dump erasure-code { "rule_id": 1,
    "rule_name": "erasure-code",
    "ruleset": 1,
    "type": 3,
    "min_size": 3,
    "max_size": 20,
    "steps": [ { "op": "set_chooseleaf_tries",
    "num": 5}, { "op": "take",
    "item": -1,
    "item_name": "default"}, { "op": "chooseleaf_indep",
    "num": 0,
    "type": "host"}, { "op": "emit"}]}

The existence of ec pools should not break clients which are not using that functionality.

Actions #1

Updated by Sage Weil over 9 years ago

  • Priority changed from Normal to Urgent
  • Source changed from other to Community (user)
Actions #2

Updated by Sage Weil over 9 years ago

  • Status changed from New to Resolved
  • Backport set to firefly

backported to firefly

Actions #3

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (10)
Actions

Also available in: Atom PDF