Project

General

Profile

Actions

Bug #10488

closed

osd erasure-code-profile set is sometimes not idempotent

Added by Loïc Dachary over 9 years ago. Updated about 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
firefly
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When http://workbench.dachary.org/ceph/ceph/blob/giant/src/mon/OSDMonitor.cc#L4531 compares the profile to be created (for instance k=2 m=1), it does not take into account the key=values that are added implicitly by the jerasure plugin. Therefore ceph osd erasure-code-profile set myprofile k=3 m=1 will not compare equal if run twice.
It shows in the mon logs with

2015-01-07 18:06:34.333705 7fe7dcffd700 10 mon.d@6(peon) e1 handle_route mon_command_ack([{"prefix": "osd erasure-code-profile set", "name": "testprofile", "profile": [ "k=2", "m=1", "ruleset-failure-domain=osd"]}]=-1 will not override erasure code profile testprofile v220) v1 to unknown.0 :/0
2015-01-07 18:06:34.333712 7fe7dcffd700  1 -- 10.214.131.5:6792/0 --> 10.214.134.136:0/64022480 -- mon_command_ack([{"prefix": "osd erasure-code-profile set", "name": "testprofile", "profile": [ "k=2", "m=1", "ruleset-failure-domain=osd"]}]=-1 will not override erasure code profile testprofile v220) v1 -- ?+0 0x20d8960 con 0x32609a0
2015-01-07 18:06:34.335084 7fe7dcffd700 10 mon.d@6(peon) e1 ms_handle_reset 0x32609a0 10.214.134.136:0/64022480

and manifested itself on http://pulpito.ceph.com/sage-2015-01-06_09:44:19-rados-wip-sage-testing-firefly---basic-multi/688168/ reported in http://tracker.ceph.com/issues/10483 where http://workbench.dachary.org/ceph/ceph/blob/master/src/test/librados/test.cc#L57 failed with
2015-01-07T18:05:54.843 INFO:tasks.workunit.client.0.mira057.stdout:[ RUN      ] LibRadosAioEC.FlushAsync
2015-01-07T18:05:55.228 INFO:teuthology.orchestra.run.plana35:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph -m 10.214.134.136:6792 mon_status'
2015-01-07T18:05:55.607 INFO:tasks.mon_thrash.mon_thrasher:monitors to thrash: ['e', 'i', 'a', 'g']
2015-01-07T18:05:55.607 INFO:tasks.mon_thrash.mon_thrasher:monitors to freeze: []
2015-01-07T18:05:55.608 INFO:tasks.mon_thrash.mon_thrasher:thrashing mon.e
2015-01-07T18:05:55.608 INFO:tasks.mon_thrash.mon_thrasher:killing mon.e
2015-01-07T18:06:01.606 INFO:tasks.ceph.mon.e:Stopped
2015-01-07T18:06:01.606 INFO:tasks.mon_thrash.mon_thrasher:thrashing mon.i
2015-01-07T18:06:01.606 INFO:tasks.mon_thrash.mon_thrasher:killing mon.i
2015-01-07T18:06:07.607 INFO:tasks.ceph.mon.i:Stopped
2015-01-07T18:06:07.847 INFO:tasks.mon_thrash.mon_thrasher:thrashing mon.a
2015-01-07T18:06:07.847 INFO:tasks.mon_thrash.mon_thrasher:killing mon.a
2015-01-07T18:06:13.608 INFO:tasks.ceph.mon.a:Stopped
2015-01-07T18:06:13.608 INFO:tasks.mon_thrash.mon_thrasher:thrashing mon.g
2015-01-07T18:06:13.608 INFO:tasks.mon_thrash.mon_thrasher:killing mon.g
2015-01-07T18:06:19.608 INFO:tasks.ceph.mon.g:Stopped
2015-01-07T18:06:19.608 INFO:tasks.mon_thrash.ceph_manager:waiting for quorum size 5
2015-01-07T18:06:19.609 INFO:teuthology.orchestra.run.plana35:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph quorum_status'
2015-01-07T18:06:19.799 INFO:teuthology.orchestra.run.plana35.stderr:2015-01-07 18:06:19.798735 7fba7c2c1700  0 -- :/1026241 >> 10.214.134.136:6790/0 pipe(0x7fba78039010 sd=7 :0 s=1 pgs=0 cs=0 l=1 c=0x7fba780392a0).fault
2015-01-07T18:06:28.799 INFO:teuthology.orchestra.run.plana35.stderr:2015-01-07 18:06:28.799283 7fba7c1c0700  0 -- 10.214.131.5:0/1026241 >> 10.214.131.5:6789/0 pipe(0x7fba68000cd0 sd=7 :0 s=1 pgs=0 cs=0 l=1 c=0x7fba680029c0).fault
2015-01-07T18:06:31.170 INFO:tasks.workunit.client.0.mira057.stdout:test/librados/aio.cc:2279: Failure
2015-01-07T18:06:31.170 INFO:tasks.workunit.client.0.mira057.stdout:Value of: test_data.init()
2015-01-07T18:06:31.170 INFO:tasks.workunit.client.0.mira057.stdout:  Actual: "create_one_ec_pool(test-rados-api-mira057-22480-52) failed: error rados_mon_command erasure-code-profile set name:testprofile failed with error -1" 
2015-01-07T18:06:31.171 INFO:tasks.workunit.client.0.mira057.stdout:Expected: "" 
2015-01-07T18:06:31.171 INFO:tasks.workunit.client.0.mira057.stdout:[  FAILED  ] LibRadosAioEC.FlushAsync (36327 ms)
2015-01-07T18:06:31.171 INFO:tasks.workunit.client.0.mira057.stdout:[ RUN      ] LibRadosAioEC.FlushAsyncPP
2015-01-07T18:06:31.194 INFO:tasks.workunit.client.0.mira057.stdout:test/librados/aio.cc:2319: Failure
2015-01-07T18:06:31.194 INFO:tasks.workunit.client.0.mira057.stdout:Value of: test_data.init()
2015-01-07T18:06:31.195 INFO:tasks.workunit.client.0.mira057.stdout:  Actual: "create_one_ec_pool(test-rados-api-mira057-22480-53) failed: error mon_command erasure-code-profile set name:testprofile failed with error -1" 
2015-01-07T18:06:31.195 INFO:tasks.workunit.client.0.mira057.stdout:Expected: "" 
2015-01-07T18:06:31.195 INFO:tasks.workunit.client.0.mira057.stdout:[  FAILED  ] LibRadosAioEC.FlushAsyncPP (24 ms)
2015-01-07T18:06:31.196 INFO:tasks.workunit.client.0.mira057.stdout:[ RUN      ] LibRadosAioEC.RoundTripWriteFull
2015-01-07T18:06:31.206 INFO:tasks.workunit.client.0.mira057.stdout:test/librados/aio.cc:2364: Failure


Related issues 2 (0 open2 closed)

Related to Ceph - Bug #11144: erasure-code-profile set races with erasure-code-profile rmResolvedLoïc Dachary03/18/2015

Actions
Has duplicate Ceph - Bug #10483: erasure-code-profile set fails on mon thrashingDuplicate01/08/2015

Actions
Actions

Also available in: Atom PDF