Bug #13279
upgrade suite: pool_create failed with error -4 EINTR
0%
Description
Run: http://pulpito.ceph.com/teuthology-2015-09-29_10:41:00-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-vps/
Job: ['1076297']
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-09-29_10:41:00-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-vps/1076297/teuthology.log
2015-09-29T11:13:07.293 INFO:tasks.ceph.osd.3:Restarting daemon 2015-09-29T11:13:07.293 INFO:tasks.ceph.osd.3:Stopping old one... 2015-09-29T11:13:07.293 DEBUG:tasks.ceph.osd.3:waiting for process to exit 2015-09-29T11:13:07.578 INFO:tasks.ceph.osd.2.vpm055.stdout:starting osd.2 at :/0 osd_data /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal 2015-09-29T11:13:07.601 INFO:tasks.ceph.osd.2.vpm055.stderr:2015-09-29 18:13:07.593229 7fa603c65900 -1 filestore(/var/lib/ceph/osd/ceph-2) FileStore::mount : stale version stamp detected: 3. Proceeding, do_update is set, performing disk format upgrade. 2015-09-29T11:13:07.685 INFO:tasks.ceph.osd.2.vpm055.stderr:2015-09-29 18:13:07.680160 7fa603c65900 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2015-09-29T11:13:07.713 INFO:tasks.ceph.osd.2.vpm055.stderr:2015-09-29 18:13:07.707706 7fa603c65900 -1 osd.2 286 PGs are upgrading 2015-09-29T11:13:07.804 INFO:tasks.ceph.osd.2.vpm055.stderr:2015-09-29 18:13:07.799305 7fa603c65900 -1 osd.2 286 log_to_monitors {default=true} 2015-09-29T11:13:10.928 INFO:tasks.ceph.mon.b.vpm055.stderr:2015-09-29 18:13:10.921432 7f871fe56700 -1 mon.b@0(leader).mds e5 Missing health data for MDS 4112 2015-09-29T11:13:10.980 INFO:tasks.workunit.client.0.vpm026.stdout:test/librados/aio.cc:2817: Failure 2015-09-29T11:13:10.980 INFO:tasks.workunit.client.0.vpm026.stdout:Value of: test_data.init() 2015-09-29T11:13:10.980 INFO:tasks.workunit.client.0.vpm026.stdout: Actual: "create_one_ec_pool(test-rados-api-vpm026-12090-64) failed: error mon_command osd pool create pool:test-rados-api-vpm026-12090-64 pool_type:erasure failed with error -4" 2015-09-29T11:13:10.981 INFO:tasks.workunit.client.0.vpm026.stdout:Expected: "" 2015-09-29T11:13:10.981 INFO:tasks.workunit.client.0.vpm026.stdout:[ FAILED ] LibRadosAioEC.MultiWritePP (17363 ms) 2015-09-29T11:13:10.981 INFO:tasks.workunit.client.0.vpm026.stdout:[----------] 31 tests from LibRadosAioEC (294972 ms total) 2015-09-29T11:13:10.981 INFO:tasks.workunit.client.0.vpm026.stdout: 2015-09-29T11:13:10.981 INFO:tasks.workunit.client.0.vpm026.stdout:[----------] Global test environment tear-down 2015-09-29T11:13:10.982 INFO:tasks.workunit.client.0.vpm026.stdout:[==========] 62 tests from 2 test cases ran. (421478 ms total) 2015-09-29T11:13:10.982 INFO:tasks.workunit.client.0.vpm026.stdout:[ PASSED ] 61 tests. 2015-09-29T11:13:10.982 INFO:tasks.workunit.client.0.vpm026.stdout:[ FAILED ] 1 test, listed below: 2015-09-29T11:13:10.982 INFO:tasks.workunit.client.0.vpm026.stdout:[ FAILED ] LibRadosAioEC.MultiWritePP 2015-09-29T11:13:10.982 INFO:tasks.workunit.client.0.vpm026.stdout: 2015-09-29T11:13:10.982 INFO:tasks.workunit.client.0.vpm026.stdout: 1 FAILED TEST 2015-09-29T11:13:10.983 INFO:tasks.workunit:Stopping ['rados/test-upgrade-v9.0.1.sh', 'cls'] on client.0... 2015-09-29T11:13:10.983 INFO:teuthology.orchestra.run.vpm026:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/workunit.client.0'
Related issues
History
#1 Updated by Yuri Weinstein over 8 years ago
- ceph-qa-suite upgrade/firefly-hammer-x added
#3 Updated by Yuri Weinstein over 8 years ago
- Subject changed from "[ FAILED ] LibRadosAioEC.MultiWritePP" in upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-vps run to "[ FAILED ] LibRadosAioEC.*" tests in upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-vps run
Run: http://pulpito.ceph.com/teuthology-2015-09-30_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/
Jobs: ['1078473', '1078474', '1078475', '1078476']
#4 Updated by Yuri Weinstein over 8 years ago
- Subject changed from "[ FAILED ] LibRadosAioEC.*" tests in upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-vps run to "[ FAILED ] LibRadosAioEC.*" tests failed in upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-vps run
#5 Updated by Yuri Weinstein over 8 years ago
- Priority changed from Normal to Urgent
Run: http://pulpito.ceph.com/teuthology-2015-10-02_13:18:08-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/
Jobs: ['1084445', '1084447', '1084448']
#6 Updated by Loïc Dachary over 8 years ago
- Status changed from New to In Progress
- Assignee set to Loïc Dachary
#7 Updated by Yuri Weinstein over 8 years ago
Loic I see similar in run, assuming a dupe for now:
http://pulpito.ceph.com/teuthology-2015-10-05_17:05:09-upgrade:giant-x-hammer-distro-basic-vps/
['1089918', '1089925', '1089958']
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-10-05_17:05:09-upgrade:giant-x-hammer-distro-basic-vps/1089918/teuthology.log
2015-10-05T21:59:44.107 INFO:tasks.mon_thrash.ceph_manager:quorum is size 2 2015-10-05T21:59:44.107 INFO:teuthology.orchestra.run.vpm185:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph -m 10.214.130.185:6789 mon_status' 2015-10-05T22:00:00.125 INFO:tasks.workunit.client.1.vpm050.stdout:test/librados/aio.cc:167: Failure 2015-10-05T22:00:00.125 INFO:tasks.workunit.client.1.vpm050.stdout:Value of: test_data.init() 2015-10-05T22:00:00.125 INFO:tasks.workunit.client.1.vpm050.stdout: Actual: "create_one_pool(test-rados-api-vpm050-15351-1) failed: error rados_pool_create(test-rados-api-vpm050-15351-1) failed with error -4" 2015-10-05T22:00:00.125 INFO:tasks.workunit.client.1.vpm050.stdout:Expected: "" 2015-10-05T22:00:00.126 INFO:tasks.workunit.client.1.vpm050.stdout:[ FAILED ] LibRadosAio.SimpleWrite (56419 ms) ....... 2015-10-06T06:38:26.649 INFO:tasks.workunit.client.1.vpm050.stdout:[ FAILED ] 1 test, listed below: 2015-10-06T06:38:26.650 INFO:tasks.workunit.client.1.vpm050.stdout:[ FAILED ] LibRadosAio.SimpleWrite 2015-10-06T06:38:26.650 INFO:tasks.workunit.client.1.vpm050.stdout: 2015-10-06T06:38:26.650 INFO:tasks.workunit.client.1.vpm050.stdout: 1 FAILED TEST
#8 Updated by Yuri Weinstein over 8 years ago
- Release set to hammer
- ceph-qa-suite upgrade/giant-x added
#9 Updated by Loïc Dachary over 8 years ago
- Subject changed from "[ FAILED ] LibRadosAioEC.*" tests failed in upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-vps run to upgrade suite: pool_create failed with error -4 EINTR
- Release deleted (
hammer) - ceph-qa-suite deleted (
upgrade/giant-x)
#10 Updated by Loïc Dachary over 8 years ago
The bug was fist seen early september, shortly after the following commits were merged in the hammer branch:
$ git log --merges --since 2015-09-01 --until 2015-09-11 --format='%H' ceph/hammer | while read sha1 ; do echo ; git log --format='** %aD "%s":https://github.com/ceph/ceph/commit/%H' ${sha1}^1..${sha1} ; done | perl -p -e 'print "* \"PR $1\":https://github.com/ceph/ceph/pull/$1\n" if(/Merge pull request #(\d+)/)'
- PR 5769
- Wed, 9 Sep 2015 19:44:09 -0400 Merge pull request #5769 from dachary/wip-12850-hammer
- Tue, 11 Aug 2015 09:26:33 -0400 librbd: prevent race condition between resize requests
- PR 5768
- Wed, 9 Sep 2015 19:44:01 -0400 Merge pull request #5768 from dachary/wip-12849-hammer
- Wed, 29 Jul 2015 12:46:24 -0400 lockdep: allow lockdep to be dynamically enabled/disabled
- Tue, 28 Jul 2015 14:23:14 -0400 tests: librbd API test cannot use private md_config_t struct
- Thu, 9 Apr 2015 15:06:27 -0400 tests: ensure old-format RBD tests still work
- Thu, 30 Jul 2015 09:00:57 -0400 librados_test_stub: implement conf get/set API methods
- Tue, 28 Jul 2015 13:14:29 -0400 crypto: use NSS_InitContext/NSS_ShutdownContex to avoid memory leak
- Sat, 21 Mar 2015 07:13:51 +0800 auth: use crypto_init_mutex to protect NSS_Shutdown()
- Sat, 21 Mar 2015 01:02:42 +0800 auth: reinitialize NSS modules after fork()
- PR 5697
- Wed, 9 Sep 2015 16:58:56 +0200 Merge pull request #5697 from tchaikov/wip-12638-hammer
- Mon, 10 Aug 2015 04:25:03 -0700 mon: add a cache layer over MonitorDBStore
- PR 5381
- Wed, 9 Sep 2015 14:52:54 +0200 Merge pull request #5381 from dachary/wip-12499-hammer
- Thu, 16 Jul 2015 04:45:05 -0700 Client: check dir is still complete after dropping locks in _readdir_cache_cb
- PR 5757
- Tue, 8 Sep 2015 14:58:16 -0700 Merge pull request #5757 from dachary/wip-12836-hammer
- Tue, 7 Jul 2015 08:49:54 -0700 WBThrottle::clear_object: signal if we cleared an object
- PR 5759
- Mon, 7 Sep 2015 10:06:30 +0200 Merge pull request #5759 from dachary/wip-12841-hammer
- Mon, 24 Aug 2015 15:40:39 -0700 config: skip lockdep for intentionally recursive md_config_t lock
- PR 5761
- Mon, 7 Sep 2015 10:05:41 +0200 Merge pull request #5761 from dachary/wip-12843-hammer
- Tue, 21 Jul 2015 11:31:12 -0700 OSD: break connection->session->waiting message->connection cycle
- PR 5762
- Mon, 7 Sep 2015 10:04:51 +0200 Merge pull request #5762 from dachary/wip-12844-hammer
- Wed, 29 Jul 2015 21:47:17 +0000 osd: copy the RecoveryCtx::handle when creating a new RecoveryCtx instance from another one
- PR 5763
- Mon, 7 Sep 2015 10:04:03 +0200 Merge pull request #5763 from dachary/wip-12846-hammer
- Sun, 9 Aug 2015 10:46:10 -0400 osd/PGLog: dirty_to is inclusive
- PR 5764
- Mon, 7 Sep 2015 10:03:10 +0200 Merge pull request #5764 from dachary/wip-12847-hammer
- Mon, 24 Aug 2015 23:02:10 +0800 common: fix code format
- Mon, 24 Aug 2015 22:59:40 +0800 test: add test case for insert empty ptr when buffer rebuild
- Mon, 24 Aug 2015 23:01:27 +0800 common: fix insert empty ptr when bufferlist rebuild
- PR 5373
- Mon, 7 Sep 2015 10:02:14 +0200 Merge pull request #5373 from dachary/wip-12489-hammer
- Thu, 2 Jul 2015 05:29:47 +0000 osd: pg_interval_t::check_new_interval should not rely on pool.min_size to determine if the PG was active
- Wed, 1 Jul 2015 20:26:54 +0000 osd: Move IsRecoverablePredicate/IsReadablePredicate to osd_types.h
- PR 5383
- Mon, 7 Sep 2015 10:00:32 +0200 Merge pull request #5383 from dachary/wip-12504-hammer
- Thu, 16 Jul 2015 09:42:55 +0800 rest_bench: bucketname is not mandatory as we have a default name
- Thu, 16 Jul 2015 09:17:59 +0800 rest_bench: drain the work queue to fix a crash Fixes: #3896 Signed-off-by: huangjun <hjwsm1989@gmail.com>
- PR 5765
- Mon, 7 Sep 2015 09:54:07 +0200 Merge pull request #5765 from dachary/wip-12883-hammer
- Thu, 13 Aug 2015 19:41:47 +0200 tests: tiering agent and proxy read
- Thu, 13 Aug 2015 13:47:24 +0200 osd: trigger the cache agent after a promotion
- PR 5754
- Mon, 7 Sep 2015 09:53:14 +0200 Merge pull request #5754 from dachary/wip-12588-hammer
- Wed, 8 Jul 2015 10:35:49 +0800 librados: Make librados pool_create respect default_crush_ruleset
- PR 5377
- Mon, 7 Sep 2015 09:51:50 +0200 Merge pull request #5377 from dachary/wip-12396-hammer
- Fri, 3 Jul 2015 18:27:13 +0800 mon/PGMonitor: bug fix pg monitor get crush rule
- PR 5758
- Sun, 6 Sep 2015 21:07:38 -0400 Merge pull request #5758 from dachary/wip-12839-hammer
- Thu, 23 Jul 2015 16:36:19 -0700 osd: Keep a reference count on Connection while calling send_message()
- PR 5276
- Sun, 6 Sep 2015 23:17:22 +0200 Merge pull request #5276 from dachary/wip-11824-hammer
- Thu, 16 Jul 2015 18:02:02 +0200 mon: test the crush ruleset when creating a pool
- Sat, 30 May 2015 12:40:26 +0200 erasure-code: set max_size to chunk_count() instead of 20 for shec
- Thu, 26 Feb 2015 21:22:31 +0200 vstart.sh: set PATH to include pwd
- PR 5382
- Sun, 6 Sep 2015 17:24:43 +0200 Merge pull request #5382 from dachary/wip-12500-hammer
- Tue, 21 Jul 2015 16:09:32 +0100 auth: check return value of keyring->get_secret
- PR 5367
- Sun, 6 Sep 2015 17:23:19 +0200 Merge pull request #5367 from dachary/wip-12311-hammer
- Wed, 18 Mar 2015 13:49:20 -0700 os/chain_xattr: handle read on chnk-aligned xattr
- PR 5223
- Fri, 4 Sep 2015 15:38:43 -0600 Merge pull request #5223 from SUSE/wip-12305-hammer
- Mon, 13 Jul 2015 18:12:01 +0200 ceph.spec.in: do not run fdupes, even on SLE/openSUSE
- PR 5716
- Thu, 3 Sep 2015 12:20:38 +0200 Merge pull request #5716 from dachary/wip-12851-hammer
- Mon, 20 Jul 2015 20:27:33 -0700 rgw: avoid using slashes for generated secret keys
- PR 5717
- Thu, 3 Sep 2015 12:11:24 +0200 Merge pull request #5717 from dachary/wip-12591-hammer
- Mon, 29 Jun 2015 15:35:04 -0700 rgw: api adjustment following a rebase
- Mon, 29 Jun 2015 15:34:44 -0700 rgw: orphans, fix check on number of shards
- Mon, 29 Jun 2015 15:34:11 -0700 rgw: orphans, change default number of shards
- Tue, 5 May 2015 14:43:05 -0700 rgw: change error output related to orphans
- Mon, 4 May 2015 17:02:29 -0700 rgw: orphan, fix truncated detection
- Mon, 4 May 2015 16:32:57 -0700 radosgw-admin: simplify orphan command
- Mon, 4 May 2015 15:24:00 -0700 radosgw-admin: stat orphan objects before reporting leakage
- Mon, 4 May 2015 14:39:20 -0700 radosgw-admin: orphans finish command
- Sat, 2 May 2015 17:28:30 -0700 rgw: cannot re-init an orphan scan job
- Sat, 2 May 2015 16:38:08 -0700 rgw: stat_async() sets the object locator appropriately
- Sat, 2 May 2015 16:34:09 -0700 rgw: list_objects() sets namespace appropriately
- Fri, 1 May 2015 17:23:44 -0700 rgw: modify orphan search fingerprints
- Fri, 1 May 2015 15:17:10 -0700 rgw: compare oids and dump leaked objects
- Thu, 30 Apr 2015 16:17:54 -0700 rgw: keep accurate state for linked objects orphan scan
- Wed, 29 Apr 2015 17:12:34 -0700 rgw: iterate over linked objects, store them
- Wed, 29 Apr 2015 17:12:00 -0700 rgw: add rgw_obj::parse_raw_oid()
- Wed, 29 Apr 2015 14:50:15 -0700 rgw: iterate asynchronously over linked objects
- Wed, 29 Apr 2015 14:15:33 -0700 rgw: async object stat functionality
- Tue, 28 Apr 2015 16:45:49 -0700 rgw-admin: build index of bucket indexes
- Sat, 25 Apr 2015 09:37:53 -0700 rgw: initial work of orphan detection tool implementation
- Wed, 29 Apr 2015 13:35:29 +0530 Avoid an extra read on the atomic variable
- Wed, 8 Apr 2015 18:53:14 +0530 RGW: Make RADOS handles in RGW to be a configurable option
- PR 5755
- Wed, 2 Sep 2015 23:35:58 +0200 Merge pull request #5755 from dachary/wip-12589-hammer
- Sun, 31 May 2015 19:42:45 +0200 ceph-disk: always check zap is applied on a full device
- PR 5732
- Wed, 2 Sep 2015 23:22:59 +0200 Merge pull request #5732 from ceph/wip-11455-hammer
- Wed, 26 Aug 2015 14:34:30 -0700 rgw: init some manifest fields when handling explicit objs
- PR 5721
- Wed, 2 Sep 2015 23:19:02 +0200 Merge pull request #5721 from dachary/wip-12853-hammer
- Thu, 6 Aug 2015 15:52:58 +0200 rgw: rework X-Trans-Id header to be conform with Swift API.
- Mon, 8 Jun 2015 22:59:54 +0530 Transaction Id added in response
- PR 5498
- Wed, 2 Sep 2015 23:08:24 +0200 Merge pull request #5498 from ceph/wip-12432-hammer
- Wed, 22 Jul 2015 10:01:00 -0700 rgw: set http status in civetweb
- Fri, 31 Jul 2015 11:03:29 -0700 civetweb: update submodule to support setting of http status
- PR 5527
- Wed, 2 Sep 2015 12:00:11 +0200 Merge pull request #5527 from SUSE/wip-12585-hammer
- Thu, 30 Jul 2015 14:20:56 +0100 osd/OSDMap: handle incrementals that modify+del pool
- PR 5551
- Wed, 2 Sep 2015 02:19:26 +0200 Merge pull request #5551 from ceph/wip-corpus-hammer
- Tue, 1 Sep 2015 17:44:06 -0400 ceph-object-corpus: add 0.94.2-207-g88e7ee7 hammer objects
#11 Updated by Loïc Dachary over 8 years ago
On the teuthology machine, in the /a directory,
for run in *2015-{05,06,07,08,09,10}*upgrade* ; do for job in $run/* ; do test -d $job || continue ; config=$job/config.yaml ; test -f $config || continue ; summary=$job/summary.yaml ; test -f $summary || continue ; if shyaml get-value branch < $config | grep -q hammer && shyaml get-value success < $summary | grep -qi false && grep -q 'error -4' $job/teuthology.log ; then echo $job ; fi ; done ; done teuthology-2015-09-11_17:18:07-upgrade:firefly-x-hammer-distro-basic-vps/1051109 teuthology-2015-09-14_17:18:07-upgrade:firefly-x-hammer-distro-basic-vps/1056283 teuthology-2015-09-25_17:18:08-upgrade:firefly-x-hammer-distro-basic-vps/1069265 teuthology-2015-10-02_17:05:01-upgrade:giant-x-hammer-distro-basic-vps/1084647 teuthology-2015-10-05_17:05:09-upgrade:giant-x-hammer-distro-basic-vps/1089918 teuthology-2015-10-05_17:05:09-upgrade:giant-x-hammer-distro-basic-vps/1089925
Shows the error appeared for the first time September 9th, 2015 with upgrade:firefly-x/stress-split/{0-cluster/start.yaml 1-firefly-install/firefly.yaml 2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/{rbd-cls.yaml rbd-import-export.yaml readwrite.yaml snaps-few-objects.yaml} 6-next-mon/monb.yaml 7-workload/{radosbench.yaml rbd_api.yaml} 8-next-mon/monc.yaml 9-workload/{rbd-python.yaml rgw-swift.yaml snaps-many-objects.yaml} distros/ubuntu_14.04.yaml}
#12 Updated by Loïc Dachary over 8 years ago
$ git log --merges --since 2015-09-01 --until 2015-09-11 --format='%H' ceph/firefly | while read sha1 ; do echo ; git log --format='** %aD "%s":https://github.com/ceph/ceph/commit/%H' ${sha1}^1..${sha1} ; done | perl -p -e 'print "* \"PR $1\":https://github.com/ceph/ceph/pull/$1\n" if(/Merge pull request #(\d+)/)'
- PR 5200
- Thu, 10 Sep 2015 11:46:41 +0200 Merge pull request #5200 from SUSE/wip-12289-firefly
- Tue, 12 Aug 2014 15:24:26 -0700 ReplicatedPG::maybe_handle_cache: do not skip promote for write_ordered
- Mon, 28 Jul 2014 14:06:06 +0800 osd: promotion on 2nd read for cache tiering
- Thu, 31 Jul 2014 15:49:44 -0700 ceph_test_rados_api_tier: test promote-on-second-read behavior
- Fri, 11 Jul 2014 11:31:22 -0700 osd/osd_types: be pedantic about encoding last_force_op_resend without feature bit
- PR 5388
- Wed, 9 Sep 2015 15:45:37 +0200 Merge pull request #5388 from SUSE/wip-12490-firefly
- Thu, 9 Jul 2015 13:32:03 +0800 UnittestBuffer: Add bufferlist zero test case
- Thu, 9 Jul 2015 13:42:42 +0800 buffer: Fix bufferlist::zero bug with special case
- PR 5408
- Wed, 9 Sep 2015 06:10:09 +0200 Merge pull request #5408 from SUSE/wip-12492-firefly
- Tue, 21 Jul 2015 11:20:53 +0100 mon: OSDMonitor: fix hex output on 'osd reweight'
- PR 5404
- Wed, 9 Sep 2015 06:07:51 +0200 Merge pull request #5404 from SUSE/wip-12395-firefly
- Fri, 3 Jul 2015 18:27:13 +0800 mon/PGMonitor: bug fix pg monitor get crush rule
- PR 5199
- Wed, 9 Sep 2015 06:05:39 +0200 Merge pull request #5199 from SUSE/wip-11980-firefly
- Fri, 15 May 2015 22:50:36 +0800 mon: always reply mdsbeacon
- Tue, 2 Jun 2015 23:20:21 -0700 mon/MDSMonitor: rename labels to a better name
- Tue, 2 Jun 2015 12:55:06 +0800 mon: send no_reply() to peon to drop ignored mdsbeacon
- Tue, 2 Jun 2015 12:22:26 +0800 mon: remove unnecessary error handling
- PR 5410
- Wed, 9 Sep 2015 06:04:15 +0200 Merge pull request #5410 from SUSE/wip-12497-firefly
- Tue, 21 Jul 2015 18:55:00 +0800 Update OSDMonitor.cc
- PR 5409
- Wed, 9 Sep 2015 06:01:50 +0200 Merge pull request #5409 from SUSE/wip-12495-firefly
- Mon, 20 Jul 2015 10:50:20 +0800 mon/PGMonitor: avoid uint64_t overflow when checking pool 'target/max' status. Fixes: #12401
- PR 5358
- Mon, 7 Sep 2015 14:22:52 +0200 Merge pull request #5358 from ceph/wip-11470.firefly
- Fri, 12 Jun 2015 19:21:10 +0100 mon: PaxosService: call post_refresh() instead of post_paxos_update()
- PR 5403
- Mon, 7 Sep 2015 10:20:27 +0200 Merge pull request #5403 from SUSE/wip-12393-firefly
- Tue, 26 May 2015 18:50:17 +0800 Mutex: fix leak of pthread_mutexattr
- PR 5225
- Sun, 6 Sep 2015 21:19:55 +0200 Merge pull request #5225 from SUSE/wip-12266-firefly
- Thu, 25 Jun 2015 22:37:52 +0200 ceph.spec.in: use _udevrulesdir to eliminate conditionals
- PR 5217
- Sun, 6 Sep 2015 21:19:42 +0200 Merge pull request #5217 from SUSE/wip-12268-firefly
- Tue, 16 Jun 2015 18:27:20 +0200 ceph.spec.in: python-argparse only in Python 2.6
- PR 5224
- Fri, 4 Sep 2015 11:50:22 -0600 Merge pull request #5224 from SUSE/wip-12304-firefly
- Mon, 13 Jul 2015 18:12:01 +0200 ceph.spec.in: do not run fdupes, even on SLE/openSUSE
- PR 5394
- Fri, 4 Sep 2015 11:47:49 -0600 Merge pull request #5394 from SUSE/wip-12447-firefly
- Thu, 9 Jul 2015 11:51:13 +0200 ceph.spec.in: drop SUSE-specific %py_requires macro
- PR 5043
- Fri, 4 Sep 2015 17:18:30 +0200 Merge pull request #5043 from SUSE/wip-12007-firefly
- Mon, 15 Sep 2014 11:41:06 +0000 For pgls OP, get/put budget on per list session basis, instead of per OP basis, which could lead to deadlock.
- PR 4769
- Fri, 4 Sep 2015 17:08:41 +0200 Merge pull request #4769 from SUSE/wip-11741-firefly
- Tue, 12 May 2015 16:37:56 -0700 mon: prevent bucket deletion when referenced by a rule
- PR 4788
- Fri, 4 Sep 2015 16:50:45 +0200 Merge pull request #4788 from SUSE/wip-11801-firefly
- Fri, 16 Jan 2015 07:54:22 -0800 mon/OSDMonitor: require mon_allow_pool_delete = true to remove pools
- PR 5235
- Fri, 4 Sep 2015 16:44:57 +0200 Merge pull request #5235 from SUSE/wip-12310-firefly
- Wed, 18 Mar 2015 13:49:20 -0700 os/chain_xattr: handle read on chnk-aligned xattr
- PR 5389
- Fri, 4 Sep 2015 16:42:28 +0200 Merge pull request #5389 from SUSE/wip-12391-firefly
- Fri, 15 May 2015 13:05:40 -0700 OSDMonitor: disallow ec pools as tiers
- PR 5406
- Fri, 4 Sep 2015 09:42:03 -0400 Merge pull request #5406 from ceph/wip-12465-firefly
- Fri, 24 Jul 2015 15:38:18 -0700 Log::reopen_log_file: take m_flush_mutex
- PR 4854
- Thu, 3 Sep 2015 12:34:32 +0200 Merge pull request #4854 from ceph/wip-11769-firefly
- Tue, 2 Jun 2015 10:33:35 -0400 tests: verify librbd blocking aio code path
- Mon, 1 Jun 2015 22:56:11 -0400 librbd: new rbd_non_blocking_aio config option
- Thu, 9 Apr 2015 20:34:28 -0400 PendingReleaseNotes: document changes to librbd's aio_read methods
- Wed, 8 Apr 2015 21:55:36 -0400 tests: update librbd AIO tests to remove result code
- Thu, 9 Apr 2015 13:33:09 -0400 librbd: AioRequest::send no longer returns a result
- Wed, 8 Apr 2015 21:37:50 -0400 librbd: internal AIO methods no longer return result
- Wed, 8 Apr 2015 21:48:21 -0400 Throttle: added pending_error method to SimpleThrottle
- Wed, 8 Apr 2015 20:18:50 -0400 librbd: add new fail method to AioCompletion
- Wed, 8 Apr 2015 19:06:52 -0400 librbd: avoid blocking AIO API methods
- Wed, 8 Apr 2015 17:24:08 -0400 librbd: add task pool / work queue for AIO requests
- Mon, 11 May 2015 17:05:49 -0400 WorkQueue: added virtual destructor
- Wed, 8 Apr 2015 16:46:34 -0400 WorkQueue: add new ContextWQ work queue
- Fri, 5 Dec 2014 14:21:08 -0800 common/ceph_context: don't import std namespace
- Mon, 1 Dec 2014 23:54:16 +0800 CephContext: Add AssociatedSingletonObject to allow CephContext's singleton
- PR 5233
- Wed, 2 Sep 2015 07:55:15 +0200 Merge pull request #5233 from SUSE/wip-12074-firefly
- Wed, 10 Dec 2014 11:53:43 +0100 Unconditionally chown rados log file.
#13 Updated by Loïc Dachary over 8 years ago
The following can't be the cause because it allows pool deletion by default and no ceph-qa-suite branch changes that.
- PR 4788
- Fri, 4 Sep 2015 16:50:45 +0200 Merge pull request #4788 from SUSE/wip-11801-firefly
- Fri, 16 Jan 2015 07:54:22 -0800 mon/OSDMonitor: require mon_allow_pool_delete = true to remove pools
The following, merged Sept 4th in Firefly would return ENOTSUP, not a candidate.
- PR 5389
- Fri, 4 Sep 2015 16:42:28 +0200 Merge pull request #5389 from SUSE/wip-12391-firefly
- Fri, 15 May 2015 13:05:40 -0700 OSDMonitor: disallow ec pools as tiers
#14 Updated by Loïc Dachary over 8 years ago
Nothing significant was merged in the hammer ceph-qa-suite branch before the problems started showing up. The giant-x and firefly-x upgrade suites have not been modified since july 2015 in the hammer branch.
#15 Updated by Loïc Dachary over 8 years ago
First occurrence with the tests involving the infernalis branch September 13, 2015.
$ for run in *2015-{08,09}*upgrade* ; do for job in $run/* ; do test -d $job || continue ; config=$job/config.yaml ; test -\ f $config || continue ; summary=$job/summary.yaml ; test -f $summary || continue ; if shyaml get-value branch < $config | grep -q infernalis && shyaml ge\ t-value success < $summary | grep -qi false && grep -q 'error -4' $job/teuthology.log ; then echo $job ; fi ; done ; done teuthology-2015-09-09_13:18:06-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1047659 teuthology-2015-09-09_13:18:06-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1047660 teuthology-2015-09-09_13:18:06-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1047661 teuthology-2015-09-09_13:18:06-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1047662 teuthology-2015-09-11_13:18:08-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1050696 teuthology-2015-09-11_13:18:08-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1050699 teuthology-2015-09-11_13:18:08-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1050701 teuthology-2015-09-14_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1055798 teuthology-2015-09-14_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1055799 teuthology-2015-09-14_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1055800 teuthology-2015-09-14_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1055801 teuthology-2015-09-16_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1060621 teuthology-2015-09-16_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1060622 teuthology-2015-09-16_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1060623 teuthology-2015-09-16_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1060624 teuthology-2015-09-23_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1066645 teuthology-2015-09-23_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1066646 teuthology-2015-09-23_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1066647 teuthology-2015-09-23_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1066648 teuthology-2015-09-28_13:18:08-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1074178 teuthology-2015-09-28_13:18:08-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1074179 teuthology-2015-09-28_13:18:08-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1074180 teuthology-2015-09-28_13:18:08-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1074181 teuthology-2015-09-29_10:41:00-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-vps/1076295 teuthology-2015-09-29_10:41:00-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-vps/1076296 teuthology-2015-09-29_10:41:00-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-vps/1076297 teuthology-2015-09-29_10:41:00-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-vps/1076298 teuthology-2015-09-30_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1078473 teuthology-2015-09-30_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1078474 teuthology-2015-09-30_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1078475 teuthology-2015-09-30_13:18:07-upgrade:firefly-hammer-x:parallel-infernalis-distro-basic-multi/1078476
#16 Updated by Loïc Dachary over 8 years ago
In the hammer upgrade/giant-x install the clients are installed with giant. The same giant client are used during the whole upgrade test, meaning tests from giant are actually run all along.
#17 Updated by Loïc Dachary over 8 years ago
Because of the crush ruleset validation introduced by https://github.com/ceph/ceph/pull/5276 in hammer September 6th, a pool creation may fail if
- the crush rule is invalid
- the validation takes more than X seconds, X being < to the mon lease
The implicit crush ruleset in firefly had max_size 20 which makes it very expensive to verify it. It is the most probable cause of EINTR, meaning crush validation did not complete on time
#18 Updated by Loïc Dachary over 8 years ago
We want
- https://github.com/ceph/ceph/commit/aa238e5ed50f44a94caf84567267e4f6be8732a2
- https://github.com/ceph/ceph/commit/524b0bdcc45c2f4b95f2239c988e93250f337f3d
- https://github.com/ceph/ceph/commit/21a1e75d8a7bad89a48cd9d36902c5d609be5015
- https://github.com/ceph/ceph/commit/1b3090d50e5bd5ca3e6e396b23d2d9826896c718
- https://github.com/ceph/ceph/commit/0f82f461b33d93d868e185912a2c7e4074d06900
#19 Updated by Loïc Dachary over 8 years ago
- Status changed from In Progress to Duplicate
Duplicate of http://tracker.ceph.com/issues/13401
#20 Updated by Yuri Weinstein over 8 years ago
- Related to Bug #13664: tests: testprofile must be removed before it is re-created added