Tasks #17151
closedhammer v0.94.10
Added by Nathan Cutler over 7 years ago. Updated about 7 years ago.
0%
Description
Workflow¶
- Preparing the release
- Cutting the release
- Nathan asks Abhishek (Release Manager) if a point release should be published - YES
- Nathan gets approval from all leads
- Yehuda, rgw - YES
- John, CephFS - YES
- Jason, RBD - YES
- Josh, rados - YES
- Nathan prepares draft release notes -DONE https://github.com/ceph/ceph/pull/13152
- ceph-objectstore-tool and ceph-monstore-tool now enable user to rebuild the monitor database from OSDs. This feature is especially useful when all monitors fail to boot due to leveldb corruption.
- Abhishek writes and commits the release notes
- Nathan informs Yuri that the branch is ready for testing - DONE
- Yuri runs additional integration tests - DONE
- If Yuri discovers new bugs that need to be backported urgently (i.e. their priority is set to Urgent), the release goes back to being prepared, it was not ready after all
- Yuri informs Abhishek that the branch is ready for release - DONE
- Abhishek creates the packages and sets the release tag - IN PROGRESS
Release information¶
- branch to build from: hammer, commit: 83af8cdaaa6d94404e6146b68e532a784e3cc99c
- version: v0.94.10
- type of release: point release
- where to publish the release: http://download.ceph.com/debian-hammer and http://download.ceph.com/rpm-hammer
Updated by Nathan Cutler over 7 years ago
- Subject changed from hammer v0.94.9 to hammer v0.94.10
- Target version changed from v0.94.9 - UNDELETABLE, DO NOT USE to 522
0.94.9 is out already. Repurposing this issue for 0.94.10.
Updated by Nathan Cutler over 7 years ago
- Status changed from New to In Progress
Updated by Nathan Cutler over 7 years ago
git --no-pager log --format='%H %s' --graph ceph/hammer..hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 11628
- |\
- | + Don't loop forever when reading data from 0 sized segment.
- + Pull request 9873
- |\
- | + init-radosgw: do not use systemd-run in sysvinit
- + Pull request 10238
- |\
- | + mon/MDSMonitor: fix memory leak in prepare_beacon
- + Pull request 10255
- |\
- | + common: fix race during optracker switches between enabled/disabled mode
- | + ReplicatedPG: clearing a whiteout should create the object
- | + common/TrackedOp: Move tracking_enabled check into register_inflight_op()
- | + common/TrackedOp: Handle dump racing with constructor
- | + common/TrackedOp: Missed locking when examining events
- | + CLEANUP: Move locking into dump_ops_in_flight()/dump_historic_ops()
- | + mds, osd: Fix missing locking for dump_blocked_ops
- | + osd: cleanup: Specify both template types for create_request()
- | + osd: add dump_blocked_ops asok command.
- | + common/TrackedOp: make Tracker can dynamic control.
- | + mds: Make mds can dynamic set optracker via asok.
- | + osd: Make osd can dynamic set optracker via asok.
- | + common/TrackedOp: Should lock ops_history_lock when access shutdown.
- | + osd/ReplicatedPG: for osd_op_create, if ob existed don't do t->touch.
- | + common/TrackedOp: checking in flight ops fix
- | + common/OpTracker: don't dump ops if tracking is not enabled
- | + common/TrackedOp: break out of loop when reaching log threshold
- | + common/TrackedOp: check tracking_enabled for event initiated/done .
- | + common/TrackedOp: clean up code make look good.
- + Pull request 10569
- |\
- | + stop.sh: make more portable
- + Pull request 10582
- |\
- | + osd: Mark child of split temp_created if it was created
- | + os/FileStore: better debug output for destroy_collection
- + Pull request 10724
- |\
- | + crush: reset bucket->h.items[i] when removing tree item
- + Pull request 10871
- |\
- | + hammer: resend writes after pool loses full flag
- | + hammer: drop write if pool is full
- | + hammer: osdc: implement Objecter::osdmap_pool_full
- | + hammer: osdc/Objecter: allow per-pool calls to op_cancel_writes
- + Pull request 10904
- |\
- | + mon: return size_t from MonitorDBStore::Transaction::size()
- + Pull request 10905
- |\
- | + ceph.in: improve the error message
- + Pull request 10990
- |\
- | + Merge branch 'hammer' into fix_bug_EXPORT_DIFF1
- | + rbd: this command should be EXPORT_DIFF
- + Pull request 11045
- |\
- | + 13207: Rados Gateway: Anonymous user is able to read bucket with authenticated read ACL
- + Pull request 11125
- |\
- | + doc: fill keyring with caps before passing it to ceph-monstore-tool
- | + tools/ceph_monstore_tool: bail out if no caps found for a key
- | + tools/ceph_monstore_tool: update pgmap_meta also when rebuilding store.db
- | + tools/rebuild_mondb: kill compiling warning
- | + tools/rebuild_mondb: return error if ondisk version of pg_info is incompatible
- | + tools/rebuild_mondb: avoid unnecessary result code cast
- | + doc: add rados/operations/disaster-recovery.rst
- | + tools/ceph_monstore_tool: add rebuild command
- | + tools/ceph-objectstore-tool: add update-mon-db command
- | + mon/AuthMonitor: make AuthMonitor::IncType public
- + Pull request 11273
- |\
- | + mon: OSDMonitor: Missing nearfull flag set
- + Pull request 11457
- |\
- | + mon: send updated monmap to its subscribers
- + Pull request 11618
- |\
- | + hammer: ObjectCacher: fix bh_read_finish offset logic
- | + hammer: test: build a correctness test for the ObjectCacher
- | + hammer: test: split objectcacher test into 'stress' and 'correctness'
- | + hammer: test: add a data-storing MemWriteback for testing ObjectCacher
- | + hammer: objectcacher: introduce ObjectCacher::flush_all()
- | + hammer: osd: provide some contents on ObjectExtent usage in testing
- + Pull request 11676
- |\
- | + PG: update PGPool to detect map gaps and reset cached_removed_snaps
- + Pull request 11809
- |\
- | + rgw: handle empty POST condition
- + Pull request 11899
- |\
- | + rgw: fix the field 'total_time' of log entry in log show opt
- + Pull request 11927
- |\
- | + osd/PGBackend: fix collection_list shadow return value
- + Pull request 11929
- |\
- | + os/filestore/FileJournal: fail out if FileJournal is not block device or regular file
- + Pull request 11930
- |\
- | + cephx: Fix multiple segfaults due to attempts to encrypt or decrypt an empty secret and a null CryptoKeyHandler
- + Pull request 11931
- |\
- | + crush/CrushCompiler: error out as long as parse fails
- + Pull request 11932
- |\
- | + Cleanup: delete find_best_info again
- + Pull request 11933
- |\
- | + PG: use upset rather than up for _update_calc_stats
- | + PG: introduce and maintain upset
- + Pull request 11934
- |\
- | + mon/PGMonitor: calc the %USED of pool using used/
- | + mon/PGMonitor: mark dump_object_stat_sum() as static
- + Pull request 11935
- |\
- | + pg: restore correct behavior of read() callers
- + Pull request 11936
- |\
- | + librados: extend remove interface, add flags parameter
- | + osd: Add func has_flag in MOSDOp.
- | + osd: reject PARALLELEXEC ops with EINVAL
- | + ceph_test_rados_api_misc: test rados op with bad flas
- + Pull request 11937
- |\
- | + OSDMonitor::prepare_pgtemp: only update up_thru if newer
- + Pull request 11938
- |\
- | + OpRequest: release the message throttle when unregistered
- + Pull request 11939
- |\
- | + mds: fix out-of-order messages
- + Pull request 11946
- |\
- | + mon: update mon(peon)'s down_pending_out when osd up
- + Pull request 11948
- |\
- | + rbd: this command should be EXPORT_DIFF
- + Pull request 11949
- |\
- | + librbd: block name prefix might overflow fixed size C-string
- + Pull request 11950
- |\
- | + rgw: adjust manifest head object
- | + rgw: adjust objs when copying obj with explicit_objs set
- | + rgw: patch manifest to handle explicit objs copy issue
- + Pull request 11951
- |\
- | + rgw: fix wrong length in Content-Range HTTP header of Swift's DLO.
- | + rgw: fix wrong first byte pos in Content-Range HTTP header of Swift's DLO.
- + Pull request 11952
- |\
- | + rgw-admin: return error on email address conflict
- | + rgw-admin: convert user email addresses to lower case
- + Pull request 12006
- |\
- | + mon: MonmapMonitor: return success when monitor will be removed
- + Pull request 12018
- + librbd: request exclusive lock if current owner cannot execute op
Updated by Nathan Cutler over 7 years ago
rados¶
teuthology-suite --priority 101 --suite rados --subset $(expr $RANDOM % 20)/20 --suite-branch wip-16225 --email ncutler@suse.cz --ceph hammer-backports --machine-type smithi
- new bug http://tracker.ceph.com/issues/17955 rados/test_pool_quota.sh fails
- 558016 rados/basic/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/few.yaml tasks/rados_api_tests.yaml}
- 558140 rados/basic/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/many.yaml tasks/rados_api_tests.yaml}
- David Zafman found that the test failures are due to the new "pool full" detection/handling in the Objecter: "it's a feature, not a bug" - just need to fix the test
- known bug http://tracker.ceph.com/issues/15139 Command failed on smithi014 with status 1: "sudo yum -y install '' ceph-radosgw"
- 558060
- probably a WONTFIX
- known bug http://tracker.ceph.com/issues/15345 ./common/RWLock.h: In function 'void RWLock::get_write(bool)' ./common/RWLock.h: 97: FAILED assert(r == 0)
- 558164 rados/thrash-erasure-code-isa/{arch/x86_64.yaml clusters/{fixed-2.yaml openstack.yaml} fs/ext4.yaml msgr-failures/fastclose.yaml supported/ubuntu_14.04.yaml thrashers/mapgap.yaml workloads/ec-rados-plugin=isa-k=2-m=1.yaml}
- 558237 rados/thrash/{0-size-min-size-overrides/2-size-1-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/{fixed-2.yaml openstack.yaml} fs/btrfs.yaml msgr-failures/few.yaml thrashers/mapgap.yaml workloads/readwrite.yaml}
- 558253 rados/thrash/{0-size-min-size-overrides/2-size-2-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/{fixed-2.yaml openstack.yaml} fs/ext4.yaml msgr-failures/osd-delay.yaml thrashers/pggrow.yaml workloads/pool-snaps-few-objects.yaml}
- addressed by staging backport https://github.com/ceph/ceph/pull/12071
Pushed hammer-backports-12071 branch to ceph/ceph and re-running to see if PR#12071 fixes http://tracker.ceph.com/issues/15345:
./virtualenv/bin/teuthology-suite --priority 101 --suite rados --suite-branch wip-16225 --email ncutler@suse.cz --ceph hammer-backports-12071 --machine-type smithi --filter 'rados/thrash-erasure-code-isa/{arch/x86_64.yaml clusters/{fixed-2.yaml openstack.yaml} fs/ext4.yaml msgr-failures/fastclose.yaml supported/ubuntu_14.04.yaml thrashers/mapgap.yaml workloads/ec-rados-plugin=isa-k=2-m=1.yaml},rados/thrash/{0-size-min-size-overrides/2-size-1-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/{fixed-2.yaml openstack.yaml} fs/btrfs.yaml msgr-failures/few.yaml thrashers/mapgap.yaml workloads/readwrite.yaml},rados/thrash/{0-size-min-size-overrides/2-size-2-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/{fixed-2.yaml openstack.yaml} fs/ext4.yaml msgr-failures/osd-delay.yaml thrashers/pggrow.yaml workloads/pool-snaps-few-objects.yaml}'
Re-running the test_pool_quota.sh failures on hammer-backports-12071-17955 (which includes cherry-pick of 16ead95daa3d1309e8e76e57416b4201e71d0449):
./virtualenv/bin/teuthology-suite --priority 101 --suite rados --suite-branch wip-16225 --email ncutler@suse.cz --ceph hammer-backports-12071-17955 --machine-type smithi --filter 'rados/basic/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/few.yaml tasks/rados_api_tests.yaml},rados/basic/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/many.yaml tasks/rados_api_tests.yaml}'
Updated by Nathan Cutler over 7 years ago
RADOS BASELINE¶
teuthology-suite --priority 101 --suite rados --subset $(expr $RANDOM % 20)/20 --suite-branch hammer --email ncutler@suse.cz --ceph hammer --machine-type smithi
fail http://pulpito.front.sepia.ceph.com:80/smithfarm-2016-11-18_09:47:09-rados-hammer---basic-smithi/
- known bug http://tracker.ceph.com/issues/15139 Command failed on smithi014 with status 1: "sudo yum -y install '' ceph-radosgw"
Updated by Nathan Cutler over 7 years ago
powercycle¶
fail http://pulpito.front.sepia.ceph.com:80/smithfarm-2016-11-18_09:57:30-powercycle-hammer-backports-testing-basic-smithi/- ceph packages failed to install
- Loic had the same problem with jewel - the issue is that the suite has changed in ceph-qa-suite master, so we need to run with
--suite-branch hammer
Second try:
./virtualenv/bin/teuthology-suite -l2 -v -c hammer-backports -k testing -m smithi -s powercycle -p 1000 --email ncutler@suse.cz --suite-branch hammer
one passed, one dead http://pulpito.front.sepia.ceph.com:80/smithfarm-2016-11-18_15:27:08-powercycle-hammer-backports-testing-basic-smithi/
Repeat the dead test:
./virtualenv/bin/teuthology-suite -l2 -v -c hammer-backports -k testing -m smithi -s powercycle -p 1000 --email ncutler@suse.cz --suite-branch hammer --filter 'powercycle/osd/{clusters/3osd-1per-target.yaml fs/ext4.yaml powercycle/default.yaml tasks/cfuse_workunit_kernel_untar_build.yaml}'
Updated by Nathan Cutler over 7 years ago
fs¶
teuthology-suite --priority 1000 --suite fs --subset $(expr $RANDOM % 5)/5 --suite-branch hammer --email ncutler@suse.cz --ceph hammer-backports --machine-type smithi
- new bug - won't fix because btrfs http://tracker.ceph.com/issues/17972 ("ENOSPC handling not implemented" followed by OSD assert)
- known bug http://tracker.ceph.com/issues/9501 (OSD crash, possibly a kernel bug in btrfs?)
- known bug http://tracker.ceph.com/issues/10675 (MON clock skew, unreliable NTP server?)
- known bug http://tracker.ceph.com/issues/14716 (MDS crash due to partially backported OSD full handling changes?)
Re-running all five dead and failed jobs:
- known bug - won't fix because btrfs http://tracker.ceph.com/issues/17972 ("ENOSPC handling not implemented" followed by OSD assert)
- ceph-fuse task fails to start http://tracker.ceph.com/issues/12612
- known bug http://tracker.ceph.com/issues/14716 (MDS crash due to partially backported OSD full handling changes?)
Updated by Nathan Cutler over 7 years ago
rgw¶
teuthology-suite --priority 1000 --suite rgw --subset $(expr $RANDOM % 5)/5 --suite-branch hammer --email ncutler@suse.cz --ceph hammer-backports --machine-type smithi
- known bug http://tracker.ceph.com/issues/13529 http://pulpito.front.sepia.ceph.com/smithfarm-2016-11-18_10:00:20-rgw-hammer-backports---basic-smithi/558612/
- valgrind log can be found at /a/smithfarm-2016-11-18_10:00:20-rgw-hammer-backports---basic-smithi/558612/remote/smithi068/log/valgrind
- fix was staged for backport, but not merged because it is "problematic"
Re-running the failed test 5 times:
Updated by Nathan Cutler over 7 years ago
rbd¶
teuthology-suite --priority 1000 --suite rbd --subset $(expr $RANDOM % 5)/5 --suite-branch hammer --email ncutler@suse.cz --ceph hammer-backports --machine-type smithi
- unknown bug - environmental noise? OSDs commit suicide - no idea what's going on here
- known bug http://tracker.ceph.com/issues/10773 qemu-iotests failure
Re-running both jobs:
fail http://pulpito.front.sepia.ceph.com:80/smithfarm-2016-11-20_22:58:18-rbd-hammer-backports---basic-smithi/- known bug http://tracker.ceph.com/issues/10773 qemu-iotests failure
Updated by Nathan Cutler over 7 years ago
git --no-pager log --format='%H %s' --graph ceph/hammer..hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 9873
- |\
- | + init-radosgw: do not use systemd-run in sysvinit
- + Pull request 10238
- |\
- | + mon/MDSMonitor: fix memory leak in prepare_beacon
- + Pull request 10255
- |\
- | + common: fix race during optracker switches between enabled/disabled mode
- | + ReplicatedPG: clearing a whiteout should create the object
- | + common/TrackedOp: Move tracking_enabled check into register_inflight_op()
- | + common/TrackedOp: Handle dump racing with constructor
- | + common/TrackedOp: Missed locking when examining events
- | + CLEANUP: Move locking into dump_ops_in_flight()/dump_historic_ops()
- | + mds, osd: Fix missing locking for dump_blocked_ops
- | + osd: cleanup: Specify both template types for create_request()
- | + osd: add dump_blocked_ops asok command.
- | + common/TrackedOp: make Tracker can dynamic control.
- | + mds: Make mds can dynamic set optracker via asok.
- | + osd: Make osd can dynamic set optracker via asok.
- | + common/TrackedOp: Should lock ops_history_lock when access shutdown.
- | + osd/ReplicatedPG: for osd_op_create, if ob existed don't do t->touch.
- | + common/TrackedOp: checking in flight ops fix
- | + common/OpTracker: don't dump ops if tracking is not enabled
- | + common/TrackedOp: break out of loop when reaching log threshold
- | + common/TrackedOp: check tracking_enabled for event initiated/done .
- | + common/TrackedOp: clean up code make look good.
- + Pull request 10569
- |\
- | + stop.sh: make more portable
- + Pull request 10582
- |\
- | + osd: Mark child of split temp_created if it was created
- | + os/FileStore: better debug output for destroy_collection
- + Pull request 10724
- |\
- | + crush: reset bucket->h.items[i] when removing tree item
- + Pull request 10871
- |\
- | + qa: update pool quota test for internal retries
- | + hammer: resend writes after pool loses full flag
- | + hammer: drop write if pool is full
- | + hammer: osdc: implement Objecter::osdmap_pool_full
- | + hammer: osdc/Objecter: allow per-pool calls to op_cancel_writes
- + Pull request 10904
- |\
- | + mon: return size_t from MonitorDBStore::Transaction::size()
- + Pull request 10905
- |\
- | + ceph.in: improve the error message
- + Pull request 11045
- |\
- | + 13207: Rados Gateway: Anonymous user is able to read bucket with authenticated read ACL
- + Pull request 11125
- |\
- | + doc: fill keyring with caps before passing it to ceph-monstore-tool
- | + tools/ceph_monstore_tool: bail out if no caps found for a key
- | + tools/ceph_monstore_tool: update pgmap_meta also when rebuilding store.db
- | + tools/rebuild_mondb: kill compiling warning
- | + tools/rebuild_mondb: return error if ondisk version of pg_info is incompatible
- | + tools/rebuild_mondb: avoid unnecessary result code cast
- | + doc: add rados/operations/disaster-recovery.rst
- | + tools/ceph_monstore_tool: add rebuild command
- | + tools/ceph-objectstore-tool: add update-mon-db command
- | + mon/AuthMonitor: make AuthMonitor::IncType public
- + Pull request 11273
- |\
- | + mon: OSDMonitor: Missing nearfull flag set
- + Pull request 11457
- |\
- | + mon: send updated monmap to its subscribers
- + Pull request 11618
- |\
- | + hammer: ObjectCacher: fix bh_read_finish offset logic
- | + hammer: test: build a correctness test for the ObjectCacher
- | + hammer: test: split objectcacher test into 'stress' and 'correctness'
- | + hammer: test: add a data-storing MemWriteback for testing ObjectCacher
- | + hammer: objectcacher: introduce ObjectCacher::flush_all()
- | + hammer: osd: provide some contents on ObjectExtent usage in testing
- + Pull request 11676
- |\
- | + PG: update PGPool to detect map gaps and reset cached_removed_snaps
- + Pull request 11809
- |\
- | + rgw: handle empty POST condition
- + Pull request 11899
- |\
- | + rgw: fix the field 'total_time' of log entry in log show opt
- + Pull request 11927
- |\
- | + osd/PGBackend: fix collection_list shadow return value
- + Pull request 11929
- |\
- | + os/filestore/FileJournal: fail out if FileJournal is not block device or regular file
- + Pull request 11930
- |\
- | + cephx: Fix multiple segfaults due to attempts to encrypt or decrypt an empty secret and a null CryptoKeyHandler
- + Pull request 11931
- |\
- | + crush/CrushCompiler: error out as long as parse fails
- + Pull request 11932
- |\
- | + Cleanup: delete find_best_info again
- + Pull request 11933
- |\
- | + PG: use upset rather than up for _update_calc_stats
- | + PG: introduce and maintain upset
- + Pull request 11934
- |\
- | + mon/PGMonitor: calc the %USED of pool using used/
- | + mon/PGMonitor: mark dump_object_stat_sum() as static
- + Pull request 11935
- |\
- | + pg: restore correct behavior of read() callers
- + Pull request 11936
- |\
- | + librados: extend remove interface, add flags parameter
- | + osd: Add func has_flag in MOSDOp.
- | + osd: reject PARALLELEXEC ops with EINVAL
- | + ceph_test_rados_api_misc: test rados op with bad flas
- + Pull request 11937
- |\
- | + OSDMonitor::prepare_pgtemp: only update up_thru if newer
- + Pull request 11938
- |\
- | + OpRequest: release the message throttle when unregistered
- + Pull request 11939
- |\
- | + mds: fix out-of-order messages
- + Pull request 11946
- |\
- | + mon: update mon(peon)'s down_pending_out when osd up
- + Pull request 11948
- |\
- | + rbd: this command should be EXPORT_DIFF
- + Pull request 11949
- |\
- | + librbd: block name prefix might overflow fixed size C-string
- + Pull request 11950
- |\
- | + rgw: adjust manifest head object
- | + rgw: adjust objs when copying obj with explicit_objs set
- | + rgw: patch manifest to handle explicit objs copy issue
- + Pull request 11951
- |\
- | + rgw: fix wrong length in Content-Range HTTP header of Swift's DLO.
- | + rgw: fix wrong first byte pos in Content-Range HTTP header of Swift's DLO.
- + Pull request 11952
- |\
- | + rgw-admin: return error on email address conflict
- | + rgw-admin: convert user email addresses to lower case
- + Pull request 12006
- |\
- | + mon: MonmapMonitor: return success when monitor will be removed
- + Pull request 12018
- |\
- | + librbd: request exclusive lock if current owner cannot execute op
- + Pull request 12071
- + os/ObjectStore: fix _update_op for split dest_cid
Updated by Nathan Cutler over 7 years ago
rados¶
teuthology-suite --priority 101 --suite rados --subset $(expr $RANDOM % 2000)/2000 --suite-branch wip-16225 --email ncutler@suse.cz --ceph hammer-backports --machine-type smithi
- known bug http://tracker.ceph.com/issues/17955 rados/test_pool_quota.sh fails
(pool quota tests failed because the run somehow picked up the "old" hammer-backports branch, i.e. 2dd92e79592d27661e0b13d1da5522da995a187a which did not include the fix)
Re-running the pool quota tests 10 times:
./virtualenv/bin/teuthology-suite --priority 101 --suite rados --suite-branch wip-16225 --email ncutler@suse.cz --ceph hammer-backports-12071-17955 --machine-type smithi --filter 'rados/basic/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/few.yaml tasks/rados_api_tests.yaml},rados/basic/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/many.yaml tasks/rados_api_tests.yaml}' -N 10
And a slightly modified run of the same two tests:
./virtualenv/bin/teuthology-suite --priority 101 --suite rados --suite-branch hammer --email ncutler@suse.cz --ceph hammer-backports --machine-type smithi --filter 'rados/basic/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/few.yaml tasks/rados_api_tests.yaml},rados/basic/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/many.yaml tasks/rados_api_tests.yaml}' -N 5
Updated by Nathan Cutler over 7 years ago
rados - PR#10255¶
A 5x run to assert that a failure from https://github.com/ceph/ceph/pull/10255#issuecomment-241138225 is not reproducible:
./virtualenv/bin/teuthology-suite --priority 101 --suite rados --suite-branch wip-16225 --email ncutler@suse.cz --ceph hammer-backports --machine-type smithi --filter 'thrash/{0-size-min-size-overrides/2-size-2-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml clusters/{fixed-2.yaml openstack.yaml} fs/xfs.yaml msgr-failures/few.yaml thrashers/mapgap.yaml workloads/cache.yaml}' -N 5
Updated by Nathan Cutler over 7 years ago
git --no-pager log --format='%H %s' --graph ceph/hammer..hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 9873
- |\
- | + init-radosgw: do not use systemd-run in sysvinit
- + Pull request 10238
- |\
- | + mon/MDSMonitor: fix memory leak in prepare_beacon
- + Pull request 10255
- |\
- | + common: fix race during optracker switches between enabled/disabled mode
- | + ReplicatedPG: clearing a whiteout should create the object
- | + common/TrackedOp: Move tracking_enabled check into register_inflight_op()
- | + common/TrackedOp: Handle dump racing with constructor
- | + common/TrackedOp: Missed locking when examining events
- | + CLEANUP: Move locking into dump_ops_in_flight()/dump_historic_ops()
- | + mds, osd: Fix missing locking for dump_blocked_ops
- | + osd: cleanup: Specify both template types for create_request()
- | + osd: add dump_blocked_ops asok command.
- | + common/TrackedOp: make Tracker can dynamic control.
- | + mds: Make mds can dynamic set optracker via asok.
- | + osd: Make osd can dynamic set optracker via asok.
- | + common/TrackedOp: Should lock ops_history_lock when access shutdown.
- | + osd/ReplicatedPG: for osd_op_create, if ob existed don't do t->touch.
- | + common/TrackedOp: checking in flight ops fix
- | + common/OpTracker: don't dump ops if tracking is not enabled
- | + common/TrackedOp: break out of loop when reaching log threshold
- | + common/TrackedOp: check tracking_enabled for event initiated/done .
- | + common/TrackedOp: clean up code make look good.
- + Pull request 10569
- |\
- | + stop.sh: make more portable
- + Pull request 10582
- |\
- | + osd: Mark child of split temp_created if it was created
- | + os/FileStore: better debug output for destroy_collection
- + Pull request 10724
- |\
- | + crush: reset bucket->h.items[i] when removing tree item
- + Pull request 10871
- |\
- | + qa: update pool quota test for internal retries
- | + hammer: resend writes after pool loses full flag
- | + hammer: drop write if pool is full
- | + hammer: osdc: implement Objecter::osdmap_pool_full
- | + hammer: osdc/Objecter: allow per-pool calls to op_cancel_writes
- + Pull request 10904
- |\
- | + mon: return size_t from MonitorDBStore::Transaction::size()
- + Pull request 10905
- |\
- | + ceph.in: improve the error message
- + Pull request 10990
- |\
- | + Merge branch 'fix_bug_EXPORT_DIFF1' of github.com:YankunLi/ceph into fix_bug_EXPORT_DIFF1
- | |\
- | | + Merge branch 'hammer' into fix_bug_EXPORT_DIFF1
- | | + rbd: this command should be EXPORT_DIFF
- | + rbd: this command should be EXPORT_DIFF
- + Pull request 11045
- |\
- | + 13207: Rados Gateway: Anonymous user is able to read bucket with authenticated read ACL
- + Pull request 11125
- |\
- | + doc: fill keyring with caps before passing it to ceph-monstore-tool
- | + tools/ceph_monstore_tool: bail out if no caps found for a key
- | + tools/ceph_monstore_tool: update pgmap_meta also when rebuilding store.db
- | + tools/rebuild_mondb: kill compiling warning
- | + tools/rebuild_mondb: return error if ondisk version of pg_info is incompatible
- | + tools/rebuild_mondb: avoid unnecessary result code cast
- | + doc: add rados/operations/disaster-recovery.rst
- | + tools/ceph_monstore_tool: add rebuild command
- | + tools/ceph-objectstore-tool: add update-mon-db command
- | + mon/AuthMonitor: make AuthMonitor::IncType public
- + Pull request 11273
- |\
- | + mon: OSDMonitor: Missing nearfull flag set
- + Pull request 11457
- |\
- | + mon: send updated monmap to its subscribers
- + Pull request 11618
- |\
- | + hammer: ObjectCacher: fix bh_read_finish offset logic
- | + hammer: test: build a correctness test for the ObjectCacher
- | + hammer: test: split objectcacher test into 'stress' and 'correctness'
- | + hammer: test: add a data-storing MemWriteback for testing ObjectCacher
- | + hammer: objectcacher: introduce ObjectCacher::flush_all()
- | + hammer: osd: provide some contents on ObjectExtent usage in testing
- + Pull request 11676
- |\
- | + PG: update PGPool to detect map gaps and reset cached_removed_snaps
- + Pull request 11809
- |\
- | + rgw: handle empty POST condition
- + Pull request 11899
- |\
- | + rgw: fix the field 'total_time' of log entry in log show opt
- + Pull request 11927
- |\
- | + osd/PGBackend: fix collection_list shadow return value
- + Pull request 11929
- |\
- | + os/filestore/FileJournal: fail out if FileJournal is not block device or regular file
- + Pull request 11930
- |\
- | + cephx: Fix multiple segfaults due to attempts to encrypt or decrypt an empty secret and a null CryptoKeyHandler
- + Pull request 11931
- |\
- | + crush/CrushCompiler: error out as long as parse fails
- + Pull request 11932
- |\
- | + Cleanup: delete find_best_info again
- + Pull request 11933
- |\
- | + PG: use upset rather than up for _update_calc_stats
- | + PG: introduce and maintain upset
- + Pull request 11934
- |\
- | + mon/PGMonitor: calc the %USED of pool using used/
- | + mon/PGMonitor: mark dump_object_stat_sum() as static
- + Pull request 11935
- |\
- | + pg: restore correct behavior of read() callers
- + Pull request 11936
- |\
- | + librados: extend remove interface, add flags parameter
- | + osd: Add func has_flag in MOSDOp.
- | + osd: reject PARALLELEXEC ops with EINVAL
- | + ceph_test_rados_api_misc: test rados op with bad flas
- + Pull request 11937
- |\
- | + OSDMonitor::prepare_pgtemp: only update up_thru if newer
- + Pull request 11938
- |\
- | + OpRequest: release the message throttle when unregistered
- + Pull request 11939
- |\
- | + mds: fix out-of-order messages
- + Pull request 11946
- |\
- | + mon: update mon(peon)'s down_pending_out when osd up
- + Pull request 11948
- |\
- | + rbd: this command should be EXPORT_DIFF
- + Pull request 11949
- |\
- | + librbd: block name prefix might overflow fixed size C-string
- + Pull request 11950
- |\
- | + rgw: adjust manifest head object
- | + rgw: adjust objs when copying obj with explicit_objs set
- | + rgw: patch manifest to handle explicit objs copy issue
- + Pull request 11951
- |\
- | + rgw: fix wrong length in Content-Range HTTP header of Swift's DLO.
- | + rgw: fix wrong first byte pos in Content-Range HTTP header of Swift's DLO.
- + Pull request 11952
- |\
- | + rgw-admin: return error on email address conflict
- | + rgw-admin: convert user email addresses to lower case
- + Pull request 12006
- |\
- | + mon: MonmapMonitor: return success when monitor will be removed
- + Pull request 12018
- |\
- | + librbd: request exclusive lock if current owner cannot execute op
- + Pull request 12071
- + os/ObjectStore: fix _update_op for split dest_cid
Updated by Nathan Cutler over 7 years ago
rados¶
./virtualenv/bin/teuthology-suite --priority 101 --suite rados --subset $(expr $RANDOM % 2000)/2000 --suite-branch wip-16225 --email ncutler@suse.cz --ceph hammer-backports --machine-type smithi
Although the run is technically a "fail", there were only two failed tests, both caused by the inability to install very old (dumpling-, firefly-era) packages.
Updated by Nathan Cutler over 7 years ago
upgrade¶
./virtualenv/bin/teuthology-suite -l 2 -k distro --verbose --suite upgrade/hammer --suite-branch hammer --ceph hammer-backports --machine-type vps --priority 1000 --dry-run machine_types/vps.yaml
Plus this one job:
Updated by Nathan Cutler over 7 years ago
powercycle¶
./virtualenv/bin/teuthology-suite -l2 -v -c hammer-backports -k distro -m smithi -s powercycle -p 90 --email ncutler@suse.cz --suite-branch hammer
Updated by Nathan Cutler over 7 years ago
git --no-pager log --format='%H %s' --graph ceph/hammer..hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 10582
- |\
- | + osd: Mark child of split temp_created if it was created
- + Pull request 11457
- |\
- | + mon: send updated monmap to its subscribers
- + Pull request 11615
- |\
- | + FileStore:: fix fiemap issue in xfs when #extents > 1364
- + Pull request 11628
- |\
- | + Don't loop forever when reading data from 0 sized segment.
- + Pull request 11936
- + librados: extend remove interface, add flags parameter
- + osd: Add func has_flag in MOSDOp.
- + osd: reject PARALLELEXEC ops with EINVAL
- + ceph_test_rados_api_misc: test rados op with bad flas
Updated by Nathan Cutler over 7 years ago
rados¶
teuthology-suite --priority 1000 --suite rados --subset $(expr $RANDOM % 2000)/2000 --suite-branch hammer --email ncutler@suse.cz --ceph hammer-backports --machine-type smithi
- known bug http://tracker.ceph.com/issues/15139 Command failed on smithi014 with status 1: "sudo yum -y install '' ceph-radosgw"
- 575097
- WONTFIX
So, despite the one failure, the run should be considered a pass
Updated by Nathan Cutler over 7 years ago
rgw¶
teuthology-suite --priority 1000 --suite rgw --subset $(expr $RANDOM % 5)/5 --suite-branch hammer --email ncutler@suse.cz --ceph hammer-backports --machine-type smithi
Updated by Nathan Cutler over 7 years ago
fs¶
teuthology-suite --priority 101 --suite fs --suite-branch hammer --email ncutler@suse.cz --ceph hammer-backports --machine-type smithi --filter 'recovery/{clusters/2-remote-clients.yaml debug/mds_client.yaml mounts/ceph-fuse.yaml tasks/mds-full.yaml}'
So we can assert that http://tracker.ceph.com/issues/14716 is not triggered.
Unfortunately, the failure is a case of http://tracker.ceph.com/issues/14716
Updated by Nathan Cutler over 7 years ago
fs¶
Pushed hammer-backports-14716 with revert of http://github.com/ceph/ceph/commit/4a36933 which was recently merged to hammer and might be causing this test to fail:
teuthology-suite --priority 101 --suite fs --suite-branch hammer --email ncutler@suse.cz --ceph hammer-backports-14716 --machine-type smithi --filter 'recovery/{clusters/2-remote-clients.yaml debug/mds_client.yaml mounts/ceph-fuse.yaml tasks/mds-full.yaml}'
dead http://pulpito.ceph.com:80/smithfarm-2016-11-28_10:51:30-fs-hammer-backports-14716---basic-smithi/
Updated by Nathan Cutler over 7 years ago
- Target version changed from 522 to v0.94.10
Updated by Nathan Cutler over 7 years ago
git --no-pager log --format='%H %s' --graph ceph/hammer..wip-hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 11615
- |\
- | + FileStore::_do_fiemap: do not reference fiemap after it is freed
- | + FileStore:: fix fiemap issue in xfs when #extents > 1364
- + Pull request 11936
- |\
- | + osd: Add func has_flag in MOSDOp.
- | + osd: reject PARALLELEXEC ops with EINVAL
- | + ceph_test_rados_api_misc: test rados op with bad flas
- + Pull request 12121
- |\
- | + common/TrackedOp: Move tracking_enabled check into register_inflight_op()
- | + common/TrackedOp: Handle dump racing with constructor
- | + common/TrackedOp: Missed locking when examining events
- | + CLEANUP: Move locking into dump_ops_in_flight()/dump_historic_ops()
- | + mds, osd: Fix missing locking for dump_blocked_ops
- | + osd: cleanup: Specify both template types for create_request()
- | + osd: add dump_blocked_ops asok command.
- | + common/TrackedOp: Should lock ops_history_lock when access shutdown.
- | + common/TrackedOp: checking in flight ops fix
- | + common/OpTracker: don't dump ops if tracking is not enabled
- | + common/TrackedOp: check tracking_enabled for event initiated/done .
- | + common/TrackedOp: clean up code make look good.
- + Pull request 12266
- |\
- | + msg/simple/Pipe: handle addr decode error
- + Pull request 12312
- |\
- | + rbd: fix parameter check
- + Pull request 12398
- |\
- | + rgw: do not abort when accept a CORS request with short origin
- + Pull request 12417
- |\
- | + osd: limit omap data in push op
- + Pull request 12418
- |\
- | + rgw: omap_get_all() fixes
- | + rgw/rgw_rados: do not omap_getvals with (u64)-1 max
- + Pull request 12423
- |\
- | + qa/workunits/rbd: removed qemu-iotest case 077
- + Pull request 12446
- |\
- | + librbd: diffs to clone's first snapshot should include parent diffs
- + Pull request 12685
- |\
- | + qa/tasks/workunit: clear clone dir before retrying checkout
- | + qa/tasks/workunit: retry on ceph.git if checkout fails
- | + qa/workunits: include extension for nose tests
- | + qa/workunits: use relative path instead of wget from git
- | + qa/tasks/workunit.py: add CEPH_BASE env var
- | + qa/tasks/workunit: leave workunits inside git checkout
- + Pull request 12687
- |\
- | + mon/OSDMonitor: only show interesting flags in health warning
- + Pull request 12743
- + qa/tasks/ceph.py: populate mnt_point in hammer
Updated by Nathan Cutler over 7 years ago
rados¶
teuthology-suite --priority 101 --suite rados --subset $(expr $RANDOM % 2000)/2000 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro
- new bug http://tracker.ceph.com/issues/18382 (Sepia infrastructure related - impossible to install Xenial build)
Re-running with Xenial excluded:
teuthology-suite --priority 101 --suite rados --subset $(expr $RANDOM % 2000)/2000 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
- new bug http://tracker.ceph.com/issues/18383 (resolved)
Re-pushed new integration branch with https://github.com/ceph/ceph/pull/12743 and running the full 112 jobs again:
fail http://pulpito.ceph.com:80/smithfarm-2017-01-02_15:48:30-rados-wip-hammer-backports-distro-basic-smithi/ (four fail, three dead)
- new bug http://tracker.ceph.com/issues/18393
- rados/thrash/{0-size-min-size-overrides/2-size-1-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml clusters/{fixed-2.yaml openstack.yaml} fs/ext4.yaml msgr-failures/few.yaml thrashers/pggrow.yaml
- rados/thrash/{0-size-min-size-overrides/2-size-2-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/{fixed-2.yaml openstack.yaml} fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/default.yaml workloads/admin_socket_objecter_requests.yaml}
- new bug http://tracker.ceph.com/issues/18401 (3 dead jobs)
- rados/thrash-erasure-code-isa/{arch/x86_64.yaml clusters/{fixed-2.yaml openstack.yaml} fs/ext4.yaml msgr-failures/few.yaml supported/centos_7.2.yaml thrashers/default.yaml workloads/ec-rados-plugin=isa-k=2-m=1.yaml}
- rados/thrash-erasure-code-isa/{arch/x86_64.yaml clusters/{fixed-2.yaml openstack.yaml} fs/ext4.yaml msgr-failures/fastclose.yaml supported/centos_7.2.yaml thrashers/morepggrow.yaml workloads/ec-rados-plugin=isa-k=2-m=1.yaml}
- rados/thrash-erasure-code-isa/{arch/x86_64.yaml clusters/{fixed-2.yaml openstack.yaml} fs/ext4.yaml msgr-failures/osd-delay.yaml supported/centos_7.2.yaml thrashers/default.yaml workloads/ec-rados-plugin=isa-k=2-m=1.yaml}
- failed jobs - not expected to succeed because Shaman does not build old (dumpling, firefly) branches
Updated by Nathan Cutler over 7 years ago
rgw¶
teuthology-suite --priority 101 --suite rgw --subset $(expr $RANDOM % 5)/5 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
- new bug http://tracker.ceph.com/issues/18384 (cannot clone s3-tests.git)
Re-running failed tests with --suite-repo https://github.com/ceph/ceph.git --suite-branch hammer
(to avoid issue#18384):
./virtualenv/bin/teuthology-suite --priority 101 --suite rgw --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro --suite-repo https://github.com/ceph/ceph.git --suite-branch hammer --filter="$filter" qa/distros/all/ubuntu_14.04.yaml
Updated by Nathan Cutler over 7 years ago
rbd¶
teuthology-suite --priority 101 --suite rbd --subset $(expr $RANDOM % 5)/5 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
fail http://pulpito.ceph.com:80/smithfarm-2017-01-02_16:38:11-rbd-wip-hammer-backports-distro-basic-smithi/ (two fail, one dead)
- new bug http://tracker.ceph.com/issues/18388 - two failed jobs involving test_lock_fence.sh
- rbd/qemu/{cache/writethrough.yaml cachepool/small.yaml clusters/{fixed-3.yaml openstack.yaml} fs/btrfs.yaml msgr-failures/few.yaml workloads/qemu_xfstests.yaml} (one dead job)
- OSD committed suicide (btrfs, maybe it can be ignored?)
Re-running dead and failed jobs with fix for issue#18388
teuthology-suite --priority 101 --suite rbd --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro --filter="$filter" qa/distros/all/ubuntu_14.04.yaml
Updated by Nathan Cutler over 7 years ago
fs¶
teuthology-suite --priority 101 --suite fs --subset $(expr $RANDOM % 5)/5 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
fail http://pulpito.ceph.com:80/smithfarm-2017-01-02_16:39:27-fs-wip-hammer-backports-distro-basic-smithi/ (one dead job)
- fs/recovery/{clusters/2-remote-clients.yaml debug/mds_client.yaml mounts/ceph-fuse.yaml tasks/mds-full.yaml}
- MDS says "mds.0.log unhandled error (28) No space left on device, shutting down..." and commits suicide
- matches http://tracker.ceph.com/issues/14716 exactly
Updated by Nathan Cutler over 7 years ago
powercycle¶
./virtualenv/bin/teuthology-suite -l2 -v -c wip-hammer-backports -k distro -m smithi -s powercycle -p 101 --email ncutler@suse.cz
- known bug http://tracker.ceph.com/issues/18382 (Sepia infrastructure related - impossible to install Xenial build)
Updated by Nathan Cutler over 7 years ago
git --no-pager log --format='%H %s' --graph ceph/hammer..wip-hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 11615
- |\
- | + FileStore::_do_fiemap: do not reference fiemap after it is freed
- | + FileStore:: fix fiemap issue in xfs when #extents > 1364
- + Pull request 11936
- |\
- | + osd: Add func has_flag in MOSDOp.
- | + osd: reject PARALLELEXEC ops with EINVAL
- | + ceph_test_rados_api_misc: test rados op with bad flas
- + Pull request 12121
- |\
- | + common/TrackedOp: Move tracking_enabled check into register_inflight_op()
- | + common/TrackedOp: Handle dump racing with constructor
- | + common/TrackedOp: Missed locking when examining events
- | + CLEANUP: Move locking into dump_ops_in_flight()/dump_historic_ops()
- | + mds, osd: Fix missing locking for dump_blocked_ops
- | + osd: cleanup: Specify both template types for create_request()
- | + osd: add dump_blocked_ops asok command.
- | + common/TrackedOp: Should lock ops_history_lock when access shutdown.
- | + common/TrackedOp: checking in flight ops fix
- | + common/OpTracker: don't dump ops if tracking is not enabled
- | + common/TrackedOp: check tracking_enabled for event initiated/done .
- | + common/TrackedOp: clean up code make look good.
- + Pull request 12266
- |\
- | + msg/simple/Pipe: handle addr decode error
- + Pull request 12312
- |\
- | + rbd: fix parameter check
- + Pull request 12398
- |\
- | + rgw: do not abort when accept a CORS request with short origin
- + Pull request 12417
- |\
- | + osd: limit omap data in push op
- + Pull request 12418
- |\
- | + rgw: omap_get_all() fixes
- | + rgw/rgw_rados: do not omap_getvals with (u64)-1 max
- + Pull request 12423
- |\
- | + qa/workunits/rbd: removed qemu-iotest case 077
- + Pull request 12446
- |\
- | + librbd: diffs to clone's first snapshot should include parent diffs
- + Pull request 12619
- |\
- | + rgw: TempURL in radosgw behaves now like its Swift's counterpart.
- + Pull request 12685
- |\
- | + tests: rbd/test_lock_fence.sh: fix rbdrw.py relative path
- | + qa/tasks/workunit: clear clone dir before retrying checkout
- | + qa/tasks/workunit: retry on ceph.git if checkout fails
- | + qa/workunits: include extension for nose tests
- | + qa/workunits: use relative path instead of wget from git
- | + qa/tasks/workunit.py: add CEPH_BASE env var
- | + qa/tasks/workunit: leave workunits inside git checkout
- + Pull request 12687
- |\
- | + mon/OSDMonitor: only show interesting flags in health warning
- + Pull request 12744
- + use ceph-master branch for s3tests
Updated by Nathan Cutler over 7 years ago
rgw¶
teuthology-suite --priority 101 --suite rgw --subset $(expr $RANDOM % 5)/5 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
Updated by Nathan Cutler over 7 years ago
powercycle¶
./virtualenv/bin/teuthology-suite -l2 -v -c wip-hammer-backports -k distro -m smithi -s powercycle -p 101 --email ncutler@suse.cz qa/distros/all/ubuntu_14.04.yaml
- known bug http://tracker.ceph.com/issues/18393
Updated by Nathan Cutler over 7 years ago
git --no-pager log --format='%H %s' --graph ceph/hammer..wip-hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 11615
- |\
- | + FileStore::_do_fiemap: do not reference fiemap after it is freed
- | + FileStore:: fix fiemap issue in xfs when #extents > 1364
- + Pull request 11936
- |\
- | + osd: Add func has_flag in MOSDOp.
- | + osd: reject PARALLELEXEC ops with EINVAL
- | + ceph_test_rados_api_misc: test rados op with bad flas
- + Pull request 12121
- |\
- | + common/TrackedOp: Move tracking_enabled check into register_inflight_op()
- | + common/TrackedOp: Handle dump racing with constructor
- | + common/TrackedOp: Missed locking when examining events
- | + CLEANUP: Move locking into dump_ops_in_flight()/dump_historic_ops()
- | + mds, osd: Fix missing locking for dump_blocked_ops
- | + osd: cleanup: Specify both template types for create_request()
- | + osd: add dump_blocked_ops asok command.
- | + common/TrackedOp: Should lock ops_history_lock when access shutdown.
- | + common/TrackedOp: checking in flight ops fix
- | + common/OpTracker: don't dump ops if tracking is not enabled
- | + common/TrackedOp: check tracking_enabled for event initiated/done .
- | + common/TrackedOp: clean up code make look good.
- + Pull request 12266
- |\
- | + msg/simple/Pipe: handle addr decode error
- + Pull request 12417
- |\
- | + osd: limit omap data in push op
- + Pull request 12418
- |\
- | + rgw: omap_get_all() fixes
- | + rgw/rgw_rados: do not omap_getvals with (u64)-1 max
- + Pull request 12423
- |\
- | + qa/workunits/rbd: removed qemu-iotest case 077
- + Pull request 12619
- |\
- | + rgw: TempURL in radosgw behaves now like its Swift's counterpart.
- + Pull request 12685
- |\
- | + tests: rbd/test_lock_fence.sh: fix rbdrw.py relative path
- | + qa/tasks/workunit: clear clone dir before retrying checkout
- | + qa/tasks/workunit: retry on ceph.git if checkout fails
- | + qa/workunits: include extension for nose tests
- | + qa/workunits: use relative path instead of wget from git
- | + qa/tasks/workunit.py: add CEPH_BASE env var
- | + qa/tasks/workunit: leave workunits inside git checkout
- + Pull request 12687
- |\
- | + mon/OSDMonitor: only show interesting flags in health warning
- + Pull request 12744
- |\
- | + use ceph-master branch for s3tests
- + Pull request 12759
- |\
- | + qa/tasks/admin_socket: subst in repo name
- + Pull request 12762
- + qa/distros: add centos yaml; use that instead
Updated by Nathan Cutler over 7 years ago
rados¶
teuthology-suite --priority 101 --suite rados --subset $(expr $RANDOM % 2000)/2000 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
- failed jobs - not expected to succeed (?) but fix might be feasible, see http://tracker.ceph.com/issues/18069#note-7
- rados/singleton-nomsgr/{all/11429.yaml}
- rados/singleton-nomsgr/{all/13234.yaml}
- Jobs that need to be re-run without "qa/distros/all/ubuntu_14.04.yaml":
- rados/thrash-erasure-code-isa/{arch/x86_64.yaml clusters/{fixed-2.yaml openstack.yaml} fs/ext4.yaml msgr-failures/fastclose.yaml supported/centos.yaml thrashers/pggrow.yaml workloads/ec-rados-plugin=isa-k=2-m=1.yaml}
- rados/thrash-erasure-code-isa/{arch/x86_64.yaml clusters/{fixed-2.yaml openstack.yaml} fs/ext4.yaml msgr-failures/osd-delay.yaml supported/centos.yaml thrashers/mapgap.yaml workloads/ec-rados-plugin=isa-k=2-m=1.yaml}
- Jobs that need to be re-run with "qa/distros/all/ubuntu_14.04.yaml":
- One job stuck in "Queued" state
Re-running two failed (centos-only) jobs without "qa/distros/all/ubuntu_14.04.yaml":
one pass, one stuck in "queued" state http://pulpito.ceph.com:80/smithfarm-2017-01-04_19:05:09-rados-wip-hammer-backports-distro-basic-smithi/
Re-running one failed job with "qa/distros/all/ubuntu_14.04.yaml":
Re-running one job that seemed to get stuck in "Queued" state:
Re-running last job that got stuck in "Queued" state:
Updated by Nathan Cutler over 7 years ago
rgw¶
teuthology-suite --priority 101 --suite rgw --subset $(expr $RANDOM % 5)/5 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
Updated by Nathan Cutler over 7 years ago
rbd¶
teuthology-suite --priority 101 --suite rbd --subset $(expr $RANDOM % 5)/5 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
Updated by Nathan Cutler over 7 years ago
powercycle¶
./virtualenv/bin/teuthology-suite -l2 -v -c wip-hammer-backports -k distro -m smithi -s powercycle -p 101 --email ncutler@suse.cz qa/distros/all/ubuntu_14.04.yaml
Updated by Nathan Cutler over 7 years ago
fs¶
teuthology-suite --priority 101 --suite fs --subset $(expr $RANDOM % 5)/5 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
- known bug http://tracker.ceph.com/issues/14716
Though technically "fail", this run is actually "pass" because the only failure can safely be ignored.
Updated by Nathan Cutler over 7 years ago
git --no-pager log --format='%H %s' --graph ceph/hammer..wip-hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 12227
- + rgw: fix osd crashes when execute radosgw-admin bi list --max-entries=1 command
- + rgw: use hammer rgw_obj_key api
- + Revert rgw: rgw_obj encoding fixes
- + rgw_admin: add bi purge command
- + rgw: bucket resharding, adjust logging
- + cls/rgw: bi_list() fix is_truncated returned param
- + rgw_admin: require --yes-i-really-mean-it for bucket reshard
- + rgw_admin: better bucket reshard logging
- + rgw: limit bucket reshard num shards to max possible
- + rgw_admin: fix bi list command
- + rgw_admin: use aio operations for bucket resharding
- + rgw: bucket reshard updates stats
- + cls/rgw: add bucket_update_stats method
- + rgw_admin: reshard also links to new bucket instance
- + rgw: rgw_link_bucket, use correct bucket structure for entry point
- + radosgw-admin: bucket reshard needs --num-shards to be specified
- + cls/rgw: fix bi_list objclass command
- + rgw_admin: bucket rehsrading, initial work
- + rgw: rgw_obj encoding fixes
- + rgw: utilities to support raw bucket index operations
- + rgw: use bucket_info.bucket_id instead of marker where needed
- + cls/rgw: utilities to support raw bucket index operations
Updated by Nathan Cutler over 7 years ago
rgw¶
teuthology-suite --priority 101 --suite rgw --subset $(expr $RANDOM % 5)/5 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
fail http://pulpito.ceph.com:80/smithfarm-2017-01-05_20:47:44-rgw-wip-hammer-backports-distro-basic-smithi/ (one failed job)
- infrastructure noise
2017-01-05T21:28:07.185 INFO:teuthology.task.ansible.out:failed: [smithi052.front.sepia.ceph.com] (item=http://download.ceph.com/keys/autobuild.asc) => {"failed": true, "item": "http://download.ceph.com/keys/autobuild.asc", "msg": "Failed to download key at http://download.ceph.com/keys/autobuild.asc: Request failed: <urlopen error [Errno 101] Network is unreachable>"}
Re-running the failed job:
teuthology-suite --priority 101 --suite rgw --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro --filter="rgw/multifs/{clusters/fixed-2.yaml frontend/apache.yaml fs/btrfs.yaml overrides.yaml rgw_pool_type/ec-cache.yaml tasks/rgw_s3tests.yaml}" --dry-run qa/distros/all/ubuntu_14.04.yaml
Updated by Nathan Cutler over 7 years ago
git --no-pager log --format='%H %s' --graph ceph/hammer..wip-hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 12805
- |\
- | + ceph-create-keys: wait 10 minutes to get or create the bootstrap key, not forever
- | + ceph-create-keys: wait 10 minutes to get or create a key, not forever
- | + ceph-create-keys: wait for quorum for ten minutes, not forever
- + Pull request 12819
- |\
- | + os/filestore: FALLOC_FL_PUNCH_HOLE must be used with FALLOC_FL_KEEP_SIZE
- + Pull request 12824
- |\
- | + tests: subst repo and branch in qemu test urls
- | + tests: subst branch and repo in qa/tasks/qemu.py
- | + tests: subst repo name in qa/tasks/cram.py
- | + cram: support fetching from sha1 branch, tag, commit hash
- + Pull request 12906
- + PG: fix cached_removed_snaps bug in PGPool::update after map gap
Updated by Nathan Cutler over 7 years ago
rados¶
teuthology-suite --priority 101 --suite rados --subset $(expr $RANDOM % 2000)/2000 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
- failed jobs - not expected to succeed (?) but fix might be feasible, see http://tracker.ceph.com/issues/18069#note-7
- rados/singleton-nomsgr/{all/11429.yaml}
- rados/singleton-nomsgr/{all/13234.yaml}
- plus three tests that need to be re-run without "qa/distros/all/ubuntu_14.04.yaml" because teuthology thinks they should be run on "centos 14.04"
Re-running the latter three failed tests without qa/distros/all/ubuntu_14.04.yaml
./virtualenv/bin/teuthology-suite --priority 101 --suite rados --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro --filter="$filter"
Updated by Nathan Cutler over 7 years ago
powercycle¶
./virtualenv/bin/teuthology-suite -l2 -v -c wip-hammer-backports -k distro -m smithi -s powercycle -p 101 --email ncutler@suse.cz qa/distros/all/ubuntu_14.04.yaml
Updated by Nathan Cutler over 7 years ago
git --no-pager log --format='%H %s' --graph ceph/hammer..wip-hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 12824
- |\
- | + tests: subst repo and branch in qemu test urls
- | + tests: subst branch and repo in qa/tasks/qemu.py
- | + tests: subst repo name in qa/tasks/cram.py
- | + cram: support fetching from sha1 branch, tag, commit hash
- + Pull request 13022
- + qa: update remaining ceph.com to download.ceph.com
Updated by Nathan Cutler over 7 years ago
rbd¶
teuthology-suite --priority 101 --suite rbd --subset $(expr $RANDOM % 5)/5 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
Updated by Nathan Cutler over 7 years ago
fs¶
teuthology-suite --priority 101 --suite fs --subset $(expr $RANDOM % 5)/5 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
- fs/recovery/{clusters/2-remote-clients.yaml debug/mds_client.yaml mounts/ceph-fuse.yaml - Test failure: test_client_pin_root (tasks.mds_client_limits.TestClientLimits) due to "TypeError: string indices must be integers" in cephfs_test_case.py, _session_by_id() does not validate its arguments
- fs/thrash/{ceph-thrash/default.yaml ceph/base.yaml clusters/mds-1active-1standby.yaml debug/mds_client.yaml fs/btrfs.yaml msgr-failures/none.yaml overrides/whitelist_wrongly_marked_down.yaml tasks/cfuse_workunit_suites_pjd.yaml} - somehow btrfs tests are still being run
- fs/recovery/{clusters/2-remote-clients.yaml debug/mds_client.yaml mounts/ceph-fuse.yaml tasks/mds-full.yaml} - not expected to pass
Updated by Nathan Cutler over 7 years ago
git --no-pager log --format='%H %s' --graph ceph/hammer..wip-hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
Updated by Nathan Cutler over 7 years ago
fs¶
teuthology-suite --priority 101 --suite fs --subset $(expr $RANDOM % 5)/5 --email ncutler@suse.cz --ceph wip-hammer-backports --machine-type smithi -k distro qa/distros/all/ubuntu_14.04.yaml
- The only failure is fs/recovery/{clusters/2-remote-clients.yaml debug/mds_client.yaml mounts/ceph-fuse.yaml tasks/mds-full.yaml} which was not expected to pass
Ruled a pass
Updated by Yuri Weinstein over 7 years ago
QE VALIDATION (STARTED 1/23/17)¶
re-runs command lines and filters are captured in http://pad.ceph.com/p/hammer_v0.94.10_QE_validation_notes
COMMAND LINE => CEPH_BRANCH=83af8cdaaa6d94404e6146b68e532a784e3cc99c; MACHINE_NAME=vps; teuthology-suite -v -S $CEPH_BRANCH -m $MACHINE_NAME -k distro -s rados -e $CEPH_QA_EMAIL --suite-branch hammer
(Note: PASSED / FAILED - indicates "TEST IS IN PROGRESS")
All suites won't test on ubuntu 16.04 http://tracker.ceph.com/issues/18382
The following tests are not expected to pass:
- fs/recovery/{clusters/2-remote-clients.yaml debug/mds_client.yaml mounts/ceph-fuse.yaml tasks/mds-full.yaml} -> http://tracker.ceph.com/issues/14716
- rados/singleton-nomsgr/{all/11429.yaml}
- rados/singleton-nomsgr/{all/13234.yaml}
Suite | Runs/Reruns | Notes/Issues |
fs ubuntu | http://pulpito.ceph.com/teuthology-2017-01-26_18:02:24-fs-master-distro-basic-smithi/ | FAILED looks like the same in v0.94.8 http://tracker.ceph.com/issues/15895#note-33 approved by John "The MDL_WriteError one is the bug from when ENOSPC stuff was backported wrongly, the FileJournal/FileStore assertions are from btrfs." |
http://pulpito.ceph.com/teuthology-2017-01-28_16:31:30-fs-hammer---basic-smithi/ | ||
http://pulpito.ceph.com/teuthology-2017-01-31_17:18:26-fs-hammer-distro-basic-smithi/ | run off gitbuilders | |
fs centos | http://pulpito.ceph.com/teuthology-2017-01-26_18:03:38-fs-master-distro-basic-smithi/ | FAILED noise? approved by John |
http://pulpito.ceph.com/teuthology-2017-01-28_16:32:30-fs-hammer---basic-smithi/ | ||
krbd ubuntu | http://pulpito.ceph.com/teuthology-2017-01-26_20:58:56-krbd-master-testing-basic-smithi/ | FAILED looks like the same in v0.94.8 http://tracker.ceph.com/issues/15895#note-33 approved by Ilya |
http://pulpito.ceph.com/teuthology-2017-01-28_16:38:13-krbd-hammer-testing-basic-smithi/ | ||
krbd centos | http://pulpito.ceph.com/teuthology-2017-01-26_21:01:15-krbd-master-testing-basic-smithi/ | FAILED looks like the same in v0.94.8 http://tracker.ceph.com/issues/15895#note-33 approved by Ilya |
http://pulpito.ceph.com/teuthology-2017-01-28_16:38:58-krbd-hammer-testing-basic-smithi/ | ||
samba ubuntu | http://pulpito.ceph.com/teuthology-2017-01-26_21:11:17-samba-master-distro-basic-smithi/ | FAILED shaman build issues approved by Sage |
http://pulpito.ceph.com/teuthology-2017-01-30_18:15:33-samba-hammer-distro-basic-smithi/ | run off gitbuilders, packaging issues | |
samba centos | http://pulpito.ceph.com/teuthology-2017-01-26_21:11:58-samba-master-distro-basic-smithi/ | FAILED shaman build issues approved by Sage |
ceph-deploy centos | http://pulpito.ceph.com/teuthology-2017-01-26_21:15:47-ceph-deploy-master-distro-basic-vps/ | FAILED same as 0.94.08 see #15895 |
upgrade/client-upgrade ubuntu | http://pulpito.ceph.com/teuthology-2017-01-26_21:29:02-upgrade:client-upgrade-master-distro-basic-vps/ | |
upgrade/client-upgrade centos | http://pulpito.ceph.com/teuthology-2017-01-26_21:30:09-upgrade:client-upgrade-master-distro-basic-vps/ | |
http://pulpito.ceph.com/teuthology-2017-01-30_20:48:08-upgrade:client-upgrade-hammer-distro-basic-vps/ | run off gitbuilders a must! PASSED | |
upgrade/firefly-x (hammer) | http://pulpito.ceph.com/teuthology-2017-01-30_20:52:09-upgrade:firefly-x-hammer-distro-basic-vps/ | FAILED run off gitbuilders, approved by Sage |
http://pulpito.ceph.com/teuthology-2017-01-31_16:19:21-upgrade:firefly-x-hammer---basic-smithi/ | ||
upgrade/hammer-x (jewel) | http://pulpito.ceph.com/teuthology-2017-02-07_18:15:23-upgrade:hammer-x-jewel-distro-basic-vps/ | PASSED |
http://pulpito.ceph.com/yuriw-2017-02-08_17:22:47-upgrade:hammer-x-jewel---basic-smithi/ | run off gitbuilders | |
Suite | Runs/Reruns | Notes/Issues |
PASSED / FAILED | ||
centos
Updated by Nathan Cutler over 7 years ago
rgw valgrind issues bisect¶
ubuntu ./virtualenv/bin/teuthology-suite -k distro --priority 101 --suite rgw/verify --email ncutler@suse.com --ceph hammer --machine-type smithi --ceph-repo http://github.com/ceph/ceph.git --suite-repo http://github.com/ceph/ceph.git --filter="valgrind" qa/distros/all/ubuntu_14.04.yaml
centos 7.3 ./virtualenv/bin/teuthology-suite -k distro --priority 101 --suite rgw/verify --email ncutler@suse.com --ceph hammer --machine-type smithi --ceph-repo http://github.com/ceph/ceph.git --suite-repo http://github.com/ceph/ceph.git --filter="valgrind" ~/os_centos.yaml
Tip of hammer ("bad" version for bisect) 83af8cdaaa6d94404e6146b68e532a784e3cc99c
Ubuntu 14.04
CentOS 7.3
- fail 7.3 http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-01-29_17:36:17-rgw:verify-hammer-distro-basic-smithi/
- all 24 jobs failed
0.94.9 (hopefully "good" version for bisect) wip-hammer-0-94-9
./virtualenv/bin/teuthology-suite -k distro --priority 101 --suite rgw/verify --email ncutler@suse.com --ceph wip-hammer-0-94-9 --machine-type smithi --ceph-repo http://github.com/ceph/ceph-ci.git --suite-branch hammer --suite-repo http://github.com/ceph/ceph.git --filter="valgrind" ~/os_centos.yaml
- fail http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-01-29_23:10:19-rgw:verify-wip-hammer-0-94-9-distro-basic-smithi/
- all 24 jobs failed
Since the problem is present in the 0.94.9 release, we can be sure it's not due to a regression introduced in 0.94.10.
Updated by Nathan Cutler about 7 years ago
(04:03:19 PM) sage: smithfarm: i think we can ignore hte notcmalloc hammer stuff and move on... (04:03:59 PM) smithfarm: sage: ok, so afaict that means hammer 0.94.10 has passed qE (04:04:33 PM) smithfarm: unless you think we should fix upgrade/firefly-x (04:04:36 PM) sage: yeah i think so (04:04:37 PM) sage: nope :)
Updated by Abhishek Lekshmanan about 7 years ago
- Status changed from In Progress to Resolved