Tasks #19538
closedjewel v10.2.8
0%
Description
Workflow¶
- Preparing the release
- Nathan patches upgrade/jewel-x/point-to-point-x to do 10.2.0 -> current Jewel point release -> x - SKIPPED
- Cutting the release
- Loic asks Abhishek L. if a point release should be published - YES
- Loic gets approval from all leads
- Yehuda, rgw - YES
- John, CephFS - YES
- Jason, RBD - YES
- Josh, rados - YES
- Abhishek L. writes and commits the release notes - https://github.com/ceph/ceph/pull/16274 (merged)
- Nathan informs Yuri that the branch is ready for testing - DONE June 30th on ceph-devel ML
- Yuri runs additional integration tests - DONE
- If Yuri discovers new bugs that need to be backported urgently (i.e. their priority is set to Urgent or Immediate), the release goes back to being prepared; it was not ready after all
- Yuri informs Alfredo that the branch is ready for release - DONE
- Alfredo creates the packages and sets the release tag - DONE
- Abhishek L. posts release announcement on https://ceph.com/community/blog - DONE
- Abhishek L. sends release announcement to community mailing lists
- Abhishek L. informs Patrick M. about the release so he can tweet about it
Release information¶
- branch to build from: jewel, commit: 66dbf9beef04988dbd3653591e51afa6d84e3990
- version: v10.2.8
- type of release: point release
- where to publish the release: http://download.ceph.com/debian-jewel and http://download.ceph.com/rpm-jewel
Updated by Nathan Cutler about 7 years ago
- Target version changed from 536 to v10.2.8
Updated by Nathan Cutler about 7 years ago
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 13107
- |\
- | + librbd: improve debug logging for lock / watch state machines
- | + test: use librados API to retrieve config params
- + Pull request 13154
- |\
- | + librbd: possible deadlock with flush if refresh in-progress
- + Pull request 13212
- |\
- | + test/osd: add test for fast mark down functionality
- | + msg/async: implement ECONNREFUSED detection
- | + messages/MOSDFailure.h: distinguish between timeout and immediate failure
- | + OSD: Implement ms_handle_refused
- | + msg/simple: add ms_handle_refused callback
- | + AsyncConnection: fix delay state using dispatch_queue
- | + AsyncConnection: need to prepare message when features mismatch
- | + AsyncConnection: continue to read when meeting EINTR
- | + AsyncConnection: release dispatch throttle with fast dispatch message
- | + DispatchQueue: remove pipe words
- | + DispatchQueue: add name to separte different instance
- | + AsyncConnection: add DispathQueue throttle
- | + AsyncConnection: change all exception deliver to DispatchQueue
- | + AsyncConnection: make local message deliver via DispatchQueue
- | + AsyncMessenger: introduce DispatchQueue to separate nonfast message
- | + DispatchQueue: move dispatch_throtter from SimpleMessenger to DispatchQueue
- | + DispatchQueue: Move from msg/simple to msg
- + Pull request 13214
- |\
- | + OSD: allow client throttler to be adjusted on-fly, without restart
- + Pull request 13244
- |\
- | + osdc: cache should ignore error bhs during trim
- + Pull request 13254
- |\
- | + radosstriper : protect aio_write API from calls with 0 bytes
- + Pull request 13261
- |\
- | + mon/OSDMonitor: make 'osd crush move ...' work on osds
- + Pull request 13450
- |\
- | + msg/simple/Pipe: support IPv6 QoS.
- | + msg/simple: cleanups
- + Pull request 13477
- |\
- | + ceph-osd: --flush-journal: sporadic segfaults on exit
- + Pull request 13489
- |\
- | + ceph-disk: Fix getting wrong group name when --setgroup in bluestore
- + Pull request 13492
- |\
- | + systemd: Start OSDs after MONs
- + Pull request 13541
- |\
- | + osd/PG: restrict want_acting to up+acting on recovery completion
- + Pull request 13544
- |\
- | + auth: 'ceph auth import -i' overwrites caps, if caps are not specified in given keyring file, should alert user and should not allow this import. Because in 'ceph auth list' we keep all the keyrings with caps and importing 'client.admin' user keyring without caps locks the cluster with error1 because admin keyring caps are missing in 'ceph auth'.
- + Pull request 13552
- |\
- | + rgw: assume obj write is a first write
- + Pull request 13585
- |\
- | + msg/simple: set close on exec on server sockets
- | + msg/async: set close on exec on server sockets
- + Pull request 13606
- |\
- | + build/ops: rpm: move $CEPH_EXTRA_CONFIGURE_ARGS to right place
- | + build/ops: rpm: explicitly provide --with-ocf to configure
- | + rpm: build ceph-resource-agents by default
- + Pull request 13608
- |\
- | + tests: Thrasher: eliminate a race between kill_osd and init
- + Pull request 13647
- |\
- | + os: make zero values noops for set_alloc_hint() in FileStore
- | + osd: preserve allocation hint attribute during recovery
- + Pull request 13724
- |\
- | + rgw: Use decoded URI when verifying TempURL
- + Pull request 13732
- |\
- | + PendingReleaseNotes: warning about 'osd rm ...' and #19119
- + Pull request 13779
- |\
- | + rgw: metadata sync info should be shown at master zone of slave zonegroup
- + Pull request 13786
- |\
- | + build/ops: add psmisc dependency to ceph-base
- + Pull request 13787
- |\
- | + osdc/Objecter: If osd full, it should pause read op which w/ rwordered flag.
- + Pull request 13788
- |\
- | + os/filestore/HashIndex: be loud about splits
- + Pull request 13809
- |\
- | + librbd: remove image header lock assertions
- + Pull request 13827
- |\
- | + osd/ReplicatedPG: try with pool's use-gmt setting if hitset archive not found
- + Pull request 13831
- |\
- | + server: negative error code when responding to client
- + Pull request 13833
- |\
- | + rgw: the swift container acl should support field .ref
- + Pull request 13834
- |\
- | + rgw: change log level to 20 for 'System already converted' message
- + Pull request 13837
- |\
- | + rgw: fix for broken yields in RGWMetaSyncShardCR
- | + rgw: kill a compile warning for rgw_sync
- + Pull request 13842
- |\
- | + rgw: don't init rgw_obj from rgw_obj_key when it's incorrect to do so
- + Pull request 13863
- |\
- | + rgw: Fixes typo in rgw_admin.cc
- + Pull request 13865
- |\
- | + rgw: Correct the return codes for the health check feature Fixes: http://tracker.ceph.com/issues/19025 Signed-off-by: Pavan Rallabhandi <PRallabhandi@walmartlabs.com>
- + Pull request 13872
- |\
- | + rgw: Let the object stat command be shown in the usage
- + Pull request 13874
- |\
- | + doc: rgw: make a note abt system users vs normal users
- + Pull request 13932
- |\
- | + rbd-nbd: check /sys/block/nbdX/size to ensure kernel mapped correctly
- + Pull request 14044
- |\
- | + os/filestore: fix clang static check warn use-after-free
- + Pull request 14047
- |\
- | + jewel: osd/PGLog: reindex properly on pg log split
- + Pull request 14064
- |\
- | + rgw: delete_system_obj() fails on empty object name
- | + rgw: if user.email is empty, dont try to delete
- + Pull request 14066
- |\
- | + rgw: fix break inside of yield in RGWFetchAllMetaCR
- + Pull request 14070
- |\
- | + Revert dummy: reduce run time, run user.yaml playbook
- + Pull request 14083
- |\
- | + doc: update description of rbdmap unmap[-all] behaviour
- | + doc: add verbiage to rbdmap manpage
- | + rbdmap: unmap RBDMAPFILE images unless called with unmap-all
- + Pull request 14112
- |\
- | + brag: count the number of mds in fsmap not in mdsmap
- | + brag: Assume there are 0 MDS instead of crashing when data is missing
- + Pull request 14113
- |\
- | + tools/rados: Check return value of connect
- + Pull request 14136
- |\
- | + rgw: skip conversion of zones without any zoneparams
- | + rgw: better debug information for upgrade
- | + rgw/rgw_rados.cc: prefer ++operator for non-primitive iterators
- + Pull request 14143
- |\
- | + rgw: use rgw_zone_root_pool for region_map like is done in hammer
- + Pull request 14148
- |\
- | + rbd: destination pool should be source pool if it is not specified
- + Pull request 14150
- |\
- | + librbd: avoid possible recursive lock when racing acquire lock
- + Pull request 14152
- |\
- | + librbd: Include WorkQueue.h since we use it
- + Pull request 14154
- |\
- | + qa/workunits/rbd: resolve potential rbd-mirror race conditions
- + Pull request 14181
- |\
- | + osd: bypass readonly ops when osd full.
- + Pull request 14236
- |\
- | + mon: remove bad rocksdb option
- + Pull request 14324
- |\
- | + common: fix segfault in public IPv6 addr picking
- + Pull request 14325
- |\
- | + osd: Calculate degraded and misplaced more accurately
- + Pull request 14326
- |\
- | + osd: don't share osdmap with objecter when preboot
- + Pull request 14329
- |\
- | + ceph-disk: Adding retry loop in get_partition_dev()
- | + ceph-disk: Reporting /sys directory in get_partition_dev()
- + Pull request 14368
- |\
- | + jewel: rgw: fix listing of objects that start with underscore
- + Pull request 14371
- |\
- | + qa/tasks/workunit.py: use overrides as the default settings of workunit
- | + tasks/workunit.py: specify the branch name when cloning a branch
- | + tasks/workunit.py: when cloning, use --depth=1
- + Pull request 14377
- + rgw_file: fix missing unlock in unlink
- + rgw_file: implement reliable has-children check
Updated by Nathan Cutler about 7 years ago
- Status changed from New to In Progress
Updated by Nathan Cutler about 7 years ago
rados¶
teuthology-suite -k distro --priority 1000 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- fail (5 fail, 222 passed, 227 total) http://pulpito.ceph.com:80/smithfarm-2017-04-07_07:52:35-rados-wip-jewel-backports-distro-basic-smithi/
Re-running 5 failed jobs:
- 3 pass, 2 fail http://pulpito.ceph.com:80/smithfarm-2017-04-09_13:21:35-rados-wip-jewel-backports-distro-basic-smithi/
Re-running 2 failed jobs 5 times each:
- 9 fail, 1 pass http://pulpito.ceph.com:80/smithfarm-2017-04-09_20:57:12-rados-wip-jewel-backports-distro-basic-smithi/
Problematic jobs are:
- fails every time:
rados/verify/{1thrash/default.yaml clusters/{fixed-2.yaml openstack.yaml} fs/btrfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml tasks/mon_recovery.yaml validater/valgrind.yaml}
- fails almost every time:
rados/singleton/{all/ec-lost-unfound-upgrade.yaml fs/xfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml}
Of these, Josh thinks only the async messenger-related valgrind issues are a real issue - these might be caused by https://github.com/ceph/ceph/pull/13585 or https://github.com/ceph/ceph/pull/13212
Updated by Nathan Cutler about 7 years ago
powercycle¶
teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 1000 -l 2 --email ncutler@suse.com
Updated by Nathan Cutler about 7 years ago
Upgrade jewel point-to-point-x¶
teuthology-suite -k distro --verbose --suite upgrade/jewel-x/point-to-point-x --ceph wip-jewel-backports --machine-type vps --priority 1000 --email ncutler@suse.com
Updated by Nathan Cutler about 7 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- fail (2 dead, 15 pass, out of 17 total) http://pulpito.ceph.com:80/smithfarm-2017-04-07_08:06:45-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
Re-running 2 dead jobs:
- 1 pass, 1 fail http://pulpito.ceph.com:80/smithfarm-2017-04-09_13:19:31-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- failure is http://tracker.ceph.com/issues/19556
Re-running the last remaining failed job 5 times:
- 1 fail, 1 dead, 3 pass http://pulpito.ceph.com:80/smithfarm-2017-04-09_17:36:23-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
Pushed https://github.com/ceph/ceph-ci/commit/d49d11e714020220e49949c591b0743538212beb to fix http://tracker.ceph.com/issues/19556
Ruled a pass
Updated by Nathan Cutler about 7 years ago
ceph-disk¶
teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
Updated by Nathan Cutler about 7 years ago
fs¶
teuthology-suite -k distro --priority 1000 --suite fs --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- fail (2 failed, 85 passed, 87 total) http://pulpito.ceph.com/smithfarm-2017-04-07_08:09:44-fs-wip-jewel-backports-distro-basic-smithi/
- both failures are btrfs
Re-running 2 failed jobs:
- fail (1 fail, 1 pass, 2 total) http://pulpito.ceph.com:80/smithfarm-2017-04-09_13:23:36-fs-wip-jewel-backports-distro-basic-smithi/
fs/basic/{clusters/fixed-2-ucephfs.yaml debug/mds_client.yaml dirfrag/frag_enable.yaml fs/btrfs.yaml inline/yes.yaml overrides/whitelist_wrongly_marked_down.yaml tasks/cfuse_workunit_suites_ffsb.yaml}
- assert
2017-04-09T13:40:44.797 INFO:tasks.ceph.osd.3.smithi059.stderr:os/filestore/FileStore.cc: In function 'void FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7f65124fc700 time 2017-04-09 13:40:44.804790 2017-04-09T13:40:44.797 INFO:tasks.ceph.osd.3.smithi059.stderr:os/filestore/FileStore.cc: 2920: FAILED assert(0 == "unexpected error")
Re-running failed job 6 times:
- 1 fail http://pulpito.ceph.com:80/smithfarm-2017-04-09_14:14:03-fs-wip-jewel-backports-distro-basic-smithi/
- 3 fail, 2 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-04-09_14:15:42-fs-wip-jewel-backports-distro-basic-smithi/
All the same error, i.e.:
2017-04-09T14:29:48.837 INFO:tasks.ceph.osd.3.smithi063.stderr:os/filestore/FileStore.cc: In function 'void FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7fcd5a7f8700 time 2017-04-09 14:29:48.854235 2017-04-09T14:29:48.837 INFO:tasks.ceph.osd.3.smithi063.stderr:os/filestore/FileStore.cc: 2920: FAILED assert(0 == "unexpected error")
So, fs is ruled a pass but there are no fs backports staged (correction: there is one)
Updated by Nathan Cutler about 7 years ago
rgw¶
teuthology-suite -k distro --priority 1000 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- fail http://pulpito.ceph.com:80/smithfarm-2017-04-07_09:30:57-rgw-wip-jewel-backports-distro-basic-smithi/
- massive failure (50+ failed jobs), due mostly (if not all) to this:
2017-04-09T00:20:37.988 INFO:teuthology.orchestra.run.smithi167.stderr:====================================================================== 2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:FAIL: s3tests.functional.test_s3.test_versioning_obj_suspend_versions 2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:---------------------------------------------------------------------- 2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:Traceback (most recent call last): 2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr: File "/home/ubuntu/cephtest/s3-tests/virtualenv/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest 2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr: self.test(*self.arg) 2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr: File "/home/ubuntu/cephtest/s3-tests/s3tests/functional/test_s3.py", line 6385, in test_versioning_obj_suspend_versions 2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr: overwrite_suspended_versioning_obj(bucket, objname, k, c, 'null content 2') 2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr: File "/home/ubuntu/cephtest/s3-tests/s3tests/functional/test_s3.py", line 6243, in overwrite_suspended_versioning_obj 2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr: check_obj_versions(bucket, objname, k, c) 2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr: File "/home/ubuntu/cephtest/s3-tests/s3tests/functional/test_s3.py", line 6080, in check_obj_versions 2017-04-09T00:20:37.990 INFO:teuthology.orchestra.run.smithi167.stderr: eq(keys[i].version_id or 'null', key.version_id) 2017-04-09T00:20:37.990 INFO:teuthology.orchestra.run.smithi167.stderr:AssertionError: u'yGSvpxjEbJRBE.JL76y4OzeJISqDtmJ' != u'null'
Updated by Nathan Cutler about 7 years ago
rbd¶
teuthology-suite -k distro --priority 1000 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- fail (1 dead, 253 passed, 254 total) http://pulpito.ceph.com:80/smithfarm-2017-04-07_09:35:09-rbd-wip-jewel-backports-distro-basic-smithi/
description: rbd/thrash/{base/install.yaml clusters/{fixed-2.yaml openstack.yaml} fs/xfs.yaml msgr-failures/few.yaml thrashers/cache.yaml workloads/rbd_fsx_nbd.yaml}
- assert in ceph-objectstore-tool
2017-04-09T01:48:49.750 INFO:teuthology.orchestra.run.smithi051:Running: 'sudo adjust-ulimits ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-2 --journal-path /var/lib/ceph/osd/ceph-2/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op export --pgid 0.7 --file /home/ubuntu/cephtest/ceph.data/exp.0.7.2' 2017-04-09T01:48:49.850 INFO:teuthology.orchestra.run.smithi051.stderr:osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, ceph::bufferlist*)' thread 7f052ae89980 time 2017-04-09 01:48:49.853858 2017-04-09T01:48:49.850 INFO:teuthology.orchestra.run.smithi051.stderr:osd/PG.cc: 2967: FAILED assert(values.size() == 2)
Re-running 1 dead job:
- dead http://pulpito.ceph.com:80/smithfarm-2017-04-09_13:27:00-rbd-wip-jewel-backports-distro-basic-smithi/
- same error as before
Ruled a pass by Jason
Updated by Nathan Cutler about 7 years ago
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 13107
- |\
- | + librbd: improve debug logging for lock / watch state machines
- | + test: use librados API to retrieve config params
- + Pull request 13154
- |\
- | + librbd: possible deadlock with flush if refresh in-progress
- + Pull request 13212
- |\
- | + test/osd: add test for fast mark down functionality
- | + msg/async: implement ECONNREFUSED detection
- | + messages/MOSDFailure.h: distinguish between timeout and immediate failure
- | + OSD: Implement ms_handle_refused
- | + msg/simple: add ms_handle_refused callback
- | + AsyncConnection: fix delay state using dispatch_queue
- | + AsyncConnection: need to prepare message when features mismatch
- | + AsyncConnection: continue to read when meeting EINTR
- | + AsyncConnection: release dispatch throttle with fast dispatch message
- | + DispatchQueue: remove pipe words
- | + DispatchQueue: add name to separte different instance
- | + AsyncConnection: add DispathQueue throttle
- | + AsyncConnection: change all exception deliver to DispatchQueue
- | + AsyncConnection: make local message deliver via DispatchQueue
- | + AsyncMessenger: introduce DispatchQueue to separate nonfast message
- | + DispatchQueue: move dispatch_throtter from SimpleMessenger to DispatchQueue
- | + DispatchQueue: Move from msg/simple to msg
- + Pull request 13214
- |\
- | + OSD: allow client throttler to be adjusted on-fly, without restart
- + Pull request 13244
- |\
- | + osdc: cache should ignore error bhs during trim
- + Pull request 13254
- |\
- | + radosstriper : protect aio_write API from calls with 0 bytes
- + Pull request 13261
- |\
- | + mon/OSDMonitor: make 'osd crush move ...' work on osds
- + Pull request 13450
- |\
- | + msg/simple/Pipe: support IPv6 QoS.
- | + msg/simple: cleanups
- + Pull request 13477
- |\
- | + ceph-osd: --flush-journal: sporadic segfaults on exit
- + Pull request 13489
- |\
- | + ceph-disk: Fix getting wrong group name when --setgroup in bluestore
- + Pull request 13492
- |\
- | + systemd: Start OSDs after MONs
- + Pull request 13541
- |\
- | + osd/PG: restrict want_acting to up+acting on recovery completion
- + Pull request 13544
- |\
- | + auth: 'ceph auth import -i' overwrites caps, if caps are not specified in given keyring file, should alert user and should not allow this import. Because in 'ceph auth list' we keep all the keyrings with caps and importing 'client.admin' user keyring without caps locks the cluster with error1 because admin keyring caps are missing in 'ceph auth'.
- + Pull request 13552
- |\
- | + rgw: assume obj write is a first write
- + Pull request 13585
- |\
- | + msg/simple: set close on exec on server sockets
- | + msg/async: set close on exec on server sockets
- + Pull request 13606
- |\
- | + build/ops: rpm: move $CEPH_EXTRA_CONFIGURE_ARGS to right place
- | + build/ops: rpm: explicitly provide --with-ocf to configure
- | + rpm: build ceph-resource-agents by default
- + Pull request 13608
- |\
- | + tests: Thrasher: eliminate a race between kill_osd and init
- + Pull request 13724
- |\
- | + rgw: Use decoded URI when verifying TempURL
- + Pull request 13732
- |\
- | + PendingReleaseNotes: warning about 'osd rm ...' and #19119
- + Pull request 13779
- |\
- | + rgw: metadata sync info should be shown at master zone of slave zonegroup
- + Pull request 13786
- |\
- | + build/ops: add psmisc dependency to ceph-base
- + Pull request 13787
- |\
- | + osdc/Objecter: If osd full, it should pause read op which w/ rwordered flag.
- + Pull request 13788
- |\
- | + os/filestore/HashIndex: be loud about splits
- + Pull request 13809
- |\
- | + librbd: remove image header lock assertions
- + Pull request 13827
- |\
- | + osd/ReplicatedPG: try with pool's use-gmt setting if hitset archive not found
- + Pull request 13831
- |\
- | + server: negative error code when responding to client
- + Pull request 13833
- |\
- | + rgw: the swift container acl should support field .ref
- + Pull request 13834
- |\
- | + rgw: change log level to 20 for 'System already converted' message
- + Pull request 13837
- |\
- | + rgw: fix for broken yields in RGWMetaSyncShardCR
- | + rgw: kill a compile warning for rgw_sync
- + Pull request 13842
- |\
- | + rgw: don't init rgw_obj from rgw_obj_key when it's incorrect to do so
- + Pull request 13863
- |\
- | + rgw: Fixes typo in rgw_admin.cc
- + Pull request 13865
- |\
- | + rgw: Correct the return codes for the health check feature Fixes: http://tracker.ceph.com/issues/19025 Signed-off-by: Pavan Rallabhandi <PRallabhandi@walmartlabs.com>
- + Pull request 13872
- |\
- | + rgw: Let the object stat command be shown in the usage
- + Pull request 13874
- |\
- | + doc: rgw: make a note abt system users vs normal users
- + Pull request 13885
- |\
- | + qa/tasks/ceph_manager: use new luminous set-full-ratio etc
- | + qa/tasks/thrashosds: chance_thrash_cluster_full
- | + osdc/Objecter: resend RWORDERED ops on full
- + Pull request 13932
- |\
- | + rbd-nbd: check /sys/block/nbdX/size to ensure kernel mapped correctly
- + Pull request 14044
- |\
- | + os/filestore: fix clang static check warn use-after-free
- + Pull request 14047
- |\
- | + jewel: osd/PGLog: reindex properly on pg log split
- + Pull request 14064
- |\
- | + rgw: delete_system_obj() fails on empty object name
- | + rgw: if user.email is empty, dont try to delete
- + Pull request 14066
- |\
- | + rgw: fix break inside of yield in RGWFetchAllMetaCR
- + Pull request 14070
- |\
- | + Revert dummy: reduce run time, run user.yaml playbook
- + Pull request 14083
- |\
- | + doc: update description of rbdmap unmap[-all] behaviour
- | + doc: add verbiage to rbdmap manpage
- | + rbdmap: unmap RBDMAPFILE images unless called with unmap-all
- + Pull request 14112
- |\
- | + brag: count the number of mds in fsmap not in mdsmap
- | + brag: Assume there are 0 MDS instead of crashing when data is missing
- + Pull request 14113
- |\
- | + tools/rados: Check return value of connect
- + Pull request 14136
- |\
- | + rgw: skip conversion of zones without any zoneparams
- | + rgw: better debug information for upgrade
- | + rgw/rgw_rados.cc: prefer ++operator for non-primitive iterators
- + Pull request 14143
- |\
- | + rgw: use rgw_zone_root_pool for region_map like is done in hammer
- + Pull request 14148
- |\
- | + rbd: destination pool should be source pool if it is not specified
- + Pull request 14150
- |\
- | + librbd: avoid possible recursive lock when racing acquire lock
- + Pull request 14152
- |\
- | + librbd: Include WorkQueue.h since we use it
- + Pull request 14154
- |\
- | + qa/workunits/rbd: resolve potential rbd-mirror race conditions
- + Pull request 14181
- |\
- | + osd: bypass readonly ops when osd full.
- + Pull request 14236
- |\
- | + mon: remove bad rocksdb option
- + Pull request 14324
- |\
- | + common: fix segfault in public IPv6 addr picking
- + Pull request 14325
- |\
- | + osd: Calculate degraded and misplaced more accurately
- + Pull request 14326
- |\
- | + osd: don't share osdmap with objecter when preboot
- + Pull request 14329
- |\
- | + ceph-disk: Adding retry loop in get_partition_dev()
- | + ceph-disk: Reporting /sys directory in get_partition_dev()
- + Pull request 14371
- |\
- | + qa/tasks/workunit.py: use overrides as the default settings of workunit
- | + tasks/workunit.py: specify the branch name when cloning a branch
- | + tasks/workunit.py: when cloning, use --depth=1
- + Pull request 14377
- |\
- | + rgw_file: fix missing unlock in unlink
- | + rgw_file: implement reliable has-children check
- + Pull request 14383
- |\
- | + debian: replace SysV rbdmap with systemd service
- + Pull request 14416
- + tests: Thrasher: handle OSD has the store locked gracefully
Updated by Nathan Cutler about 7 years ago
rgw¶
teuthology-suite -k distro --priority 1000 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- run killed in favor of newer build http://pulpito.ceph.com:80/smithfarm-2017-04-09_20:58:27-rgw-wip-jewel-backports-distro-basic-smithi/
Updated by Nathan Cutler about 7 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- massive failure http://pulpito.ceph.com:80/smithfarm-2017-04-09_21:01:21-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- timeout in rados/test.sh
- Command failed on vpm099 with status 22: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph osd set-full-ratio .001'
Updated by Nathan Cutler about 7 years ago
upgrade/client-upgrade¶
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- 2 fail, 12 pass, 14 total http://pulpito.ceph.com:80/smithfarm-2017-04-09_21:02:46-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- one failure is http://tracker.ceph.com/issues/18089 and can be ignored
- other failure is in
upgrade:client-upgrade/infernalis-client-x/basic/{0-cluster/start.yaml 1-install/infernalis-client-x.yaml 2-workload/rbd_api_tests.yaml distros/centos_7.2.yaml}
Re-running the 1 problematic job:
- fail http://pulpito.ceph.com:80/smithfarm-2017-04-10_13:30:15-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- failure in TestLibRBD.UpdateFeatures - opened http://tracker.ceph.com/issues/19567 to track, but it is not a blocker
Ruled a pass
Updated by Nathan Cutler about 7 years ago
upgrade/client-upgrade¶
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- 2 fail, 1 dead, 11 pass (14 total) http://pulpito.ceph.com:80/smithfarm-2017-04-10_18:48:08-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
Re-running 2 failed and 1 dead jobs
- 2 fail, 1 pass http://pulpito.ceph.com:80/smithfarm-2017-04-10_19:45:29-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- one failure - upgrade:client-upgrade/firefly-client-x/basic/{0-cluster/start.yaml 1-install/firefly-client-x.yaml 2-workload/rbd_cli_import_export.yaml distros/centos_7.2.yaml} - is http://tracker.ceph.com/issues/19571 and is not expected to pass
- second failure - upgrade:client-upgrade/infernalis-client-x/basic/{0-cluster/start.yaml 1-install/infernalis-client-x.yaml 2-workload/rbd_api_tests.yaml distros/centos_7.2.yaml} - is http://tracker.ceph.com/issues/19567 which is not a blocker
Ruled a pass
Updated by Nathan Cutler about 7 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- 9 failed, 1 dead, 7 passed http://pulpito.ceph.com:80/smithfarm-2017-04-10_18:49:29-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- timeout in rados/test.sh
Re-running 9 failed and 1 dead jobs:
- 5 failed, 1 dead, 4 passed http://pulpito.ceph.com:80/smithfarm-2017-04-12_10:18:53-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- timeout in rados/test.sh
Updated by Nathan Cutler about 7 years ago
rgw¶
teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
Updated by Nathan Cutler about 7 years ago
Made an integration branch "wip-jewel-backports-rgw" consisting only of jewel PRs labeled "rgw". Will try to reproduce the s3tests.functional.test_s3.test_versioning_obj_suspend_versions failure on it.
The hypothesis is that there is a single problematic PR and it carries the label "rgw".
Reproducer: teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports-rgw --machine-type smithi --filter="rgw/verify/{clusters/fixed-2.yaml frontend/apache.yaml fs/btrfs.yaml msgr-failures/few.yaml overrides.yaml rgw_pool_type/ec-cache.yaml tasks/rgw_s3tests.yaml validater/lockdep.yaml}"
Updated by Nathan Cutler about 7 years ago
wip-jewel-backports-rgw¶
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports-rgw | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 13552
- |\
- | + rgw: assume obj write is a first write
- + Pull request 13724
- |\
- | + rgw: Use decoded URI when verifying TempURL
- + Pull request 13779
- |\
- | + rgw: metadata sync info should be shown at master zone of slave zonegroup
- + Pull request 13833
- |\
- | + rgw: the swift container acl should support field .ref
- + Pull request 13834
- |\
- | + rgw: change log level to 20 for 'System already converted' message
- + Pull request 13837
- |\
- | + rgw: fix for broken yields in RGWMetaSyncShardCR
- | + rgw: kill a compile warning for rgw_sync
- + Pull request 13842
- |\
- | + rgw: don't init rgw_obj from rgw_obj_key when it's incorrect to do so
- + Pull request 13863
- |\
- | + rgw: Fixes typo in rgw_admin.cc
- + Pull request 13865
- |\
- | + rgw: Correct the return codes for the health check feature Fixes: http://tracker.ceph.com/issues/19025 Signed-off-by: Pavan Rallabhandi <PRallabhandi@walmartlabs.com>
- + Pull request 13872
- |\
- | + rgw: Let the object stat command be shown in the usage
- + Pull request 13874
- |\
- | + doc: rgw: make a note abt system users vs normal users
- + Pull request 14064
- |\
- | + rgw: delete_system_obj() fails on empty object name
- | + rgw: if user.email is empty, dont try to delete
- + Pull request 14066
- |\
- | + rgw: fix break inside of yield in RGWFetchAllMetaCR
- + Pull request 14136
- |\
- | + rgw: skip conversion of zones without any zoneparams
- | + rgw: better debug information for upgrade
- | + rgw/rgw_rados.cc: prefer ++operator for non-primitive iterators
- + Pull request 14143
- |\
- | + rgw: use rgw_zone_root_pool for region_map like is done in hammer
- + Pull request 14368
- |\
- | + jewel: rgw: fix listing of objects that start with underscore
- + Pull request 14377
- + rgw_file: fix missing unlock in unlink
- + rgw_file: implement reliable has-children check
Updated by Nathan Cutler about 7 years ago
RGW bisect¶
teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports-rgw --machine-type smithi --filter="rgw/verify/{clusters/fixed-2.yaml frontend/apache.yaml fs/btrfs.yaml msgr-failures/few.yaml overrides.yaml rgw_pool_type/ec-cache.yaml tasks/rgw_s3tests.yaml validater/lockdep.yaml}"
Bug reproduced; hypothesis confirmed!
Re-running with a new integration branch containing just PRs:
- 13865
- 13863
- 13842
- 13837
- 13834
- 13833
- 13779
- 13724
- 13552
Bug reproduced!
Re-running with a new integration branch containing just PRs:
- 13834
- 13833
- 13779
- 13724
- 13552
Bug reproduced!
Re-running with subset:
- 13779
- 13724
- 13552
Bug reproduced!
Last test was run manually with the conclusion that PR#13552 is to blame. The test branch is https://github.com/ceph/ceph-ci/commits/wip-jewel-backports-rgw (contains just v10.2.7 plus this one PR).
Updated by Nathan Cutler about 7 years ago
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 13450
- |\
- | + msg/simple/Pipe: support IPv6 QoS.
- | + msg/simple: cleanups
- + Pull request 13544
- |\
- | + auth: 'ceph auth import -i' overwrites caps, if caps are not specified in given keyring file, should alert user and should not allow this import. Because in 'ceph auth list' we keep all the keyrings with caps and importing 'client.admin' user keyring without caps locks the cluster with error1 because admin keyring caps are missing in 'ceph auth'.
- + Pull request 13606
- |\
- | + build/ops: rpm: move $CEPH_EXTRA_CONFIGURE_ARGS to right place
- | + build/ops: rpm: explicitly provide --with-ocf to configure
- | + rpm: build ceph-resource-agents by default
- + Pull request 13608
- |\
- | + tests: Thrasher: eliminate a race between kill_osd and init
- + Pull request 13647
- |\
- | + os: make zero values noops for set_alloc_hint() in FileStore
- | + osd: preserve allocation hint attribute during recovery
- + Pull request 13724
- |\
- | + rgw: Use decoded URI when verifying TempURL
- + Pull request 13779
- |\
- | + rgw: metadata sync info should be shown at master zone of slave zonegroup
- + Pull request 13833
- |\
- | + rgw: the swift container acl should support field .ref
- + Pull request 13834
- |\
- | + rgw: change log level to 20 for 'System already converted' message
- + Pull request 13837
- |\
- | + rgw: fix for broken yields in RGWMetaSyncShardCR
- | + rgw: kill a compile warning for rgw_sync
- + Pull request 13842
- |\
- | + rgw: don't init rgw_obj from rgw_obj_key when it's incorrect to do so
- + Pull request 13863
- |\
- | + rgw: Fixes typo in rgw_admin.cc
- + Pull request 13865
- |\
- | + rgw: Correct the return codes for the health check feature Fixes: http://tracker.ceph.com/issues/19025 Signed-off-by: Pavan Rallabhandi <PRallabhandi@walmartlabs.com>
- + Pull request 13872
- |\
- | + rgw: Let the object stat command be shown in the usage
- + Pull request 14064
- |\
- | + rgw: delete_system_obj() fails on empty object name
- | + rgw: if user.email is empty, dont try to delete
- + Pull request 14066
- |\
- | + rgw: fix break inside of yield in RGWFetchAllMetaCR
- + Pull request 14136
- |\
- | + rgw: skip conversion of zones without any zoneparams
- | + rgw: better debug information for upgrade
- | + rgw/rgw_rados.cc: prefer ++operator for non-primitive iterators
- + Pull request 14143
- |\
- | + rgw: use rgw_zone_root_pool for region_map like is done in hammer
- + Pull request 14195
- |\
- | + rgw: use separate http_manager for read_sync_status
- | + rgw: pass cr registry to managers
- | + rgw: use separate cr manager for read_sync_status
- | + rgw: change read_sync_status interface
- | + rgw: don't ignore ENOENT in RGWRemoteDataLog::read_sync_status()
- + Pull request 14368
- |\
- | + jewel: rgw: fix listing of objects that start with underscore
- + Pull request 14377
- |\
- | + rgw_file: fix missing unlock in unlink
- | + rgw_file: implement reliable has-children check
- + Pull request 14383
- |\
- | + debian: replace SysV rbdmap with systemd service
- + Pull request 14449
- + tests: fix oversight in yaml comment
Updated by Nathan Cutler about 7 years ago
assert no massive rgw failure¶
teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --filter="rgw/verify/{clusters/fixed-2.yaml frontend/apache.yaml fs/btrfs.yaml msgr-failures/few.yaml overrides.yaml rgw_pool_type/ec-cache.yaml tasks/rgw_s3tests.yaml validater/lockdep.yaml}"
Bisect result verified.
Updated by Nathan Cutler about 7 years ago
rgw¶
teuthology-suite -k distro --priority 1000 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- 3 fail, 189 pass (192 total) http://pulpito.ceph.com:80/smithfarm-2017-04-13_13:49:51-rgw-wip-jewel-backports-distro-basic-smithi/
Re-running 3 failed jobs:
Updated by Nathan Cutler about 7 years ago
assert no async messenger leak¶
teuthology-suite -k distro --priority 101 --suite rados --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --filter="rados/verify/{1thrash/default.yaml clusters/{fixed-2.yaml openstack.yaml} fs/btrfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml tasks/mon_recovery.yaml validater/valgrind.yaml}"
Confirmed that the leak is (most likely) caused by https://github.com/ceph/ceph/pull/13212
Updated by Nathan Cutler about 7 years ago
rados¶
teuthology-suite -k distro --priority 1000 --suite rados --subset $(expr $RANDOM % 2000)/2000 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- 2 fail, 110 pass (112 total) http://pulpito.ceph.com:80/smithfarm-2017-04-13_13:54:30-rados-wip-jewel-backports-distro-basic-smithi/
- Command failed on smithi001 with status 6: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph tell osd.0 flush_pg_stats'
rados/singleton/{all/ec-lost-unfound-upgrade.yaml fs/xfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml}
- Command failed on smithi171 with status 11: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph pg scrub 1.0'
- Command failed on smithi001 with status 6: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph tell osd.0 flush_pg_stats'
Note that the 2 failed jobs also failed in the earlier rados run - #note-4 above
Re-running 2 failed jobs:
- same failure in
rados/singleton-nomsgr/{all/lfn-upgrade-hammer.yaml rados.yaml}
2017-04-14T19:24:04.696 INFO:teuthology.orchestra.run.smithi161:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph pg scrub 1.0' 2017-04-14T19:24:04.838 INFO:teuthology.orchestra.run.smithi161.stderr:Error EAGAIN: pg 1.0 primary osd.1 not up
- same failure in
rados/singleton/{all/ec-lost-unfound-upgrade.yaml fs/xfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml}
Test first installs infernalis (new build from tip of infernalis branch - see #18089) and then upgrades all but one OSD to wip-jewel-backports. Then it runs the "ec_lost_unfound" task, at which point we see:
2017-04-14T19:25:42.516 INFO:teuthology.orchestra.run.smithi132:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph tell osd.0 flush_pg_stats' ... 2017-04-14T19:25:42.756 INFO:teuthology.orchestra.run.smithi132.stderr:Error ENXIO: problem getting command descriptions from osd.0
Test if failure is reproducible on wip-v10.2.7
:
teuthology-suite -k distro --verbose --suite rados --priority 101 --email ncutler@suse.com --ceph wip-v10.2.7 --machine-type smithi --filter="rados/singleton/{all/ec-lost-unfound-upgrade.yaml fs/xfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml}"
- pass http://pulpito.ceph.com:80/smithfarm-2017-04-14_19:46:35-rados-wip-v10.2.7-distro-basic-smithi/
Test if failure is reproducible on wip-v10.2.7
:
teuthology-suite -k distro --verbose --suite rados --priority 101 --email ncutler@suse.com --ceph wip-v10.2.7 --machine-type smithi --filter="rados/singleton-nomsgr/{all/lfn-upgrade-hammer.yaml rados.yaml}"
- pass http://pulpito.ceph.com:80/smithfarm-2017-04-14_20:18:59-rados-wip-v10.2.7-distro-basic-smithi/
Test if failure is reproducible on jewel
:
teuthology-suite -k distro --verbose --suite rados --ceph jewel --ceph-repo https://github.com/ceph/ceph --suite-repo https://github.com/ceph/ceph --machine-type vps --priority 101 --email ncutler@suse.com --filter="rados/singleton-nomsgr/{all/lfn-upgrade-hammer.yaml rados.yaml}"
Updated by Nathan Cutler about 7 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- 5 failed, 7 pass (12 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-04-13_10:51:25-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- reproducible timeout in rados/test.sh
Updated by Nathan Cutler about 7 years ago
upgrade client-upgrade¶
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- 1 fail, 13 pass (14 total) http://pulpito.ceph.com:80/smithfarm-2017-04-13_10:52:09-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- known bug http://tracker.ceph.com/issues/19571 (firefly install fails because qemu-kvm caused installation of newer librado2 from distro) http://pulpito.ceph.com/smithfarm-2017-04-13_10:52:09-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/1019795
Ruled a pass
Updated by Nathan Cutler about 7 years ago
bisect regression in jewel¶
It appears we somehow managed to merge a PR that introduced a regression.
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph jewel --ceph-repo https://github.com/ceph/ceph --suite-repo https://github.com/ceph/ceph --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"
- fail http://pulpito.ceph.com:80/smithfarm-2017-04-13_13:59:35-upgrade:hammer-x-jewel-distro-basic-vps/
Running same test on smithi:
- running http://pulpito.ceph.com:80/smithfarm-2017-04-17_11:23:20-upgrade:hammer-x-jewel-distro-basic-smithi/
Re-running 5 times on VPS:
- running http://pulpito.ceph.com:80/smithfarm-2017-04-17_11:24:40-upgrade:hammer-x-jewel-distro-basic-vps/
The next step is to run the reproducer on v10.2.7 to assert it is free of the regression. Assuming the test passes on v10.2.7, we will have to bisect :-(
Pushed wip-v10.2.7 (v10.2.7+PR#14371) to Shaman
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"
Preparing first bisect round - here are the PRs merged since v10.2.7:
$ git log --merges --oneline --no-color v10.2.7..HEAD e31a540 Merge pull request #13834 from smithfarm/wip-18969-jewel 7c36d16 Merge pull request #13833 from smithfarm/wip-18908-jewel 0e3aa2c Merge pull request #13214 from ovh/bp-osd-updateable-throttles-jewel 8d5a5dd Merge pull request #14326 from shinobu-x/wip-15025-jewel 091aaa2 Merge pull request #13874 from smithfarm/wip-19171-jewel 3f2e4cd Merge pull request #13492 from shinobu-x/wip-18516-jewel ea0bc6c Merge pull request #13254 from shinobu-x/wip-14609-jewel 845972f Merge pull request #13489 from shinobu-x/wip-18955-jewel a3deef9 Merge pull request #14070 from smithfarm/wip-19339-jewel 702edb5 Merge pull request #14329 from smithfarm/wip-19493-jewel f509ccc Merge pull request #14427 from smithfarm/wip-19140-jewel c8c4bff Merge pull request #14324 from shinobu-x/wip-19371-jewel 349baea Merge pull request #14112 from shinobu-x/wip-19192-jewel dd466b7 Merge pull request #14150 from smithfarm/wip-18823-jewel b8f8bd0 Merge pull request #14152 from smithfarm/wip-18893-jewel 222916a Merge pull request #14154 from smithfarm/wip-18948-jewel 49f84b1 Merge pull request #14148 from smithfarm/wip-18778-jewel 2a232d4 Merge pull request #14083 from smithfarm/wip-19357-jewel 413ac58 Merge pull request #13154 from smithfarm/wip-18496-jewel 23d595b Merge pull request #13244 from smithfarm/wip-18775-jewel 4add6f5 Merge pull request #13809 from asheplyakov/18321-bp-jewel 37ab19c Merge pull request #13107 from smithfarm/wip-18669-jewel f7c04e3 Merge pull request #13585 from asheplyakov/jewel-bp-16585 d2909bd Merge pull request #14371 from tchaikov/wip-19429-jewel cd74860 Merge pull request #14325 from shinobu-x/wip-18619-jewel 1a20c12 Merge pull request #14236 from smithfarm/wip-19392-jewel 4838c4d Merge pull request #14181 from mslovy/wip-19394-jewel e26b703 Merge pull request #14113 from shinobu-x/wip-19319-jewel 389150b Merge pull request #14047 from asheplyakov/reindex-on-pg-split a8b1008 Merge pull request #14044 from mslovy/wip-19311-jewel 32ed9b7 Merge pull request #13932 from asheplyakov/18911-bp-jewel 6705e91 Merge pull request #13831 from jan--f/wip-19206-jewel 3d21a00 Merge pull request #13827 from tchaikov/wip-19185-jewel 8a6d643 Merge pull request #13788 from shinobu-x/wip-18235-jewel f96392a Merge pull request #13786 from shinobu-x/wip-19129-jewel 8fe6ffc Merge pull request #13732 from liewegas/wip-19119-jewel 6f589a1 Merge pull request #13541 from shinobu-x/wip-18929-jewel b8f2d35 Merge pull request #13477 from asheplyakov/jewel-bp-18951 40d1443 Merge pull request #13261 from shinobu-x/wip-18587-jewel
Total of 40 PRs excluding 14371 (which must be included in any case); grabbing the first 20 (starting from the bottom of the list, which is in reverse chronological order) for the bisect branch. Populating with following script:
set -ex reviewer='Nathan Cutler <ncutler@suse.com>' milestone=jewel base_branch=wip-v10.2.7 bisect_branch=${base_branch}-bisect PRS="13261 13477 13541 13732 13786 13788 13827 13831 13932 14044 14047 14113 14181 14236 14325 13585 13107 13809 13244 13154 14083 " git checkout $milestone git fetch ceph git branch -D $bisect_branch || : git checkout -b $bisect_branch ceph/$base_branch git reset --hard ceph/$base_branch for pr in $PRS ; do eval title=$(curl --silent https://api.github.com/repos/ceph/ceph/pulls/$pr?access_token=$github_token | jq .title) ; echo "PR $pr $title" ; git --no-pager log --oneline ceph/pull/$pr/merge^1..ceph/pull/$pr/merge^2 ; git --no-pager merge --no-ff -m "$(echo -e "Merge pull request #$pr: $title\n\nReviewed-by: $reviewer")" ceph/pull/$pr/head ; done git push --force ceph-ci $bisect_branch
git --no-pager log --format='%H %s' --graph ceph-ci/wip-v10.2.7..wip-v10.2.7-bisect | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 14083
- |\
- | + doc: update description of rbdmap unmap[-all] behaviour
- | + doc: add verbiage to rbdmap manpage
- | + rbdmap: unmap RBDMAPFILE images unless called with unmap-all
- + Pull request 13154
- |\
- | + librbd: possible deadlock with flush if refresh in-progress
- + Pull request 13244
- |\
- | + osdc: cache should ignore error bhs during trim
- + Pull request 13809
- |\
- | + librbd: remove image header lock assertions
- + Pull request 13107
- |\
- | + librbd: improve debug logging for lock / watch state machines
- | + test: use librados API to retrieve config params
- + Pull request 13585
- |\
- | + msg/simple: set close on exec on server sockets
- | + msg/async: set close on exec on server sockets
- + Pull request 14325
- |\
- | + osd: Calculate degraded and misplaced more accurately
- + Pull request 14236
- |\
- | + mon: remove bad rocksdb option
- + Pull request 14181
- |\
- | + osd: bypass readonly ops when osd full.
- + Pull request 14113
- |\
- | + tools/rados: Check return value of connect
- + Pull request 14047
- |\
- | + jewel: osd/PGLog: reindex properly on pg log split
- + Pull request 14044
- |\
- | + os/filestore: fix clang static check warn use-after-free
- + Pull request 13932
- |\
- | + rbd-nbd: check /sys/block/nbdX/size to ensure kernel mapped correctly
- + Pull request 13831
- |\
- | + server: negative error code when responding to client
- + Pull request 13827
- |\
- | + osd/ReplicatedPG: try with pool's use-gmt setting if hitset archive not found
- + Pull request 13788
- |\
- | + os/filestore/HashIndex: be loud about splits
- + Pull request 13786
- |\
- | + build/ops: add psmisc dependency to ceph-base
- + Pull request 13732
- |\
- | + PendingReleaseNotes: warning about 'osd rm ...' and #19119
- + Pull request 13541
- |\
- | + osd/PG: restrict want_acting to up+acting on recovery completion
- + Pull request 13477
- |\
- | + ceph-osd: --flush-journal: sporadic segfaults on exit
- + Pull request 13261
- + mon/OSDMonitor: make 'osd crush move ...' work on osds
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-bisect --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"
And wip-v10.2.7 (the base branch) again for comparison:
- pass http://pulpito.ceph.com/smithfarm-2017-04-18_15:31:28-upgrade:hammer-x-wip-v10.2.7-distro-basic-vps/
Will continue the bisect in a new comment.
Updated by Nathan Cutler about 7 years ago
jewel regression bisect, round 2¶
In round 1 we prepared an integration branch consisting of v10.2.7 + PR#14371 (which is required in any case) + the first 21 PRs merged into jewel after the v10.2.7 release. Although logic would dictate that the regression is one of the following PRs
e31a540 Merge pull request #13834 from smithfarm/wip-18969-jewel 7c36d16 Merge pull request #13833 from smithfarm/wip-18908-jewel 0e3aa2c Merge pull request #13214 from ovh/bp-osd-updateable-throttles-jewel 8d5a5dd Merge pull request #14326 from shinobu-x/wip-15025-jewel 091aaa2 Merge pull request #13874 from smithfarm/wip-19171-jewel 3f2e4cd Merge pull request #13492 from shinobu-x/wip-18516-jewel ea0bc6c Merge pull request #13254 from shinobu-x/wip-14609-jewel 845972f Merge pull request #13489 from shinobu-x/wip-18955-jewel a3deef9 Merge pull request #14070 from smithfarm/wip-19339-jewel 702edb5 Merge pull request #14329 from smithfarm/wip-19493-jewel f509ccc Merge pull request #14427 from smithfarm/wip-19140-jewel c8c4bff Merge pull request #14324 from shinobu-x/wip-19371-jewel 349baea Merge pull request #14112 from shinobu-x/wip-19192-jewel dd466b7 Merge pull request #14150 from smithfarm/wip-18823-jewel b8f8bd0 Merge pull request #14152 from smithfarm/wip-18893-jewel 222916a Merge pull request #14154 from smithfarm/wip-18948-jewel 49f84b1 Merge pull request #14148 from smithfarm/wip-18778-jewel
I would like to get a clear reproducer, so I prepared a wip-v10.2.7-bisect-2
branch consisting of v10.2.7 + PR#14371 + these PRs.
git --no-pager log --format='%H %s' --graph ceph-ci/wip-v10.2.7..wip-v10.2.7-bisect-2 | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 13834
- |\
- | + rgw: change log level to 20 for 'System already converted' message
- + Pull request 13833
- |\
- | + rgw: the swift container acl should support field .ref
- + Pull request 13214
- |\
- | + OSD: allow client throttler to be adjusted on-fly, without restart
- + Pull request 14326
- |\
- | + osd: don't share osdmap with objecter when preboot
- + Pull request 13874
- |\
- | + doc: rgw: make a note abt system users vs normal users
- + Pull request 13492
- |\
- | + systemd: Start OSDs after MONs
- + Pull request 13254
- |\
- | + radosstriper : protect aio_write API from calls with 0 bytes
- + Pull request 13489
- |\
- | + ceph-disk: Fix getting wrong group name when --setgroup in bluestore
- + Pull request 14070
- |\
- | + Revert dummy: reduce run time, run user.yaml playbook
- + Pull request 14329
- |\
- | + ceph-disk: Adding retry loop in get_partition_dev()
- | + ceph-disk: Reporting /sys directory in get_partition_dev()
- + Pull request 14427
- |\
- | + osdc/Objecter: resend RWORDERED ops on full
- | + osdc/Objecter: If osd full, it should pause read op which w/ rwordered flag.
- + Pull request 14324
- |\
- | + common: fix segfault in public IPv6 addr picking
- + Pull request 14112
- |\
- | + brag: count the number of mds in fsmap not in mdsmap
- | + brag: Assume there are 0 MDS instead of crashing when data is missing
- + Pull request 14150
- |\
- | + librbd: avoid possible recursive lock when racing acquire lock
- + Pull request 14152
- |\
- | + librbd: Include WorkQueue.h since we use it
- + Pull request 14154
- |\
- | + qa/workunits/rbd: resolve potential rbd-mirror race conditions
- + Pull request 14148
- + rbd: destination pool should be source pool if it is not specified
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-bisect-2 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"
builds OK https://shaman.ceph.com/builds/ceph/wip-v10.2.7-bisect-2/2b15bdd5a425e2d20a146af19ad06fda24adc2d2/
Examining the test yaml again, it seems strange that a test called "upgrade:hammer-x/f-h-x-offline" should install firefly and then upgrade directly to "x" (jewel in this case)? Opened http://tracker.ceph.com/issues/19687 to track.
Updated by Nathan Cutler about 7 years ago
jewel "regression" bisect, round 3¶
Pushed wip-v10.2.7-bisect-3
git --no-pager log --format='%H %s' --graph ceph-ci/wip-v10.2.7..wip-v10.2.7-bisect-3 | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 13214
- |\
- | + OSD: allow client throttler to be adjusted on-fly, without restart
- + Pull request 14326
- |\
- | + osd: don't share osdmap with objecter when preboot
- + Pull request 13254
- |\
- | + radosstriper : protect aio_write API from calls with 0 bytes
- + Pull request 14427
- |\
- | + osdc/Objecter: resend RWORDERED ops on full
- | + osdc/Objecter: If osd full, it should pause read op which w/ rwordered flag.
- + Pull request 14324
- + common: fix segfault in public IPv6 addr picking
https://shaman.ceph.com/builds/ceph/wip-v10.2.7-bisect-3/0dfc1333c5ff95624e8825bb4af339b67b2a1d1d/
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-bisect-3 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"
jewel "regression" bisect, round 4¶
Pushed wip-v10.2.7-bisect-4
git --no-pager log --format='%H %s' --graph ceph-ci/wip-v10.2.7..wip-v10.2.7-bisect-4 | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 13214
- |\
- | + OSD: allow client throttler to be adjusted on-fly, without restart
- + Pull request 14326
- + osd: don't share osdmap with objecter when preboot
https://shaman.ceph.com/builds/ceph/wip-v10.2.7-bisect-4/77358532ce0d07ae7afc317c304c2e255058aad0/
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-bisect-4 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"
Updated by Nathan Cutler about 7 years ago
jewel "regression" grand finale¶
Starting the grand finale by cherry-picking (not merging as before) the following PRs (one of which should be the cause of the "regression" according to the bisect results so far) on top of wip-v10.2.7:
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-10.2.7-13254 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"
- wip-10.2.7-13254 https://shaman.ceph.com/builds/ceph/wip-10.2.7-13254/772ba99ac653c7e21efb6d506ef418904f6fc1d4/
- pass http://pulpito.ceph.com:80/smithfarm-2017-04-19_13:53:56-upgrade:hammer-x-wip-10.2.7-13254-distro-basic-vps/
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-14427 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"
- wip-v10.2.7-14427 https://shaman.ceph.com/builds/ceph/wip-v10.2.7-14427/783344dc5f35f45f61ac33d5072d64f176e22791/
- fail http://pulpito.ceph.com:80/smithfarm-2017-04-19_13:54:33-upgrade:hammer-x-wip-v10.2.7-14427-distro-basic-vps/
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-14324 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"
- wip-v10.2.7-14324 https://shaman.ceph.com/builds/ceph/wip-v10.2.7-14324/217da3d655e1c23c7ac461cc65f630cd74da2d79/
- pass http://pulpito.ceph.com:80/smithfarm-2017-04-19_13:55:00-upgrade:hammer-x-wip-v10.2.7-14324-distro-basic-vps/
CONCLUSION: https://github.com/ceph/ceph/pull/14427 would seem to be the culprit. Opened https://github.com/ceph/ceph/pull/14643 to revert it.
Updated by Nathan Cutler about 7 years ago
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 13450
- |\
- | + msg/simple/Pipe: support IPv6 QoS.
- | + msg/simple: cleanups
- + Pull request 13507
- |\
- | + osd/Pool: Disallow enabling 'hashpspool' option to a pool without '--yes-i-really-mean-it'
- + Pull request 13544
- |\
- | + auth: 'ceph auth import -i' overwrites caps, if caps are not specified in given keyring file, should alert user and should not allow this import. Because in 'ceph auth list' we keep all the keyrings with caps and importing 'client.admin' user keyring without caps locks the cluster with error1 because admin keyring caps are missing in 'ceph auth'.
- + Pull request 13606
- |\
- | + build/ops: rpm: move $CEPH_EXTRA_CONFIGURE_ARGS to right place
- | + build/ops: rpm: explicitly provide --with-ocf to configure
- | + rpm: build ceph-resource-agents by default
- + Pull request 13608
- |\
- | + tests: Thrasher: eliminate a race between kill_osd and init
- + Pull request 13647
- |\
- | + os: make zero values noops for set_alloc_hint() in FileStore
- | + osd: preserve allocation hint attribute during recovery
- + Pull request 13724
- |\
- | + rgw: Use decoded URI when verifying TempURL
- + Pull request 13779
- |\
- | + rgw: metadata sync info should be shown at master zone of slave zonegroup
- + Pull request 13837
- |\
- | + rgw: fix for broken yields in RGWMetaSyncShardCR
- | + rgw: kill a compile warning for rgw_sync
- + Pull request 13842
- |\
- | + rgw: don't init rgw_obj from rgw_obj_key when it's incorrect to do so
- + Pull request 13863
- |\
- | + rgw: Fixes typo in rgw_admin.cc
- + Pull request 13865
- |\
- | + rgw: Correct the return codes for the health check feature Fixes: http://tracker.ceph.com/issues/19025 Signed-off-by: Pavan Rallabhandi <PRallabhandi@walmartlabs.com>
- + Pull request 13872
- |\
- | + rgw: Let the object stat command be shown in the usage
- + Pull request 14064
- |\
- | + rgw: delete_system_obj() fails on empty object name
- | + rgw: if user.email is empty, dont try to delete
- + Pull request 14066
- |\
- | + rgw: fix break inside of yield in RGWFetchAllMetaCR
- + Pull request 14136
- |\
- | + rgw: skip conversion of zones without any zoneparams
- | + rgw: better debug information for upgrade
- | + rgw/rgw_rados.cc: prefer ++operator for non-primitive iterators
- + Pull request 14143
- |\
- | + rgw: use rgw_zone_root_pool for region_map like is done in hammer
- + Pull request 14195
- |\
- | + rgw: use separate http_manager for read_sync_status
- | + rgw: pass cr registry to managers
- | + rgw: use separate cr manager for read_sync_status
- | + rgw: change read_sync_status interface
- | + rgw: don't ignore ENOENT in RGWRemoteDataLog::read_sync_status()
- + Pull request 14204
- |\
- | + filestore, tools: Fix logging of DBObjectMap check() repairs
- | + osd: Simplify DBObjectMap by no longer creating complete tables
- | + ceph-osdomap-tool: Fix seg fault with large amount of check error output
- | + osd: Add automatic repair for DBObjectMap bug
- | + ceph-osdomap-tool: Fix tool exit status
- | + DBObjectMap: rewrite rm_keys and merge_new_complete
- | + DBObjectMap: strengthen in_complete_region post condition
- | + DBObjectMap: fix next_parent()
- | + test_object_map: add tests to trigger some bugs related to 18533
- | + test: Add ceph_test_object_map to make check tests
- | + ceph-osdomap-tool: Add --debug and only show internal logging if enabled
- | + osd: DBOjectMap::check: Dump complete mapping when inconsistency found
- | + test_object_map: Use ASSERT_EQ() for check() so failure doesn't stop testing
- | + tools: Check for overlaps in internal complete table for DBObjectMap
- | + tools: Add dump-headers command to ceph-osdomap-tool
- | + tools: Add --oid option to ceph-osdomap-tool
- | + osd: Remove unnecessary assert and assignment in DBObjectMap
- + Pull request 14377
- |\
- | + rgw_file: fix missing unlock in unlink
- | + rgw_file: implement reliable has-children check
- + Pull request 14383
- |\
- | + debian: replace SysV rbdmap with systemd service
- + Pull request 14416
- |\
- | + tests: Thrasher: handle OSD has the store locked gracefully
- + Pull request 14449
- |\
- | + tests: fix oversight in yaml comment
- + Pull request 14481
- |\
- | + librbd: is_exclusive_lock_owner API should ping OSD
- | + pybind: fix incorrect exception format strings
- + Pull request 14587
- |\
- | + mon/MonClient: make get_mon_log_message() atomic
- + Pull request 14602
- |\
- | + ceph-disk: enable directory backed OSD at boot time
- + Pull request 14605
- |\
- | + rgw: don't return skew time in pre-signed url
- + Pull request 14607
- |\
- | + rgw: fix for null version_id in fetch_remote_obj()
- | + rgw: version id doesn't work in fetch_remote_obj
- + Pull request 14626
- |\
- | + tests: upgrade:hammer-x/f-h-x-offline add missing hammer upgrade
- + Pull request 14635
- |\
- | + doc: mention --show-mappings in crushtool manpage
- + Pull request 14643
- + Revert osdc/Objecter: If osd full, it should pause read op which w/ rwordered flag.
- + Revert osdc/Objecter: resend RWORDERED ops on full
Updated by Nathan Cutler about 7 years ago
rados¶
teuthology-suite -k distro --priority 1000 --suite rados --subset $(expr $RANDOM % 2000)/2000 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- builds OK https://shaman.ceph.com/builds/ceph/wip-jewel-backports/16a335388be2616c68e6895cba3eac7782ebcaa3/
- 4 fail, 108 pass (112 total) http://pulpito.ceph.com:80/smithfarm-2017-04-20_08:48:32-rados-wip-jewel-backports-distro-basic-smithi/
Re-running 4 failed jobs:
- 2 fail, 2 pass http://pulpito.ceph.com:80/smithfarm-2017-04-21_05:45:14-rados-wip-jewel-backports-distro-basic-smithi/
- new bug http://tracker.ceph.com/issues/19737 Error EAGAIN: pg 1.0 primary osd.1 not up
- known bug http://tracker.ceph.com/issues/16239 Error ENXIO: problem getting command descriptions from osd.0 - opened https://github.com/ceph/ceph/pull/14710 to work around it
Updated by Nathan Cutler about 7 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- *1 fail, 16 pass (17 total) * http://pulpito.ceph.com:80/smithfarm-2017-04-20_08:51:50-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- known bug http://tracker.ceph.com/issues/12973 - opened https://github.com/ceph/ceph/pull/14626 to fix the test
Ruled a pass
Updated by Nathan Cutler about 7 years ago
upgrade client-upgrade¶
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- 1 fail, 13 pass (14 total) http://pulpito.ceph.com:80/smithfarm-2017-04-20_08:53:29-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- known bug http://tracker.ceph.com/issues/19571 Command failed on vpm175 with status 1: "sudo yum -y install '' ceph-radosgw" - opened https://github.com/ceph/ceph/pull/14691 to "fix" (by dropping the CentOS version of the test)
Ruled a pass
Updated by Nathan Cutler about 7 years ago
ceph-disk¶
teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
Updated by Nathan Cutler about 7 years ago
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 13450
- |\
- | + msg/simple/Pipe: support IPv6 QoS.
- | + msg/simple: cleanups
- + Pull request 13507
- |\
- | + osd/Pool: Disallow enabling 'hashpspool' option to a pool without '--yes-i-really-mean-it'
- + Pull request 13544
- |\
- | + auth: 'ceph auth import -i' overwrites caps, if caps are not specified in given keyring file, should alert user and should not allow this import. Because in 'ceph auth list' we keep all the keyrings with caps and importing 'client.admin' user keyring without caps locks the cluster with error1 because admin keyring caps are missing in 'ceph auth'.
- + Pull request 13608
- |\
- | + tests: Thrasher: eliminate a race between kill_osd and init
- + Pull request 13647
- |\
- | + os: make zero values noops for set_alloc_hint() in FileStore
- | + osd: preserve allocation hint attribute during recovery
- + Pull request 13724
- |\
- | + rgw: Use decoded URI when verifying TempURL
- + Pull request 13837
- |\
- | + rgw: fix for broken yields in RGWMetaSyncShardCR
- | + rgw: kill a compile warning for rgw_sync
- + Pull request 13842
- |\
- | + rgw: don't init rgw_obj from rgw_obj_key when it's incorrect to do so
- + Pull request 14064
- |\
- | + rgw: delete_system_obj() fails on empty object name
- | + rgw: if user.email is empty, dont try to delete
- + Pull request 14066
- |\
- | + rgw: fix break inside of yield in RGWFetchAllMetaCR
- + Pull request 14195
- |\
- | + rgw: use separate http_manager for read_sync_status
- | + rgw: pass cr registry to managers
- | + rgw: use separate cr manager for read_sync_status
- | + rgw: change read_sync_status interface
- | + rgw: don't ignore ENOENT in RGWRemoteDataLog::read_sync_status()
- + Pull request 14204
- |\
- | + filestore, tools: Fix logging of DBObjectMap check() repairs
- | + osd: Simplify DBObjectMap by no longer creating complete tables
- | + ceph-osdomap-tool: Fix seg fault with large amount of check error output
- | + osd: Add automatic repair for DBObjectMap bug
- | + ceph-osdomap-tool: Fix tool exit status
- | + DBObjectMap: rewrite rm_keys and merge_new_complete
- | + DBObjectMap: strengthen in_complete_region post condition
- | + DBObjectMap: fix next_parent()
- | + test_object_map: add tests to trigger some bugs related to 18533
- | + test: Add ceph_test_object_map to make check tests
- | + ceph-osdomap-tool: Add --debug and only show internal logging if enabled
- | + osd: DBOjectMap::check: Dump complete mapping when inconsistency found
- | + test_object_map: Use ASSERT_EQ() for check() so failure doesn't stop testing
- | + tools: Check for overlaps in internal complete table for DBObjectMap
- | + tools: Add dump-headers command to ceph-osdomap-tool
- | + tools: Add --oid option to ceph-osdomap-tool
- | + osd: Remove unnecessary assert and assignment in DBObjectMap
- + Pull request 14392
- |\
- | + jewel: osd: pg_pool_t::encode(): be compatible with Hammer <= 0.94.6
- + Pull request 14416
- |\
- | + tests: Thrasher: handle OSD has the store locked gracefully
- + Pull request 14481
- |\
- | + librbd: is_exclusive_lock_owner API should ping OSD
- | + pybind: fix incorrect exception format strings
- + Pull request 14587
- |\
- | + mon/MonClient: make get_mon_log_message() atomic
- + Pull request 14605
- |\
- | + rgw: don't return skew time in pre-signed url
- + Pull request 14607
- |\
- | + rgw: fix for null version_id in fetch_remote_obj()
- | + rgw: version id doesn't work in fetch_remote_obj
- + Pull request 14626
- |\
- | + tests: 'failed to encode ...' warnings are normal on upgrades
- + Pull request 14635
- |\
- | + doc: mention --show-mappings in crushtool manpage
- + Pull request 14643
- |\
- | + Revert osdc/Objecter: If osd full, it should pause read op which w/ rwordered flag.
- | + Revert osdc/Objecter: resend RWORDERED ops on full
- + Pull request 14653
- |\
- | + rgw_file: remove unused rgw_key variable
- | + rgw_file: fix readdir after dirent-change
- | + rgw_file: don't expire directories being read
- | + rgw_file: rgw_readdir: return dot-dirs only when *offset is 0
- | + rgw_file: chunked readdir
- | + rgw_file: fix missing unlock in unlink
- | + rgw_file: implement reliable has-children check
- | + rgw_file: introduce rgw_lookup type hints
- + Pull request 14660
- |\
- | + radosgw-admin: use zone id when creating a zone
- | + qa: rgw task uses period instead of region-map
- | + rgw-admin: remove deprecated regionmap commands
- + Pull request 14661
- |\
- | + rgw: fix crash when listing objects via swift
- + Pull request 14663
- |\
- | + librbd: relax is parent mirrored check when enabling mirroring for pool
- + Pull request 14664
- |\
- | + rbd: prevent adding multiple mirror peers to a single pool
- + Pull request 14665
- |\
- | + test/librados_test_stub: fixed cls_cxx_map_get_keys/vals return value
- + Pull request 14666
- |\
- | + librbd: fix rbd_metadata_list and rbd_metadata_get
- + Pull request 14667
- |\
- | + client: fix the cross-quota rename boundary check conditions
- + Pull request 14668
- |\
- | + mds: don't purge strays when mds is in clientreplay state
- | + mds: skip fragment space check for replayed request
- + Pull request 14669
- |\
- | + tasks/cephfs: switch open vs. write in test_open_inode
- | + qa: fix race in Mount.open_background
- + Pull request 14670
- |\
- | + mds/StrayManager: aviod reusing deleted inode in StrayManager::_purge_stray_logged
- + Pull request 14671
- |\
- | + test/libcephfs: avoid buffer overflow when testing ceph_getdents()
- + Pull request 14672
- |\
- | + mds: reset heartbeat in export_remaining_imported_caps
- | + mds: heartbeat_reset in dispatch
- + Pull request 14674
- |\
- | + mon: fix hiding mdsmonitor informative strings
- + Pull request 14676
- |\
- | + tools/cephfs: set dir_layout when injecting inodes
- + Pull request 14677
- |\
- | + mds: make C_MDSInternalNoop::complete() delete 'this'
- + Pull request 14679
- |\
- | + cephfs: fix mount point break off problem after mds switch occured
- + Pull request 14682
- |\
- | + mds: ignore ENOENT on writing backtrace
- + Pull request 14683
- |\
- | + mds: shut down finisher before objecter
- + Pull request 14684
- |\
- | + cephfs: fix write_buf's _len overflow problem
- + Pull request 14685
- |\
- | + client: wait for lastest osdmap when handling set file/dir layout
- + Pull request 14686
- |\
- | + osd: Give requested scrub work a higher priority
- + Pull request 14691
- |\
- | + tests: upgrade:client-upgrade/firefly-client-x: drop CentOS
- + Pull request 14694
- |\
- | + use sudo to check check health
- | + Add reboot case for systemd test
- | + Fix distro's, point to latest version
- + Pull request 14698
- |\
- | + client/Client.cc: add feature to reconnect client after MDS reset
- | + doc: cephfs: fix the unexpected indent warning
- | + doc: additional edits in FUSE client config
- | + doc: Dirty data are not the same as corrupted data
- | + doc: minor changes in fuse client config reference
- | + doc: add client config ref
- + Pull request 14699
- |\
- | + mds: include advisory `path` field in damage
- | + mds: populate DamageTable from scrub and log more quietly
- | + mds/DamageTable: move classes to .cc file
- + Pull request 14700
- |\
- | + mds: validate prealloc_inos on sessions after load
- | + mds: operator<< for Session
- + Pull request 14710
- + tests: rados: sleep before ceph tell osd.0 flush_pg_stats after restart
Updated by Nathan Cutler about 7 years ago
rados¶
teuthology-suite -k distro --priority 1000 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- build OK https://shaman.ceph.com/builds/ceph/wip-jewel-backports/20c565ede9f11aee034eb9c387fcc07939974a8f/
- 1 fail, 226 pass (227 total) http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:26:17-rados-wip-jewel-backports-distro-basic-smithi/
- AttributeError: managers
--rerun
Updated by Nathan Cutler about 7 years ago
powercycle¶
teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 1000 -l 2 --email ncutler@suse.com
Updated by Nathan Cutler about 7 years ago
Upgrade jewel point-to-point-x¶
teuthology-suite -k distro --verbose --suite upgrade/jewel-x/point-to-point-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
Updated by Nathan Cutler about 7 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- 1 fail, 16 pass (17 total) http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:31:29-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- [Errno 113] No route to host
Re-running 1 failed job:
Updated by Nathan Cutler about 7 years ago
ceph-disk¶
teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
Updated by Nathan Cutler about 7 years ago
upgrade client-upgrade¶
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- fail http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:33:03-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- 1 dead job - SSH connection to vpm057 was lost: 'sudo apt-get -y install linux-image-generic' - possibly infrastructure noise
Re-running 1 dead job
Updated by Nathan Cutler about 7 years ago
fs¶
teuthology-suite -k distro --priority 1000 --suite fs --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- 3 fail, 84 fail (87 pass) http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:33:43-fs-wip-jewel-backports-distro-basic-smithi/
- "cluster [WRN] Scrub error on inode" (warning in mds log)
- java.lang.NoClassDefFoundError: Could not initialize class com.ceph.fs.CephMount
--rerun
- 2 fail, 1 pass http://pulpito.ceph.com/smithfarm-2017-04-22_06:58:50-fs-wip-jewel-backports---basic-smithi/
- "cluster [WRN] Scrub error on inode" (warning in mds log)
Marked https://github.com/ceph/ceph/pull/14699 DNM for now.
Updated by Nathan Cutler about 7 years ago
rgw¶
teuthology-suite -k distro --priority 1000 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- 1 fail, 191 pass (192 total) http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:35:15-rgw-wip-jewel-backports-distro-basic-smithi/
- s3tests.functional.test_s3.test_versioned_concurrent_object_create_and_remove ... FAIL
--rerun
Updated by Nathan Cutler about 7 years ago
rbd¶
teuthology-suite -k distro --priority 1000 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 4)/4
- 8 fail, 101 pass (109 total) http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:39:27-rbd-wip-jewel-backports-distro-basic-smithi/
- TestLibRBD.Mirror
--rerun
- 8 fail http://pulpito.ceph.com/smithfarm-2017-04-22_07:07:13-rbd-wip-jewel-backports---basic-smithi/
- TestLibRBD.Mirror
Marked https://github.com/ceph/ceph/pull/14663 DNM - needs another run on repopulated integration branch.
Updated by Nathan Cutler about 7 years ago
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 13450
- |\
- | + msg/simple/Pipe: support IPv6 QoS.
- | + msg/simple: cleanups
- + Pull request 13507
- |\
- | + osd/Pool: Disallow enabling 'hashpspool' option to a pool without '--yes-i-really-mean-it'
- + Pull request 13647
- |\
- | + os: make zero values noops for set_alloc_hint() in FileStore
- | + osd: preserve allocation hint attribute during recovery
- + Pull request 13884
- |\
- | + osd/OSDMap: don't set weight to IN when OSD is destroyed
- + Pull request 13887
- |\
- | + qa/suites/rados/thrash: add no-thrash item to matrix
- | + osd/osd_internal_types: wake snaptrimmer on put_read lock, too
- + Pull request 14204
- |\
- | + filestore, tools: Fix logging of DBObjectMap check() repairs
- | + osd: Simplify DBObjectMap by no longer creating complete tables
- | + ceph-osdomap-tool: Fix seg fault with large amount of check error output
- | + osd: Add automatic repair for DBObjectMap bug
- | + ceph-osdomap-tool: Fix tool exit status
- | + DBObjectMap: rewrite rm_keys and merge_new_complete
- | + DBObjectMap: strengthen in_complete_region post condition
- | + DBObjectMap: fix next_parent()
- | + test_object_map: add tests to trigger some bugs related to 18533
- | + test: Add ceph_test_object_map to make check tests
- | + ceph-osdomap-tool: Add --debug and only show internal logging if enabled
- | + osd: DBOjectMap::check: Dump complete mapping when inconsistency found
- | + test_object_map: Use ASSERT_EQ() for check() so failure doesn't stop testing
- | + tools: Check for overlaps in internal complete table for DBObjectMap
- | + tools: Add dump-headers command to ceph-osdomap-tool
- | + tools: Add --oid option to ceph-osdomap-tool
- | + osd: Remove unnecessary assert and assignment in DBObjectMap
- + Pull request 14332
- |\
- | + osdc/Objecter: respect epoch barrier in _op_submit()
- + Pull request 14392
- |\
- | + jewel: osd: pg_pool_t::encode(): be compatible with Hammer <= 0.94.6
- + Pull request 14481
- |\
- | + librbd: is_exclusive_lock_owner API should ping OSD
- | + pybind: fix incorrect exception format strings
- + Pull request 14626
- |\
- | + tests: 'failed to encode ...' warnings are normal on upgrades
- + Pull request 14659
- |\
- | + rgw: add the remove-x-delete feature to cancel swift object expiration
- + Pull request 14661
- |\
- | + rgw: fix crash when listing objects via swift
- + Pull request 14664
- |\
- | + rbd: prevent adding multiple mirror peers to a single pool
- + Pull request 14666
- |\
- | + librbd: fix rbd_metadata_list and rbd_metadata_get
- + Pull request 14667
- |\
- | + client: fix the cross-quota rename boundary check conditions
- + Pull request 14668
- |\
- | + mds: don't purge strays when mds is in clientreplay state
- | + mds: skip fragment space check for replayed request
- + Pull request 14669
- |\
- | + tasks/cephfs: switch open vs. write in test_open_inode
- | + qa: fix race in Mount.open_background
- + Pull request 14670
- |\
- | + mds/StrayManager: aviod reusing deleted inode in StrayManager::_purge_stray_logged
- + Pull request 14671
- |\
- | + test/libcephfs: avoid buffer overflow when testing ceph_getdents()
- + Pull request 14672
- |\
- | + mds: reset heartbeat in export_remaining_imported_caps
- | + mds: heartbeat_reset in dispatch
- + Pull request 14674
- |\
- | + mon: fix hiding mdsmonitor informative strings
- + Pull request 14676
- |\
- | + tools/cephfs: set dir_layout when injecting inodes
- + Pull request 14677
- |\
- | + mds: make C_MDSInternalNoop::complete() delete 'this'
- + Pull request 14679
- |\
- | + cephfs: fix mount point break off problem after mds switch occured
- + Pull request 14682
- |\
- | + mds: ignore ENOENT on writing backtrace
- + Pull request 14683
- |\
- | + mds: shut down finisher before objecter
- + Pull request 14684
- |\
- | + cephfs: fix write_buf's _len overflow problem
- + Pull request 14685
- |\
- | + client: wait for lastest osdmap when handling set file/dir layout
- + Pull request 14691
- |\
- | + tests: upgrade:client-upgrade/firefly-client-x: drop CentOS
- + Pull request 14694
- |\
- | + use sudo to check check health
- | + Add reboot case for systemd test
- | + Fix distro's, point to latest version
- + Pull request 14698
- |\
- | + client/Client.cc: add feature to reconnect client after MDS reset
- | + doc: cephfs: fix the unexpected indent warning
- | + doc: additional edits in FUSE client config
- | + doc: Dirty data are not the same as corrupted data
- | + doc: minor changes in fuse client config reference
- | + doc: add client config ref
- + Pull request 14700
- |\
- | + mds: validate prealloc_inos on sessions after load
- | + mds: operator<< for Session
- + Pull request 14710
- |\
- | + tests: rados: sleep before ceph tell osd.0 flush_pg_stats after restart
- + Pull request 14752
- |\
- | + rgw: data sync skips slo data when syncing the manifest object
- | + rgw: RGWGetObj applies skip_manifest flag to SLO
- | + rgw: allow system users to read SLO parts
- + Pull request 14763
- |\
- | + ceph_test_librados_api_misc: fix stupid LibRadosMiscConnectFailure.ConnectFailure test
- + Pull request 14765
- |\
- | + ceph-disk: dmcrypt activate must use the same cluster as prepare
- + Pull request 14766
- |\
- | + rgw: fix failed to create bucket if a non-master zonegroup has a single zone
- + Pull request 14787
- |\
- | + rgw: add bucket size limit check to radosgw-admin
- + Pull request 14789
- |\
- | + rgw: swift: disable revocation thread if sleep 0 || cache_size 0
- + Pull request 14791
- |\
- | + Fix reveresed promote throttle default parameters.
- + Pull request 14812
- |\
- | + tests: double snap trimming timeout
- + Pull request 14815
- + rgw: add suport for creating S3 type subuser of admin rest api
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/572fb344af805709327f270fcf8743bc62ef4b3d
Updated by Nathan Cutler about 7 years ago
rados¶
teuthology-suite -k distro --priority 1000 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- 2 fail, 257 pass (259 total) http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:18:43-rados-wip-jewel-backports-distro-basic-smithi/
- "Command failed on smithi066 with status 1: '/home/ubuntu/cephtest/s3-tests/virtualenv/bin/s3tests-test-readwrite'" NOT REPRODUCED
--rerun
- 1 fail, 1 pass http://pulpito.ceph.com/smithfarm-2017-04-27_16:56:17-rados-wip-jewel-backports---basic-smithi/
- known bug http://tracker.ceph.com/issues/19737 Command failed on smithi168 with status 11: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph pg scrub 1.0'
Running the failed job 4 more times:
- 2 pass, 2 fail http://pulpito.ceph.com:80/smithfarm-2017-04-27_17:35:57-rados-wip-jewel-backports-distro-basic-smithi/
Ruled a pass
Updated by Nathan Cutler about 7 years ago
powercycle¶
teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 1000 -l 2 --email ncutler@suse.com
Updated by Nathan Cutler about 7 years ago
Upgrade jewel point-to-point-x¶
teuthology-suite -k distro --verbose --suite upgrade/jewel-x/point-to-point-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
Updated by Nathan Cutler about 7 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- 1 fail, 3 dead, 13 pass (17 total) http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:23:47-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
Updated by Nathan Cutler about 7 years ago
ceph-disk¶
teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
Updated by Nathan Cutler about 7 years ago
upgrade client-upgrade¶
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
- 1 dead, 12 pass (13 total) http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:25:39-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- dead job looks like infrastructure noise
--rerun
Updated by Nathan Cutler about 7 years ago
fs¶
teuthology-suite -k distro --priority 1000 --suite fs --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
- 3 fail, 84 pass (87 total) http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:26:37-fs-wip-jewel-backports-distro-basic-smithi/
--rerun
- 1 fail, 2 succeed (3 total) http://pulpito.ceph.com/smithfarm-2017-04-27_13:37:36-fs-wip-jewel-backports---basic-smithi/
- "java.lang.NoClassDefFoundError: Could not initialize class com.ceph.fs.CephMount" in libcephfs-java workunit leads to
Command failed (workunit test libcephfs-java/test.sh) on smithi138 with status 1
- "java.lang.NoClassDefFoundError: Could not initialize class com.ceph.fs.CephMount" in libcephfs-java workunit leads to
Updated by Nathan Cutler about 7 years ago
rgw¶
teuthology-suite -k distro --priority 1000 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 2)/2
- 10 fail, 86 pass (96 total) http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:29:14-rgw-wip-jewel-backports-distro-basic-smithi/
- saw valgrind issues (9 failures)
- foo (1 failure) NOT REPRODUCED
--rerun
- fail http://pulpito.ceph.com/smithfarm-2017-04-27_16:58:41-rgw-wip-jewel-backports---basic-smithi/
- saw valgrind issues
Updated by Nathan Cutler about 7 years ago
rbd¶
teuthology-suite -k distro --priority 1000 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 4)/4
Updated by Nathan Cutler about 7 years ago
rgw suite for ragweed support¶
Special request by Yehuda
build: https://shaman.ceph.com/builds/ceph/wip-rgw-support-ragweed-jewel/
teuthology-suite -k distro --priority 999 --suite rgw --email ncutler@suse.com --ceph wip-rgw-support-ragweed-jewel --machine-type smithi --subset $(expr $RANDOM % 2)/2
- 1 fail, rest pass http://pulpito.ceph.com:80/smithfarm-2017-05-03_10:05:41-rgw-wip-rgw-support-ragweed-jewel-distro-basic-smithi/
- multiregion valgrind
Updated by Abhishek Lekshmanan almost 7 years ago
Added https://github.com/ceph/ceph/pull/15208 on Sage's request,
teuthology-suite -k distro --priority 999 --suite rados --subset $(expr $RANDOM % 50)/50 --email abhishek@suse.com --ceph wip-jewel-backports-mon-sortbitwise --machine-type smithi
Updated by Abhishek Lekshmanan almost 7 years ago
Adding an integration branch scheduling only the RGW memleak fix PRs (and the above rados PR which was already merged in Jewel)
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports-rgw-fixes | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull-?request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Merge pull request #15312
- |\
- | + rgw: rest conn functions cleanup, only append zonegroup if not empty
- | + rgw: rest and http client code to use param vectors
- + Merge pull request #15382
- |\
- | + rgw:fix memory leaks
- + Merge pull request #13450: jewel: msg: IPv6 Heartbeat packets are not marked with DSCP QoS - simple messenger
- + Merge pull request #13507: jewel: Disallow enabling 'hashpspool' option to a pool without some kind of --i-understand-this-will-remap-all-pgs flag
- |\
- | + osd/Pool: Disallow enabling 'hashpspool' option to a pool without '--yes-i-really-mean-it'
- + Merge pull request #13647: jewel: osd: preserve allocation hint attribute during recovery
- + Merge pull request #13884: jewel: pre-jewel osd rm incrementals are misinterpreted
- + Merge pull request #13887: jewel: snap trim blocked behind ec read, never woken, on kraken-x upgrade
- |\
- | + qa/suites/rados/thrash: add no-thrash item to matrix
- | + osd/osd_internal_types: wake snaptrimmer on put_read lock, too
- + Merge pull request #14204: jewel: core: two instances of omap_digest mismatch
- + Merge pull request #14332: jewel: Objecter::epoch_barrier isn't respected in _op_submit()
- + Merge pull request #14392: jewel: osd: pg_pool_t::encode(): be compatible with Hammer <= 0.94.6
- |\
- | + jewel: osd: pg_pool_t::encode(): be compatible with Hammer <= 0.94.6
- + Merge pull request #14481: jewel: librbd: is_exclusive_lock_owner API should ping OSD
- + Merge pull request #14626: tests: upgrade:hammer-x/f-h-x-offline: 'failed to encode ...' warnings are normal on upgrades
- |\
- | + tests: 'failed to encode ...' warnings are normal on upgrades
- + Merge pull request #14659: jewel: rgw: add the remove-x-delete feature to cancel swift object expiration
- |\
- | + rgw: add the remove-x-delete feature to cancel swift object expiration
- + Merge pull request #14661: jewel: rgw: unsafe access in RGWListBucket_ObjStore_SWIFT::send_response()
- |\
- | + rgw: fix crash when listing objects via swift
- + Merge pull request #14664: jewel: [api] temporarily restrict (rbd_)mirror_peer_add from adding multiple peers
- + Merge pull request #14666: jewel: librbd: Issues with C API image metadata retrieval functions
- + Merge pull request #14667: jewel: client: fix the cross-quota rename boundary check conditions
- + Merge pull request #14668: jewel: mds: fragment space check can cause replayed request fail
- |\
- | + mds: don't purge strays when mds is in clientreplay state
- | + mds: skip fragment space check for replayed request
- + Merge pull request #14669: jewel: cephfs: Test failure: test_open_inode
- |\
- | + tasks/cephfs: switch open vs. write in test_open_inode
- | + qa: fix race in Mount.open_background
- + Merge pull request #14670: jewel: mds: avoid reusing deleted inode in StrayManager::_purge_stray_logged
- |\
- | + mds/StrayManager: aviod reusing deleted inode in StrayManager::_purge_stray_logged
- + Merge pull request #14671: jewel: tests: buffer overflow in test LibCephFS.DirLs
- |\
- | + test/libcephfs: avoid buffer overflow when testing ceph_getdents()
- + Merge pull request #14672: jewel: MDS heartbeat timeout during rejoin, when working with large amount of caps/inodes
- |\
- | + mds: reset heartbeat in export_remaining_imported_caps
- | + mds: heartbeat_reset in dispatch
- + Merge pull request #14674: jewel: cephfs: No output for ceph mds rmfailed 0 --yes-i-really-mean-it command
- |\
- | + mon: fix hiding mdsmonitor informative strings
- + Merge pull request #14676: jewel: cephfs: MDS server crashes due to inconsistent metadata.
- |\
- | + tools/cephfs: set dir_layout when injecting inodes
- + Merge pull request #14677: jewel: mds: C_MDSInternalNoop::complete doesn't free itself
- |\
- | + mds: make C_MDSInternalNoop::complete() delete 'this'
- + Merge pull request #14679: jewel: cephfs: The mount point break off when mds switch hanppened.
- |\
- | + cephfs: fix mount point break off problem after mds switch occured
- + Merge pull request #14682: jewel: cephfs: MDS goes readonly writing backtrace for a file whose data pool has been removed
- |\
- | + mds: ignore ENOENT on writing backtrace
- + Merge pull request #14683: jewel: cephfs: MDS assert failed when shutting down
- |\
- | + mds: shut down finisher before objecter
- + Merge pull request #14684: jewel: cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file
- |\
- | + cephfs: fix write_buf's _len overflow problem
- + Merge pull request #14685: jewel: cephfs: Test failure: test_data_isolated
- |\
- | + client: wait for lastest osdmap when handling set file/dir layout
- + Merge pull request #14691: tests: upgrade:client-upgrade/firefly-client-x: drop CentOS
- |\
- | + tests: upgrade:client-upgrade/firefly-client-x: drop CentOS
- + Merge pull request #14694: [backport] qa/tasks: systemd test backport to jewel
- + Merge pull request #14698: jewel: cephfs: ceph-fuse does not recover after lost connection to MDS
- |\
- | + client/Client.cc: add feature to reconnect client after MDS reset
- | + doc: cephfs: fix the unexpected indent warning
- | + doc: additional edits in FUSE client config
- | + doc: Dirty data are not the same as corrupted data
- | + doc: minor changes in fuse client config reference
- | + doc: add client config ref
- + Merge pull request #14700: jewel: mds: enable start when session ino info is corrupt
- |\
- | + mds: validate prealloc_inos on sessions after load
- | + mds: operator<< for Session
- + Merge pull request #14710: tests: rados: sleep before ceph tell osd.0 flush_pg_stats after restart
- |\
- | + tests: rados: sleep before ceph tell osd.0 flush_pg_stats after restart
- + Merge pull request #14752: jewel: rgw: allow system users to read SLO parts
- |\
- | + rgw: data sync skips slo data when syncing the manifest object
- | + rgw: RGWGetObj applies skip_manifest flag to SLO
- | + rgw: allow system users to read SLO parts
- + Merge pull request #14763: jewel: api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure
- + Merge pull request #14765: jewel: ceph-disk does not support cluster names different than 'ceph'
- + Merge pull request #14766: jewel: rgw: fix failed to create bucket if a non-master zonegroup has a single zone
- |\
- | + rgw: fix failed to create bucket if a non-master zonegroup has a single zone
- + Merge pull request #14787: jewel: rgw: add bucket size limit check to radosgw-admin
- |\
- | + rgw: add bucket size limit check to radosgw-admin
- + Merge pull request #14789: jewel: rgw: swift: disable revocation thread if sleep 0 || cache_size 0
- |\
- | + rgw: swift: disable revocation thread if sleep 0 || cache_size 0
- + Merge pull request #14791: jewel: osd: promote throttle parameters are reversed
- + Merge pull request #14812: jewel: tests: upgrade tests failing with AssertionError: failed to complete snap trimming before timeout
- |\
- | + tests: double snap trimming timeout
- + Merge pull request #14815: jewel: rgw: failure to create s3 type subuser from admin rest api
- + rgw: add suport for creating S3 type subuser of admin rest api
Updated by Nathan Cutler almost 7 years ago
rgw suite on wip-jewel-backports-rgw-fixes branch¶
86 pass, 10 fail (96 total) http://pulpito.ceph.com/abhi-2017-06-02_14:29:52-rgw-wip-jewel-backports-rgw-fixes-distro-basic-smithi/- all 10 failures are
'/home/ubuntu/cephtest/archive/syslog/misc.log:2017-06-03T02:38:23.151386+00:00 smithi114 ceph-create-keys[73655]: INFO:ceph-create-keys:ceph-mon admin socket not ready yet. ' in syslog
- i.e. no valgrind failures \o/ - the failure is tracked at #20171 - 7 FAILED http://pulpito.ceph.com/abhi-2017-06-06_08:38:40-rgw-wip-jewel-backports-rgw-fixes-distro-basic-smithi/ with the same env. error above
- 4 FAILED http://pulpito.ceph.com/abhi-2017-06-06_13:16:36-rgw-wip-jewel-backports-rgw-fixes-distro-basic-smithi/ (ditto)
- RUNNING http://pulpito.ceph.com/abhi-2017-06-07_08:35:51-rgw-wip-jewel-backports-rgw-fixes-distro-basic-smithi/
Updated by Nathan Cutler almost 7 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/f77f547ca1680f7f8491c50ddac4b8d45f6882d0/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 15842
- |\
- | + qa/suites/upgrade/hammer-x: set sortbitwise for jewel clusters
- + Pull request 15529
- |\
- | + osd: Move scrub sleep timer to osdservice
- | + osd: Implement asynchronous scrub sleep
- + Pull request 15472
- |\
- | + client: update the 'approaching max_size' code
- | + mds: limit client writable range increment
- + Pull request 15468
- |\
- | + osdc/Journaler: avoid executing on_safe contexts prematurely
- | + osdc/Journaler: make header write_pos align to boundary of flushed entry
- + Pull request 15438
- |\
- | + mds: issue new caps when sending reply to client
- + Pull request 15383
- |\
- | + cls/rgw: list_plain_entries() stops before bi_log entries
- + Pull request 15000
- |\
- | + pybind: fix cephfs.OSError initialization
- | + pybind: fix open flags calculation
- | + fs: normalize file open flags internally used by cephfs
- + Pull request 14930
- |\
- | + tests: upgrade/hammer-x: new v0-94-6-mon-overload subsuite
- + Pull request 14392
- + jewel: osd: pg_pool_t::encode(): be compatible with Hammer <= 0.94.6
Updated by Nathan Cutler almost 7 years ago
rados¶
Using teuthology branch wip-20171
to avoid the silly regression http://tracker.ceph.com/issues/20171
teuthology-suite -k distro --priority 101 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --teuthology-branch wip-20171
2 failed, 225 pass (227 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-22_11:36:26-rados-wip-jewel-backports-distro-basic-smithi/
- infrastructure noise ENOSPC (smithis have smaller disks and some of the tests can max them out) - see https://github.com/ceph/ceph/pull/15529#issuecomment-310476078
- new bug, covered in RGW #20392
Updated by Nathan Cutler almost 7 years ago
powercycle¶
teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 101 -l 2 --email ncutler@suse.com
Updated by Nathan Cutler almost 7 years ago
Upgrade jewel point-to-point-x¶
teuthology-suite -k distro --verbose --suite upgrade/jewel-x/point-to-point-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
Updated by Nathan Cutler almost 7 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com --teuthology-branch wip-20171
- several failures are because the sortbitwise flag is not being set upon upgrade to jewel; addressed by https://github.com/ceph/ceph/pull/15842 - needs integration branch repopulate
Re-running with https://github.com/ceph/ceph/pull/15842 included:
8 failed, 10 passed (18 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-22_11:47:33-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- two failures are because I missed a test in https://github.com/ceph/ceph/pull/15842 - fixed
- the remaining failures are all either the RGW swift.py issue (see RGW below) or because hammer does not have a libcephfs-java package for Xenial - maybe a case of http://tracker.ceph.com/issues/19681
Updated by Nathan Cutler almost 7 years ago
fs¶
teuthology-suite -k distro --priority 101 --suite fs --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --teuthology-branch wip-20171
1 failed, 87 passed (88 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-22_11:51:56-fs-wip-jewel-backports-distro-basic-smithi/
- new bug http://tracker.ceph.com/issues/20412 "Test failure: test_remote_update_write (tasks.cephfs.test_quota.TestQuota)"
Re-running 1 failed job
fail http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-25_07:23:04-fs-wip-jewel-backports-distro-basic-smithi/ - #20412 again
Updated by Nathan Cutler almost 7 years ago
rgw¶
teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 2)/2 --teuthology-branch wip-20171
- new bug #20392 (Most, if not all, of the failures are due to incompatible tests added to ceph/swift.git master and to the fact that we are using a single swift.py task for all Ceph versions.)
Updated by Nathan Cutler almost 7 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/98045e76d74a57a5d859b4e2e742dc64722f70cb/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 15904
- |\
- | + tests: upgrade/hammer-x/stress-split: tweak packages list
- + Pull request 15870
- |\
- | + tests: swift.py: tweak imports
- | + tests: swift.py: clone the ceph-jewel branch
- | + Merge branch 'master' of /home/smithfarm/src/ceph/upstream/teuthology into wip-swift-task-move-jewel
- | + tests: move swift.py task to qa/tasks
- | + swift: added --cluster to rgw-admin command for multisite support
- | + Pull request 470
- | |\
- | + \ Pull request 466
- | |\ \
- | | |/
- | + | Pull request 462
- | |\ \
- | | |/
- | |/|
- | + | Pull request 460
- | |\ \
- | | |/
- | |/|
- | + | swift: set full access to subusers creation
- | |/
- | + Remove most ceph-specific tasks. They are in ceph-qa-suite now.
- | + Revert Lines formerly of the form '(remote,) = ctx.cluster.only(role).remotes.keys()'
- | + Lines formerly of the form '(remote,) = ctx.cluster.only(role).remotes.keys()' and '(remote,) = ctx.cluster.only(role).remotes.iterkeys()' would fail with ValueError and no message if there were less than 0 or more than 1 key. Now a new function, get_single_remote_value() is called which prints out more understandable messages.
- | + Pull request 186
- | |\
- | + \ Pull request 188
- | |\ \
- | | |/
- | + | Pull request 192
- | |\ \
- | + \ \ Pull request 194
- | |\ \ \
- | | |/ /
- | + | | Pull request 193
- | |\ \ \
- | | + | | Add doc strings to Swift tests
- | | |/ /
- | + | | Pull request 187
- | |\ \ \
- | | |/ /
- | |/| /
- | | |/
- | + | Add docstrings to s3 related tasks.
- | |/
- | + Fix namespace collision
- | + Pull request 106
- | |\
- | | + Don't hardcode the git://ceph.com/git/ mirror
- | |/
- | + Pull request 78
- | |\
- | | + Helper scripts live in /usr/local/bin now!
- | |/
- | + s3tests: extend for multi-region tests
- | + Pull request 41
- | |\
- | + \ Pull request 40
- | |\ \
- | | |/
- | |/|
- | + | Fix some instances where print is being used instead of log
- | |/
- | + s3/swift tests: call radosgw-admin as the right client
- | + s3tests: clone correct branch
- | + Merge branch 'master' of github.com:ceph/teuthology
- | |\
- | + \ Merge remote-tracking branch 'origin/wip-sandon-vm'
- | |\ \
- | | |/
- | |/|
- | + | Merge branch 'wip-centos-rgw'
- | |\ \
- | | |/
- | |/|
- | | + s3tests: fix client configurations that aren't dictionaries
- | |/
- | + Pull request 15
- | |\
- | | + enable-coredump -> adjust-ulimits
- | |/
- | + Merge branch 'wip-teuth4768a-wusui'
- | |\
- | + \ Merge branch 'next'
- | |\ \
- | + | | s3tests: add force-branch with higher precdence than 'branch'
- | | |/
- | |/|
- | + | Merge remote branch 'origin/next'
- | |\ \
- | | |/
- | | + fix some errors found by pyflakes
- | | + s3tests: revert useless portion of 1c50db6a4630d07e72144dafd985c397f8a42dc5
- | | + rgw tests: remove users after each test
- | | + rgw tests: clean up immediately after the test
- | | + swift, s3readwrite: add missing yield
- | | + s3tests, s3readwrite, swift: cleanup explicitly
- | |/
- | + Merge remote-tracking branch 'origin/wip-3634'
- | |\
- | + \ Merge branch 'unstable'
- | |\ \
- | | + | Install ceph debs and use installed debs
- | |/ /
- | + | Replace /tmp/cephtest/ with configurable path
- | + | task/swift: change upstream repository url
- | + | Merge branch 'wip-mon-thrasher'
- | |\ \
- | + | | s3tests: fix typo
- | |/ /
- | + | rgw-logsocket: a task to verify opslog socket works
- | |/
- | + s3tests: run against arbitrary branch/sha1 of s3-tests.git
- | + pull s3-tests.git using git, not http
- | + ceph.newdream.net -> ceph.com
- | + Merge branch 'master' of github.com:ceph/teuthology
- | |\
- | | + github.com/NewDreamNetwork -> github.com/ceph
- | |/
- | + Add necessary imports for s3 tasks, and keep them alphabetical.
- | + rgw: access key uses url safe chars
- | + use local mirrors for (most) github urls
- | + Rename testrados and testswift tasks to not begin with test .
- | + testswift: fix config
- | + rgw: add swift task
- | + s3-tests: use radosgw-admin instead of radosgw_admin
- | + s3tests: Clone repository from github.
- | + Move orchestra to teuthology.orchestra so there's just one top-level package.
- | + Callers of task s3tests.create_users don't need to provide dummy fixtures dict.
- | + allow s3tests.create_users defaults be overridden
- | + Make targets a dictionary mapping hosts to ssh host keys.
- | + Skip s3-tests marked fails_on_rgw, they will fail anyway.
- | + The shell exits after the command, hence there is no need for pushd/popd.
- | + Add s3tests task.
- + Pull request 15842
- |\
- | + qa/suites/upgrade/hammer-x: set sortbitwise for jewel clusters
- + Pull request 15468
- |\
- | + osdc/Journaler: avoid executing on_safe contexts prematurely
- | + osdc/Journaler: make header write_pos align to boundary of flushed entry
- + Pull request 15438
- |\
- | + mds: issue new caps when sending reply to client
- + Pull request 15383
- |\
- | + cls/rgw: list_plain_entries() stops before bi_log entries
- + Pull request 15000
- |\
- | + pybind: fix cephfs.OSError initialization
- | + pybind: fix open flags calculation
- | + fs: normalize file open flags internally used by cephfs
- + Pull request 14930
- |\
- | + tests: upgrade/hammer-x/v0-94-6-mon-overload: tweak packages list
- | + tests: upgrade/hammer-x: new v0-94-6-mon-overload subsuite
- + Pull request 14626
- |\
- | + tests: 'failed to encode ...' warnings are normal on upgrades
- + Pull request 14392
- + jewel: osd: pg_pool_t::encode(): be compatible with Hammer <= 0.94.6
Updated by Nathan Cutler almost 7 years ago
rgw¶
Partial run to verify fix is viable:
teuthology-suite -k distro --priority 101 --rerun smithfarm-2017-06-22_11:53:25-rgw-wip-jewel-backports-distro-basic-smithi --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --teuthology-branch wip-20392
Full rgw run:
teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 2)/2 --teuthology-branch wip-20392
1 fail, 95 pass (96 total) http://pulpito.front.sepia.ceph.com/smithfarm-2017-06-25_16:38:58-rgw-wip-jewel-backports-distro-basic-smithi/
- failed job is with apache frontend, so not a high priority to fix
Re-running 1 failed job:
Updated by Nathan Cutler almost 7 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com --teuthology-branch wip-20392
3 failed, 15 passed (18 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-25_17:11:13-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
Updated by Nathan Cutler almost 7 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/015dd1136459b15885142a76769efb360c945baf/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 15904
- |\
- | + tests: upgrade/hammer-x/stress-split: tweak packages list
- + Pull request 14930
- |\
- | + tests: upgrade/hammer-x/v0-94-6-mon-overload: tweak packages list
- | + tests: upgrade/hammer-x: new v0-94-6-mon-overload subsuite
- + Pull request 14626
- |\
- | + tests: 'failed to encode ...' warnings are normal on upgrades
- + Pull request 14392
- + jewel: osd: pg_pool_t::encode(): be compatible with Hammer <= 0.94.6
Updated by Nathan Cutler almost 7 years ago
upgrade/hammer-x¶
teuthology-suite -k distro --ceph wip-jewel-backports --rerun smithfarm-2017-06-25_17:11:13-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps --machine-type vps --priority 101 --email ncutler@suse.com --teuthology-branch wip-20392
running with bad PR#14930, but the other two tests are valid re-runs http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-26_20:18:43-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
Updated by Nathan Cutler almost 7 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/9117553ee1ff17c305c86948ea6ae1d167f0cf92/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 15936
- |\
- | + qa: enable quotas for pre-luminous quota tests
- + Pull request 15904
- |\
- | + tests: upgrade/hammer-x/stress-split: tweak packages list
- + Pull request 14930
- |\
- | + [SQUASH] drop the wait loop
- | + [SQUASH] fix the test so it fails on jewel
- | + tests: upgrade/hammer-x/v0-94-6-mon-overload: tweak packages list
- | + tests: upgrade/hammer-x: new v0-94-6-mon-overload subsuite
- + Pull request 14626
- |\
- | + tests: 'failed to encode ...' warnings are normal on upgrades
- + Pull request 14392
- + jewel: osd: pg_pool_t::encode(): be compatible with Hammer <= 0.94.6
Updated by Nathan Cutler almost 7 years ago
fs¶
teuthology-suite -k distro --priority 101 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --rerun smithfarm-2017-06-22_11:51:56-fs-wip-jewel-backports-distro-basic-smithi
Result reported in https://github.com/ceph/ceph/pull/15936 (merged)
Updated by Nathan Cutler almost 7 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/ca7ab74ae7884f24983d94b729cc262108ff6aba/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 15904
- |\
- | + tests: upgrade/hammer-x/stress-split: tweak packages list
- + Pull request 14930
- |\
- | + tests: upgrade/hammer-x/v0-94-6-mon-overload: tweak packages list
- | + tests: upgrade/hammer-x: new v0-94-6-mon-overload subsuite
- + Pull request 14392
- + jewel: osd: pg_pool_t::encode(): be compatible with Hammer <= 0.94.6
Updated by Nathan Cutler almost 7 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com --teuthology-branch wip-20392
2 fail, 16 pass (18 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-27_15:24:41-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- both failed jobs appear to be http://tracker.ceph.com/issues/13381 (regression?)
Rerun on smithi:
1 fail, 1 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-30_09:08:18-upgrade:hammer-x-wip-jewel-backports-distro-basic-smithi/
- failure is "ceph-objectstore-tool: exp list-pgs failure with status 1"
Rerun on vps:
Ruled a pass
Updated by Nathan Cutler almost 7 years ago
rados¶
teuthology-suite -k distro --priority 101 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --teuthology-branch wip-20392
7 fail, 220 pass (227 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-27_19:13:42-rados-wip-jewel-backports-distro-basic-smithi/
- five of the failures are infrastructure noise
- the sixth might be a new bug: http://tracker.ceph.com/issues/20449
- the seventh is ENOSPC, presumably because the smithis have smaller disks (so, infrastructure noise)
Re-run:
3 pass, 4 fail (7 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-28_10:10:16-rados-wip-jewel-backports-distro-basic-smithi/
- all four failures are ansible-related
Re-run:
Ruled a pass
Updated by Nathan Cutler almost 7 years ago
ceph-disk¶
teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
Updated by Yuri Weinstein almost 7 years ago
QE VALIDATION (STARTED 7/1/17)¶
(Note: PASSED / FAILED - indicates "TEST IS IN PROGRESS")
re-runs command lines and filters are captured in http://pad.ceph.com/p/hammer_v10.2.8_QE_validation_notes
command line CEPH_QA_MAIL="ceph-qa@ceph.com"; MACHINE_NAME=smithi; CEPH_BRANCH=jewel; SHA1=53a3be7261cfeb12445fbdba8238eefa40ed09f5 ; teuthology-suite -v --ceph-repo https://github.com/ceph/ceph.git --suite-repo https://github.com/ceph/ceph.git -c $CEPH_BRANCH -S $SHA1 -m $MACHINE_NAME -s rados --subset 35/50 -k distro -p 100 -e $CEPH_QA_MAIL --suite-branch jewel --dry-run
teuthology-suite -v -c $CEPH_BRANCH -S $SHA1 -m $MACHINE_NAME -r $RERUN --suite-repo https://github.com/ceph/ceph.git --ceph-repo https://github.com/ceph/ceph.git --suite-branch jewel -p 90 -R fail,dead,running
Suite | Runs/Reruns | Notes/Issues |
rgw | http://pulpito.ceph.com/yuriw-2017-06-30_22:56:30-rgw-jewel-distro-basic-smithi/ | PASSED see one "saw valgrind issues" |
http://pulpito.ceph.com/yuriw-2017-07-01_04:05:30-rgw-jewel-distro-basic-smithi/ | passed on rerun | |
rbd | http://pulpito.ceph.com/yuriw-2017-06-30_22:59:10-rbd-jewel-distro-basic-smithi/ | PASSED |
krbd | http://pulpito.ceph.com/yuriw-2017-07-01_04:10:21-krbd-jewel-testing-basic-smithi/ | FAILED approved by Ilya |
http://pulpito.front.sepia.ceph.com:80/yuriw-2017-07-03_15:58:06-krbd-jewel-testing-basic-smithi/ | rerun per Ilya | |
kcephfs | http://pulpito.ceph.com/yuriw-2017-07-01_04:11:25-kcephfs-jewel-testing-basic-smithi/ | PASSED |
knfs | http://pulpito.ceph.com/yuriw-2017-07-01_04:12:02-knfs-jewel-testing-basic-smithi/ | PASSED |
rest | http://pulpito.ceph.com/yuriw-2017-07-01_14:54:12-rest-jewel-distro-basic-smithi/ | PASSED |
hadoop | http://pulpito.ceph.com/yuriw-2017-07-01_14:54:48-hadoop-jewel-distro-basic-smithi/ | FAILED #19456 EXCLUDED FROM THIS RELEASE |
samba | EXCLUDED FROM THIS RELEASE | |
ceph-deploy | http://pulpito.ceph.com/yuriw-2017-07-01_14:55:48-ceph-deploy-jewel-distro-basic-vps/ | PASSED |
http://pulpito.ceph.com/yuriw-2017-07-03_23:00:14-ceph-deploy-jewel-distro-basic-vps/ | ||
ceph-disk | http://pulpito.ceph.com/yuriw-2017-07-01_14:56:06-ceph-disk-jewel-distro-basic-vps/ | PASSED |
upgrade/hammer-x (jewel) | http://pulpito.ceph.com/yuriw-2017-07-01_14:57:44-upgrade:hammer-x-jewel-distro-basic-vps/ | PASSED |
powercycle | http://pulpito.ceph.com/yuriw-2017-07-01_04:12:42-powercycle-jewel-testing-basic-smithi/ | PASSED |
ceph-ansible | http://pulpito.ceph.com/yuriw-2017-07-01_15:11:28-ceph-ansible-jewel-distro-basic-vps/ | PASSED |
PASSED / FAILED | ||
Updated by Nathan Cutler almost 7 years ago
- Description updated (diff)
Updated release SHA1 to 66dbf9beef04988dbd3653591e51afa6d84e3990
Updated by Nathan Cutler almost 7 years ago
- Status changed from In Progress to Resolved