Tasks #19538: jewel v10.2.8 - Stable releases - Ceph

Actions

Copy link

#1

Updated by Nathan Cutler about 7 years ago

Target version changed from 536 to v10.2.8

Actions

Copy link

#2

Updated by Nathan Cutler about 7 years ago

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

+ Pull request 13107
|\
| + librbd: improve debug logging for lock / watch state machines
| + test: use librados API to retrieve config params
+ Pull request 13154
|\
| + librbd: possible deadlock with flush if refresh in-progress
+ Pull request 13212
|\
| + test/osd: add test for fast mark down functionality
| + msg/async: implement ECONNREFUSED detection
| + messages/MOSDFailure.h: distinguish between timeout and immediate failure
| + OSD: Implement ms_handle_refused
| + msg/simple: add ms_handle_refused callback
| + AsyncConnection: fix delay state using dispatch_queue
| + AsyncConnection: need to prepare message when features mismatch
| + AsyncConnection: continue to read when meeting EINTR
| + AsyncConnection: release dispatch throttle with fast dispatch message
| + DispatchQueue: remove pipe words
| + DispatchQueue: add name to separte different instance
| + AsyncConnection: add DispathQueue throttle
| + AsyncConnection: change all exception deliver to DispatchQueue
| + AsyncConnection: make local message deliver via DispatchQueue
| + AsyncMessenger: introduce DispatchQueue to separate nonfast message
| + DispatchQueue: move dispatch_throtter from SimpleMessenger to DispatchQueue
| + DispatchQueue: Move from msg/simple to msg
+ Pull request 13214
|\
| + OSD: allow client throttler to be adjusted on-fly, without restart
+ Pull request 13244
|\
| + osdc: cache should ignore error bhs during trim
+ Pull request 13254
|\
| + radosstriper : protect aio_write API from calls with 0 bytes
+ Pull request 13261
|\
| + mon/OSDMonitor: make 'osd crush move ...' work on osds
+ Pull request 13450
|\
| + msg/simple/Pipe: support IPv6 QoS.
| + msg/simple: cleanups
+ Pull request 13477
|\
| + ceph-osd: --flush-journal: sporadic segfaults on exit
+ Pull request 13489
|\
| + ceph-disk: Fix getting wrong group name when --setgroup in bluestore
+ Pull request 13492
|\
| + systemd: Start OSDs after MONs
+ Pull request 13541
|\
| + osd/PG: restrict want_acting to up+acting on recovery completion
+ Pull request 13544
|\
| + auth: 'ceph auth import -i' overwrites caps, if caps are not specified in given keyring file, should alert user and should not allow this import. Because in 'ceph auth list' we keep all the keyrings with caps and importing 'client.admin' user keyring without caps locks the cluster with error 1 because admin keyring caps are missing in 'ceph auth'.
+ Pull request 13552
|\
| + rgw: assume obj write is a first write
+ Pull request 13585
|\
| + msg/simple: set close on exec on server sockets
| + msg/async: set close on exec on server sockets
+ Pull request 13606
|\
| + build/ops: rpm: move $CEPH_EXTRA_CONFIGURE_ARGS to right place
| + build/ops: rpm: explicitly provide --with-ocf to configure
| + rpm: build ceph-resource-agents by default
+ Pull request 13608
|\
| + tests: Thrasher: eliminate a race between kill_osd and init
+ Pull request 13647
|\
| + os: make zero values noops for set_alloc_hint() in FileStore
| + osd: preserve allocation hint attribute during recovery
+ Pull request 13724
|\
| + rgw: Use decoded URI when verifying TempURL
+ Pull request 13732
|\
| + PendingReleaseNotes: warning about 'osd rm ...' and #19119
+ Pull request 13779
|\
| + rgw: metadata sync info should be shown at master zone of slave zonegroup
+ Pull request 13786
|\
| + build/ops: add psmisc dependency to ceph-base
+ Pull request 13787
|\
| + osdc/Objecter: If osd full, it should pause read op which w/ rwordered flag.
+ Pull request 13788
|\
| + os/filestore/HashIndex: be loud about splits
+ Pull request 13809
|\
| + librbd: remove image header lock assertions
+ Pull request 13827
|\
| + osd/ReplicatedPG: try with pool's use-gmt setting if hitset archive not found
+ Pull request 13831
|\
| + server: negative error code when responding to client
+ Pull request 13833
|\
| + rgw: the swift container acl should support field .ref
+ Pull request 13834
|\
| + rgw: change log level to 20 for 'System already converted' message
+ Pull request 13837
|\
| + rgw: fix for broken yields in RGWMetaSyncShardCR
| + rgw: kill a compile warning for rgw_sync
+ Pull request 13842
|\
| + rgw: don't init rgw_obj from rgw_obj_key when it's incorrect to do so
+ Pull request 13863
|\
| + rgw: Fixes typo in rgw_admin.cc
+ Pull request 13865
|\
| + rgw: Correct the return codes for the health check feature Fixes: http://tracker.ceph.com/issues/19025 Signed-off-by: Pavan Rallabhandi <PRallabhandi@walmartlabs.com>
+ Pull request 13872
|\
| + rgw: Let the object stat command be shown in the usage
+ Pull request 13874
|\
| + doc: rgw: make a note abt system users vs normal users
+ Pull request 13932
|\
| + rbd-nbd: check /sys/block/nbdX/size to ensure kernel mapped correctly
+ Pull request 14044
|\
| + os/filestore: fix clang static check warn use-after-free
+ Pull request 14047
|\
| + jewel: osd/PGLog: reindex properly on pg log split
+ Pull request 14064
|\
| + rgw: delete_system_obj() fails on empty object name
| + rgw: if user.email is empty, dont try to delete
+ Pull request 14066
|\
| + rgw: fix break inside of yield in RGWFetchAllMetaCR
+ Pull request 14070
|\
| + Revert dummy: reduce run time, run user.yaml playbook
+ Pull request 14083
|\
| + doc: update description of rbdmap unmap[-all] behaviour
| + doc: add verbiage to rbdmap manpage
| + rbdmap: unmap RBDMAPFILE images unless called with unmap-all
+ Pull request 14112
|\
| + brag: count the number of mds in fsmap not in mdsmap
| + brag: Assume there are 0 MDS instead of crashing when data is missing
+ Pull request 14113
|\
| + tools/rados: Check return value of connect
+ Pull request 14136
|\
| + rgw: skip conversion of zones without any zoneparams
| + rgw: better debug information for upgrade
| + rgw/rgw_rados.cc: prefer ++operator for non-primitive iterators
+ Pull request 14143
|\
| + rgw: use rgw_zone_root_pool for region_map like is done in hammer
+ Pull request 14148
|\
| + rbd: destination pool should be source pool if it is not specified
+ Pull request 14150
|\
| + librbd: avoid possible recursive lock when racing acquire lock
+ Pull request 14152
|\
| + librbd: Include WorkQueue.h since we use it
+ Pull request 14154
|\
| + qa/workunits/rbd: resolve potential rbd-mirror race conditions
+ Pull request 14181
|\
| + osd: bypass readonly ops when osd full.
+ Pull request 14236
|\
| + mon: remove bad rocksdb option
+ Pull request 14324
|\
| + common: fix segfault in public IPv6 addr picking
+ Pull request 14325
|\
| + osd: Calculate degraded and misplaced more accurately
+ Pull request 14326
|\
| + osd: don't share osdmap with objecter when preboot
+ Pull request 14329
|\
| + ceph-disk: Adding retry loop in get_partition_dev()
| + ceph-disk: Reporting /sys directory in get_partition_dev()
+ Pull request 14368
|\
| + jewel: rgw: fix listing of objects that start with underscore
+ Pull request 14371
|\
| + qa/tasks/workunit.py: use overrides as the default settings of workunit
| + tasks/workunit.py: specify the branch name when cloning a branch
| + tasks/workunit.py: when cloning, use --depth=1
+ Pull request 14377
+ rgw_file: fix missing unlock in unlink
+ rgw_file: implement reliable has-children check

Actions

Copy link

#3

Updated by Nathan Cutler about 7 years ago

Status changed from New to In Progress

Actions

Copy link

#4

Updated by Nathan Cutler about 7 years ago

rados¶

teuthology-suite -k distro --priority 1000 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

fail (5 fail, 222 passed, 227 total) http://pulpito.ceph.com:80/smithfarm-2017-04-07_07:52:35-rados-wip-jewel-backports-distro-basic-smithi/

Re-running 5 failed jobs:

3 pass, 2 fail http://pulpito.ceph.com:80/smithfarm-2017-04-09_13:21:35-rados-wip-jewel-backports-distro-basic-smithi/

Re-running 2 failed jobs 5 times each:

9 fail, 1 pass http://pulpito.ceph.com:80/smithfarm-2017-04-09_20:57:12-rados-wip-jewel-backports-distro-basic-smithi/

Problematic jobs are:

fails every time: rados/verify/{1thrash/default.yaml clusters/{fixed-2.yaml openstack.yaml} fs/btrfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml tasks/mon_recovery.yaml validater/valgrind.yaml}
fails almost every time: rados/singleton/{all/ec-lost-unfound-upgrade.yaml fs/xfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml}

Of these, Josh thinks only the async messenger-related valgrind issues are a real issue - these might be caused by https://github.com/ceph/ceph/pull/13585 or https://github.com/ceph/ceph/pull/13212

Actions

Copy link

#5

Updated by Nathan Cutler about 7 years ago

powercycle¶

teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 1000 -l 2 --email ncutler@suse.com

pass http://pulpito.ceph.com:80/smithfarm-2017-04-07_08:04:27-powercycle-wip-jewel-backports-distro-basic-smithi/

Actions

Copy link

#6

Updated by Nathan Cutler about 7 years ago

Upgrade jewel point-to-point-x¶

teuthology-suite -k distro --verbose --suite upgrade/jewel-x/point-to-point-x --ceph wip-jewel-backports --machine-type vps --priority 1000 --email ncutler@suse.com

pass http://pulpito.ceph.com:80/smithfarm-2017-04-07_08:05:41-upgrade:jewel-x:point-to-point-x-wip-jewel-backports-distro-basic-vps/

Actions

Copy link

#7

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

fail (2 dead, 15 pass, out of 17 total) http://pulpito.ceph.com:80/smithfarm-2017-04-07_08:06:45-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/

Re-running 2 dead jobs:

1 pass, 1 fail http://pulpito.ceph.com:80/smithfarm-2017-04-09_13:19:31-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- failure is http://tracker.ceph.com/issues/19556

Re-running the last remaining failed job 5 times:

1 fail, 1 dead, 3 pass http://pulpito.ceph.com:80/smithfarm-2017-04-09_17:36:23-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/

Pushed https://github.com/ceph/ceph-ci/commit/d49d11e714020220e49949c591b0743538212beb to fix http://tracker.ceph.com/issues/19556

Ruled a pass

Actions

Copy link

#8

Updated by Nathan Cutler about 7 years ago

ceph-disk¶

teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

pass http://pulpito.ceph.com:80/smithfarm-2017-04-07_08:08:28-ceph-disk-wip-jewel-backports-distro-basic-vps/

Actions

Copy link

#9

Updated by Nathan Cutler about 7 years ago

fs¶

teuthology-suite -k distro --priority 1000 --suite fs --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

fail (2 failed, 85 passed, 87 total) http://pulpito.ceph.com/smithfarm-2017-04-07_08:09:44-fs-wip-jewel-backports-distro-basic-smithi/
- both failures are btrfs

Re-running 2 failed jobs:

fail (1 fail, 1 pass, 2 total) http://pulpito.ceph.com:80/smithfarm-2017-04-09_13:23:36-fs-wip-jewel-backports-distro-basic-smithi/
- fs/basic/{clusters/fixed-2-ucephfs.yaml debug/mds_client.yaml dirfrag/frag_enable.yaml fs/btrfs.yaml inline/yes.yaml overrides/whitelist_wrongly_marked_down.yaml tasks/cfuse_workunit_suites_ffsb.yaml}
- assert

2017-04-09T13:40:44.797 INFO:tasks.ceph.osd.3.smithi059.stderr:os/filestore/FileStore.cc: In function 'void FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7f65124fc700 time 2017-04-09 13:40:44.804790
2017-04-09T13:40:44.797 INFO:tasks.ceph.osd.3.smithi059.stderr:os/filestore/FileStore.cc: 2920: FAILED assert(0 == "unexpected error")

Re-running failed job 6 times:

1 fail http://pulpito.ceph.com:80/smithfarm-2017-04-09_14:14:03-fs-wip-jewel-backports-distro-basic-smithi/

3 fail, 2 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-04-09_14:15:42-fs-wip-jewel-backports-distro-basic-smithi/

All the same error, i.e.:

2017-04-09T14:29:48.837 INFO:tasks.ceph.osd.3.smithi063.stderr:os/filestore/FileStore.cc: In function 'void FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7fcd5a7f8700 time 2017-04-09 14:29:48.854235
2017-04-09T14:29:48.837 INFO:tasks.ceph.osd.3.smithi063.stderr:os/filestore/FileStore.cc: 2920: FAILED assert(0 == "unexpected error")

So, fs is ruled a pass ~~but there are no fs backports staged~~ (correction: there is one)

Actions

Copy link

#10

Updated by Nathan Cutler about 7 years ago

rgw¶

teuthology-suite -k distro --priority 1000 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

fail http://pulpito.ceph.com:80/smithfarm-2017-04-07_09:30:57-rgw-wip-jewel-backports-distro-basic-smithi/
- massive failure (50+ failed jobs), due mostly (if not all) to this:

2017-04-09T00:20:37.988 INFO:teuthology.orchestra.run.smithi167.stderr:======================================================================
2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:FAIL: s3tests.functional.test_s3.test_versioning_obj_suspend_versions
2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:----------------------------------------------------------------------
2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:Traceback (most recent call last):
2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:  File "/home/ubuntu/cephtest/s3-tests/virtualenv/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:    self.test(*self.arg)
2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:  File "/home/ubuntu/cephtest/s3-tests/s3tests/functional/test_s3.py", line 6385, in test_versioning_obj_suspend_versions
2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:    overwrite_suspended_versioning_obj(bucket, objname, k, c, 'null content 2')
2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:  File "/home/ubuntu/cephtest/s3-tests/s3tests/functional/test_s3.py", line 6243, in overwrite_suspended_versioning_obj
2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:    check_obj_versions(bucket, objname, k, c)
2017-04-09T00:20:37.989 INFO:teuthology.orchestra.run.smithi167.stderr:  File "/home/ubuntu/cephtest/s3-tests/s3tests/functional/test_s3.py", line 6080, in check_obj_versions
2017-04-09T00:20:37.990 INFO:teuthology.orchestra.run.smithi167.stderr:    eq(keys[i].version_id or 'null', key.version_id)
2017-04-09T00:20:37.990 INFO:teuthology.orchestra.run.smithi167.stderr:AssertionError: u'yGSvpxjEbJRBE.JL76y4OzeJISqDtmJ' != u'null'

Actions

Copy link

#11

Updated by Nathan Cutler about 7 years ago

rbd¶

teuthology-suite -k distro --priority 1000 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

fail (1 dead, 253 passed, 254 total) http://pulpito.ceph.com:80/smithfarm-2017-04-07_09:35:09-rbd-wip-jewel-backports-distro-basic-smithi/
- description: rbd/thrash/{base/install.yaml clusters/{fixed-2.yaml openstack.yaml} fs/xfs.yaml msgr-failures/few.yaml thrashers/cache.yaml workloads/rbd_fsx_nbd.yaml}
- assert in ceph-objectstore-tool

2017-04-09T01:48:49.750 INFO:teuthology.orchestra.run.smithi051:Running: 'sudo adjust-ulimits ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-2 --journal-path /var/lib/ceph/osd/ceph-2/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op export --pgid 0.7 --file /home/ubuntu/cephtest/ceph.data/exp.0.7.2'
2017-04-09T01:48:49.850 INFO:teuthology.orchestra.run.smithi051.stderr:osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, ceph::bufferlist*)' thread 7f052ae89980 time 2017-04-09 01:48:49.853858
2017-04-09T01:48:49.850 INFO:teuthology.orchestra.run.smithi051.stderr:osd/PG.cc: 2967: FAILED assert(values.size() == 2)

Re-running 1 dead job:

dead http://pulpito.ceph.com:80/smithfarm-2017-04-09_13:27:00-rbd-wip-jewel-backports-distro-basic-smithi/
- same error as before

Ruled a pass by Jason

Actions

Copy link

#12

Updated by Nathan Cutler about 7 years ago

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

+ Pull request 13107
|\
| + librbd: improve debug logging for lock / watch state machines
| + test: use librados API to retrieve config params
+ Pull request 13154
|\
| + librbd: possible deadlock with flush if refresh in-progress
+ Pull request 13212
|\
| + test/osd: add test for fast mark down functionality
| + msg/async: implement ECONNREFUSED detection
| + messages/MOSDFailure.h: distinguish between timeout and immediate failure
| + OSD: Implement ms_handle_refused
| + msg/simple: add ms_handle_refused callback
| + AsyncConnection: fix delay state using dispatch_queue
| + AsyncConnection: need to prepare message when features mismatch
| + AsyncConnection: continue to read when meeting EINTR
| + AsyncConnection: release dispatch throttle with fast dispatch message
| + DispatchQueue: remove pipe words
| + DispatchQueue: add name to separte different instance
| + AsyncConnection: add DispathQueue throttle
| + AsyncConnection: change all exception deliver to DispatchQueue
| + AsyncConnection: make local message deliver via DispatchQueue
| + AsyncMessenger: introduce DispatchQueue to separate nonfast message
| + DispatchQueue: move dispatch_throtter from SimpleMessenger to DispatchQueue
| + DispatchQueue: Move from msg/simple to msg
+ Pull request 13214
|\
| + OSD: allow client throttler to be adjusted on-fly, without restart
+ Pull request 13244
|\
| + osdc: cache should ignore error bhs during trim
+ Pull request 13254
|\
| + radosstriper : protect aio_write API from calls with 0 bytes
+ Pull request 13261
|\
| + mon/OSDMonitor: make 'osd crush move ...' work on osds
+ Pull request 13450
|\
| + msg/simple/Pipe: support IPv6 QoS.
| + msg/simple: cleanups
+ Pull request 13477
|\
| + ceph-osd: --flush-journal: sporadic segfaults on exit
+ Pull request 13489
|\
| + ceph-disk: Fix getting wrong group name when --setgroup in bluestore
+ Pull request 13492
|\
| + systemd: Start OSDs after MONs
+ Pull request 13541
|\
| + osd/PG: restrict want_acting to up+acting on recovery completion
+ Pull request 13544
|\
| + auth: 'ceph auth import -i' overwrites caps, if caps are not specified in given keyring file, should alert user and should not allow this import. Because in 'ceph auth list' we keep all the keyrings with caps and importing 'client.admin' user keyring without caps locks the cluster with error 1 because admin keyring caps are missing in 'ceph auth'.
+ Pull request 13552
|\
| + rgw: assume obj write is a first write
+ Pull request 13585
|\
| + msg/simple: set close on exec on server sockets
| + msg/async: set close on exec on server sockets
+ Pull request 13606
|\
| + build/ops: rpm: move $CEPH_EXTRA_CONFIGURE_ARGS to right place
| + build/ops: rpm: explicitly provide --with-ocf to configure
| + rpm: build ceph-resource-agents by default
+ Pull request 13608
|\
| + tests: Thrasher: eliminate a race between kill_osd and init
+ Pull request 13724
|\
| + rgw: Use decoded URI when verifying TempURL
+ Pull request 13732
|\
| + PendingReleaseNotes: warning about 'osd rm ...' and #19119
+ Pull request 13779
|\
| + rgw: metadata sync info should be shown at master zone of slave zonegroup
+ Pull request 13786
|\
| + build/ops: add psmisc dependency to ceph-base
+ Pull request 13787
|\
| + osdc/Objecter: If osd full, it should pause read op which w/ rwordered flag.
+ Pull request 13788
|\
| + os/filestore/HashIndex: be loud about splits
+ Pull request 13809
|\
| + librbd: remove image header lock assertions
+ Pull request 13827
|\
| + osd/ReplicatedPG: try with pool's use-gmt setting if hitset archive not found
+ Pull request 13831
|\
| + server: negative error code when responding to client
+ Pull request 13833
|\
| + rgw: the swift container acl should support field .ref
+ Pull request 13834
|\
| + rgw: change log level to 20 for 'System already converted' message
+ Pull request 13837
|\
| + rgw: fix for broken yields in RGWMetaSyncShardCR
| + rgw: kill a compile warning for rgw_sync
+ Pull request 13842
|\
| + rgw: don't init rgw_obj from rgw_obj_key when it's incorrect to do so
+ Pull request 13863
|\
| + rgw: Fixes typo in rgw_admin.cc
+ Pull request 13865
|\
| + rgw: Correct the return codes for the health check feature Fixes: http://tracker.ceph.com/issues/19025 Signed-off-by: Pavan Rallabhandi <PRallabhandi@walmartlabs.com>
+ Pull request 13872
|\
| + rgw: Let the object stat command be shown in the usage
+ Pull request 13874
|\
| + doc: rgw: make a note abt system users vs normal users
+ Pull request 13885
|\
| + qa/tasks/ceph_manager: use new luminous set-full-ratio etc
| + qa/tasks/thrashosds: chance_thrash_cluster_full
| + osdc/Objecter: resend RWORDERED ops on full
+ Pull request 13932
|\
| + rbd-nbd: check /sys/block/nbdX/size to ensure kernel mapped correctly
+ Pull request 14044
|\
| + os/filestore: fix clang static check warn use-after-free
+ Pull request 14047
|\
| + jewel: osd/PGLog: reindex properly on pg log split
+ Pull request 14064
|\
| + rgw: delete_system_obj() fails on empty object name
| + rgw: if user.email is empty, dont try to delete
+ Pull request 14066
|\
| + rgw: fix break inside of yield in RGWFetchAllMetaCR
+ Pull request 14070
|\
| + Revert dummy: reduce run time, run user.yaml playbook
+ Pull request 14083
|\
| + doc: update description of rbdmap unmap[-all] behaviour
| + doc: add verbiage to rbdmap manpage
| + rbdmap: unmap RBDMAPFILE images unless called with unmap-all
+ Pull request 14112
|\
| + brag: count the number of mds in fsmap not in mdsmap
| + brag: Assume there are 0 MDS instead of crashing when data is missing
+ Pull request 14113
|\
| + tools/rados: Check return value of connect
+ Pull request 14136
|\
| + rgw: skip conversion of zones without any zoneparams
| + rgw: better debug information for upgrade
| + rgw/rgw_rados.cc: prefer ++operator for non-primitive iterators
+ Pull request 14143
|\
| + rgw: use rgw_zone_root_pool for region_map like is done in hammer
+ Pull request 14148
|\
| + rbd: destination pool should be source pool if it is not specified
+ Pull request 14150
|\
| + librbd: avoid possible recursive lock when racing acquire lock
+ Pull request 14152
|\
| + librbd: Include WorkQueue.h since we use it
+ Pull request 14154
|\
| + qa/workunits/rbd: resolve potential rbd-mirror race conditions
+ Pull request 14181
|\
| + osd: bypass readonly ops when osd full.
+ Pull request 14236
|\
| + mon: remove bad rocksdb option
+ Pull request 14324
|\
| + common: fix segfault in public IPv6 addr picking
+ Pull request 14325
|\
| + osd: Calculate degraded and misplaced more accurately
+ Pull request 14326
|\
| + osd: don't share osdmap with objecter when preboot
+ Pull request 14329
|\
| + ceph-disk: Adding retry loop in get_partition_dev()
| + ceph-disk: Reporting /sys directory in get_partition_dev()
+ Pull request 14371
|\
| + qa/tasks/workunit.py: use overrides as the default settings of workunit
| + tasks/workunit.py: specify the branch name when cloning a branch
| + tasks/workunit.py: when cloning, use --depth=1
+ Pull request 14377
|\
| + rgw_file: fix missing unlock in unlink
| + rgw_file: implement reliable has-children check
+ Pull request 14383
|\
| + debian: replace SysV rbdmap with systemd service
+ Pull request 14416
+ tests: Thrasher: handle OSD has the store locked gracefully

Actions

Copy link

#13

Updated by Nathan Cutler about 7 years ago

rgw¶

teuthology-suite -k distro --priority 1000 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

run killed in favor of newer build http://pulpito.ceph.com:80/smithfarm-2017-04-09_20:58:27-rgw-wip-jewel-backports-distro-basic-smithi/

Actions

Copy link

#14

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

massive failure http://pulpito.ceph.com:80/smithfarm-2017-04-09_21:01:21-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- timeout in rados/test.sh
- Command failed on vpm099 with status 22: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph osd set-full-ratio .001'

Actions

Copy link

#15

Updated by Nathan Cutler about 7 years ago

upgrade/client-upgrade¶

teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

2 fail, 12 pass, 14 total http://pulpito.ceph.com:80/smithfarm-2017-04-09_21:02:46-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- one failure is http://tracker.ceph.com/issues/18089 and can be ignored
- other failure is in upgrade:client-upgrade/infernalis-client-x/basic/{0-cluster/start.yaml 1-install/infernalis-client-x.yaml 2-workload/rbd_api_tests.yaml distros/centos_7.2.yaml}

Re-running the 1 problematic job:

fail http://pulpito.ceph.com:80/smithfarm-2017-04-10_13:30:15-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- failure in TestLibRBD.UpdateFeatures - opened http://tracker.ceph.com/issues/19567 to track, but it is not a blocker

Ruled a pass

Actions

Copy link

#16

Updated by Nathan Cutler about 7 years ago

upgrade/client-upgrade¶

teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

2 fail, 1 dead, 11 pass (14 total) http://pulpito.ceph.com:80/smithfarm-2017-04-10_18:48:08-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/

Re-running 2 failed and 1 dead jobs

2 fail, 1 pass http://pulpito.ceph.com:80/smithfarm-2017-04-10_19:45:29-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- one failure - upgrade:client-upgrade/firefly-client-x/basic/{0-cluster/start.yaml 1-install/firefly-client-x.yaml 2-workload/rbd_cli_import_export.yaml distros/centos_7.2.yaml} - is http://tracker.ceph.com/issues/19571 and is not expected to pass
- second failure - upgrade:client-upgrade/infernalis-client-x/basic/{0-cluster/start.yaml 1-install/infernalis-client-x.yaml 2-workload/rbd_api_tests.yaml distros/centos_7.2.yaml} - is http://tracker.ceph.com/issues/19567 which is not a blocker

Ruled a pass

Actions

Copy link

#17

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

9 failed, 1 dead, 7 passed http://pulpito.ceph.com:80/smithfarm-2017-04-10_18:49:29-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- timeout in rados/test.sh

Re-running 9 failed and 1 dead jobs:

5 failed, 1 dead, 4 passed http://pulpito.ceph.com:80/smithfarm-2017-04-12_10:18:53-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- timeout in rados/test.sh

Actions

Copy link

#18

Updated by Nathan Cutler about 7 years ago

rgw¶

teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

massive failure http://pulpito.ceph.com:80/smithfarm-2017-04-10_18:52:22-rgw-wip-jewel-backports-distro-basic-smithi/

Actions

Copy link

#19

Updated by Nathan Cutler about 7 years ago

Made an integration branch "wip-jewel-backports-rgw" consisting only of jewel PRs labeled "rgw". Will try to reproduce the s3tests.functional.test_s3.test_versioning_obj_suspend_versions failure on it.

The hypothesis is that there is a single problematic PR and it carries the label "rgw".

Reproducer: teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports-rgw --machine-type smithi --filter="rgw/verify/{clusters/fixed-2.yaml frontend/apache.yaml fs/btrfs.yaml msgr-failures/few.yaml overrides.yaml rgw_pool_type/ec-cache.yaml tasks/rgw_s3tests.yaml validater/lockdep.yaml}"

Actions

Copy link

#20

Updated by Nathan Cutler about 7 years ago

wip-jewel-backports-rgw¶

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports-rgw | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

+ Pull request 13552
|\
| + rgw: assume obj write is a first write
+ Pull request 13724
|\
| + rgw: Use decoded URI when verifying TempURL
+ Pull request 13779
|\
| + rgw: metadata sync info should be shown at master zone of slave zonegroup
+ Pull request 13833
|\
| + rgw: the swift container acl should support field .ref
+ Pull request 13834
|\
| + rgw: change log level to 20 for 'System already converted' message
+ Pull request 13837
|\
| + rgw: fix for broken yields in RGWMetaSyncShardCR
| + rgw: kill a compile warning for rgw_sync
+ Pull request 13842
|\
| + rgw: don't init rgw_obj from rgw_obj_key when it's incorrect to do so
+ Pull request 13863
|\
| + rgw: Fixes typo in rgw_admin.cc
+ Pull request 13865
|\
| + rgw: Correct the return codes for the health check feature Fixes: http://tracker.ceph.com/issues/19025 Signed-off-by: Pavan Rallabhandi <PRallabhandi@walmartlabs.com>
+ Pull request 13872
|\
| + rgw: Let the object stat command be shown in the usage
+ Pull request 13874
|\
| + doc: rgw: make a note abt system users vs normal users
+ Pull request 14064
|\
| + rgw: delete_system_obj() fails on empty object name
| + rgw: if user.email is empty, dont try to delete
+ Pull request 14066
|\
| + rgw: fix break inside of yield in RGWFetchAllMetaCR
+ Pull request 14136
|\
| + rgw: skip conversion of zones without any zoneparams
| + rgw: better debug information for upgrade
| + rgw/rgw_rados.cc: prefer ++operator for non-primitive iterators
+ Pull request 14143
|\
| + rgw: use rgw_zone_root_pool for region_map like is done in hammer
+ Pull request 14368
|\
| + jewel: rgw: fix listing of objects that start with underscore
+ Pull request 14377
+ rgw_file: fix missing unlock in unlink
+ rgw_file: implement reliable has-children check

Actions

Copy link

#21

Updated by Nathan Cutler about 7 years ago

RGW bisect¶

teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports-rgw --machine-type smithi --filter="rgw/verify/{clusters/fixed-2.yaml frontend/apache.yaml fs/btrfs.yaml msgr-failures/few.yaml overrides.yaml rgw_pool_type/ec-cache.yaml tasks/rgw_s3tests.yaml validater/lockdep.yaml}"

fail http://pulpito.ceph.com:80/smithfarm-2017-04-11_10:39:24-rgw-wip-jewel-backports-rgw-distro-basic-smithi/

Bug reproduced; hypothesis confirmed!

Re-running with a new integration branch containing just PRs:

13865
13863
13842
13837
13834
13833
13779
13724
13552

fail http://pulpito.ceph.com:80/smithfarm-2017-04-11_13:31:07-rgw-wip-jewel-backports-rgw-distro-basic-smithi/

Bug reproduced!

Re-running with a new integration branch containing just PRs:

13834
13833
13779
13724
13552

fail http://pulpito.ceph.com:80/smithfarm-2017-04-12_00:08:01-rgw-wip-jewel-backports-rgw-distro-basic-smithi/

Bug reproduced!

Re-running with subset:

13779
13724
13552

fail http://pulpito.ceph.com:80/smithfarm-2017-04-12_11:34:14-rgw-wip-jewel-backports-rgw-distro-basic-smithi/

Bug reproduced!

Last test was run manually with the conclusion that PR#13552 is to blame. The test branch is https://github.com/ceph/ceph-ci/commits/wip-jewel-backports-rgw (contains just v10.2.7 plus this one PR).

Actions

Copy link

#22

Updated by Nathan Cutler about 7 years ago

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

+ Pull request 13450
|\
| + msg/simple/Pipe: support IPv6 QoS.
| + msg/simple: cleanups
+ Pull request 13544
|\
| + auth: 'ceph auth import -i' overwrites caps, if caps are not specified in given keyring file, should alert user and should not allow this import. Because in 'ceph auth list' we keep all the keyrings with caps and importing 'client.admin' user keyring without caps locks the cluster with error 1 because admin keyring caps are missing in 'ceph auth'.
+ Pull request 13606
|\
| + build/ops: rpm: move $CEPH_EXTRA_CONFIGURE_ARGS to right place
| + build/ops: rpm: explicitly provide --with-ocf to configure
| + rpm: build ceph-resource-agents by default
+ Pull request 13608
|\
| + tests: Thrasher: eliminate a race between kill_osd and init
+ Pull request 13647
|\
| + os: make zero values noops for set_alloc_hint() in FileStore
| + osd: preserve allocation hint attribute during recovery
+ Pull request 13724
|\
| + rgw: Use decoded URI when verifying TempURL
+ Pull request 13779
|\
| + rgw: metadata sync info should be shown at master zone of slave zonegroup
+ Pull request 13833
|\
| + rgw: the swift container acl should support field .ref
+ Pull request 13834
|\
| + rgw: change log level to 20 for 'System already converted' message
+ Pull request 13837
|\
| + rgw: fix for broken yields in RGWMetaSyncShardCR
| + rgw: kill a compile warning for rgw_sync
+ Pull request 13842
|\
| + rgw: don't init rgw_obj from rgw_obj_key when it's incorrect to do so
+ Pull request 13863
|\
| + rgw: Fixes typo in rgw_admin.cc
+ Pull request 13865
|\
| + rgw: Correct the return codes for the health check feature Fixes: http://tracker.ceph.com/issues/19025 Signed-off-by: Pavan Rallabhandi <PRallabhandi@walmartlabs.com>
+ Pull request 13872
|\
| + rgw: Let the object stat command be shown in the usage
+ Pull request 14064
|\
| + rgw: delete_system_obj() fails on empty object name
| + rgw: if user.email is empty, dont try to delete
+ Pull request 14066
|\
| + rgw: fix break inside of yield in RGWFetchAllMetaCR
+ Pull request 14136
|\
| + rgw: skip conversion of zones without any zoneparams
| + rgw: better debug information for upgrade
| + rgw/rgw_rados.cc: prefer ++operator for non-primitive iterators
+ Pull request 14143
|\
| + rgw: use rgw_zone_root_pool for region_map like is done in hammer
+ Pull request 14195
|\
| + rgw: use separate http_manager for read_sync_status
| + rgw: pass cr registry to managers
| + rgw: use separate cr manager for read_sync_status
| + rgw: change read_sync_status interface
| + rgw: don't ignore ENOENT in RGWRemoteDataLog::read_sync_status()
+ Pull request 14368
|\
| + jewel: rgw: fix listing of objects that start with underscore
+ Pull request 14377
|\
| + rgw_file: fix missing unlock in unlink
| + rgw_file: implement reliable has-children check
+ Pull request 14383
|\
| + debian: replace SysV rbdmap with systemd service
+ Pull request 14449
+ tests: fix oversight in yaml comment

Actions

Copy link

#23

Updated by Nathan Cutler about 7 years ago

assert no massive rgw failure¶

teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --filter="rgw/verify/{clusters/fixed-2.yaml frontend/apache.yaml fs/btrfs.yaml msgr-failures/few.yaml overrides.yaml rgw_pool_type/ec-cache.yaml tasks/rgw_s3tests.yaml validater/lockdep.yaml}"

pass http://pulpito.ceph.com:80/smithfarm-2017-04-13_10:50:01-rgw-wip-jewel-backports-distro-basic-smithi/

Bisect result verified.

Actions

Copy link

#24

Updated by Nathan Cutler about 7 years ago

rgw¶

teuthology-suite -k distro --priority 1000 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

3 fail, 189 pass (192 total) http://pulpito.ceph.com:80/smithfarm-2017-04-13_13:49:51-rgw-wip-jewel-backports-distro-basic-smithi/

Re-running 3 failed jobs:

pass http://pulpito.ceph.com:80/smithfarm-2017-04-14_19:16:31-rgw-wip-jewel-backports-distro-basic-smithi/

Actions

Copy link

#25

Updated by Nathan Cutler about 7 years ago

assert no async messenger leak¶

teuthology-suite -k distro --priority 101 --suite rados --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --filter="rados/verify/{1thrash/default.yaml clusters/{fixed-2.yaml openstack.yaml} fs/btrfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml tasks/mon_recovery.yaml validater/valgrind.yaml}"

pass http://pulpito.ceph.com:80/smithfarm-2017-04-13_10:50:52-rados-wip-jewel-backports-distro-basic-smithi/

Confirmed that the leak is (most likely) caused by https://github.com/ceph/ceph/pull/13212

Actions

Copy link

#26

Updated by Nathan Cutler about 7 years ago

rados¶

teuthology-suite -k distro --priority 1000 --suite rados --subset $(expr $RANDOM % 2000)/2000 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

2 fail, 110 pass (112 total) http://pulpito.ceph.com:80/smithfarm-2017-04-13_13:54:30-rados-wip-jewel-backports-distro-basic-smithi/
- Command failed on smithi001 with status 6: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph tell osd.0 flush_pg_stats' rados/singleton/{all/ec-lost-unfound-upgrade.yaml fs/xfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml}
- Command failed on smithi171 with status 11: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph pg scrub 1.0'

Note that the 2 failed jobs also failed in the earlier rados run - #note-4 above

Re-running 2 failed jobs:

fail http://pulpito.ceph.com:80/smithfarm-2017-04-14_19:18:15-rados-wip-jewel-backports-distro-basic-smithi/

same failure in rados/singleton-nomsgr/{all/lfn-upgrade-hammer.yaml rados.yaml}

2017-04-14T19:24:04.696 INFO:teuthology.orchestra.run.smithi161:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph pg scrub 1.0'
2017-04-14T19:24:04.838 INFO:teuthology.orchestra.run.smithi161.stderr:Error EAGAIN: pg 1.0 primary osd.1 not up

same failure in rados/singleton/{all/ec-lost-unfound-upgrade.yaml fs/xfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml}

Test first installs infernalis (new build from tip of infernalis branch - see #18089) and then upgrades all but one OSD to wip-jewel-backports. Then it runs the "ec_lost_unfound" task, at which point we see:

2017-04-14T19:25:42.516 INFO:teuthology.orchestra.run.smithi132:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph tell osd.0 flush_pg_stats'
...
2017-04-14T19:25:42.756 INFO:teuthology.orchestra.run.smithi132.stderr:Error ENXIO: problem getting command descriptions from osd.0

Test if failure is reproducible on wip-v10.2.7:

teuthology-suite -k distro --verbose --suite rados --priority 101 --email ncutler@suse.com --ceph wip-v10.2.7 --machine-type smithi --filter="rados/singleton/{all/ec-lost-unfound-upgrade.yaml fs/xfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml}"

pass http://pulpito.ceph.com:80/smithfarm-2017-04-14_19:46:35-rados-wip-v10.2.7-distro-basic-smithi/

Test if failure is reproducible on wip-v10.2.7:

teuthology-suite -k distro --verbose --suite rados --priority 101 --email ncutler@suse.com --ceph wip-v10.2.7 --machine-type smithi --filter="rados/singleton-nomsgr/{all/lfn-upgrade-hammer.yaml rados.yaml}"

pass http://pulpito.ceph.com:80/smithfarm-2017-04-14_20:18:59-rados-wip-v10.2.7-distro-basic-smithi/

Test if failure is reproducible on jewel:

teuthology-suite -k distro --verbose --suite rados --ceph jewel --ceph-repo https://github.com/ceph/ceph --suite-repo https://github.com/ceph/ceph --machine-type vps --priority 101 --email ncutler@suse.com --filter="rados/singleton-nomsgr/{all/lfn-upgrade-hammer.yaml rados.yaml}"

Actions

Copy link

#27

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

5 failed, 7 pass (12 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-04-13_10:51:25-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- reproducible timeout in rados/test.sh

Actions

Copy link

#28

Updated by Nathan Cutler about 7 years ago

upgrade client-upgrade¶

teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

1 fail, 13 pass (14 total) http://pulpito.ceph.com:80/smithfarm-2017-04-13_10:52:09-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- known bug http://tracker.ceph.com/issues/19571 (firefly install fails because qemu-kvm caused installation of newer librado2 from distro) http://pulpito.ceph.com/smithfarm-2017-04-13_10:52:09-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/1019795

Ruled a pass

Actions

Copy link

#29

Updated by Nathan Cutler about 7 years ago

bisect regression in jewel¶

It appears we somehow managed to merge a PR that introduced a regression.

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph jewel --ceph-repo https://github.com/ceph/ceph --suite-repo https://github.com/ceph/ceph --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"

fail http://pulpito.ceph.com:80/smithfarm-2017-04-13_13:59:35-upgrade:hammer-x-jewel-distro-basic-vps/

Running same test on smithi:

running http://pulpito.ceph.com:80/smithfarm-2017-04-17_11:23:20-upgrade:hammer-x-jewel-distro-basic-smithi/

Re-running 5 times on VPS:

running http://pulpito.ceph.com:80/smithfarm-2017-04-17_11:24:40-upgrade:hammer-x-jewel-distro-basic-vps/

The next step is to run the reproducer on v10.2.7 to assert it is free of the regression. Assuming the test passes on v10.2.7, we will have to bisect :-(

Pushed wip-v10.2.7 (v10.2.7+PR#14371) to Shaman

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"

pass http://pulpito.ceph.com:80/smithfarm-2017-04-13_20:53:49-upgrade:hammer-x-wip-v10.2.7-distro-basic-vps/

Preparing first bisect round - here are the PRs merged since v10.2.7:

$ git log --merges --oneline --no-color v10.2.7..HEAD
e31a540 Merge pull request #13834 from smithfarm/wip-18969-jewel
7c36d16 Merge pull request #13833 from smithfarm/wip-18908-jewel
0e3aa2c Merge pull request #13214 from ovh/bp-osd-updateable-throttles-jewel
8d5a5dd Merge pull request #14326 from shinobu-x/wip-15025-jewel
091aaa2 Merge pull request #13874 from smithfarm/wip-19171-jewel
3f2e4cd Merge pull request #13492 from shinobu-x/wip-18516-jewel
ea0bc6c Merge pull request #13254 from shinobu-x/wip-14609-jewel
845972f Merge pull request #13489 from shinobu-x/wip-18955-jewel
a3deef9 Merge pull request #14070 from smithfarm/wip-19339-jewel
702edb5 Merge pull request #14329 from smithfarm/wip-19493-jewel
f509ccc Merge pull request #14427 from smithfarm/wip-19140-jewel
c8c4bff Merge pull request #14324 from shinobu-x/wip-19371-jewel
349baea Merge pull request #14112 from shinobu-x/wip-19192-jewel
dd466b7 Merge pull request #14150 from smithfarm/wip-18823-jewel
b8f8bd0 Merge pull request #14152 from smithfarm/wip-18893-jewel
222916a Merge pull request #14154 from smithfarm/wip-18948-jewel
49f84b1 Merge pull request #14148 from smithfarm/wip-18778-jewel
2a232d4 Merge pull request #14083 from smithfarm/wip-19357-jewel
413ac58 Merge pull request #13154 from smithfarm/wip-18496-jewel
23d595b Merge pull request #13244 from smithfarm/wip-18775-jewel
4add6f5 Merge pull request #13809 from asheplyakov/18321-bp-jewel
37ab19c Merge pull request #13107 from smithfarm/wip-18669-jewel
f7c04e3 Merge pull request #13585 from asheplyakov/jewel-bp-16585
d2909bd Merge pull request #14371 from tchaikov/wip-19429-jewel
cd74860 Merge pull request #14325 from shinobu-x/wip-18619-jewel
1a20c12 Merge pull request #14236 from smithfarm/wip-19392-jewel
4838c4d Merge pull request #14181 from mslovy/wip-19394-jewel
e26b703 Merge pull request #14113 from shinobu-x/wip-19319-jewel
389150b Merge pull request #14047 from asheplyakov/reindex-on-pg-split
a8b1008 Merge pull request #14044 from mslovy/wip-19311-jewel
32ed9b7 Merge pull request #13932 from asheplyakov/18911-bp-jewel
6705e91 Merge pull request #13831 from jan--f/wip-19206-jewel
3d21a00 Merge pull request #13827 from tchaikov/wip-19185-jewel
8a6d643 Merge pull request #13788 from shinobu-x/wip-18235-jewel
f96392a Merge pull request #13786 from shinobu-x/wip-19129-jewel
8fe6ffc Merge pull request #13732 from liewegas/wip-19119-jewel
6f589a1 Merge pull request #13541 from shinobu-x/wip-18929-jewel
b8f2d35 Merge pull request #13477 from asheplyakov/jewel-bp-18951
40d1443 Merge pull request #13261 from shinobu-x/wip-18587-jewel

Total of 40 PRs excluding 14371 (which must be included in any case); grabbing the first 20 (starting from the bottom of the list, which is in reverse chronological order) for the bisect branch. Populating with following script:

set -ex
reviewer='Nathan Cutler <ncutler@suse.com>'

milestone=jewel
base_branch=wip-v10.2.7
bisect_branch=${base_branch}-bisect

PRS="13261
13477
13541
13732
13786
13788
13827
13831
13932
14044
14047
14113
14181
14236
14325
13585
13107
13809
13244
13154
14083
" 

git checkout $milestone
git fetch ceph
git branch -D $bisect_branch || :
git checkout -b $bisect_branch ceph/$base_branch
git reset --hard ceph/$base_branch
for pr in $PRS ; do eval title=$(curl --silent https://api.github.com/repos/ceph/ceph/pulls/$pr?access_token=$github_token | jq .title) ; echo "PR $pr $title" ; git --no-pager log --oneline ceph/pull/$pr/merge^1..ceph/pull/$pr/merge^2 ; git --no-pager merge --no-ff -m "$(echo -e "Merge pull request #$pr: $title\n\nReviewed-by: $reviewer")" ceph/pull/$pr/head ; done
git push --force ceph-ci $bisect_branch

git --no-pager log --format='%H %s' --graph ceph-ci/wip-v10.2.7..wip-v10.2.7-bisect | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

+ Pull request 14083
|\
| + doc: update description of rbdmap unmap[-all] behaviour
| + doc: add verbiage to rbdmap manpage
| + rbdmap: unmap RBDMAPFILE images unless called with unmap-all
+ Pull request 13154
|\
| + librbd: possible deadlock with flush if refresh in-progress
+ Pull request 13244
|\
| + osdc: cache should ignore error bhs during trim
+ Pull request 13809
|\
| + librbd: remove image header lock assertions
+ Pull request 13107
|\
| + librbd: improve debug logging for lock / watch state machines
| + test: use librados API to retrieve config params
+ Pull request 13585
|\
| + msg/simple: set close on exec on server sockets
| + msg/async: set close on exec on server sockets
+ Pull request 14325
|\
| + osd: Calculate degraded and misplaced more accurately
+ Pull request 14236
|\
| + mon: remove bad rocksdb option
+ Pull request 14181
|\
| + osd: bypass readonly ops when osd full.
+ Pull request 14113
|\
| + tools/rados: Check return value of connect
+ Pull request 14047
|\
| + jewel: osd/PGLog: reindex properly on pg log split
+ Pull request 14044
|\
| + os/filestore: fix clang static check warn use-after-free
+ Pull request 13932
|\
| + rbd-nbd: check /sys/block/nbdX/size to ensure kernel mapped correctly
+ Pull request 13831
|\
| + server: negative error code when responding to client
+ Pull request 13827
|\
| + osd/ReplicatedPG: try with pool's use-gmt setting if hitset archive not found
+ Pull request 13788
|\
| + os/filestore/HashIndex: be loud about splits
+ Pull request 13786
|\
| + build/ops: add psmisc dependency to ceph-base
+ Pull request 13732
|\
| + PendingReleaseNotes: warning about 'osd rm ...' and #19119
+ Pull request 13541
|\
| + osd/PG: restrict want_acting to up+acting on recovery completion
+ Pull request 13477
|\
| + ceph-osd: --flush-journal: sporadic segfaults on exit
+ Pull request 13261
+ mon/OSDMonitor: make 'osd crush move ...' work on osds

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-bisect --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"

pass http://pulpito.ceph.com/smithfarm-2017-04-18_15:31:09-upgrade:hammer-x-wip-v10.2.7-bisect-distro-basic-vps/

And wip-v10.2.7 (the base branch) again for comparison:

pass http://pulpito.ceph.com/smithfarm-2017-04-18_15:31:28-upgrade:hammer-x-wip-v10.2.7-distro-basic-vps/

Will continue the bisect in a new comment.

Actions

Copy link

#30

Updated by Nathan Cutler about 7 years ago

jewel regression bisect, round 2¶

In round 1 we prepared an integration branch consisting of v10.2.7 + PR#14371 (which is required in any case) + the first 21 PRs merged into jewel after the v10.2.7 release. Although logic would dictate that the regression is one of the following PRs

e31a540 Merge pull request #13834 from smithfarm/wip-18969-jewel
7c36d16 Merge pull request #13833 from smithfarm/wip-18908-jewel
0e3aa2c Merge pull request #13214 from ovh/bp-osd-updateable-throttles-jewel
8d5a5dd Merge pull request #14326 from shinobu-x/wip-15025-jewel
091aaa2 Merge pull request #13874 from smithfarm/wip-19171-jewel
3f2e4cd Merge pull request #13492 from shinobu-x/wip-18516-jewel
ea0bc6c Merge pull request #13254 from shinobu-x/wip-14609-jewel
845972f Merge pull request #13489 from shinobu-x/wip-18955-jewel
a3deef9 Merge pull request #14070 from smithfarm/wip-19339-jewel
702edb5 Merge pull request #14329 from smithfarm/wip-19493-jewel
f509ccc Merge pull request #14427 from smithfarm/wip-19140-jewel
c8c4bff Merge pull request #14324 from shinobu-x/wip-19371-jewel
349baea Merge pull request #14112 from shinobu-x/wip-19192-jewel
dd466b7 Merge pull request #14150 from smithfarm/wip-18823-jewel
b8f8bd0 Merge pull request #14152 from smithfarm/wip-18893-jewel
222916a Merge pull request #14154 from smithfarm/wip-18948-jewel
49f84b1 Merge pull request #14148 from smithfarm/wip-18778-jewel

I would like to get a clear reproducer, so I prepared a wip-v10.2.7-bisect-2 branch consisting of v10.2.7 + PR#14371 + these PRs.

git --no-pager log --format='%H %s' --graph ceph-ci/wip-v10.2.7..wip-v10.2.7-bisect-2 | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-bisect-2 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"

builds OK https://shaman.ceph.com/builds/ceph/wip-v10.2.7-bisect-2/2b15bdd5a425e2d20a146af19ad06fda24adc2d2/

fail http://pulpito.ceph.com/smithfarm-2017-04-18_21:04:46-upgrade:hammer-x-wip-v10.2.7-bisect-2-distro-basic-vps/

Examining the test yaml again, it seems strange that a test called "upgrade:hammer-x/f-h-x-offline" should install firefly and then upgrade directly to "x" (jewel in this case)? Opened http://tracker.ceph.com/issues/19687 to track.

Actions

Copy link

#31

Updated by Nathan Cutler about 7 years ago

jewel "regression" bisect, round 3¶

Pushed wip-v10.2.7-bisect-3

git --no-pager log --format='%H %s' --graph ceph-ci/wip-v10.2.7..wip-v10.2.7-bisect-3 | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

https://shaman.ceph.com/builds/ceph/wip-v10.2.7-bisect-3/0dfc1333c5ff95624e8825bb4af339b67b2a1d1d/

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-bisect-3 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"

fail http://pulpito.ceph.com:80/smithfarm-2017-04-19_04:02:57-upgrade:hammer-x-wip-v10.2.7-bisect-3-distro-basic-vps/

jewel "regression" bisect, round 4¶

Pushed wip-v10.2.7-bisect-4

git --no-pager log --format='%H %s' --graph ceph-ci/wip-v10.2.7..wip-v10.2.7-bisect-4 | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

https://shaman.ceph.com/builds/ceph/wip-v10.2.7-bisect-4/77358532ce0d07ae7afc317c304c2e255058aad0/

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-bisect-4 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"

pass http://pulpito.ceph.com:80/smithfarm-2017-04-19_08:19:07-upgrade:hammer-x-wip-v10.2.7-bisect-4-distro-basic-vps/

Actions

Copy link

#32

Updated by Nathan Cutler about 7 years ago

jewel "regression" grand finale¶

Starting the grand finale by cherry-picking (not merging as before) the following PRs (one of which should be the cause of the "regression" according to the bisect results so far) on top of wip-v10.2.7:

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-10.2.7-13254 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-14427 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-v10.2.7-14324 --machine-type vps --priority 101 --email ncutler@suse.com --filter="upgrade:hammer-x/f-h-x-offline/{0-install.yaml 1-pre.yaml 2-upgrade.yaml 3-jewel.yaml 4-after.yaml ubuntu_14.04.yaml}"

CONCLUSION: https://github.com/ceph/ceph/pull/14427 would seem to be the culprit. Opened https://github.com/ceph/ceph/pull/14643 to revert it.

Actions

Copy link

#33

Updated by Nathan Cutler about 7 years ago

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

Actions

Copy link

#34

Updated by Nathan Cutler about 7 years ago

rados¶

teuthology-suite -k distro --priority 1000 --suite rados --subset $(expr $RANDOM % 2000)/2000 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

builds OK https://shaman.ceph.com/builds/ceph/wip-jewel-backports/16a335388be2616c68e6895cba3eac7782ebcaa3/
4 fail, 108 pass (112 total) http://pulpito.ceph.com:80/smithfarm-2017-04-20_08:48:32-rados-wip-jewel-backports-distro-basic-smithi/

Re-running 4 failed jobs:

2 fail, 2 pass http://pulpito.ceph.com:80/smithfarm-2017-04-21_05:45:14-rados-wip-jewel-backports-distro-basic-smithi/
- new bug http://tracker.ceph.com/issues/19737 Error EAGAIN: pg 1.0 primary osd.1 not up
- known bug http://tracker.ceph.com/issues/16239 Error ENXIO: problem getting command descriptions from osd.0 - opened https://github.com/ceph/ceph/pull/14710 to work around it

Actions

Copy link

#35

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

*1 fail, 16 pass (17 total) * http://pulpito.ceph.com:80/smithfarm-2017-04-20_08:51:50-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- known bug http://tracker.ceph.com/issues/12973 - opened https://github.com/ceph/ceph/pull/14626 to fix the test

Ruled a pass

Actions

Copy link

#36

Updated by Nathan Cutler about 7 years ago

upgrade client-upgrade¶

teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

1 fail, 13 pass (14 total) http://pulpito.ceph.com:80/smithfarm-2017-04-20_08:53:29-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- known bug http://tracker.ceph.com/issues/19571 Command failed on vpm175 with status 1: "sudo yum -y install '' ceph-radosgw" - opened https://github.com/ceph/ceph/pull/14691 to "fix" (by dropping the CentOS version of the test)

Ruled a pass

Actions

Copy link

#37

Updated by Nathan Cutler about 7 years ago

ceph-disk¶

teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

pass http://pulpito.ceph.com:80/smithfarm-2017-04-20_17:33:40-ceph-disk-wip-jewel-backports-distro-basic-vps/

Actions

Copy link

#38

Updated by Nathan Cutler about 7 years ago

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

Actions

Copy link

#39

Updated by Nathan Cutler about 7 years ago

rados¶

teuthology-suite -k distro --priority 1000 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

build OK https://shaman.ceph.com/builds/ceph/wip-jewel-backports/20c565ede9f11aee034eb9c387fcc07939974a8f/

1 fail, 226 pass (227 total) http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:26:17-rados-wip-jewel-backports-distro-basic-smithi/
- AttributeError: managers

--rerun

pass http://pulpito.ceph.com/smithfarm-2017-04-22_06:54:43-rados-wip-jewel-backports---basic-smithi/

Actions

Copy link

#40

Updated by Nathan Cutler about 7 years ago

powercycle¶

teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 1000 -l 2 --email ncutler@suse.com

pass http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:30:03-powercycle-wip-jewel-backports-distro-basic-smithi/

Actions

Copy link

#41

Updated by Nathan Cutler about 7 years ago

Upgrade jewel point-to-point-x¶

teuthology-suite -k distro --verbose --suite upgrade/jewel-x/point-to-point-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

fail http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:30:48-upgrade:jewel-x:point-to-point-x-wip-jewel-backports-distro-basic-vps/
- known bug http://tracker.ceph.com/issues/19637

Actions

Copy link

#42

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

1 fail, 16 pass (17 total) http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:31:29-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- [Errno 113] No route to host

Re-running 1 failed job:

pass http://pulpito.ceph.com/smithfarm-2017-04-21_20:26:11-upgrade:hammer-x-wip-jewel-backports---basic-vps/

Actions

Copy link

#43

Updated by Nathan Cutler about 7 years ago

ceph-disk¶

teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

pass http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:32:24-ceph-disk-wip-jewel-backports-distro-basic-vps/

Actions

Copy link

#44

Updated by Nathan Cutler about 7 years ago

upgrade client-upgrade¶

teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

fail http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:33:03-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- 1 dead job - SSH connection to vpm057 was lost: 'sudo apt-get -y install linux-image-generic' - possibly infrastructure noise

Re-running 1 dead job

pass http://pulpito.ceph.com/smithfarm-2017-04-21_17:32:33-upgrade:client-upgrade-wip-jewel-backports---basic-vps/

Actions

Copy link

#45

Updated by Nathan Cutler about 7 years ago

fs¶

teuthology-suite -k distro --priority 1000 --suite fs --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

3 fail, 84 fail (87 pass) http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:33:43-fs-wip-jewel-backports-distro-basic-smithi/
- "cluster [WRN] Scrub error on inode" (warning in mds log)
- java.lang.NoClassDefFoundError: Could not initialize class com.ceph.fs.CephMount

--rerun

2 fail, 1 pass http://pulpito.ceph.com/smithfarm-2017-04-22_06:58:50-fs-wip-jewel-backports---basic-smithi/
- "cluster [WRN] Scrub error on inode" (warning in mds log)

Marked https://github.com/ceph/ceph/pull/14699 DNM for now.

Actions

Copy link

#46

Updated by Nathan Cutler about 7 years ago

rgw¶

teuthology-suite -k distro --priority 1000 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

1 fail, 191 pass (192 total) http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:35:15-rgw-wip-jewel-backports-distro-basic-smithi/
- s3tests.functional.test_s3.test_versioned_concurrent_object_create_and_remove ... FAIL

--rerun

pass http://pulpito.ceph.com/smithfarm-2017-04-22_07:03:40-rgw-wip-jewel-backports---basic-smithi/

Actions

Copy link

#47

Updated by Nathan Cutler about 7 years ago

rbd¶

teuthology-suite -k distro --priority 1000 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi  --subset $(expr $RANDOM % 4)/4

8 fail, 101 pass (109 total) http://pulpito.ceph.com:80/smithfarm-2017-04-21_15:39:27-rbd-wip-jewel-backports-distro-basic-smithi/
- TestLibRBD.Mirror

--rerun

8 fail http://pulpito.ceph.com/smithfarm-2017-04-22_07:07:13-rbd-wip-jewel-backports---basic-smithi/
- TestLibRBD.Mirror

Marked https://github.com/ceph/ceph/pull/14663 DNM - needs another run on repopulated integration branch.

Actions

Copy link

#48

Updated by Nathan Cutler about 7 years ago

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

https://shaman.ceph.com/builds/ceph/wip-jewel-backports/572fb344af805709327f270fcf8743bc62ef4b3d

Actions

Copy link

#49

Updated by Nathan Cutler about 7 years ago

rados¶

teuthology-suite -k distro --priority 1000 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

2 fail, 257 pass (259 total) http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:18:43-rados-wip-jewel-backports-distro-basic-smithi/
- "Command failed on smithi066 with status 1: '/home/ubuntu/cephtest/s3-tests/virtualenv/bin/s3tests-test-readwrite'" NOT REPRODUCED

--rerun

1 fail, 1 pass http://pulpito.ceph.com/smithfarm-2017-04-27_16:56:17-rados-wip-jewel-backports---basic-smithi/
- known bug http://tracker.ceph.com/issues/19737 Command failed on smithi168 with status 11: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph pg scrub 1.0'

Running the failed job 4 more times:

2 pass, 2 fail http://pulpito.ceph.com:80/smithfarm-2017-04-27_17:35:57-rados-wip-jewel-backports-distro-basic-smithi/

Ruled a pass

Actions

Copy link

#50

Updated by Nathan Cutler about 7 years ago

powercycle¶

teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 1000 -l 2 --email ncutler@suse.com

pass http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:22:49-powercycle-wip-jewel-backports-distro-basic-smithi/

Actions

Copy link

#51

Updated by Nathan Cutler about 7 years ago

Upgrade jewel point-to-point-x¶

teuthology-suite -k distro --verbose --suite upgrade/jewel-x/point-to-point-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

pass http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:23:20-upgrade:jewel-x:point-to-point-x-wip-jewel-backports-distro-basic-vps/

Actions

Copy link

#52

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

1 fail, 3 dead, 13 pass (17 total) http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:23:47-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/

Actions

Copy link

#53

Updated by Nathan Cutler about 7 years ago

ceph-disk¶

teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

pass http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:25:13-ceph-disk-wip-jewel-backports-distro-basic-vps/

Actions

Copy link

#54

Updated by Nathan Cutler about 7 years ago

upgrade client-upgrade¶

teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

1 dead, 12 pass (13 total) http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:25:39-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- dead job looks like infrastructure noise

--rerun

pass http://pulpito.ceph.com/smithfarm-2017-04-27_19:01:07-upgrade:client-upgrade-wip-jewel-backports---basic-vps/

Actions

Copy link

#55

Updated by Nathan Cutler about 7 years ago

fs¶

teuthology-suite -k distro --priority 1000 --suite fs --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi

3 fail, 84 pass (87 total) http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:26:37-fs-wip-jewel-backports-distro-basic-smithi/

--rerun

1 fail, 2 succeed (3 total) http://pulpito.ceph.com/smithfarm-2017-04-27_13:37:36-fs-wip-jewel-backports---basic-smithi/
- "java.lang.NoClassDefFoundError: Could not initialize class com.ceph.fs.CephMount" in libcephfs-java workunit leads to Command failed (workunit test libcephfs-java/test.sh) on smithi138 with status 1

Actions

Copy link

#56

Updated by Nathan Cutler about 7 years ago

rgw¶

teuthology-suite -k distro --priority 1000 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi  --subset $(expr $RANDOM % 2)/2

10 fail, 86 pass (96 total) http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:29:14-rgw-wip-jewel-backports-distro-basic-smithi/
- saw valgrind issues (9 failures)
- foo (1 failure) NOT REPRODUCED

--rerun

fail http://pulpito.ceph.com/smithfarm-2017-04-27_16:58:41-rgw-wip-jewel-backports---basic-smithi/
- saw valgrind issues

Actions

Copy link

#57

Updated by Nathan Cutler about 7 years ago

rbd¶

teuthology-suite -k distro --priority 1000 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi  --subset $(expr $RANDOM % 4)/4

pass http://pulpito.ceph.com:80/smithfarm-2017-04-27_11:30:49-rbd-wip-jewel-backports-distro-basic-smithi/

Actions

Copy link

#58

Updated by Nathan Cutler about 7 years ago

rgw suite for ragweed support¶

Special request by Yehuda

build: https://shaman.ceph.com/builds/ceph/wip-rgw-support-ragweed-jewel/

teuthology-suite -k distro --priority 999 --suite rgw --email ncutler@suse.com --ceph wip-rgw-support-ragweed-jewel --machine-type smithi --subset $(expr $RANDOM % 2)/2

1 fail, rest pass http://pulpito.ceph.com:80/smithfarm-2017-05-03_10:05:41-rgw-wip-rgw-support-ragweed-jewel-distro-basic-smithi/
- multiregion valgrind

Actions

Copy link

#59

Updated by Abhishek Lekshmanan almost 7 years ago

Added https://github.com/ceph/ceph/pull/15208 on Sage's request,

build: https://shaman.ceph.com/builds/ceph/wip-jewel-backports-mon-sortbitwise/8a7442e32348546e5f3e400810b5b01422800aad/

teuthology-suite -k distro --priority 999 --suite rados --subset $(expr $RANDOM % 50)/50 --email abhishek@suse.com --ceph wip-jewel-backports-mon-sortbitwise --machine-type smithi

Actions

Copy link

#60

Updated by Abhishek Lekshmanan almost 7 years ago

Adding an integration branch scheduling only the RGW memleak fix PRs (and the above rados PR which was already merged in Jewel)

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports-rgw-fixes | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull-?request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

Actions

Copy link

#61

Updated by Nathan Cutler almost 7 years ago

rgw suite on wip-jewel-backports-rgw-fixes branch¶

86 pass, 10 fail (96 total) http://pulpito.ceph.com/abhi-2017-06-02_14:29:52-rgw-wip-jewel-backports-rgw-fixes-distro-basic-smithi/

all 10 failures are '/home/ubuntu/cephtest/archive/syslog/misc.log:2017-06-03T02:38:23.151386+00:00 smithi114 ceph-create-keys[73655]: INFO:ceph-create-keys:ceph-mon admin socket not ready yet. ' in syslog - i.e. no valgrind failures \o/ - the failure is tracked at #20171
7 FAILED http://pulpito.ceph.com/abhi-2017-06-06_08:38:40-rgw-wip-jewel-backports-rgw-fixes-distro-basic-smithi/ with the same env. error above
4 FAILED http://pulpito.ceph.com/abhi-2017-06-06_13:16:36-rgw-wip-jewel-backports-rgw-fixes-distro-basic-smithi/ (ditto)
RUNNING http://pulpito.ceph.com/abhi-2017-06-07_08:35:51-rgw-wip-jewel-backports-rgw-fixes-distro-basic-smithi/

Actions

Copy link

#62

Updated by Nathan Cutler almost 7 years ago

https://shaman.ceph.com/builds/ceph/wip-jewel-backports/f77f547ca1680f7f8491c50ddac4b8d45f6882d0/

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

Actions

Copy link

#63

Updated by Nathan Cutler almost 7 years ago

rados¶

Using teuthology branch wip-20171 to avoid the silly regression http://tracker.ceph.com/issues/20171

teuthology-suite -k distro --priority 101 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --teuthology-branch wip-20171

2 failed, 225 pass (227 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-22_11:36:26-rados-wip-jewel-backports-distro-basic-smithi/

infrastructure noise ENOSPC (smithis have smaller disks and some of the tests can max them out) - see https://github.com/ceph/ceph/pull/15529#issuecomment-310476078
new bug, covered in RGW #20392

Actions

Copy link

#64

Updated by Nathan Cutler almost 7 years ago

powercycle¶

teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 101 -l 2 --email ncutler@suse.com

pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-22_06:27:32-powercycle-wip-jewel-backports-distro-basic-smithi/

Actions

Copy link

#65

Updated by Nathan Cutler almost 7 years ago

Upgrade jewel point-to-point-x¶

teuthology-suite -k distro --verbose --suite upgrade/jewel-x/point-to-point-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-22_06:28:08-upgrade:jewel-x:point-to-point-x-wip-jewel-backports-distro-basic-vps/

Actions

Copy link

#66

Updated by Nathan Cutler almost 7 years ago

Upgrade hammer-x¶

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com --teuthology-branch wip-20171

fail http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-22_08:13:28-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/

several failures are because the sortbitwise flag is not being set upon upgrade to jewel; addressed by https://github.com/ceph/ceph/pull/15842 - needs integration branch repopulate

Re-running with https://github.com/ceph/ceph/pull/15842 included:

8 failed, 10 passed (18 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-22_11:47:33-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/

two failures are because I missed a test in https://github.com/ceph/ceph/pull/15842 - fixed
the remaining failures are all either the RGW swift.py issue (see RGW below) or because hammer does not have a libcephfs-java package for Xenial - maybe a case of http://tracker.ceph.com/issues/19681

Actions

Copy link

#67

Updated by Nathan Cutler almost 7 years ago

fs¶

teuthology-suite -k distro --priority 101 --suite fs --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --teuthology-branch wip-20171

1 failed, 87 passed (88 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-22_11:51:56-fs-wip-jewel-backports-distro-basic-smithi/

new bug http://tracker.ceph.com/issues/20412 "Test failure: test_remote_update_write (tasks.cephfs.test_quota.TestQuota)"

Re-running 1 failed job

fail http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-25_07:23:04-fs-wip-jewel-backports-distro-basic-smithi/ - #20412 again

Actions

Copy link

#68

Updated by Nathan Cutler almost 7 years ago

rgw¶

teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi  --subset $(expr $RANDOM % 2)/2 --teuthology-branch wip-20171

fail http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-22_11:53:25-rgw-wip-jewel-backports-distro-basic-smithi/

new bug #20392 (Most, if not all, of the failures are due to incompatible tests added to ceph/swift.git master and to the fact that we are using a single swift.py task for all Ceph versions.)

Actions

Copy link

#69

Updated by Nathan Cutler almost 7 years ago

https://shaman.ceph.com/builds/ceph/wip-jewel-backports/98045e76d74a57a5d859b4e2e742dc64722f70cb/

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

+ Pull request 15904
|\
| + tests: upgrade/hammer-x/stress-split: tweak packages list
+ Pull request 15870
|\
| + tests: swift.py: tweak imports
| + tests: swift.py: clone the ceph-jewel branch
| + Merge branch 'master' of /home/smithfarm/src/ceph/upstream/teuthology into wip-swift-task-move-jewel
| + tests: move swift.py task to qa/tasks
| + swift: added --cluster to rgw-admin command for multisite support
| + Pull request 470
| |\
| + \ Pull request 466
| |\ \
| | |/
| + | Pull request 462
| |\ \
| | |/
| |/|
| + | Pull request 460
| |\ \
| | |/
| |/|
| + | swift: set full access to subusers creation
| |/
| + Remove most ceph-specific tasks. They are in ceph-qa-suite now.
| + Revert Lines formerly of the form '(remote,) = ctx.cluster.only(role).remotes.keys()'
| + Lines formerly of the form '(remote,) = ctx.cluster.only(role).remotes.keys()' and '(remote,) = ctx.cluster.only(role).remotes.iterkeys()' would fail with ValueError and no message if there were less than 0 or more than 1 key. Now a new function, get_single_remote_value() is called which prints out more understandable messages.
| + Pull request 186
| |\
| + \ Pull request 188
| |\ \
| | |/
| + | Pull request 192
| |\ \
| + \ \ Pull request 194
| |\ \ \
| | |/ /
| + | | Pull request 193
| |\ \ \
| | + | | Add doc strings to Swift tests
| | |/ /
| + | | Pull request 187
| |\ \ \
| | |/ /
| |/| /
| | |/
| + | Add docstrings to s3 related tasks.
| |/
| + Fix namespace collision
| + Pull request 106
| |\
| | + Don't hardcode the git://ceph.com/git/ mirror
| |/
| + Pull request 78
| |\
| | + Helper scripts live in /usr/local/bin now!
| |/
| + s3tests: extend for multi-region tests
| + Pull request 41
| |\
| + \ Pull request 40
| |\ \
| | |/
| |/|
| + | Fix some instances where print is being used instead of log
| |/
| + s3/swift tests: call radosgw-admin as the right client
| + s3tests: clone correct branch
| + Merge branch 'master' of github.com:ceph/teuthology
| |\
| + \ Merge remote-tracking branch 'origin/wip-sandon-vm'
| |\ \
| | |/
| |/|
| + | Merge branch 'wip-centos-rgw'
| |\ \
| | |/
| |/|
| | + s3tests: fix client configurations that aren't dictionaries
| |/
| + Pull request 15
| |\
| | + enable-coredump -> adjust-ulimits
| |/
| + Merge branch 'wip-teuth4768a-wusui'
| |\
| + \ Merge branch 'next'
| |\ \
| + | | s3tests: add force-branch with higher precdence than 'branch'
| | |/
| |/|
| + | Merge remote branch 'origin/next'
| |\ \
| | |/
| | + fix some errors found by pyflakes
| | + s3tests: revert useless portion of 1c50db6a4630d07e72144dafd985c397f8a42dc5
| | + rgw tests: remove users after each test
| | + rgw tests: clean up immediately after the test
| | + swift, s3readwrite: add missing yield
| | + s3tests, s3readwrite, swift: cleanup explicitly
| |/
| + Merge remote-tracking branch 'origin/wip-3634'
| |\
| + \ Merge branch 'unstable'
| |\ \
| | + | Install ceph debs and use installed debs
| |/ /
| + | Replace /tmp/cephtest/ with configurable path
| + | task/swift: change upstream repository url
| + | Merge branch 'wip-mon-thrasher'
| |\ \
| + | | s3tests: fix typo
| |/ /
| + | rgw-logsocket: a task to verify opslog socket works
| |/
| + s3tests: run against arbitrary branch/sha1 of s3-tests.git
| + pull s3-tests.git using git, not http
| + ceph.newdream.net -> ceph.com
| + Merge branch 'master' of github.com:ceph/teuthology
| |\
| | + github.com/NewDreamNetwork -> github.com/ceph
| |/
| + Add necessary imports for s3 tasks, and keep them alphabetical.
| + rgw: access key uses url safe chars
| + use local mirrors for (most) github urls
| + Rename testrados and testswift tasks to not begin with test .
| + testswift: fix config
| + rgw: add swift task
| + s3-tests: use radosgw-admin instead of radosgw_admin
| + s3tests: Clone repository from github.
| + Move orchestra to teuthology.orchestra so there's just one top-level package.
| + Callers of task s3tests.create_users don't need to provide dummy fixtures dict.
| + allow s3tests.create_users defaults be overridden
| + Make targets a dictionary mapping hosts to ssh host keys.
| + Skip s3-tests marked fails_on_rgw, they will fail anyway.
| + The shell exits after the command, hence there is no need for pushd/popd.
| + Add s3tests task.
+ Pull request 15842
|\
| + qa/suites/upgrade/hammer-x: set sortbitwise for jewel clusters
+ Pull request 15468
|\
| + osdc/Journaler: avoid executing on_safe contexts prematurely
| + osdc/Journaler: make header write_pos align to boundary of flushed entry
+ Pull request 15438
|\
| + mds: issue new caps when sending reply to client
+ Pull request 15383
|\
| + cls/rgw: list_plain_entries() stops before bi_log entries
+ Pull request 15000
|\
| + pybind: fix cephfs.OSError initialization
| + pybind: fix open flags calculation
| + fs: normalize file open flags internally used by cephfs
+ Pull request 14930
|\
| + tests: upgrade/hammer-x/v0-94-6-mon-overload: tweak packages list
| + tests: upgrade/hammer-x: new v0-94-6-mon-overload subsuite
+ Pull request 14626
|\
| + tests: 'failed to encode ...' warnings are normal on upgrades
+ Pull request 14392
+ jewel: osd: pg_pool_t::encode(): be compatible with Hammer <= 0.94.6

Actions

Copy link

#70

Updated by Nathan Cutler almost 7 years ago

rgw¶

Partial run to verify fix is viable:

teuthology-suite -k distro --priority 101 --rerun smithfarm-2017-06-22_11:53:25-rgw-wip-jewel-backports-distro-basic-smithi --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --teuthology-branch wip-20392

pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-25_13:13:17-rgw-wip-jewel-backports-distro-basic-smithi/

Full rgw run:

teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi  --subset $(expr $RANDOM % 2)/2 --teuthology-branch wip-20392

1 fail, 95 pass (96 total) http://pulpito.front.sepia.ceph.com/smithfarm-2017-06-25_16:38:58-rgw-wip-jewel-backports-distro-basic-smithi/

failed job is with apache frontend, so not a high priority to fix

Re-running 1 failed job:

pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-25_18:51:45-rgw-wip-jewel-backports-distro-basic-smithi/

Actions

Copy link

#71

Updated by Nathan Cutler almost 7 years ago

Upgrade hammer-x¶

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com --teuthology-branch wip-20392

3 failed, 15 passed (18 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-25_17:11:13-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/

Actions

Copy link

#72

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#73

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#74

Updated by Nathan Cutler almost 7 years ago

https://shaman.ceph.com/builds/ceph/wip-jewel-backports/015dd1136459b15885142a76769efb360c945baf/

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

Actions

Copy link

#75

Updated by Nathan Cutler almost 7 years ago

upgrade/hammer-x¶

teuthology-suite -k distro --ceph wip-jewel-backports --rerun smithfarm-2017-06-25_17:11:13-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps --machine-type vps --priority 101 --email ncutler@suse.com --teuthology-branch wip-20392

running with bad PR#14930, but the other two tests are valid re-runs http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-26_20:18:43-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/

Actions

Copy link

#76

Updated by Nathan Cutler almost 7 years ago

https://shaman.ceph.com/builds/ceph/wip-jewel-backports/9117553ee1ff17c305c86948ea6ae1d167f0cf92/

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

Actions

Copy link

#77

Updated by Nathan Cutler almost 7 years ago

fs¶

teuthology-suite -k distro --priority 101 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --rerun smithfarm-2017-06-22_11:51:56-fs-wip-jewel-backports-distro-basic-smithi

pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-27_10:16:32-fs-wip-jewel-backports-distro-basic-smithi/

Result reported in https://github.com/ceph/ceph/pull/15936 (merged)

Actions

Copy link

#78

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#79

Updated by Nathan Cutler almost 7 years ago

https://shaman.ceph.com/builds/ceph/wip-jewel-backports/ca7ab74ae7884f24983d94b729cc262108ff6aba/

git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'

Actions

Copy link

#80

Updated by Nathan Cutler almost 7 years ago

Upgrade hammer-x¶

teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com --teuthology-branch wip-20392

2 fail, 16 pass (18 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-27_15:24:41-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/

both failed jobs appear to be http://tracker.ceph.com/issues/13381 (regression?)

Rerun on smithi:

1 fail, 1 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-30_09:08:18-upgrade:hammer-x-wip-jewel-backports-distro-basic-smithi/

failure is "ceph-objectstore-tool: exp list-pgs failure with status 1"

Rerun on vps:

pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-30_09:08:35-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/

Ruled a pass

Actions

Copy link

#81

Updated by Nathan Cutler almost 7 years ago

rados¶

teuthology-suite -k distro --priority 101 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --teuthology-branch wip-20392

7 fail, 220 pass (227 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-27_19:13:42-rados-wip-jewel-backports-distro-basic-smithi/

five of the failures are infrastructure noise
the sixth might be a new bug: http://tracker.ceph.com/issues/20449
the seventh is ENOSPC, presumably because the smithis have smaller disks (so, infrastructure noise)

Re-run:

3 pass, 4 fail (7 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-28_10:10:16-rados-wip-jewel-backports-distro-basic-smithi/

all four failures are ansible-related

Re-run:

pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-29_05:54:25-rados-jewel-distro-basic-smithi/

Ruled a pass

Actions

Copy link

#82

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#83

Updated by Nathan Cutler almost 7 years ago

https://shaman.ceph.com/builds/ceph/wip-jewel-backports/5be0bc0a9cb0777201dd349120ec578fe27f4409/

Actions

Copy link

#84

Updated by Nathan Cutler almost 7 years ago

ceph-disk¶

teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com

pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-06-30_09:03:26-ceph-disk-wip-jewel-backports-distro-basic-vps/

Actions

Copy link

#85

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#86

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#87

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#88

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#89

Updated by Yuri Weinstein almost 7 years ago

QE VALIDATION (STARTED 7/1/17)¶

(Note: PASSED / FAILED - indicates "TEST IS IN PROGRESS")

re-runs command lines and filters are captured in http://pad.ceph.com/p/hammer_v10.2.8_QE_validation_notes

command line CEPH_QA_MAIL="ceph-qa@ceph.com"; MACHINE_NAME=smithi; CEPH_BRANCH=jewel; SHA1=53a3be7261cfeb12445fbdba8238eefa40ed09f5 ; teuthology-suite -v --ceph-repo https://github.com/ceph/ceph.git --suite-repo https://github.com/ceph/ceph.git -c $CEPH_BRANCH -S $SHA1 -m $MACHINE_NAME -s rados --subset 35/50 -k distro -p 100 -e $CEPH_QA_MAIL --suite-branch jewel --dry-run

teuthology-suite -v -c $CEPH_BRANCH -S $SHA1 -m $MACHINE_NAME -r $RERUN --suite-repo https://github.com/ceph/ceph.git --ceph-repo https://github.com/ceph/ceph.git --suite-branch jewel -p 90 -R fail,dead,running

Suite

Runs/Reruns

Notes/Issues

rados	http://pulpito.ceph.com/yuriw-2017-06-30_22:50:27-rados-jewel-distro-basic-smithi/	PASSED #20489*
	http://pulpito.ceph.com/yuriw-2017-07-05_00:26:35-rados-jewel-distro-basic-smithi/	https://github.com/ceph/ceph/pull/14710

rgw	http://pulpito.ceph.com/yuriw-2017-06-30_22:56:30-rgw-jewel-distro-basic-smithi/	PASSED see one "saw valgrind issues"
	http://pulpito.ceph.com/yuriw-2017-07-01_04:05:30-rgw-jewel-distro-basic-smithi/	passed on rerun

rbd	http://pulpito.ceph.com/yuriw-2017-06-30_22:59:10-rbd-jewel-distro-basic-smithi/	PASSED

fs	http://pulpito.ceph.com/yuriw-2017-06-30_23:02:06-fs-jewel-distro-basic-smithi/	PASSED
	http://pulpito.front.sepia.ceph.com/yuriw-2017-07-03_14:57:31-fs-jewel-distro-basic-smithi/
	http://pulpito.ceph.com/yuriw-2017-07-03_16:49:04-fs-jewel-distro-basic-smithi/

krbd	http://pulpito.ceph.com/yuriw-2017-07-01_04:10:21-krbd-jewel-testing-basic-smithi/	FAILED approved by Ilya
	http://pulpito.front.sepia.ceph.com:80/yuriw-2017-07-03_15:58:06-krbd-jewel-testing-basic-smithi/	rerun per Ilya

kcephfs	http://pulpito.ceph.com/yuriw-2017-07-01_04:11:25-kcephfs-jewel-testing-basic-smithi/	PASSED

knfs	http://pulpito.ceph.com/yuriw-2017-07-01_04:12:02-knfs-jewel-testing-basic-smithi/	PASSED

rest	http://pulpito.ceph.com/yuriw-2017-07-01_14:54:12-rest-jewel-distro-basic-smithi/	PASSED

hadoop	http://pulpito.ceph.com/yuriw-2017-07-01_14:54:48-hadoop-jewel-distro-basic-smithi/	FAILED #19456 EXCLUDED FROM THIS RELEASE

samba		EXCLUDED FROM THIS RELEASE

ceph-deploy	http://pulpito.ceph.com/yuriw-2017-07-01_14:55:48-ceph-deploy-jewel-distro-basic-vps/	PASSED
	http://pulpito.ceph.com/yuriw-2017-07-03_23:00:14-ceph-deploy-jewel-distro-basic-vps/

ceph-disk	http://pulpito.ceph.com/yuriw-2017-07-01_14:56:06-ceph-disk-jewel-distro-basic-vps/	PASSED

upgrade/client-upgrade	(http://pulpito.ceph.com/yuriw-2017-07-01_14:56:42-upgrade:client-upgrade-jewel-distro-basic-smithi/)	PASSED
	(http://pulpito.front.sepia.ceph.com/yuriw-2017-07-03_15:21:48-upgrade:client-upgrade-jewel-distro-basic-smithi/)
	http://pulpito.ceph.com/yuriw-2017-07-04_15:05:26-upgrade:client-upgrade-jewel-distro-basic-vps/	https://github.com/ceph/ceph/pull/16088

upgrade/hammer-x (jewel)	http://pulpito.ceph.com/yuriw-2017-07-01_14:57:44-upgrade:hammer-x-jewel-distro-basic-vps/	PASSED

upgrade/jewel-x/point-to-point-x	(http://pulpito.ceph.com/yuriw-2017-07-01_14:58:30-upgrade:jewel-x:point-to-point-x-jewel-distro-basic-vps/)	PASSED
	http://pulpito.ceph.com/yuriw-2017-07-04_15:06:27-upgrade:jewel-x:point-to-point-x-jewel-distro-basic-vps/	https://github.com/ceph/ceph/pull/16089

powercycle	http://pulpito.ceph.com/yuriw-2017-07-01_04:12:42-powercycle-jewel-testing-basic-smithi/	PASSED

ceph-ansible	http://pulpito.ceph.com/yuriw-2017-07-01_15:11:28-ceph-ansible-jewel-distro-basic-vps/	PASSED

		PASSED / FAILED

Actions

Copy link

#90

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#91

Updated by Yuri Weinstein almost 7 years ago

Description updated (diff)

Actions

Copy link

#92

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Updated release SHA1 to 66dbf9beef04988dbd3653591e51afa6d84e3990

Actions

Copy link

#93

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#94

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#95

Updated by Nathan Cutler almost 7 years ago

Description updated (diff)

Actions

Copy link

#96

Updated by Nathan Cutler almost 7 years ago

Status changed from In Progress to Resolved

Project

General

Profile

Ceph » Stable releases

Custom queries

Tasks #19538

jewel v10.2.8

Workflow¶

Release information¶

Updated by Nathan Cutler about 7 years ago

Updated by Nathan Cutler about 7 years ago

Updated by Nathan Cutler about 7 years ago

Updated by Nathan Cutler about 7 years ago

rados¶

Updated by Nathan Cutler about 7 years ago

powercycle¶

Updated by Nathan Cutler about 7 years ago

Upgrade jewel point-to-point-x¶

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

Updated by Nathan Cutler about 7 years ago

ceph-disk¶

Updated by Nathan Cutler about 7 years ago

fs¶

Updated by Nathan Cutler about 7 years ago

rgw¶

Updated by Nathan Cutler about 7 years ago

rbd¶

Updated by Nathan Cutler about 7 years ago

Updated by Nathan Cutler about 7 years ago

rgw¶

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

Updated by Nathan Cutler about 7 years ago

upgrade/client-upgrade¶

Updated by Nathan Cutler about 7 years ago

upgrade/client-upgrade¶

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

Updated by Nathan Cutler about 7 years ago

rgw¶

Updated by Nathan Cutler about 7 years ago

Updated by Nathan Cutler about 7 years ago

wip-jewel-backports-rgw¶

Updated by Nathan Cutler about 7 years ago

RGW bisect¶

Updated by Nathan Cutler about 7 years ago

Updated by Nathan Cutler about 7 years ago

assert no massive rgw failure¶

Updated by Nathan Cutler about 7 years ago

rgw¶

Updated by Nathan Cutler about 7 years ago

assert no async messenger leak¶

Updated by Nathan Cutler about 7 years ago

rados¶

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

Updated by Nathan Cutler about 7 years ago

upgrade client-upgrade¶

Updated by Nathan Cutler about 7 years ago

bisect regression in jewel¶

Updated by Nathan Cutler about 7 years ago

jewel regression bisect, round 2¶

Updated by Nathan Cutler about 7 years ago

jewel "regression" bisect, round 3¶

jewel "regression" bisect, round 4¶

Updated by Nathan Cutler about 7 years ago

jewel "regression" grand finale¶

Updated by Nathan Cutler about 7 years ago

Updated by Nathan Cutler about 7 years ago

rados¶

Updated by Nathan Cutler about 7 years ago

Upgrade hammer-x¶

Updated by Nathan Cutler about 7 years ago

upgrade client-upgrade¶

Updated by Nathan Cutler about 7 years ago

ceph-disk¶

Updated by Nathan Cutler about 7 years ago

Updated by Nathan Cutler about 7 years ago

rados¶