Tasks #20613
closedjewel v10.2.10
0%
Description
Workflow¶
- Preparing the release
- Nathan patches upgrade/jewel-x/point-to-point-x to do 10.2.0 -> current Jewel point release -> x - SKIPPED (this test is currently broken because of a PR prematurely merged to the s3-tests repo; the fix will be in 10.2.10, after which we can update the test and it should pass)
- Cutting the release
- Loic asks Abhishek L. if a point release should be published - YES 20170907
- Loic gets approval from all leads
- Yehuda, rgw - YES 20170912
- Patrick, fs - YES 20170907
- Jason, rbd - YES 20170907
- Josh, rados - YES 20170907
- Abhishek L. writes and commits the release notes
- Nathan informs Yuri that the branch is ready for testing - DONE 20170919
- Yuri runs additional integration tests
- If Yuri discovers new bugs that need to be backported urgently (i.e. their priority is set to Urgent or Immediate), the release goes back to being prepared; it was not ready after all
- Yuri informs Alfredo that the branch is ready for release - DONE
- Alfredo creates the packages and sets the release tag - DONE
- Abhishek L. posts release announcement on https://ceph.com/community/blog - DONE https://ceph.com/releases/v10-2-10-jewel-released/
Release information¶
- branch to build from: jewel, commit: 750e67cab8fd0498ca6d843f25007904041d49cd
- version: v10.2.10
- type of release: point release
- where to publish the release: http://download.ceph.com/debian-jewel and http://download.ceph.com/rpm-jewel
Updated by Nathan Cutler almost 7 years ago
- Status changed from New to In Progress
- Priority changed from Normal to Urgent
Updated by Nathan Cutler over 6 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/4a8e1d8d2302849964e39405fa78ce4c6f553378/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 15448
- |\
- | + rgw: fix for zonegroup redirect url
- | + rgw: use zonegroup's master zone endpoints for bucket redirect
- + Pull request 15447
- |\
- | + rgw: allow larger payload for period commit
- + Pull request 15442
- |\
- | + osdc/Filer: truncate large file party by party
- + Pull request 15428
- |\
- | + build/ops: deb: fix logrotate packaging
- + Pull request 15322
- |\
- | + osd: do not default-abort on leaked pg refs
- | + osd: Reset() the snaptrimmer on shutdown
- + Pull request 15236
- |\
- | + mon/PGMap: factor mon_osd_full_ratio into MAX AVAIL calc
- + Pull request 15197
- |\
- | + rgw: remove unnecessary output
- + Pull request 15196
- |\
- | + build/ops: rpm: fix python-Sphinx package name for SUSE
- + Pull request 15189
- |\
- | + os/filestore: fix infinit loops in fiemap()
- + Pull request 15083
- |\
- | + mon: check is_shutdown() in timer callbacks
- | + mon/Elector: call cancel_timer() in shutdown()
- | + jewel: mon: add override annotation to callback classes
- | + mon/PaxosService: move classes to cc file
- | + mon/Paxos: move classes to .cc file
- | + mon/Elector:move C_ElectionExpire class to cc file
- | + mon/Monitor: move C_Scrub, C_ScrubTimeout to .cc
- + Pull request 15065
- |\
- | + osd/PrimaryLogPG: do not call on_shutdown() if
- + Pull request 15051
- |\
- | + systemd/ceph-disk: make it possible to customize timeout
- + Pull request 15050
- |\
- | + jewel: mon: Fix status output warning for mon_warn_osd_usage_min_max_delta
- | + mon/PGMonitor: clean up min/max span warning
- | + osd: Round fullness in message to correspond to df -h
- | + filestore: Account for dirty journal data in statfs
- | + mon: Add warning if diff in OSD usage > config mon_warn_osd_usage_percent
- | + mon: Bump min in ratio to 75%
- | + osd: Fix ENOSPC crash message text
- + Pull request 14977
- |\
- | + librbd: add no-op event when promoting an image
- | + rbd-mirror: prevent infinite loop when computing replay status
- + Pull request 14943
- |\
- | + osd: fix occasional MOSDMap leak
- + Pull request 14874
- |\
- | + librbd: default features should be negotiated with the OSD
- | + cls/rbd: add get_all_features on client side
- + Pull request 14699
- |\
- | + suites: update log whitelist for scrub msg
- | + mds: include advisory `path` field in damage
- | + mds: populate DamageTable from scrub and log more quietly
- | + mds: tidy up ScrubHeader
- | + mds: remove redundant checks for null ScrubHeader
- | + mds/DamageTable: move classes to .cc file
- + Pull request 14691
- |\
- | + tests: upgrade:client-upgrade/firefly-client-x: drop CentOS
- + Pull request 14673
- |\
- | + mds: set ceph-mds name uncond for external tools
- + Pull request 14663
- |\
- | + test: remove hard-coded image name from RBD metadata test
- | + librbd: relax is parent mirrored check when enabling mirroring for pool
- + Pull request 14659
- |\
- | + rgw: add the remove-x-delete feature to cancel swift object expiration
- + Pull request 14346
- + rpm: Fix undefined FIRST_ARG
- + selinux: Install ceph-base before ceph-selinux
- + rpm: Move ceph-disk to ceph-base
- + ceph-disk: Fix the file ownership, skip missing
- + selinux: Do parallel relabel on package install
- + ceph-disk: Add --system option for fix command
- + ceph-disk: Add more fix targets
- + ceph-disk: Add unit test for fix command
- + ceph-disk: Add fix subcommand
Updated by Nathan Cutler over 6 years ago
rados¶
teuthology-suite -k distro --priority 1000 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
5 fail, 2 dead, 220 pass (227 total) http://pulpito.ceph.com:80/smithfarm-2017-08-21_19:38:42-rados-wip-jewel-backports-distro-basic-smithi/
- SELinux denials
- infrastructure noise? dpkg: error: dpkg status database is locked by another process
- "EAGAIN: pg 1.0 primary osd.1 not up" on ceph pg scrub
- 1547951 -> test_mon_ping: ceph ping 'mon.*' causes /usr/bin/python to dump core (!) in conjunction with injected socket failure
- 1547956 -> failed test ObjectStore/StoreTest.FiemapHoles/3, where GetParam() = "kstore" (test/objectstore/store_test.cc:310: Failure, Expected: (m[SKIP_STEP]) >= (3u), actual: 0 vs 3)
Rerun:
1 fail, 2 dead, 4 pass (7 total) http://pulpito.ceph.com:80/smithfarm-2017-08-22_13:00:32-rados-wip-jewel-backports-distro-basic-smithi/
Updated by Nathan Cutler over 6 years ago
powercycle¶
teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 999 -l 2 --email ncutler@suse.com
Updated by Nathan Cutler over 6 years ago
Upgrade jewel point-to-point-x¶
teuthology-suite -k distro --verbose --suite upgrade/jewel-x/point-to-point-x --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
Updated by Nathan Cutler over 6 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
2 fail, 1 dead, 15 pass (18 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-21_20:01:15-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
Rerun on smithi:
2 fail, 1 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-22_12:32:46-upgrade:hammer-x-wip-jewel-backports-distro-basic-smithi/
- SELinux denials
- failed to complete snaptrimming before timeout
Updated by Nathan Cutler over 6 years ago
fs¶
teuthology-suite -k distro --priority 999 --suite fs --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
2 fail, 86 pass (88 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-21_20:02:22-fs-wip-jewel-backports-distro-basic-smithi/
- known bug http://tracker.ceph.com/issues/16881
Re-run:
1 fail, 1 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-22_08:31:06-fs-wip-jewel-backports-distro-basic-smithi/
- known bug http://tracker.ceph.com/issues/16881
One of the failures appears to be reproducible, rerunning three times:
Since http://tracker.ceph.com/issues/16881 is a known issue due to a racy test, ruled a pass
Updated by Nathan Cutler over 6 years ago
rgw¶
teuthology-suite -k distro --priority 999 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 2)/2
Updated by Nathan Cutler over 6 years ago
ceph-disk¶
teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
Updated by Nathan Cutler over 6 years ago
rbd¶
teuthology-suite -k distro --priority 999 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 4)/4
27 fail, 82 pass (109 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-21_20:07:45-rbd-wip-jewel-backports-distro-basic-smithi/
- problem with a backport https://github.com/ceph/ceph/pull/14874
Updated by Nathan Cutler over 6 years ago
Upgrade client-upgrade¶
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
1 fail, 13 pass (14 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-22_08:10:05-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- known bug http://tracker.ceph.com/issues/19571
Rerun vps, smithi
- known bug http://tracker.ceph.com/issues/19571
Ruled a pass
Updated by Nathan Cutler over 6 years ago
rbd for PR#14663¶
teuthology-suite -k distro --priority 999 --suite rbd --email ncutler@suse.com --ceph wip-19228-jewel --machine-type smithi --subset $(expr $RANDOM % 4)/4
Updated by Nathan Cutler over 6 years ago
ceph-disk for pr#17133¶
teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-21035-jewel --machine-type vps --priority 999 --email ncutler@suse.com
Updated by Nathan Cutler over 6 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/bdfc04f416a87e5c1c0f6010b28ab7be0e3ded2e/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 15322
- |\
- | + osd: do not default-abort on leaked pg refs
- | + osd: Reset() the snaptrimmer on shutdown
- + Pull request 15236
- |\
- | + mon/PGMap: factor mon_osd_full_ratio into MAX AVAIL calc
- + Pull request 15083
- |\
- | + mon: check is_shutdown() in timer callbacks
- | + mon/Elector: call cancel_timer() in shutdown()
- | + jewel: mon: add override annotation to callback classes
- | + mon/PaxosService: move classes to cc file
- | + mon/Paxos: move classes to .cc file
- | + mon/Elector:move C_ElectionExpire class to cc file
- | + mon/Monitor: move C_Scrub, C_ScrubTimeout to .cc
- + Pull request 15065
- |\
- | + osd/PrimaryLogPG: do not call on_shutdown() if
- + Pull request 15050
- |\
- | + jewel: mon: Fix status output warning for mon_warn_osd_usage_min_max_delta
- | + mon/PGMonitor: clean up min/max span warning
- | + osd: Round fullness in message to correspond to df -h
- | + filestore: Account for dirty journal data in statfs
- | + mon: Add warning if diff in OSD usage > config mon_warn_osd_usage_percent
- | + mon: Bump min in ratio to 75%
- | + osd: Fix ENOSPC crash message text
- + Pull request 14977
- |\
- | + librbd: add no-op event when promoting an image
- | + rbd-mirror: prevent infinite loop when computing replay status
- + Pull request 14943
- |\
- | + osd: fix occasional MOSDMap leak
- + Pull request 14699
- |\
- | + suites: update log whitelist for scrub msg
- | + mds: include advisory `path` field in damage
- | + mds: populate DamageTable from scrub and log more quietly
- | + mds: tidy up ScrubHeader
- | + mds: remove redundant checks for null ScrubHeader
- | + mds/DamageTable: move classes to .cc file
- + Pull request 14673
- |\
- | + mds: set ceph-mds name uncond for external tools
- + Pull request 14663
- |\
- | + test: remove hard-coded image name from RBD metadata test
- | + librbd: relax is parent mirrored check when enabling mirroring for pool
- + Pull request 14659
- |\
- | + rgw: add the remove-x-delete feature to cancel swift object expiration
- + Pull request 14346
- + rpm: Fix undefined FIRST_ARG
- + selinux: Install ceph-base before ceph-selinux
- + rpm: Move ceph-disk to ceph-base
- + ceph-disk: Fix the file ownership, skip missing
- + selinux: Do parallel relabel on package install
- + ceph-disk: Add --system option for fix command
- + ceph-disk: Add more fix targets
- + ceph-disk: Add unit test for fix command
- + ceph-disk: Add fix subcommand
Updated by Nathan Cutler over 6 years ago
rados¶
Rerunning 2 dead and 1 failed job from the earlier run:
teuthology-suite -k distro --priority 101 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --rerun smithfarm-2017-08-22_13:00:32-rados-wip-jewel-backports-distro-basic-smithi
1 fail, 2 pass (3 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-23_08:55:27-rados-wip-jewel-backports-distro-basic-smithi/
- segfault added to http://tracker.ceph.com/issues/21063
Since the segfault is in KStore which is not supported, ruled a pass
Updated by Nathan Cutler over 6 years ago
Upgrade hammer-x¶
Re-running 2 failed jobs from the previous run:
teuthology-suite -k distro --verbose --ceph wip-jewel-backports --machine-type smithi --priority 101 --email ncutler@suse.com --rerun smithfarm-2017-08-22_12:32:46-upgrade:hammer-x-wip-jewel-backports-distro-basic-smithi
1 fail, 1 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-23_08:57:27-upgrade:hammer-x-wip-jewel-backports-distro-basic-smithi/
Failure is SELinux related; ruled a pass
Updated by Nathan Cutler over 6 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/8594575b28187a778fcacc2b4313e3506502bbee/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 15465
- |\
- | + rgw: segment fault when shard id out of range
- + Pull request 15464
- |\
- | + librbd: potential read IO hang when image is flattened
- + Pull request 15463
- |\
- | + rbd-nbd: relax size check for newer kernel versions
- + Pull request 15461
- |\
- | + test/librbd/test_notify.py: don't disable feature in slave
- + Pull request 15460
- |\
- | + librbd: batch ObjectMap updations upon trim
- + Pull request 15459
- |\
- | + rgw_file: fix fs_inst progression
- | + rgw_file: remove post-unlink lookup check
- | + rgw_file: release rgw_fh lock and ref on ENOTEMPTY
- | + rgw_file: remove hidden uxattr objects from buckets on delete
- + Pull request 15457
- |\
- | + rgw: RGWPeriodPusher spawns http thread before cr thread
- | + rgw: should delete in_stream_req if conn->get_obj(...) return not zero value
- | + rgw: dont spawn error_repo until lease is acquired
- + Pull request 15456
- |\
- | + rgw_file: v3: fix write-timer action
- + Pull request 15455
- |\
- | + cls/log/cls_log.cc: reduce logging noise
- + Pull request 15454
- |\
- | + radosgw-admin: warn that 'realm rename' does not update other clusters
- + Pull request 15453
- |\
- | + rgw: update bucket cors in secondary zonegroup should forward to master
- | + rgw: fix for EINVAL errors on forwarded bucket put_acl requests
- | + rgw: enable to update acl of bucket created in slave zonegroup
- + Pull request 15452
- |\
- | + rgw: fix versioned bucket data sync fail when upload is busy
- + Pull request 15451
- |\
- | + rgw: put object's acl can't work well on the latest object when versioning is enabled.
- + Pull request 15450
- |\
- | + rgw: when create_bucket use the same num_shards with info.num_shards
- | + rgw: using the same bucket num_shards as master zg when create bucket in secondary zg
- + Pull request 14977
- |\
- | + librbd: add no-op event when promoting an image
- | + rbd-mirror: prevent infinite loop when computing replay status
- + Pull request 14874
- + librbd: default features should be negotiated with the OSD
- + cls/rbd: add get_all_features on client side
Updated by Nathan Cutler over 6 years ago
rgw¶
teuthology-suite -k distro --priority 999 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 2)/2
Updated by Nathan Cutler over 6 years ago
rbd¶
teuthology-suite -k distro --priority 999 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 4)/4
- https://github.com/ceph/ceph/pull/15460 caused a regression
Updated by Nathan Cutler over 6 years ago
rados¶
teuthology-suite -k distro --priority 999 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
2 fail, 1 dead, 224 pass (227 total) http://pulpito.ceph.com:80/smithfarm-2017-08-24_16:00:40-rados-wip-jewel-backports-distro-basic-smithi/
Re-running 2 failed and 1 dead job:
1 fail, 2 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-25_17:29:04-rados-wip-jewel-backports-distro-basic-smithi/
- Failure is SELinux related, ignoring.
Ruled a pass
Updated by Nathan Cutler over 6 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/daf32194cadd7fbe1c96ebee10069e4e4a25738d/ (build failure: env noise)
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/883ecf1b30a9fe876efba26fec4ccbec24bc8b09/ (rebased to trigger a new build)
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 15465
- |\
- | + rgw: segment fault when shard id out of range
- + Pull request 15464
- |\
- | + librbd: potential read IO hang when image is flattened
- + Pull request 15463
- |\
- | + rbd-nbd: relax size check for newer kernel versions
- + Pull request 15461
- |\
- | + test/librbd/test_notify.py: don't disable feature in slave
- + Pull request 15459
- |\
- | + rgw_file: fix fs_inst progression
- | + rgw_file: remove post-unlink lookup check
- | + rgw_file: release rgw_fh lock and ref on ENOTEMPTY
- | + rgw_file: remove hidden uxattr objects from buckets on delete
- + Pull request 15457
- |\
- | + rgw: RGWPeriodPusher spawns http thread before cr thread
- | + rgw: should delete in_stream_req if conn->get_obj(...) return not zero value
- | + rgw: dont spawn error_repo until lease is acquired
- + Pull request 15456
- |\
- | + rgw_file: v3: fix write-timer action
- + Pull request 15455
- |\
- | + cls/log/cls_log.cc: reduce logging noise
- + Pull request 15454
- |\
- | + radosgw-admin: warn that 'realm rename' does not update other clusters
- + Pull request 15453
- |\
- | + rgw: update bucket cors in secondary zonegroup should forward to master
- | + rgw: fix for EINVAL errors on forwarded bucket put_acl requests
- | + rgw: enable to update acl of bucket created in slave zonegroup
- + Pull request 15452
- |\
- | + rgw: fix versioned bucket data sync fail when upload is busy
- + Pull request 15451
- |\
- | + rgw: put object's acl can't work well on the latest object when versioning is enabled.
- + Pull request 15450
- |\
- | + rgw: when create_bucket use the same num_shards with info.num_shards
- | + rgw: using the same bucket num_shards as master zg when create bucket in secondary zg
- + Pull request 14977
- |\
- | + librbd: add no-op event when promoting an image
- | + rbd-mirror: prevent infinite loop when computing replay status
- + Pull request 14874
- + librbd: default features should be negotiated with the OSD
- + cls/rbd: add get_all_features on client side
Updated by Nathan Cutler over 6 years ago
rbd¶
teuthology-suite -k distro --priority 999 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 4)/4
Updated by Nathan Cutler over 6 years ago
cephfs note¶
https://github.com/ceph/ceph/pull/16248 was merged by oversight, without any testing, but if this test runs it would probably cover it (but needs double-checking of SHA1):
running http://pulpito.ceph.com/teuthology-2017-08-27_04:10:02-fs-jewel-distro-basic-smithi/
Updated by Nathan Cutler over 6 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/fe8b99d8aec719b5dd9aacaf126db1854f34cdc8/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 16124
- |\
- | + tests: librbd: adapt test_mock_RefreshRequest for jewel
- | + librbd: acquire exclusive-lock during copy on read
- + Pull request 16061
- |\
- | + jewel:ceph-disk:remove the special check to bcache devices
- + Pull request 16059
- |\
- | + messages/MOSDPing: optimize encode and decode of dummy payload
- | + messages/MOSDPing: fix the inflation amount calculation
- | + OSD: mark two heartbeat config opts as observed
- | + messages/MOSDPing: initialize MOSDPing padding
- | + osd: heartbeat with packets large enough to require working jumbo frames.
- + Pull request 16015
- |\
- | + osd/osd_internal_types: wake snaptrimmer on put_read lock, too
- + Pull request 15988
- |\
- | + rgw: rest handlers for mdlog and datalog list dont loop
- | + rgw: fix RGWMetadataLog::list_entries() for null last_marker
- | + rgw: RGWMetadataLog::list_entries() no longer splices
- | + fix infinite loop in rest api for log list
- + Pull request 15947
- |\
- | + jewel: osd: unlock sdata_op_ordering_lock with sdata_lock hold to avoid missing wakeup signal
- + Pull request 15762
- |\
- | + test: timeout verification that mon is unreachable
- | + cli: retry when the mon is not configured
- + Pull request 15760
- |\
- | + libradosstriper: delete striped objects of zero length
- + Pull request 15726
- |\
- | + msg/async: go to open new session when existing already closed
- | + msg/async: fix accept_conn not remove entry in conns when lazy delete
- | + msg/AsyncMessenger.h:remove unneeded use of count
- + Pull request 15719
- |\
- | + rgw: fix 'gc list --include-all' command infinite loop the first 1000 items
- + Pull request 15602
- |\
- | + test/librbd: decouple ceph_test_librbd_api from libceph-common
- | + test/librados: extract functions using libcommon in test.cc into test_common.cc
- | + test/librbd: replace libcommon classes using standard library
- + Pull request 15556
- |\
- | + test/rgw: wait for realm reload after set_master_zone
- | + test/rgw: fixes for test_multi_period_incremental_sync()
- | + test/rgw: meta checkpoint compares realm epoch
- | + rgw: remove rgw_realm_reconfigure_delay
- | + rgw: require --yes-i-really-mean-it to promote zone with stale metadata
- | + rgw: period commit uses sync status markers
- | + rgw: use RGWShardCollectCR for RGWReadSyncStatusCoroutine
- | + rgw: change metadata read_sync_status interface
- | + rgw: store realm epoch with sync status markers
- | + rgw: RGWBackoffControlCR only retries until success
- | + rgw: clean up RGWInitDataSyncStatusCoroutine
- | + rgw: fix marker comparison to detect end of mdlog period
- | + rgw: add == and != operators for period history cursor
- | + rgw: add empty_on_enoent flag to RGWSimpleRadosReadCR
- + Pull request 15503
- |\
- | + ceph-disk: separate ceph-osd --check-needs-* logs
- + Pull request 15488
- |\
- | + rbd-mirror: ensure missing images are re-synced when detected
- + Pull request 15477
- |\
- | + rgw: delete non-empty buckets in slave zonegroup returns error but the buckets have actually been deleted.
- + Pull request 15475
- |\
- | + qa: add a sleep after restarting osd before tell ing it
- + Pull request 15474
- |\
- | + osd/PrimaryLogPG: do not expect FULL_TRY ops to get resent
- + Pull request 15473
- |\
- | + Set subman cron attributes in spec file
- + Pull request 15460
- + librbd: clean up object map update interface, revisited
- + librbd: batch ObjectMap updations upon trim
Updated by Nathan Cutler over 6 years ago
rados¶
teuthology-suite -k distro --priority 999 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
4 fail, 2 dead, 221 pass (227 total) http://pulpito.ceph.com:80/smithfarm-2017-08-27_18:29:25-rados-wip-jewel-backports-distro-basic-smithi/
- rados/thrash/...thrashers/pggrow.yaml workloads/rgw_snaps.yaml failed because something in s3-tests (read-write tests) apparently sent a negative size to os.urandom(size)
- rados/thrash/...thrashers/morepggrow.yaml workloads/rgw_snaps.yaml failed because something in s3-tests (read-write tests) apparently sent a negative size to os.urandom(size)
Rerunning 4 failed and 2 dead jobs:
3 fail, 2 dead, 1 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-28_13:25:20-rados-wip-jewel-backports-distro-basic-smithi/
- rados/thrash/...thrashers/pggrow.yaml workloads/rgw_snaps.yaml failed because it's possibly racing with rgw to create .rgw.control
- rados/thrash/...thrashers/morepggrow.yaml workloads/rgw_snaps.yaml failed because it's possibly racing with rgw to create .rgw.control
Rerunning 3 failed jobs individually:
The 2 dead jobs are objectstore idempotent tests that have been problematic from the beginning and do not pass in master, either; opened https://github.com/ceph/ceph/pull/17317 to drop them, but Josh determined that this is known bug http://tracker.ceph.com/issues/20981
Ruled a pass
Updated by Nathan Cutler over 6 years ago
powercycle¶
teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 999 -l 2 --email ncutler@suse.com
1 fail, 1 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-27_18:55:53-powercycle-wip-jewel-backports-distro-basic-smithi/
Rerunning the failed job:
Updated by Nathan Cutler over 6 years ago
Upgrade jewel point-to-point-x¶
teuthology-suite -k distro --verbose --suite upgrade/jewel-x/point-to-point-x --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
- FAIL: s3tests.functional.test_s3.test_versioned_object_acl_no_version_specified
Rerunning:
- FAIL: s3tests.functional.test_s3.test_versioned_object_acl_no_version_specified
The failure is due to a new test that was merged into s3-tests repo: https://github.com/ceph/s3-tests/pull/160
Ignoring for now.
Updated by Nathan Cutler over 6 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
2 fail, 16 pass (18 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-27_18:56:34-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
- timed out waiting for admin_socket to appear after osd.4 restart
Re-running 2 failed jobs on smithi:
1 fail, 1 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-28_13:27:46-upgrade:hammer-x-wip-jewel-backports-distro-basic-smithi/
- failure is SELinux-related
Re-running 1 failed job on vps:
Ruled a pass
Updated by Nathan Cutler over 6 years ago
rgw¶
teuthology-suite -k distro --priority 999 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 2)/2
Updated by Nathan Cutler over 6 years ago
ceph-disk¶
teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
Updated by Nathan Cutler over 6 years ago
rbd¶
teuthology-suite -k distro --priority 999 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 4)/4
1 dead, 106 pass (107 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-27_18:58:54-rbd-wip-jewel-backports-distro-basic-smithi/
Rerunning 1 dead job:
Updated by Nathan Cutler over 6 years ago
Upgrade client-upgrade¶
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
Updated by Nathan Cutler over 6 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/0856b44fc586f2d8620f4841e3e792c34b4affad/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 17210
- |\
- | + config: disable skewed utilization warning by default
- + Pull request 16405
- |\
- | + osd: scrub_to specifies clone ver, but transaction include head write ver
- + Pull request 16295
- |\
- | + rbd: properly decode features when using image name optional
- + Pull request 16294
- |\
- | + upstart: start radosgw-all according to runlevel
- + Pull request 16293
- |\
- | + osdc/Objecter: release message if it is not handled
- | + crypto: allow PK11 module to load even if it's already initialized
- + Pull request 16285
- |\
- | + rbd-mirror: set SEQUENTIAL and NOCACHE advise flags on image sync
- + Pull request 16276
- |\
- | + rgw: fix next marker to pass test_bucket_list_prefix in s3test
- | + rgw: fix listing of objects that start with underscore
- + Pull request 16268
- |\
- | + rgw/rgw_common.cc: modify the end check in RGWHTTPArgs::sys_get
- + Pull request 16266
- |\
- | + rgw: multipart copy-part remove '/' for s3 java sdk request header.
- + Pull request 16240
- |\
- | + Don't increment bi list entry count on error to not distort error code
- + Pull request 16169
- |\
- | + osd/PrimaryLogPG solve cache tier osd high memory consumption
- + Pull request 16167
- |\
- | + osd/ReplicatedBackend: reset thread heartbeat after every omap entry in deep-scrub
- + Pull request 16151
- |\
- | + client: avoid returning negative space available
- + Pull request 16150
- |\
- | + mds: save projected path into inode_t::stray_prior_path
- + Pull request 16144
- |\
- | + mon: osd crush set crushmap need sanity check
- | + crush: when take place the crush map should consider the rule is in used
- + Pull request 16141
- |\
- | + ceph_test_rados_api_misc: fix LibRadosMiscConnectFailure.ConnectFailure retry
- + Pull request 15988
- |\
- | + rgw: rest handlers for mdlog and datalog list dont loop
- | + rgw: fix RGWMetadataLog::list_entries() for null last_marker
- | + rgw: RGWMetadataLog::list_entries() no longer splices
- | + fix infinite loop in rest api for log list
- + Pull request 15966
- |\
- | + rgw: add a field to store generic user data in the bucket index, that can be populated/fetched via a configurable custom http header
- + Pull request 15556
- |\
- | + test/rgw: wait for realm reload after set_master_zone
- | + test/rgw: fixes for test_multi_period_incremental_sync()
- | + test/rgw: meta checkpoint compares realm epoch
- | + rgw: remove rgw_realm_reconfigure_delay
- | + rgw: require --yes-i-really-mean-it to promote zone with stale metadata
- | + rgw: period commit uses sync status markers
- | + rgw: use RGWShardCollectCR for RGWReadSyncStatusCoroutine
- | + rgw: change metadata read_sync_status interface
- | + rgw: store realm epoch with sync status markers
- | + rgw: RGWBackoffControlCR only retries until success
- | + rgw: clean up RGWInitDataSyncStatusCoroutine
- | + rgw: fix marker comparison to detect end of mdlog period
- | + rgw: add == and != operators for period history cursor
- | + rgw: add empty_on_enoent flag to RGWSimpleRadosReadCR
- + Pull request 15449
- + rgw_file: pre-compute unix attrs in write_finish()
Bisect:
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/3b2d20d4819d8e43915cce0d95850a7f0971e818/
- + Pull request 16293
- |\
- | + osdc/Objecter: release message if it is not handled
- | + crypto: allow PK11 module to load even if it's already initialized
- + Pull request 16169
- |\
- | + osd/PrimaryLogPG solve cache tier osd high memory consumption
- + Pull request 16167
- |\
- | + osd/ReplicatedBackend: reset thread heartbeat after every omap entry in deep-scrub
- + Pull request 16144
- |\
- | + mon: osd crush set crushmap need sanity check
- | + crush: when take place the crush map should consider the rule is in used
- + Pull request 16141
- + ceph_test_rados_api_misc: fix LibRadosMiscConnectFailure.ConnectFailure retry
Bisect:
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/50d2a25f7249e4ea746b6479e2ff90881bc09641/
- + Pull request 16167
- |\
- | + osd/ReplicatedBackend: reset thread heartbeat after every omap entry in deep-scrub
- + Pull request 16144
- |\
- | + mon: osd crush set crushmap need sanity check
- | + crush: when take place the crush map should consider the rule is in used
- + Pull request 16141
- + ceph_test_rados_api_misc: fix LibRadosMiscConnectFailure.ConnectFailure retry
Updated by Nathan Cutler over 6 years ago
rados¶
teuthology-suite -k distro --priority 999 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
massive failure http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-31_18:01:53-rados-wip-jewel-backports-distro-basic-smithi/
Bisect 3b2d20d4819d8e43915cce0d95850a7f0971e818
teuthology-suite -k distro --priority 999 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --limit 20 --rerun smithfarm-2017-08-31_18:01:53-rados-wip-jewel-backports-distro-basic-smithi
Looks like I got lucky.
Bisect 50d2a25f7249e4ea746b6479e2ff90881bc09641
Bisect continues:
Now I know that either https://github.com/ceph/ceph/pull/16169 or https://github.com/ceph/ceph/pull/16293 is the culprit.
16169: https://shaman.ceph.com/builds/ceph/wip-jewel-backports/947b78cc61c3750bebe036440a6bf444fd864213/
16293: https://shaman.ceph.com/builds/ceph/wip-20460-jewel/d2eea3f7d59507714b04563c6811a29c8d7120b7/
Updated by Nathan Cutler over 6 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
4 fail, 1 dead, 13 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-08-31_18:06:37-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
Updated by Nathan Cutler over 6 years ago
Upgrade client-upgrade¶
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
Bisect 3b2d20d4819d8e43915cce0d95850a7f0971e818
Updated by Nathan Cutler over 6 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/10bf477a3226b55773dc9863cf8e489354007519/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 17210
- |\
- | + config: disable skewed utilization warning by default
- + Pull request 16405
- |\
- | + osd: scrub_to specifies clone ver, but transaction include head write ver
- + Pull request 16295
- |\
- | + rbd: properly decode features when using image name optional
- + Pull request 16294
- |\
- | + upstart: start radosgw-all according to runlevel
- + Pull request 16285
- |\
- | + rbd-mirror: set SEQUENTIAL and NOCACHE advise flags on image sync
- + Pull request 16276
- |\
- | + rgw: fix next marker to pass test_bucket_list_prefix in s3test
- | + rgw: fix listing of objects that start with underscore
- + Pull request 16268
- |\
- | + rgw/rgw_common.cc: modify the end check in RGWHTTPArgs::sys_get
- + Pull request 16266
- |\
- | + rgw: multipart copy-part remove '/' for s3 java sdk request header.
- + Pull request 16240
- |\
- | + Don't increment bi list entry count on error to not distort error code
- + Pull request 16169
- |\
- | + osd/PrimaryLogPG solve cache tier osd high memory consumption
- + Pull request 16167
- |\
- | + osd/ReplicatedBackend: reset thread heartbeat after every omap entry in deep-scrub
- + Pull request 16151
- |\
- | + client: avoid returning negative space available
- + Pull request 16150
- |\
- | + mds: save projected path into inode_t::stray_prior_path
- + Pull request 16144
- |\
- | + mon: osd crush set crushmap need sanity check
- | + crush: when take place the crush map should consider the rule is in used
- + Pull request 16141
- |\
- | + ceph_test_rados_api_misc: fix LibRadosMiscConnectFailure.ConnectFailure retry
- + Pull request 15988
- |\
- | + rgw: rest handlers for mdlog and datalog list dont loop
- | + rgw: fix RGWMetadataLog::list_entries() for null last_marker
- | + rgw: RGWMetadataLog::list_entries() no longer splices
- | + fix infinite loop in rest api for log list
- + Pull request 15966
- |\
- | + rgw: add a field to store generic user data in the bucket index, that can be populated/fetched via a configurable custom http header
- + Pull request 15556
- |\
- | + test/rgw: wait for realm reload after set_master_zone
- | + test/rgw: fixes for test_multi_period_incremental_sync()
- | + test/rgw: meta checkpoint compares realm epoch
- | + rgw: remove rgw_realm_reconfigure_delay
- | + rgw: require --yes-i-really-mean-it to promote zone with stale metadata
- | + rgw: period commit uses sync status markers
- | + rgw: use RGWShardCollectCR for RGWReadSyncStatusCoroutine
- | + rgw: change metadata read_sync_status interface
- | + rgw: store realm epoch with sync status markers
- | + rgw: RGWBackoffControlCR only retries until success
- | + rgw: clean up RGWInitDataSyncStatusCoroutine
- | + rgw: fix marker comparison to detect end of mdlog period
- | + rgw: add == and != operators for period history cursor
- | + rgw: add empty_on_enoent flag to RGWSimpleRadosReadCR
- + Pull request 15449
- + rgw_file: pre-compute unix attrs in write_finish()
Updated by Nathan Cutler over 6 years ago
Upgrade client-upgrade¶
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
Updated by Nathan Cutler over 6 years ago
rados¶
teuthology-suite -k distro --priority 999 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
4 fail, 4 dead, 219 pass (227 total) http://pulpito.ceph.com:80/smithfarm-2017-09-01_19:59:45-rados-wip-jewel-backports-distro-basic-smithi/
- 1587155 SELinux denial in ceph-mds
- 1587225, 1587250 known bug http://tracker.ceph.com/issues/20981
- 1587244 failed to recover before timeout expired
- 1587271, 1587323 known bug, won't fix http://tracker.ceph.com/issues/18739
- 1587349 Command failed on smithi200 with status 1: '/home/ubuntu/cephtest/s3-tests/virtualenv/bin/s3tests-test-readwrite'
- 1587357 Socket is closed -> actually osd crash with FAILED assert(0 "unexpected error") in cls_rgw.test_implicit
Rerunning 4 failed and 4 dead jobs
1 fail, 2 dead, 5 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-02_10:38:36-rados-wip-jewel-backports-distro-basic-smithi/
- 1590956 Socket is closed -> actually osd crash with FAILED assert(0 "unexpected error") in cls_rgw.test_implicit
- the 2 dead are known bug http://tracker.ceph.com/issues/20981 -> ignoring
Reproducer for the failed job is: teuthology-suite -k distro --priority 101 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --suite rados --filter="rados/verify/{1thrash/default.yaml clusters/{fixed-2.yaml openstack.yaml} fs/btrfs.yaml msgr-failures/few.yaml msgr/random.yaml rados.yaml tasks/rados_cls_all.yaml validater/lockdep.yaml}"
Rerunning just the reproducer:
pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-02_15:41:33-rados-wip-jewel-backports-distro-basic-smithi/ (Ubuntu)
See if it only fails on CentOS?
pass --num 1 http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-03_07:24:07-rados-wip-jewel-backports-distro-basic-smithi/
pass --num 5 http://pulpito.front.sepia.ceph.com/smithfarm-2017-09-03_07:48:08-rados-wip-jewel-backports-distro-basic-smithi/
pass --num 20 http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-03_08:29:02-rados-wip-jewel-backports-distro-basic-smithi/
Updated by Nathan Cutler over 6 years ago
powercycle¶
teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 999 -l 2 --email ncutler@suse.com
Updated by Nathan Cutler over 6 years ago
fs¶
teuthology-suite -k distro --priority 999 --suite fs --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
4 fail, 84 pass (88 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-01_20:13:34-fs-wip-jewel-backports-distro-basic-smithi/
- java (btrfs)
- bad handshake (btrfs)
- coredumps (btrfs)
- Test failure: test_files_throttle (tasks.cephfs.test_strays.TestStrays) <- this was on XFS
Re-running 4 failed jobs
1 fail, 3 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-02_08:23:50-fs-wip-jewel-backports-distro-basic-smithi/
- java (btrfs) -> ignoring
Ruled a pass
Updated by Nathan Cutler over 6 years ago
rgw¶
teuthology-suite -k distro --priority 999 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 2)/2
Updated by Nathan Cutler over 6 years ago
rbd¶
teuthology-suite -k distro --priority 999 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 4)/4
2 fail, 107 pass (109 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-01_20:16:01-rbd-wip-jewel-backports-distro-basic-smithi/
- both failures are SELinux denials in logrotate
Rerunning 2 failed jobs
pass http://pulpito.ceph.com/smithfarm-2017-09-02_08:19:06-rbd-wip-jewel-backports-distro-basic-smithi/
Updated by Nathan Cutler over 6 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
2 fail, 16 pass (18 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-01_19:39:12-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
- failed to recover before timeout expired
- Command failed on vpm003 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f --cluster ceph -i 4'
Re-running 2 failed jobs
1 fail, 1 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-02_08:21:46-upgrade:hammer-x-wip-jewel-backports-distro-basic-smithi/
- failure due to SELinux denials in ceph-mon, ignoring
Ruled a pass
Updated by Josh Durgin over 6 years ago
"actually osd crash with FAILED assert(0 "unexpected error")" -> this usually means an fs bug (we saw this with btrfs giving ENOSPC very early in recent months) or a failing disk giving EIO
Updated by Nathan Cutler over 6 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/4540388d4b7a960e74c1c2b59f220a603b0333c4/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 17514
- |\
- | + interval_set: optimize intersect_of for identical spans
- | + interval_set: optimize intersect_of insert operations
- + Pull request 17412
- |\
- | + librbd: prevent self-blacklisting during break lock
- + Pull request 17402
- |\
- | + librbd: fix missing write block validation in IO work queue
- | + qa/suites/rbd: test dynamic features with cache disabled
- | + qa/tasks/qemu: rbd cache is enabled by default
- | + test: unit tests for librbd IO work queue failure path
- | + librbd: cleanup interface between IO work queue and IO requests
- | + common: improve the ability to mock PointerWQ classes
- | + librbd: exclusive lock failures should bubble up to IO
- | + librbd: directly inform IO work queue when locks are required
- | + librbd: clean up variable naming in IO work queue
- | + librbd: convert ImageRequestWQ to template
- + Pull request 17396
- |\
- | + client: skip lookupname if writing to unlinked file
- + Pull request 17385
- |\
- | + librbd: reacquire lock should update lock owner client id
- + Pull request 17133
- |\
- | + ceph-disk: set the default systemd unit timeout to 3h
- + Pull request 17084
- |\
- | + ceph-disk: Use stdin for 'config-key put' command
- | + ceph: allow '-' with -i and -o for stdin/stdout
- | + ceph-disk: implement command_with_stdin
- + Pull request 17009
- |\
- | + rgw: aws4: add rgw_s3_auth_aws4_force_boto2_compat conf option
- + Pull request 17008
- |\
- | + jewel: mon: fix force_pg_create pg stuck in creating bug
- + Pull request 16963
- |\
- | + ceph-fuse: start up log on parent process before shutdown
- + Pull request 16952
- |\
- | + rgw-admin: fix bucket limit check argparse, div
- + Pull request 16951
- |\
- | + rgw: replace '+' with %20 in canonical query string for s3 v4 auth.
- + Pull request 16880
- |\
- | + rgw: Fix up to 1000 entries at a time in check_bad_index_multipart
- + Pull request 16767
- |\
- | + rgw : fix race in RGWCompleteMultipart
- + Pull request 16720
- |\
- | + rgw: Prevent overflow of stats cached values
- | + rgw: Do not decrement stats cache when the cache values are zero
- + Pull request 16711
- |\
- | + rgw: lease_stack: use reset method instead of assignment
- | + rgw: meta sync thread crash at RGWMetaSyncShardCR
- + Pull request 16703
- |\
- | + ceph-disk: don't activate suppressed journal devices
- + Pull request 16473
- |\
- | + osd: Reverse order of op_has_sufficient_caps and do_pg_op
- + Pull request 16355
- |\
- | + OSD: also check the exsistence of clone obc for CEPH_SNAPDIR requests
- + Pull request 16316
- |\
- | + rgw: VersionIdMarker and NextVersionIdMarker should be returned when listing object versions if necessary.
- + Pull request 16299
- |\
- | + rgw: datalog trim and mdlog trim handles the result returned by osd incorrectly.
- + Pull request 16297
- |\
- | + rbd: do not attempt to load key if auth is disabled
- + Pull request 16296
- |\
- | + librbd: filter expected error codes from is_exclusive_lock_owner
- + Pull request 16294
- |\
- | + upstart: start radosgw-all according to runlevel
- + Pull request 16293
- |\
- | + osdc/Objecter: release message if it is not handled
- | + crypto: allow PK11 module to load even if it's already initialized
- + Pull request 15556
- |\
- | + test/rgw: wait for realm reload after set_master_zone
- | + test/rgw: fixes for test_multi_period_incremental_sync()
- | + test/rgw: meta checkpoint compares realm epoch
- | + rgw: remove rgw_realm_reconfigure_delay
- | + rgw: require --yes-i-really-mean-it to promote zone with stale metadata
- | + rgw: period commit uses sync status markers
- | + rgw: use RGWShardCollectCR for RGWReadSyncStatusCoroutine
- | + rgw: change metadata read_sync_status interface
- | + rgw: store realm epoch with sync status markers
- | + rgw: RGWBackoffControlCR only retries until success
- | + rgw: clean up RGWInitDataSyncStatusCoroutine
- | + rgw: fix marker comparison to detect end of mdlog period
- | + rgw: add == and != operators for period history cursor
- | + rgw: add empty_on_enoent flag to RGWSimpleRadosReadCR
- + Pull request 15189
- + os:kstore fix unittest for FiemapHole
- + os/filestore: fix infinit loops in fiemap()
Updated by Nathan Cutler over 6 years ago
rados¶
teuthology-suite -k distro --priority 999 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
Initial run with --limit 10
to rule out a major regression:
Full run
3 fail, 3 dead, 222 pass (228 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-06_16:01:34-rados-wip-jewel-backports-distro-basic-smithi/
- SELinux denials
- 1601811 rados/singleton-nomsgr/{all/pool-access.yaml rados.yaml} "WARNING: max attr value size (1024) is smaller than osd_max_object_name_len (2048). Your backend filesystem appears to not support attrs large enough to handle the configured max rados name size. You may get unexpected ENAMETOOLONG errors on rados operations or buggy behavior" followed by "AttributeError: managers" (caused by including "mgr.x" role in backport https://github.com/ceph/ceph/pull/16473 - pretty sure it does not affect any other backports in this run)
- [ FAILED ] ObjectStore/StoreTest.FiemapHoles/3, where GetParam() = "kstore" known bug caused by PR#15189
- 1601664 FAILED assert(0 == "unexpected error") in FileStore::_do_transaction -> on BTRFS, ignoring
- 1601695, 1601722 known bug http://tracker.ceph.com/issues/20981 -> ignoring
Rerunning 6 jobs:
2 fail, 4 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-07_07:08:05-rados-wip-jewel-backports-distro-basic-smithi/
- 1604961 rados/singleton-nomsgr/{all/pool-access.yaml rados.yaml} "AttributeError: managers" (known issue with PR#16473) -> ignoring
- 1604962 ObjectStore/StoreTest.FiemapHoles/3, where GetParam() = "kstore" (known issue with PR#15189) -> ignoring
Ruled a pass
Updated by Nathan Cutler over 6 years ago
Upgrade client-upgrade¶
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
Updated by Nathan Cutler over 6 years ago
Upgrade hammer-x¶
teuthology-suite -k distro --verbose --suite upgrade/hammer-x --ceph wip-jewel-backports --machine-type vps --priority 101 --email ncutler@suse.com
3 fail, 15 pass (18 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-06_15:15:02-upgrade:hammer-x-wip-jewel-backports-distro-basic-vps/
Rerunning 3 failed jobs
Updated by Nathan Cutler over 6 years ago
powercycle¶
teuthology-suite -v -c wip-jewel-backports -k distro -m smithi -s powercycle -p 999 -l 2 --email ncutler@suse.com
Updated by Nathan Cutler over 6 years ago
fs¶
teuthology-suite -k distro --priority 999 --suite fs --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
2 fail, 86 pass (88 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-06_17:19:07-fs-wip-jewel-backports-distro-basic-smithi/
- one failure is java-related
- the other is "cluster [ERR] unmatched rstat on 100, inode has n(v1 rc2017-09-07 00:31:31.113264 10=0+10), dirfrags have n(v0 rc2017-09-07 00:31:31.113264 11=0+11)" in cluster log"
Rerunning 2 failed jobs:
Updated by Nathan Cutler over 6 years ago
rgw¶
teuthology-suite -k distro --priority 999 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 2)/2
1 fail, 95 pass (96 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-06_17:17:25-rgw-wip-jewel-backports-distro-basic-smithi/
- saw valgrind issues with apache frontend
Rerunning 1 failed job
Updated by Nathan Cutler over 6 years ago
rbd¶
teuthology-suite -k distro --priority 999 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 4)/4
1 fail, 1 dead, 106 pass (108 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-06_16:07:51-rbd-wip-jewel-backports-distro-basic-smithi/
- rbd/maintenance/... -> OSD fails to start because underlying XFS data store is unavailable
- kernel crash (oops)
Rerunning 2 jobs:
1 pass, 1 fail http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-07_06:16:37-rbd-wip-jewel-backports-distro-basic-smithi/
- rbd/maintenance failure emanating from https://github.com/ceph/ceph/pull/17402 is now fixed
Updated by Nathan Cutler over 6 years ago
ceph-disk¶
teuthology-suite -k distro --verbose --suite ceph-disk --ceph wip-jewel-backports --machine-type vps --priority 999 --email ncutler@suse.com
Updated by Nathan Cutler over 6 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/a396dcfc519dd631a7e8bea62bd5e66b489e5ff9/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 17574
- |\
- | + libradosstriper: remove format injection vulnerability
- + Pull request 17412
- |\
- | + librbd: prevent self-blacklisting during break lock
- + Pull request 17402
- |\
- | + qa/suites/rbd: fixed cache override
- | + librbd: fix missing write block validation in IO work queue
- | + qa/suites/rbd: test dynamic features with cache disabled
- | + qa/tasks/qemu: rbd cache is enabled by default
- | + test: unit tests for librbd IO work queue failure path
- | + librbd: cleanup interface between IO work queue and IO requests
- | + common: improve the ability to mock PointerWQ classes
- | + librbd: exclusive lock failures should bubble up to IO
- | + librbd: directly inform IO work queue when locks are required
- | + librbd: clean up variable naming in IO work queue
- | + librbd: convert ImageRequestWQ to template
- + Pull request 17385
- |\
- | + librbd: reacquire lock should update lock owner client id
- + Pull request 16767
- |\
- | + rgw : fix race in RGWCompleteMultipart
- + Pull request 16711
- |\
- | + rgw: lease_stack: use reset method instead of assignment
- | + rgw: meta sync thread crash at RGWMetaSyncShardCR
- + Pull request 16473
- |\
- | + osd: Reverse order of op_has_sufficient_caps and do_pg_op
- + Pull request 16355
- |\
- | + OSD: also check the exsistence of clone obc for CEPH_SNAPDIR requests
- + Pull request 16299
- |\
- | + rgw: datalog trim and mdlog trim handles the result returned by osd incorrectly.
- + Pull request 16297
- |\
- | + rbd: do not attempt to load key if auth is disabled
- + Pull request 16296
- |\
- | + librbd: filter expected error codes from is_exclusive_lock_owner
- + Pull request 16294
- |\
- | + upstart: start radosgw-all according to runlevel
- + Pull request 15556
- |\
- | + test/rgw: wait for realm reload after set_master_zone
- | + test/rgw: fixes for test_multi_period_incremental_sync()
- | + test/rgw: meta checkpoint compares realm epoch
- | + rgw: remove rgw_realm_reconfigure_delay
- | + rgw: require --yes-i-really-mean-it to promote zone with stale metadata
- | + rgw: period commit uses sync status markers
- | + rgw: use RGWShardCollectCR for RGWReadSyncStatusCoroutine
- | + rgw: change metadata read_sync_status interface
- | + rgw: store realm epoch with sync status markers
- | + rgw: RGWBackoffControlCR only retries until success
- | + rgw: clean up RGWInitDataSyncStatusCoroutine
- | + rgw: fix marker comparison to detect end of mdlog period
- | + rgw: add == and != operators for period history cursor
- | + rgw: add empty_on_enoent flag to RGWSimpleRadosReadCR
- + Pull request 15189
- + ceph_test_objectstore: disable filestore_fiemap
- + os:kstore fix unittest for FiemapHole
- + os/filestore: fix infinit loops in fiemap()
Updated by Nathan Cutler over 6 years ago
rados¶
teuthology-suite -k distro --priority 999 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
4 fail, 2 dead, 222 pass (228 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-07_18:58:43-rados-wip-jewel-backports-distro-basic-smithi/
Rerun:
2 fail, 4 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-10_07:03:27-rados-wip-jewel-backports-distro-basic-smithi/
Pushed two additional commits to https://github.com/ceph/ceph/pull/16473 to address the "managers" failure. Rerunning just that one test:
Pushed two more commits to https://github.com/ceph/ceph/pull/16473 and rerunning:
*
Updated by Nathan Cutler over 6 years ago
rbd¶
teuthology-suite -k distro --priority 999 --suite rbd --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 4)/4
1 fail, 107 pass (108 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-07_19:03:02-rbd-wip-jewel-backports-distro-basic-smithi/
- Log excerpt from failed job:
2017-09-10T02:24:24.308 INFO:tasks.thrashosds.thrasher:Reweighting osd 1 to 0.765611010463 2017-09-10T02:24:24.311 INFO:teuthology.orchestra.run.smithi193:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph osd reweight 1 0.765611010463' 2017-09-10T02:24:24.372 INFO:teuthology.orchestra.run.smithi193.stdout:Size error: expected 0x1a1200 stat 0x0 2017-09-10T02:24:24.376 INFO:teuthology.orchestra.run.smithi193.stdout:LOG DUMP (3091 total operations):
Rerun:
Updated by Nathan Cutler over 6 years ago
https://shaman.ceph.com/builds/ceph/wip-jewel-backports/34f51d2dca8149110a6a335eb865800a36ce7d1b/
git --no-pager log --format='%H %s' --graph ceph/jewel..wip-jewel-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 17626
- |\
- | + kv: let ceph_logger destructed after db reset
- + Pull request 17597
- |\
- | + rgw_file: fix LRU lane lock in evict_block()
- + Pull request 17574
- |\
- | + libradosstriper: remove format injection vulnerability
- + Pull request 17287
- |\
- | + rgw: break sending data-log list infinitely
- + Pull request 17285
- |\
- | + rgw_file: properly & |'d flags
- + Pull request 17281
- |\
- | + rgw: fix rgw hang when do RGWRealmReloader::reload after go SIGHUP
- + Pull request 17280
- |\
- | + rgw: rgw_website.h doesn't assume inclusion of the std namespace anymore.
- | + rgw: never let http_redirect_code of RGWRedirectInfo to stay uninitialized.
- + Pull request 17279
- |\
- | + rgw: fix the UTF8 check on bucket entry name in rgw_log_op().
- + Pull request 17277
- |\
- | + rgw: Fix a bug that multipart upload may exceed quota ...
- + Pull request 17166
- |\
- | + cls/refcount: store and use list of retired tags
- + Pull request 17165
- |\
- | + rgw: fix radosgw-admin data sync run crash
- + Pull request 17164
- |\
- | + rgw: fix not initialized pointer which cause rgw crash with ec data pool
- | + rgw: don't do unneccesary write if buffer with zero length
- + Pull request 17159
- |\
- | + rgw: fix error message in removing bucket with --bypass-gc flag
- + Pull request 17156
- |\
- | + rgw:multisite: fix RGWRadosRemoveOmapKeysCR
- + Pull request 17148
- |\
- | + rgw: we no longer use log_meta
- | + rgw: is_single_zonegroup doesn't use store or cct
- | + rgw: log_meta only for more than one zone
- | + rgw: only log metadata on metadata master zone
- + Pull request 17147
- |\
- | + rgw_file: prevent conflict of mkdir between restarts
- + Pull request 16856
- |\
- | + rgw: bucket index check in radosgw-admin removes valid index.
- + Pull request 16767
- |\
- | + rgw : fix race in RGWCompleteMultipart
- + Pull request 16473
- |\
- | + tests: use XFS explicitly in singleton-nomsgr/pool-access.yaml
- | + qa/suites/rados/singleton-nomsgr: fix syntax
- | + qa/suites/rados/singleton-nomsgr/pool-access: behave on ext4
- | + tasks/ceph: construct CephManager earlier
- | + osd: Reverse order of op_has_sufficient_caps and do_pg_op
- + Pull request 16355
- + OSD: also check the exsistence of clone obc for CEPH_SNAPDIR requests
Updated by Nathan Cutler over 6 years ago
rados¶
teuthology-suite -k distro --priority 999 --suite rados --subset $(expr $RANDOM % 50)/50 --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi
3 fail, 2 dead, 223 pass (228 total) http://pulpito.ceph.com/smithfarm-2017-09-12_17:11:54-rados-wip-jewel-backports-distro-basic-smithi/
- The two dead jobs are known bug http://tracker.ceph.com/issues/18739
- One of the failed jobs is a known test issue fixed by https://github.com/ceph/ceph/pull/17677
Rerun 5 jobs:
2 fail 3 pass http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-13_02:44:35-rados-wip-jewel-backports-distro-basic-smithi/
- One of the failed jobs is a known test issue fixed by https://github.com/ceph/ceph/pull/17677
- The failed job is rados/singleton-nomsgr/{all/11429.yaml rados.yaml} - different failure message from last time
upgrade/client-upgrade¶
To test https://github.com/ceph/ceph/pull/17780
teuthology-suite -k distro --verbose --suite upgrade/client-upgrade --ceph-repo https://github.com/ceph/ceph-ci.git --ceph wip-jewel-backports --suite-repo https://github.com/smithfarm/ceph.git --suite-branch wip-rh-74-jewel --machine-type vps --priority 101 --email ncutler@suse.com
4 fail, 9 pass (13 total) http://pulpito.front.sepia.ceph.com:80/smithfarm-2017-09-18_20:26:52-upgrade:client-upgrade-wip-jewel-backports-distro-basic-vps/
- All four failures were due to absence of CentOS 7.4 on vps; hopefully fixed yesterday by David G.
Rerunning 4 CentOS 7.4 jobs:
Updated by Nathan Cutler over 6 years ago
rgw¶
teuthology-suite -k distro --priority 101 --suite rgw --email ncutler@suse.com --ceph wip-jewel-backports --machine-type smithi --subset $(expr $RANDOM % 2)/2
Updated by Yuri Weinstein over 6 years ago
QE VALIDATION (STARTED 9/19/17)¶
(Note: PASSED / FAILED - indicates "TEST IS IN PROGRESS")
re-runs command lines and filters are captured in http://pad.ceph.com/p/hammer_v10.2.10_QE_validation_notes
command line CEPH_QA_MAIL="ceph-qa@ceph.com"; MACHINE_NAME=smithi; CEPH_BRANCH=jewel; SHA1=189f0c6f2703758a6be917a3c4086f6a26e42366 ; teuthology-suite -v --ceph-repo https://github.com/ceph/ceph.git --suite-repo https://github.com/ceph/ceph.git -c $CEPH_BRANCH -S $SHA1 -m $MACHINE_NAME -s rados --subset 35/50 -k distro -p 100 -e $CEPH_QA_MAIL --suite-branch jewel --dry-run
teuthology-suite -v -c $CEPH_BRANCH -S $SHA1 -m $MACHINE_NAME -r $RERUN --suite-repo https://github.com/ceph/ceph.git --ceph-repo https://github.com/ceph/ceph.git --suite-branch jewel -p 90 -R fail,dead,running
Suite | Runs/Reruns | Notes/Issues |
rgw | http://pulpito.ceph.com/yuriw-2017-09-19_20:47:02-rgw-jewel-distro-basic-smithi/ | PASSED |
http://pulpito.ceph.com/yuriw-2017-09-20_15:30:09-rgw-jewel-distro-basic-smithi/ | ||
rbd | http://pulpito.ceph.com/yuriw-2017-09-19_20:49:39-rbd-jewel-distro-basic-smithi/ | PASSED |
http://pulpito.ceph.com/yuriw-2017-09-20_15:30:44-rbd-jewel-distro-basic-smithi/ | ||
fs | http://pulpito.ceph.com/yuriw-2017-09-19_20:52:59-fs-jewel-distro-basic-smithi/ | FAILED #21481 approved by Patrick |
http://pulpito.ceph.com/yuriw-2017-09-20_15:31:18-fs-jewel-distro-basic-smithi/ | ||
kcephfs | http://pulpito.ceph.com/yuriw-2017-09-19_20:55:34-kcephfs-jewel-testing-basic-smithi/ | PASSED |
knfs | http://pulpito.ceph.com/yuriw-2017-09-19_20:56:28-knfs-jewel-testing-basic-smithi/ | PASSED |
rest | http://pulpito.ceph.com/yuriw-2017-09-19_20:58:27-rest-jewel-distro-basic-smithi/ | PASSED |
hadoop | EXCLUDED FROM THIS RELEASE | |
samba | EXCLUDED FROM THIS RELEASE | |
ceph-disk | http://pulpito.ceph.com/yuriw-2017-09-19_20:59:46-ceph-disk-jewel-distro-basic-vps/ | PASSED |
powercycle | http://pulpito.ceph.com/yuriw-2017-09-19_20:57:15-powercycle-jewel-testing-basic-smithi/ | PASSED |
ceph-ansible | http://pulpito.ceph.com/yuriw-2017-09-19_21:04:38-ceph-ansible-jewel-distro-basic-vps/ | FAILED approved by Vasu |
per Vasu "Ceph-ansible is green and thats what i told in irc as well http://pulpito.ceph.com/vasu-2017-09-20_19:51:59-ceph-ansible-jewel-distro-basic-vps/ (it uses stable-2.1 branch of ceph-ansible) The failing tests during purge-cluster at the end can be ignored for now." |
||
PASSED / FAILED | ||
Updated by Nathan Cutler over 6 years ago
- Description updated (diff)
- Status changed from In Progress to Resolved