Tasks #21830
luminous v12.2.2
0%
Description
Workflow¶
- Preparing the release IN PROGRESS
- Cutting the release
- Abhishek L gets approval from all leads
- Yehuda, rgw
- Patrick, fs
- Jason, rbd
- Josh, rados
- Abhishek L. writes and commits the release notes
- Karol/Abhishek informs Yuri that the branch is ready for testing SHA1 83684b91a3c6b31419114b83fc22106146885fb6 NEED TO BE UPDATED (have added several commits on top)
- Abhishek L gets approval from all leads
- Yuri runs additional integration tests - DONE
- If Yuri discovers new bugs that need to be backported urgently (i.e. their priority is set to Urgent or Immediate), the release goes back to being prepared; it was not ready after all - DONE
- Yuri informs Alfredo that the branch is ready for release - DONE
- Alfredo creates the packages and sets the release tag - DONE
- Abhishek L. posts release announcement on https://ceph.com/community/blog
Release information¶
- branch to build from: luminous, commit: 83684b91a3c6b31419114b83fc22106146885fb6
- version: v12.2.2
- type of release: point release
- where to publish the release: http://download.ceph.com/debian-jewel and http://download.ceph.com/rpm-luminous
History
#1 Updated by Abhishek Lekshmanan over 6 years ago
- Copied from Tasks #21296: luminous v12.2.1 added
#2 Updated by Abhishek Lekshmanan over 6 years ago
- Copied from deleted (Tasks #21296: luminous v12.2.1)
#3 Updated by Abhishek Lekshmanan over 6 years ago
- Description updated (diff)
#4 Updated by Anonymous over 6 years ago
- Assignee set to Anonymous
#5 Updated by Abhishek Lekshmanan over 6 years ago
git --no-pager log --format='%H %s' --graph ceph/luminous..wip-luminous-backports | perl -p -e 's/"/ /g; i f (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://githu b.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 17729
- |\
- | + ceph.in: validate service glob
- + Pull request 17856
- |\
- | + rgw: rgw_rados: set_attrs now sets the same time for BI & object
- + Pull request 17857
- |\
- | + rgw: return bucket's location no matter which zonegroup it located in.
- + Pull request 17858
- |\
- | + rgw: fix accessing expired memory in PrefixableSignatureHelper.
- + Pull request 17859
- |\
- | + rgw: fix lc process only schdule the first item of lc objects
- + Pull request 17860
- |\
- | + rbd-mirror: potential lockdep issue
- | + rbd-mirror: update asok hook name on image rename
- + Pull request 17861
- |\
- | + rbd: mirror get actions now have cleaner error messages
- | + cls/rbd: avoid recursively listing the watchers on rbd_mirroring object
- + Pull request 17889
- |\
- | + osd: Only scan for omap corruption once
- | + tools: Add --backend option to ceph-osdomap-tool default to rocksdb
- | + osd, mds, tools: drop the invalid comment and some unused variables
- | + tools: Add the ability to reset state to v2
- | + tools: Show DB state information
- + Pull request 17921
- |\
- | + ceph_volume_client: perform snapshot operations in
- + Pull request 17994
- |\
- | + mds: make sure snap inode's last matches its parent dentry's last
- + Pull request 18004
- |\
- | + rgw_file: fix write error when the write offset overlaps.
- + Pull request 18030
- |\
- | + qa: relax cap expected value check
- | + mds: improve cap min/max ratio descriptions
- | + mds: fix whitespace
- | + mds: cap client recall to min caps per client
- | + mds: fix conf types
- | + mds: fix whitespace
- | + doc/cephfs: add client min cache and max cache ratio describe
- | + mds: adding tunable features for caps_per_client
- + Pull request 18085
- |\
- | + ceph_volume_client: fix setting caps for IDs
- + Pull request 18138
- |\
- | + rgw: stop/join TokenCache revoke thread only if started.
- + Pull request 18287
- |\
- | + rgw: Remove assertions in IAM Policy
- + Pull request 18293
- |\
- | + arch/arm: set ceph_arch_aarch64_crc32 only if the build host supports crc32cx
- + Pull request 18298
- |\
- | + osdc/ObjectCacher: limit memory usage of BufferHead
- + Pull request 18299
- |\
- | + mds: update client metadata for already open session
- + Pull request 18300
- |\
- | + mds: keep CInode::STATE_QUEUEDEXPORTPIN state when exporting inode
- + Pull request 18316
- |\
- | + mds: prevent trim count from underflowing
- + Pull request 18334
- |\
- | + cls/rgw: increment header version to avoid overwriting bilog entries
- | + test/rgw: add test_multipart_object_sync
- + Pull request 18336
- |\
- | + librbd: snapshots should be created/removed against data pool
- + Pull request 18337
- |\
- | + rbd-mirror: ensure forced-failover cannot result in sync state
- | + rbd-mirror: forced-promotion should interrupt replay delay to shut down
- + Pull request 18364
- |\
- | + mon/OSDMonitor: mon osd feature checks with 0 up osds
- | + osd/OSDMap: ignore xinfo if features == 0
- + Pull request 18385
- |\
- | + mds: fix race in PurgeQueue::wait_for_recovery()
- | + mds: open purge queue when transitioning out of standby replay
- | + mds: always re-probe mds log when standby replay done
- + Pull request 18410
- |\
- | + qa/suites/rest/basic/tasks/rest_test: whiltelist OSD_DOWN
- | + qa/suites/rest/basic/tasks/rest_test: more whitelisting
- + Pull request 18412
- |\
- | + mgr: fix crashable DaemonStateIndex::get calls
- + Pull request 18413
- |\
- | + osd: additional protection for out-of-bounds EC reads
- + Pull request 18416
- |\
- | + librbd: batch large object map updates into multiple chunks
- | + test/librbd: initial test cases for trim state machine
- | + librbd: tweaks to support testing of trim state machine
- | + librbd: combine trim state machine object map batch update states
- | + cls/rbd: object map update now utilizes constant-time bit vector operations
- | + common/bit_vector: provide constant time iteration of underlying bufferlist
- | + common/buffer: expose hidden const deference operator
- + Pull request 18417
- |\
- | + cls/journal: fixed possible infinite loop which could kill the OSD
- | + test: ceph_test_cls_journal was dropped when converting to cmake
- + Pull request 18429
- |\
- | + rgw: encryption add exception handling for from_base64 on bad input
- | + rgw: encryption fix the issue when not provide encryption mode
- | + rgw: encryption SSE-KMS add the details of error msg in response
- | + rgw: encryption SSE-C add the details of error msg in response
- + Pull request 18430
- |\
- | + rgw: release cls lock if taken in RGWCompleteMultipart
- + Pull request 18431
- |\
- | + rgw: Torrents are not supported for objects encrypted using SSE-C
- + Pull request 18432
- |\
- | + rgw: disable dynamic resharding in multisite environment
- + Pull request 18433
- |\
- | + rgw_file: fix write error when the write offset overlaps.
- + Pull request 18434
- |\
- | + rgw: 'zone placement' commands validate compression type
- + Pull request 18435
- |\
- | + RGW: Multipart upload may double the quota
- + Pull request 18436
- |\
- | + rgw: RGWUser::init no longer overwrites user_id
- + Pull request 18437
- |\
- | + rgw: update the usage read iterator in truncated scenario Fixes: http://tracker.ceph.com/issues/21196
- + Pull request 18438
- |\
- | + RGW: fix a bug about inconsistent unit of comparison
- + Pull request 18439
- |\
- | + rgw: admin api - add ability to sync user stats from admin api
- + Pull request 18440
- |\
- | + rgw: Check bucket versioning operations in policy
- | + rgw: Check payment operations in policy
- + Pull request 18441
- |\
- | + rgw: defer constructing keystone engine unless url is configured
- + Pull request 18442
- |\
- | + rgw: include SSE-KMS headers in encrypted upload response
- + Pull request 18443
- |\
- | + rgw: Check bucket GetBucketLocation in policy
- + Pull request 18444
- |\
- | + rgw: Check bucket CORS operations in policy
- + Pull request 18445
- |\
- | + rgw: Check bucket Website operations in policy
- + Pull request 18446
- |\
- | + rgw_file: explicit NFSv3 open() emulation
- + Pull request 18456
- + qa/suites/upgrade/jewel-x/parallel: run some jewel after completed upgrade
- + qa/suites/upgrade/jewel-x/: set up compat weight-set after cluster upgrade
- + messages/MOSDMap: do compat reencode of crush map, too
#6 Updated by Abhishek Lekshmanan over 6 years ago
RADOS¶
$ teuthology-suite -k distro --priority 999 --suite rados --subset $(expr $RANDOM % 999)/999 --email abhishek@suse.com --ceph wip-luminous-backports -m smithi
- 1 failed, 225 passed http://pulpito.ceph.com/abhi-2017-10-21_15:46:36-rados-wip-luminous-backports-distro-basic-smithi/
Kefu mentioned this as a known hard to reproduce failure, #20738
rerun: - passed http://pulpito.ceph.com/abhi-2017-10-23_09:11:03-rados-wip-luminous-backports-distro-basic-smithi/
#7 Updated by Abhishek Lekshmanan over 6 years ago
RBD¶
$ teuthology-suite -k distro --priority 999 --suite rbd --subset $(expr $RANDOM % 5)/5 --email abhishek@suse.com --ceph wip-luminous-backports -m smithi
Run 1 Results¶
- 7 failed, 3dead, 441 passed http://pulpito.ceph.com/abhi-2017-10-21_15:51:20-rbd-wip-luminous-backports-distro-basic-smithi/
- 1758022: Only obvious PR is: https://github.com/ceph/ceph/pull/17860
- UPDATE: jdillaman says it's likely a repeat of http://tracker.ceph.com/issues/16019. Edge race, re-run should clear.
2017-10-21T17:34:29.229 INFO:tasks.workunit.client.0.smithi049.stdout:[ RUN ] TestJournalReplay.Rename 2017-10-21T17:34:29.487 INFO:tasks.workunit.client.0.smithi049.stdout:/build/ceph-12.2.1-368-g683eb49/src/test/librbd/journal/test_Replay.cc:611: Failure 2017-10-21T17:34:29.487 INFO:tasks.workunit.client.0.smithi049.stdout: Expected: 0 2017-10-21T17:34:29.487 INFO:tasks.workunit.client.0.smithi049.stdout:To be equal to: rbd.rename(m_ioctx, new_image_name.c_str(), m_image_name.c_str()) 2017-10-21T17:34:29.487 INFO:tasks.workunit.client.0.smithi049.stdout: Which is: -2 2017-10-21T17:34:29.497 INFO:tasks.workunit.client.0.smithi049.stdout:[ FAILED ] TestJournalReplay.Rename (270 ms)
- 1758026: Nothing obvious.
- 1758034: OSD mount errors. Nothing else obvious.
2017-10-21T17:30:57.183 INFO:tasks.ceph:mount /dev/vg_nvme/lv_1 on ubuntu@smithi048.front.sepia.ceph.com -o noatime 2017-10-21T17:30:57.183 INFO:teuthology.orchestra.run.smithi048:Running: 'sudo mount -t xfs -o noatime /dev/vg_nvme/lv_1 /var/lib/ceph/osd/ceph-1' 2017-10-21T17:30:57.257 INFO:teuthology.orchestra.run.smithi048.stderr:mount: /dev/mapper/vg_nvme-lv_1 already mounted or /var/lib/ceph/osd/ceph-1 busy 2017-10-21T17:30:57.260 ERROR:teuthology.contextutil:Saw exception from nested tasks
- 1758044: Nothing obvious. Env?
- 1758106: Nothing obvious. Env?
- 1758259: Erasure coding/bluestore?
2017-10-21T18:36:42.808 INFO:tasks.ceph.osd.2.smithi194.stderr:/build/ceph-12.2.1-368-g683eb49/src/os/bluestore/bluestore_types.h: In function 'int bluestore_blob_t::map(uint64_t, uint64_t, std::function<int(long unsigned int, long unsigned int)>) const' thread 7f514ff90700 time 2017-10-21 18:36:42.807831 2017-10-21T18:36:42.808 INFO:tasks.ceph.osd.2.smithi194.stderr:/build/ceph-12.2.1-368-g683eb49/src/os/bluestore/bluestore_types.h: 742: FAILED assert(p != extents.end()) 2017-10-21T18:36:42.808 INFO:tasks.ceph.osd.5.smithi006.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.1-368-g683eb49/rpm/el7/BUILD/ceph-12.2.1-368-g683eb49/src/os/bluestore/bluestore_types.h: In function 'int bluestore_blob_t::map(uint64_t, uint64_t, std::function<int(long unsigned int, long unsigned int)>) const' thread 7f52a82da700 time 2017-10-21 18:36:42.803664 2017-10-21T18:36:42.809 INFO:tasks.ceph.osd.5.smithi006.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.1-368-g683eb49/rpm/el7/BUILD/ceph-12.2.1-368-g683eb49/src/os/bluestore/bluestore_types.h: 742: FAILED assert(p != extents.end()) 2017-10-21T18:36:42.809 INFO:tasks.ceph.osd.7.smithi006.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.1-368-g683eb49/rpm/el7/BUILD/ceph-12.2.1-368-g683eb49/src/os/bluestore/bluestore_types.h: In function 'int bluestore_blob_t::map(uint64_t, uint64_t, std::function<int(long unsigned int, long unsigned int)>) const' thread 7fa4b22d9700 time 2017-10-21 18:36:42.803811 2017-10-21T18:36:42.809 INFO:tasks.ceph.osd.7.smithi006.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.1-368-g683eb49/rpm/el7/BUILD/ceph-12.2.1-368-g683eb49/src/os/bluestore/bluestore_types.h: 742: FAILED assert(p != extents.end()) 2017-10-21T18:36:42.811 INFO:tasks.ceph.osd.2.smithi194.stderr: ceph version 12.2.1-368-g683eb49 (683eb4916e8acda8816d081bc651ac48bc587ad8) luminous (stable) 2017-10-21T18:36:42.812 INFO:tasks.ceph.osd.2.smithi194.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x55576dfbe752] 2017-10-21T18:36:42.812 INFO:tasks.ceph.osd.2.smithi194.stderr: 2: (bluestore_blob_t::map(unsigned long, unsigned long, std::function<int (unsigned long, unsigned long)>) const+0x157) [0x55576de85cd7] 2017-10-21T18:36:42.812 INFO:tasks.ceph.osd.2.smithi194.stderr: 3: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xad4) [0x55576de5f534] 2017-10-21T18:36:42.812 INFO:tasks.ceph.osd.2.smithi194.stderr: 4: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x7b) [0x55576de603eb] 2017-10-21T18:36:42.812 INFO:tasks.ceph.osd.2.smithi194.stderr: 5: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x1bbc) [0x55576de76b5c] 2017-10-21T18:36:42.813 INFO:tasks.ceph.osd.2.smithi194.stderr: 6: (BlueStore::queue_transactions(ObjectStore::Sequencer*, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x52e) [0x55576de77c5e] 2017-10-21T18:36:42.813 INFO:tasks.ceph.osd.2.smithi194.stderr: 7: (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x66) [0x55576db9f096] 2017-10-21T18:36:42.813 INFO:tasks.ceph.osd.2.smithi194.stderr: 8: (ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace const&, Context*)+0x889) [0x55576dce0f19] 2017-10-21T18:36:42.813 INFO:tasks.ceph.osd.2.smithi194.stderr: 9: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x331) [0x55576dcfa091] 2017-10-21T18:36:42.813 INFO:tasks.ceph.osd.2.smithi194.stderr: 10: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x55576dbdc9b0] 2017-10-21T18:36:42.814 INFO:tasks.ceph.osd.2.smithi194.stderr: 11: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x55d) [0x55576db4170d] 2017-10-21T18:36:42.814 INFO:tasks.ceph.osd.2.smithi194.stderr: 12: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a9) [0x55576d9bf2e9] 2017-10-21T18:36:42.814 INFO:tasks.ceph.osd.2.smithi194.stderr: 13: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x55576dc5dd47] 2017-10-21T18:36:42.814 INFO:tasks.ceph.osd.2.smithi194.stderr: 14: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x130e) [0x55576d9e7e8e] 2017-10-21T18:36:42.814 INFO:tasks.ceph.osd.2.smithi194.stderr: 15: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x884) [0x55576dfc3544] 2017-10-21T18:36:42.815 INFO:tasks.ceph.osd.2.smithi194.stderr: 16: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55576dfc6580] 2017-10-21T18:36:42.815 INFO:tasks.ceph.osd.2.smithi194.stderr: 17: (()+0x76ba) [0x7f516a2d96ba] 2017-10-21T18:36:42.815 INFO:tasks.ceph.osd.2.smithi194.stderr: 18: (clone()+0x6d) [0x7f51693503dd]
- 1758265: Looks to be an issue with spooling up the cluster.
- 1758351: OSD mount errors. Nothing else obvious.
2017-10-21T19:08:51.770 INFO:tasks.ceph:mount /dev/vg_nvme/lv_2 on ubuntu@smithi053.front.sepia.ceph.com -o noatime 2017-10-21T19:08:51.770 INFO:teuthology.orchestra.run.smithi053:Running: 'sudo mount -t xfs -o noatime /dev/vg_nvme/lv_2 /var/lib/ceph/osd/ceph-6' 2017-10-21T19:08:51.803 INFO:teuthology.orchestra.run.smithi053.stderr:mount: /dev/mapper/vg_nvme-lv_2 already mounted or /var/lib/ceph/osd/ceph-6 busy 2017-10-21T19:08:51.804 ERROR:teuthology.contextutil:Saw exception from nested tasks
- 1758354: jdillaman says it's likely http://tracker.ceph.com/issues/11502 (cache tier can go back in time)
2017-10-21T19:11:05.122 INFO:tasks.workunit.client.0.smithi013.stdout:[==========] 201 tests from 11 test cases ran. (282890 ms total) 2017-10-21T19:11:05.122 INFO:tasks.workunit.client.0.smithi013.stdout:[ PASSED ] 200 tests. 2017-10-21T19:11:05.122 INFO:tasks.workunit.client.0.smithi013.stdout:[ FAILED ] 1 test, listed below: 2017-10-21T19:11:05.122 INFO:tasks.workunit.client.0.smithi013.stdout:[ FAILED ] TestLibRBD.LockingPP 2017-10-21T19:11:05.122 INFO:tasks.workunit.client.0.smithi013.stdout: 2017-10-21T19:11:05.123 INFO:tasks.workunit.client.0.smithi013.stdout: 1 FAILED TEST 2017-10-21T19:11:05.139 INFO:tasks.workunit:Stopping ['rbd/test_librbd.sh'] on client.0... 2017-10-21T19:11:05.139 INFO:teuthology.orchestra.run.smithi013:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0' 2017-10-21T19:11:05.372 ERROR:teuthology.run_tasks:Saw exception from tasks.
- 1758599: EC/bluestore... see 1758259 above
Run 2 Results¶
- 1 failed, 2 dead http://pulpito.ceph.com/abhi-2017-10-24_13:20:16-rbd-wip-luminous-backports-distro-basic-smithi/
- Dead cases show same EC/bluestore traces as above - Should be fixed by http://tracker.ceph.com/issues/21766 (PR for which as been merged to luminous branch).
- Failed case caused by update failure (package hash sum mismatch)
#8 Updated by Abhishek Lekshmanan over 6 years ago
RGW¶
$ teuthology-suite -k distro --priority 999 --suite rgw --subset $(expr $RANDOM % 2)/2 --email abhishek@suse.com --ceph wip-luminous-backports -m smithi
- 5 failed, 101 passed http://pulpito.ceph.com/abhi-2017-10-21_15:52:29-rgw-wip-luminous-backports-distro-basic-smithi/
- 4 failed http://pulpito.ceph.com/abhi-2017-10-24_13:21:33-rgw-wip-luminous-backports-distro-basic-smithi/
2 multisite failures are the known valgrind failures, the other 2 being env.
#9 Updated by Abhishek Lekshmanan over 6 years ago
FS¶
$ teuthology-suite -k distro --priority 999 --suite fs --email abhishek@suse.com --ceph wip-luminous-backports -m smithi
Run 1 Results¶
- 1 failed, 210 passed http://pulpito.ceph.com/abhi-2017-10-21_15:56:08-fs-wip-luminous-backports-distro-basic-smithi/
- 1758697: From log, copy out of fsid file only performed for OSD 1, and this is performed before unmounting. Not clear why OSD 1 is apparently missing fsid file.
2017-10-21T21:01:42.430 INFO:tasks.ceph:Starting osd daemons in cluster ceph... 2017-10-21T21:01:42.430 INFO:teuthology.orchestra.run.smithi101:Running: "python -c 'import os; import tempfile; import sys;(fd,fname) = tempfile.mkstemp();os.close(fd);sys.stdout.write(fname.rstrip());sys.stdout.flush()'" 2017-10-21T21:01:42.465 INFO:teuthology.orchestra.run.smithi101.stdout:/tmp/tmpyay5oL 2017-10-21T21:01:42.465 INFO:teuthology.orchestra.run.smithi101:Running: 'sudo cp /var/lib/ceph/osd/ceph-1/fsid /tmp/tmpyay5oL' 2017-10-21T21:01:42.570 INFO:teuthology.orchestra.run.smithi101.stderr:cp: cannot stat '/var/lib/ceph/osd/ceph-1/fsid': No such file or directory 2017-10-21T21:01:42.570 ERROR:teuthology.contextutil:Saw exception from nested tasks
Run 2 Results
- 1769519: Same issue as above.
Patrick reported these failures as known failures Deemed Pass
#10 Updated by Abhishek Lekshmanan over 6 years ago
ceph-disk¶
$ teuthology-suite -k distro --priority 999 --suite ceph-disk --email abhishek@suse.com --ceph wip-luminous-backports -m vps http://pulpito.ceph.com/abhi-2017-10-21_15:57:45-ceph-disk-wip-luminous-backports-distro-basic-vps
1 failed, 2 pass http://pulpito.ceph.com/abhi-2017-10-21_15:57:45-ceph-disk-wip-luminous-backports-distro-basic-vps/
- 1758657: A pool near full health warning along with ceph-disk dmcrypt errors (trace below)
2017-10-21T16:24:33.843 INFO:tasks.workunit.client.0.vpm027.stderr:command: Running command: /usr/sbin/blkid -o udev -p /dev/vdb2 2017-10-21T16:24:33.843 INFO:tasks.workunit.client.0.vpm027.stderr:get_dmcrypt_key: no `ceph_fsid` found falling back to 'ceph' for cluster name 2017-10-21T16:24:33.844 INFO:tasks.workunit.client.0.vpm027.stderr:Traceback (most recent call last): 2017-10-21T16:24:33.844 INFO:tasks.workunit.client.0.vpm027.stderr: File "/usr/sbin/ceph-disk", line 9, in <module> 2017-10-21T16:24:33.844 INFO:tasks.workunit.client.0.vpm027.stderr: load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')() 2017-10-21T16:24:33.844 INFO:tasks.workunit.client.0.vpm027.stderr: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5695, in run 2017-10-21T16:24:33.844 INFO:tasks.workunit.client.0.vpm027.stderr: main(sys.argv[1:]) 2017-10-21T16:24:33.844 INFO:tasks.workunit.client.0.vpm027.stderr: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5646, in main 2017-10-21T16:24:33.844 INFO:tasks.workunit.client.0.vpm027.stderr: args.func(args) 2017-10-21T16:24:33.844 INFO:tasks.workunit.client.0.vpm027.stderr: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5396, in <lambda> 2017-10-21T16:24:33.844 INFO:tasks.workunit.client.0.vpm027.stderr: func=lambda args: main_activate_space(name, args), 2017-10-21T16:24:33.845 INFO:tasks.workunit.client.0.vpm027.stderr: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4111, in main_activate_space 2017-10-21T16:24:33.845 INFO:tasks.workunit.client.0.vpm027.stderr: dev = dmcrypt_map(args.dev, args.dmcrypt_key_dir) 2017-10-21T16:24:33.845 INFO:tasks.workunit.client.0.vpm027.stderr: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3453, in dmcrypt_map 2017-10-21T16:24:33.845 INFO:tasks.workunit.client.0.vpm027.stderr: dmcrypt_key = get_dmcrypt_key(part_uuid, dmcrypt_key_dir, luks) 2017-10-21T16:24:33.845 INFO:tasks.workunit.client.0.vpm027.stderr: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1314, in get_dmcrypt_key 2017-10-21T16:24:33.845 INFO:tasks.workunit.client.0.vpm027.stderr: raise Error('unknown key-management-mode ' + str(mode)) 2017-10-21T16:24:33.845 INFO:tasks.workunit.client.0.vpm027.stderr:ceph_disk.main.Error: Error: unknown key-management-mode None
Possibly related issue: #18945
#11 Updated by Abhishek Lekshmanan over 6 years ago
Powercycle¶
$ teuthology-suite -k distro --priority 999 --suite powercycle -l 2 --email abhishek@suse.com --ceph wip-luminous-backports -m smithi
#12 Updated by Abhishek Lekshmanan over 6 years ago
upgrade jewel-x¶
$ teuthology-suite -k distro --priority 999 --suite upgrade/jewel-x --email abhishek@suse.com --ceph wip-luminous-backports -m smithi
- 40 failed http://pulpito.ceph.com/abhi-2017-10-21_15:58:19-upgrade:jewel-x-wip-luminous-backports-distro-basic-smithi
sage mentioned theis was due to PR https://github.com/ceph/ceph/pull/18456 which was included in the run being broken, now marked as DNM
#13 Updated by Abhishek Lekshmanan over 6 years ago
upgrade luminous-x¶
$ teuthology-suite -k distro --priority 999 --suite upgrade/luminous-x --email abhishek@suse.com --ceph wip-luminous-backports -m smithi
6 failed, 1dead http://pulpito.ceph.com/abhi-2017-10-21_15:58:01-upgrade:luminous-x-wip-luminous-backports-distro-basic-smithi
env noise
#14 Updated by Abhishek Lekshmanan over 6 years ago
- + Pull request 18008
- |\
- | + mds: fix CDir::log_mark_dirty()
- + Pull request 18516
- |\
- | + qa/rgw: ignore errors from 'pool application enable'
- + Pull request 18539
- |\
- | + You can find the problem do like this:
- + Pull request 18564
- |\
- | + librbd: list_children should not attempt to refresh image
- + Pull request 18566
- |\
- | + rbd-mirror: strip environment/CLI overrides for remote cluster
- + Pull request 18569
- |\
- | + rgw:fix list objects with marker when bucket is enable versioning
- + Pull request 18589
- |\
- | + debian: fix package relationships after d3ac8d18
- | + debian: fix package relationships after 40caf6a6
- | + build/ops: deb: move ceph-*-tool binaries out of ceph-test subpackage
- | + build/ops: rpm: move ceph-*-tool binaries out of ceph-test subpackage
- + Pull request 18591
- |\
- | + rgw: RGWDataSyncControlCR retries on all errors
- | + rgw: fix error handling in ListBucketIndexesCR
- | + rgw: ListBucketIndexesCR spawns entries_index after listing metadata
- + Pull request 18596
- |\
- | + qa/cephfs: test ec data pool
- + Pull request 18599
- |\
- | + rgw_file: set s->obj_size from bytes_written
- + Pull request 18626
- |\
- | + qa/suites/rbd: run cls tests for all dependencies
- | + cls/journal: fixed possible infinite loop in expire_tags
- + Pull request 18628
- |\
- | + MDSMonitor: wait for readable OSDMap before sanitizing
- | + mds: clean up non-existent data pools in MDSMap
- | + mds: reduce variable scope
- + Pull request 18673
- |\
- | + osd: build_past_intervals_parallel: Ignore new partially created PGs
- + Pull request 18688
- |\
- | + mgr/balancer: simplify pool_info tracking
- | + mgr/balancer: less verbose on 'eval' by default; add 'eval-verbose'
- | + mgr/balancer: fix pg vs object terminology
- | + mgr/balancer: restrict to time of day
- | + mgr/module: adjust osd_weight min step to .005
- | + mgr/balancer: if score regresses, take a few more steps
- | + mgr/balancer: allow 5% misplaced
- | + mgr/balancer: more aggressive steps
- | + qa/suites/rados/thrash/d-balancer: enable balancer in various modes
- | + mgr/balancer: crush-compat: phase out osd_weights
- | + mgr/balancer: crush_compat: cope with 'out' osds
- | + mgr/balancer: stop if we get a perfect score
- | + mgr/balancer: more dead code
- | + mgr/balancer: crush-compat: throttle changes based on max_misplaced
- | + mgr/balancer: remove dead code
- | + mgr/balancer: include pg up mapping in MappingState
- | + mgr/balancer: normalize weight-set weights to sum to target weight
- | + mgr/balancer: note root id in Eval
- | + mgr/balancer: make crush-compat mode work!
- + Pull request 18446
- + rgw_file: explicit NFSv3 open() emulation
#16 Updated by Abhishek Lekshmanan over 6 years ago
RADOS¶
- 3 failed, rest pass http://pulpito.ceph.com/abhi-2017-11-05_16:35:44-rados-wip-abhi-testing-2017-11-05-1320-distro-basic-smithi/
2 env, http://qa-proxy.ceph.com/teuthology/abhi-2017-11-05_16:35:44-rados-wip-abhi-testing-2017-11-05-1320-distro-basic-smithi/1815714/
possible timing issue, reported at http://tracker.ceph.com/issues/22047 - passed http://pulpito.ceph.com/abhi-2017-11-06_13:14:17-rados-wip-abhi-testing-2017-11-05-1320-distro-basic-smithi/
#18 Updated by Abhishek Lekshmanan over 6 years ago
RGW¶
- 33 failed http://pulpito.ceph.com/abhi-2017-11-05_16:46:58-rgw-wip-abhi-testing-2017-11-05-1320-distro-basic-smithi/
Needs https://github.com/ceph/s3-tests/pull/195 to be backported probably
-s3 tests was fixed with a push to ceph-luminous branch - running http://pulpito.ceph.com/abhi-2017-11-06_15:37:57-rgw-wip-abhi-testing-2017-11-05-1320-distro-basic-smithi/
- 1820535 http://tracker.ceph.com/issues/22052
#19 Updated by Abhishek Lekshmanan over 6 years ago
#21 Updated by Abhishek Lekshmanan over 6 years ago
Upgrade Luminous-x¶
- 6 failed, running http://pulpito.ceph.com/abhi-2017-11-05_16:56:34-upgrade:luminous-x-wip-abhi-testing-2017-11-05-1320-distro-basic-smithi/
reproduced on rerun, Command failed on smithi011 with status 1: "rpmq ceph-common --qf '%{VERSION}%{RELEASE}'" and a similar debian equivalent
#22 Updated by Abhishek Lekshmanan over 6 years ago
Upgrade Jewel-x¶
#23 Updated by Abhishek Lekshmanan over 6 years ago
cephfs¶
- running, 6 failed http://pulpito.ceph.com/abhi-2017-11-05_16:49:08-fs-wip-abhi-testing-2017-11-05-1320-distro-basic-smithi/
seeing probably http://tracker.ceph.com/issues/22039#change-101941 in multiple instances
#24 Updated by Abhishek Lekshmanan over 6 years ago
- Status changed from New to In Progress
#25 Updated by Abhishek Lekshmanan over 6 years ago
- Description updated (diff)
#26 Updated by Yuri Weinstein over 6 years ago
QE VALIDATION (STARTED 11/9/17)¶
RELEASE APPROVED ON 11/30/17
(Note: PASSED / FAILED - indicates "TEST IS IN PROGRESS")
See details of approvals => https://marc.info/?t=151078342600002&r=1&w=2
re-runs command lines and filters are captured in http://pad.ceph.com/p/luminous_v12.2.2_QE_validation_notes
command line => CEPH_QA_MAIL="ceph-qa@ceph.com"; MACHINE_NAME=smithi; CEPH_BRANCH=luminous; SHA1=83684b91a3c6b31419114b83fc22106146885fb6 - FINAL
(and many more old 1071fdcf73faa387d0df18489ab7b0359a0c0afb a7c8c8101d4b78b4d6e437620b2c1a38cd752c3f) ; teuthology-suite -v --ceph-repo https://github.com/ceph/ceph.git --suite-repo https://github.com/ceph/ceph.git -c $CEPH_BRANCH -S $SHA1 -m $MACHINE_NAME -s rados --subset 35/50 -k distro -p 100 -e $CEPH_QA_MAIL --suite-branch $CEPH_BRANCH --dry-run
teuthology-suite -v -c $CEPH_BRANCH -S $SHA1 -m $MACHINE_NAME -r $RERUN --suite-repo https://github.com/ceph/ceph.git --ceph-repo https://github.com/ceph/ceph.git --suite-branch $CEPH_BRANCH -p 90 -R fail,dead,running
Suite | Runs/Reruns | Notes/Issues |
rgw | http://pulpito.ceph.com/yuriw-2017-11-09_16:59:06-rgw-luminous-distro-basic-smithi/ | FAILED #21154 Casey approved, ansible errors need more reviewing |
http://pulpito.ceph.com/yuriw-2017-11-10_15:59:57-rgw-luminous-distro-basic-smithi/ | ||
http://pulpito.ceph.com/yuriw-2017-11-28_17:27:51-rgw-luminous-distro-basic-smithi/ | rerun on latest sha1 %{color:blue}approved by Matt #21154 | |
From IRC "(01:31:21 PM) mattbenjamin: yuriw: I reviewed http://qa-proxy.ceph.com/teuthology/yuriw-2017-11-28_17:27:51-rgw-luminous-distro-basic-smithi/1901431/remote/smithi067/log/valgrind/ and http://qa-proxy.ceph.com/teuthology/yuriw-2017-11-10_15:59:57-rgw-luminous-distro-basic-smithi/1834895/remote/smithi137/log/valgrind/ ; I think these are the only not-trivially false positive run fails; I think they are clearly the same issue, which is a use-after-free in multi-siite; Casey b (01:32:29 PM) cbodley: the tracker for that one is http://tracker.ceph.com/issues/21154" |
||
hadoop | EXCLUDED FROM THIS RELEASE | |
samba | EXCLUDED FROM THIS RELEASE | |
upgrade/client-upgrade-hammer (luminous) | http://pulpito.ceph.com/yuriw-2017-11-09_17:28:10-upgrade:client-upgrade-hammer-luminous-distro-basic-ovh/ | PASSED |
upgrade/client-upgrade-kraken (luminous) | http://pulpito.ceph.com/yuriw-2017-11-09_17:28:49-upgrade:client-upgrade-kraken-luminous-distro-basic-ovh/ | PASSED |
upgrade/client-upgrade-jewel (luminous) | http://pulpito.ceph.com/yuriw-2017-11-09_17:29:21-upgrade:client-upgrade-jewel-luminous-distro-basic-ovh/ | PASSED |
upgrade/kraken-x (luminous) | http://pulpito.ceph.com/teuthology-2017-11-13_03:25:02-upgrade:kraken-x-luminous-distro-basic-ovh/ | PASSED |
upgrade/luminous-x (master) | EXCLUDED FROM THIS RELEASE | |
powercycle | http://pulpito.ceph.com/yuriw-2017-11-09_17:23:42-powercycle-smithi-testing-basic-smithi/ | |
http://pulpito.ceph.com/yuriw-2017-11-10_17:15:33-powercycle-luminous-distro-basic-smithi/ | FAILED #22108 need Sage review | |
http://pulpito.ceph.com/yuriw-2017-11-28_17:32:12-powercycle-luminous-distro-basic-smithi/ | on latest sha1 approved by Sage | |
PASSED / FAILED | ||
#27 Updated by Yuri Weinstein over 6 years ago
- Description updated (diff)
#28 Updated by Yuri Weinstein over 6 years ago
- Description updated (diff)
#29 Updated by Nathan Cutler over 6 years ago
- Description updated (diff)
- Status changed from In Progress to Resolved