Tasks #12701
closedhammer v0.94.4
0%
Description
Workflow¶
- Preparing the release OK
- Cutting the release
- Loic asks Sage if a point release should be published OK
- Loic gets approval from all leads
- Yehuda, rgw OK
- Gregory, CephFS no patch of concern to CephFS
- Josh, RBD OK
- Sam, rados OK
- Sage writes and commits the release notes IN PROGRESS
- Loic informs Yuri that the branch is ready for testing DONE
- Yuri runs additional integration tests - DONE
- If Yuri discovers new bugs that need to be backported urgently (i.e. their priority is set to Urgent), the release goes back to being prepared, it was not ready after all
- Yuri informs Alfredo that the branch is ready for release DONE
- Alfredo creates the packages and sets the release tag DONE
Release information¶
- branch to build from: hammer, commit:7f485ed5aa620fe982561663bf64356b7e2c38f2
- version: v0.94.4
- type of release: point release
- where to publish the release: http://ceph.com/debian-hammer and http://ceph.com/rpm-hammer
git --no-pager log --format='%H %s' --graph tags/v0.94.3..ceph/hammer | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 6161
- |\
- | + init-radosgw.sysv: remove
- | + init-radosgw: specify pid file to start-stop-daemon
- | + rgw: fix radosgw start-up script.
- | + init-radosgw: unify init-radosgw[.sysv]
- | + init-radosgw: look in /var/lib/ceph/radosgw
- | + doc: rgw: fix typo in comments
- | + rgw: init script waits until the radosgw stops
- + | Pull request 6166
- |\ \
- | + | rgw : setting max number of buckets for users via ceph.conf option
- | |/
- + | Pull request 6039
- |\ \
- | + | rgw: preserve all attrs if intra-zone copy
- | + | rgw: don't preserve acls when copying object
- | + | rgw: fix assignment of copy obj attributes
- + | | Pull request 6042
- |\ \ \
- | + | | rgw: set default value for env->get() call
- | + | | rgw: remove trailing :port from host for purposes of subdomain matching
- + | | | Pull request 6160
- |\ \ \ \
- | + | | | rgw: don't read actual data on user manifest HEAD
- | | |_|/
- | |/| |
- + | | | Pull request 6162
- |\ \ \ \
- | + | | | rgw: be more flexible with iso8601 timestamps
- | |/ / /
- + | | | Pull request 6163
- |\ \ \ \
- | + | | | rgw:add --reset-regions for regionmap update
- | |/ / /
- + | | | Pull request 6164
- |\ \ \ \
- | + | | | rgw: delete finisher only after finalizing watches
- | |/ / /
- + | | | Pull request 5718
- |\ \ \ \
- | + | | | rgw: send Content-Length in response for GET on Swift account.
- | + | | | rgw: force content_type for swift bucket stats request
- | + | | | rgw: we should not overide Swift sent content type
- | + | | | rgw: enforce Content-Type in Swift responses.
- | + | | | rgw: force content-type header for swift account responses without body
- | + | | | rgw: shouldn't return content-type: application/xml if content length is 0
- + | | | | Pull request 5860
- |\ \ \ \ \
- | + | | | | rgw: add delimiter to prefix only when path is specified
- | | |_|_|/
- | |/| | |
- + | | | | Pull request 6165
- |\ \ \ \ \
- | + | | | | rgw: init_rados failed leads to repeated delete
- | | |_|/ /
- | |/| | |
- + | | | | Pull request 6201
- |\ \ \ \ \
- | + | | | | tests: robust test for the pool create crushmap test
- + | | | | | Pull request 5885
- |\ \ \ \ \ \
- | |/ / / / /
- |/| | | | |
- | + | | | | ReplicatedPG,Objecter: copy_get should include truncate_seq and size
- + | | | | | Pull request 6192
- |\ \ \ \ \ \
- | + | | | | | crush/CrushTester: test fewer inputs when running crushtool
- | + | | | | | tests: update to match crushmap validation message
- | + | | | | | mon/OSDMonitor: fix crush injection error message
- | + | | | | | mon/OSDMonitor: only test crush ruleset for the newly created pool
- | + | | | | | crush/CrushTester: allow testing by ruleset
- |/ / / / / /
- + | | | | | Pull request 5887
- |\ \ \ \ \ \
- | + | | | | | crypto: fix unbalanced ceph::crypto::init/ceph::crypto:shutdown
- | |/ / / / /
- + | | | | | Pull request 6172
- |\ \ \ \ \ \
- | + | | | | | qa/workunits/cephtool/test.sh: don't assume crash_replay_interval=45
- | | |/ / / /
- | |/| | | |
- + | | | | | Pull request 6156
- |\ \ \ \ \ \
- | |/ / / / /
- |/| | | | |
- | + | | | | doc: remove mention of ceph-extra as a requirement
- | + | | | | doc: remove ceph-extras
- | + | | | | doc: correct links to download.ceph.com
- | + | | | | doc: Added Hammer in the list of major releases.
- |/ / / / /
- + | | | | osd/ReplicatedPG: tolerate promotion completion with stopped agent
- + | | | | Pull request 5715
- |\ \ \ \ \
- | + | | | | rgw: url encode exposed bucket
- + | | | | | Pull request 5719
- |\ \ \ \ \ \
- | + | | | | | rgw:segmentation fault when rgw_gc_max_objs > HASH_PRIME
- + | | | | | | Pull request 5720
- |\ \ \ \ \ \ \
- | |_|_|_|_|_|/
- |/| | | | | |
- | + | | | | | rgw:the arguments 'domain' should not be assigned when return false
- | |/ / / / /
- + | | | | | Pull request 5930
- |\ \ \ \ \ \
- | + | | | | | upstart: limit respawn to 3 in 30 mins
- + | | | | | | Pull request 5908
- |\ \ \ \ \ \ \
- | |/ / / / / /
- |/| | | | | |
- | + | | | | | Pipe: Drop connect_seq increase line
- |/ / / / / /
- + | | | | | Pull request 5767
- |\ \ \ \ \ \
- | + | | | | | librbd: Add a paramter:purge_on_error in ImageCtx::invalidate_cache().
- | + | | | | | librbd: Remvoe unused func ImageCtx::read_from_cache.
- | + | | | | | osdc: clean up code in ObjectCacher::Object::map_write
- | + | | | | | osdc: Don't pass mutex into ObjectCacher::_wait_for_write.
- | + | | | | | osdc: After write try merge bh.
- | + | | | | | osdc: Make last missing bh to wake up the reader.
- | + | | | | | osdc: For trust_enoent is true, there is only one extent.
- | + | | | | | osdc: In _readx() only no error can tidy read result.
- | | |_|_|_|/
- | |/| | | |
- + | | | | | Pull request 5687
- |\ \ \ \ \ \
- | + | | | | | include/ceph_features: define HAMMER_0_94_4 feature
- + | | | | | | Pull request 5892
- |\ \ \ \ \ \ \
- | |_|_|_|_|/ /
- |/| | | | | |
- | + | | | | | osd/PG: peek_map_epoch: skip legacy PGs if infos object is missing
- | + | | | | | osd: allow peek_map_epoch to return an error
- |/ / / / / /
- + | | | | | Pull request 5769
- |\ \ \ \ \ \
- | + | | | | | librbd: prevent race condition between resize requests
- | | |/ / / /
- | |/| | | |
- + | | | | | Pull request 5768
- |\ \ \ \ \ \
- | |_|_|_|_|/
- |/| | | | |
- | + | | | | lockdep: allow lockdep to be dynamically enabled/disabled
- | + | | | | tests: librbd API test cannot use private md_config_t struct
- | + | | | | tests: ensure old-format RBD tests still work
- | + | | | | librados_test_stub: implement conf get/set API methods
- | + | | | | crypto: use NSS_InitContext/NSS_ShutdownContex to avoid memory leak
- | + | | | | auth: use crypto_init_mutex to protect NSS_Shutdown()
- | + | | | | auth: reinitialize NSS modules after fork()
- + | | | | | Pull request 5697
- |\ \ \ \ \ \
- | + | | | | | mon: add a cache layer over MonitorDBStore
- | | |_|/ / /
- | |/| | | |
- + | | | | | Pull request 5381
- |\ \ \ \ \ \
- | + | | | | | Client: check dir is still complete after dropping locks in _readdir_cache_cb
- | / / / / /
- + | | | | | Pull request 5757
- |\ \ \ \ \ \
- | + | | | | | WBThrottle::clear_object: signal if we cleared an object
- + | | | | | | Pull request 5759
- |\ \ \ \ \ \ \
- | + | | | | | | config: skip lockdep for intentionally recursive md_config_t lock
- | |/ / / / / /
- + | | | | | | Pull request 5761
- |\ \ \ \ \ \ \
- | + | | | | | | OSD: break connection->session->waiting message->connection cycle
- + | | | | | | | Pull request 5762
- |\ \ \ \ \ \ \ \
- | + | | | | | | | osd: copy the RecoveryCtx::handle when creating a new RecoveryCtx instance from another one
- | | |/ / / / / /
- | |/| | | | | |
- + | | | | | | | Pull request 5763
- |\ \ \ \ \ \ \ \
- | + | | | | | | | osd/PGLog: dirty_to is inclusive
- | | |/ / / / / /
- | |/| | | | | |
- + | | | | | | | Pull request 5764
- |\ \ \ \ \ \ \ \
- | + | | | | | | | common: fix code format
- | + | | | | | | | test: add test case for insert empty ptr when buffer rebuild
- | + | | | | | | | common: fix insert empty ptr when bufferlist rebuild
- | |/ / / / / / /
- + | | | | | | | Pull request 5373
- |\ \ \ \ \ \ \ \
- | + | | | | | | | osd: pg_interval_t::check_new_interval should not rely on pool.min_size to determine if the PG was active
- | + | | | | | | | osd: Move IsRecoverablePredicate/IsReadablePredicate to osd_types.h
- | / / / / / / /
- + | | | | | | | Pull request 5383
- |\ \ \ \ \ \ \ \
- | + | | | | | | | rest_bench: bucketname is not mandatory as we have a default name
- | + | | | | | | | rest_bench: drain the work queue to fix a crash Fixes: #3896 Signed-off-by: huangjun <hjwsm1989@gmail.com>
- | / / / / / / /
- + | | | | | | | Pull request 5765
- |\ \ \ \ \ \ \ \
- | + | | | | | | | tests: tiering agent and proxy read
- | + | | | | | | | osd: trigger the cache agent after a promotion
- + | | | | | | | | Pull request 5754
- |\ \ \ \ \ \ \ \ \
- | + | | | | | | | | librados: Make librados pool_create respect default_crush_ruleset
- | | |_|/ / / / / /
- | |/| | | | | | |
- + | | | | | | | | Pull request 5377
- |\ \ \ \ \ \ \ \ \
- | + | | | | | | | | mon/PGMonitor: bug fix pg monitor get crush rule
- | / / / / / / / /
- + | | | | | | | | Pull request 5758
- |\ \ \ \ \ \ \ \ \
- | + | | | | | | | | osd: Keep a reference count on Connection while calling send_message()
- | |/ / / / / / / /
- + | | | | | | | | Pull request 5276
- |\ \ \ \ \ \ \ \ \
- | |_|/ / / / / / /
- |/| | | | | | | |
- | + | | | | | | | mon: test the crush ruleset when creating a pool
- | + | | | | | | | erasure-code: set max_size to chunk_count() instead of 20 for shec
- | + | | | | | | | vstart.sh: set PATH to include pwd
- + | | | | | | | | Pull request 5382
- |\ \ \ \ \ \ \ \ \
- | + | | | | | | | | auth: check return value of keyring->get_secret
- | / / / / / / / /
- + | | | | | | | | Pull request 5367
- |\ \ \ \ \ \ \ \ \
- | |_|_|_|_|/ / / /
- |/| | | | | | | |
- | + | | | | | | | os/chain_xattr: handle read on chnk-aligned xattr
- | / / / / / / /
- + | | | | | | | Pull request 5223
- |\ \ \ \ \ \ \ \
- | + | | | | | | | ceph.spec.in: do not run fdupes, even on SLE/openSUSE
- | / / / / / / /
- + | | | | | | | Pull request 5716
- |\ \ \ \ \ \ \ \
- | + | | | | | | | rgw: avoid using slashes for generated secret keys
- | | |_|_|_|_|_|/
- | |/| | | | | |
- + | | | | | | | Pull request 5717
- |\ \ \ \ \ \ \ \
- | |_|_|_|_|_|/ /
- |/| | | | | | |
- | + | | | | | | rgw: api adjustment following a rebase
- | + | | | | | | rgw: orphans, fix check on number of shards
- | + | | | | | | rgw: orphans, change default number of shards
- | + | | | | | | rgw: change error output related to orphans
- | + | | | | | | rgw: orphan, fix truncated detection
- | + | | | | | | radosgw-admin: simplify orphan command
- | + | | | | | | radosgw-admin: stat orphan objects before reporting leakage
- | + | | | | | | radosgw-admin: orphans finish command
- | + | | | | | | rgw: cannot re-init an orphan scan job
- | + | | | | | | rgw: stat_async() sets the object locator appropriately
- | + | | | | | | rgw: list_objects() sets namespace appropriately
- | + | | | | | | rgw: modify orphan search fingerprints
- | + | | | | | | rgw: compare oids and dump leaked objects
- | + | | | | | | rgw: keep accurate state for linked objects orphan scan
- | + | | | | | | rgw: iterate over linked objects, store them
- | + | | | | | | rgw: add rgw_obj::parse_raw_oid()
- | + | | | | | | rgw: iterate asynchronously over linked objects
- | + | | | | | | rgw: async object stat functionality
- | + | | | | | | rgw-admin: build index of bucket indexes
- | + | | | | | | rgw: initial work of orphan detection tool implementation
- | + | | | | | | Avoid an extra read on the atomic variable
- | + | | | | | | RGW: Make RADOS handles in RGW to be a configurable option
- | | |/ / / / /
- | |/| | | | |
- + | | | | | | Pull request 5755
- |\ \ \ \ \ \ \
- | + | | | | | | ceph-disk: always check zap is applied on a full device
- | | |_|/ / / /
- | |/| | | | |
- + | | | | | | Pull request 5732
- |\ \ \ \ \ \ \
- | + | | | | | | rgw: init some manifest fields when handling explicit objs
- | |/ / / / / /
- + | | | | | | Pull request 5721
- |\ \ \ \ \ \ \
- | + | | | | | | rgw: rework X-Trans-Id header to be conform with Swift API.
- | + | | | | | | Transaction Id added in response
- | | |/ / / / /
- | |/| | | | |
- + | | | | | | Pull request 5498
- |\ \ \ \ \ \ \
- | |_|_|_|/ / /
- |/| | | | | |
- | + | | | | | rgw: set http status in civetweb
- | + | | | | | civetweb: update submodule to support setting of http status
- | / / / / /
- + | | | | | Pull request 5527
- |\ \ \ \ \ \
- | + | | | | | osd/OSDMap: handle incrementals that modify+del pool
- | / / / / /
- + | | | | | Pull request 5551
- |\ \ \ \ \ \
- | |_|/ / / /
- |/| | | | |
- | + | | | | ceph-object-corpus: add 0.94.2-207-g88e7ee7 hammer objects
- | / / / /
- + | | | | Pull request 5365
- |\ \ \ \ \
- | + | | | | buffer: Fix bufferlist::zero bug with special case
- | + | | | | UnittestBuffer: Add bufferlist zero test case
- | / / / /
- + | | | | Pull request 5369
- |\ \ \ \ \
- | + | | | | Update OSDMonitor.cc
- | / / / /
- + | | | | Pull request 5370
- |\ \ \ \ \
- | + | | | | mon/PGMonitor: avoid uint64_t overflow when checking pool 'target/max' status. Fixes: #12401
- | / / / /
- + | | | | Pull request 5378
- |\ \ \ \ \
- | + | | | | Mutex: fix leak of pthread_mutexattr
- | / / / /
- + | | | | Pull request 5372
- |\ \ \ \ \
- | |/ / / /
- |/| | | |
- | + | | | mon: OSDMonitor: fix hex output on 'osd reweight'
- | / / /
- + | | | Pull request 5374
- |\ \ \ \
- | + | | | crush/CrushWrapper: fix adjust_subtree_weight debug
- | + | | | crush/CrushWrapper: return changed from adjust_subtree_weight
- | + | | | crush/CrushWrapper: adjust subtree base in adjust_subtree_weight
- | + | | | unittest_crush_wrapper: test adjust_subtree_weight
- | + | | | unittest_crush_wrapper: attach buckets to root in adjust_item_weight test
- | + | | | unittest_crush_wrapper: parse env
- | / / /
- + | | | Pull request 5380
- |\ \ \ \
- | |_|_|/
- |/| | |
- | + | | TestPGLog: fix invalid proc_replica_log test caes
- | + | | TestPGLog: fix noop log proc_replica_log test case
- | + | | TestPGLog: add test for 11358
- | + | | PGLog::proc_replica_log: handle split out overlapping entries
- | / /
- + | | Pull request 5366
- |\ \ \
- | + | | common/Cycles.cc: skip initialization if rdtsc is not implemented
- | / /
- + | | Pull request 5202
- |\ \ \
- | + | | rpm: add missing Java conditionals
- | / /
- + | | Pull request 5203
- |\ \ \
- | + | | Add rpm conditionals : cephfs_java
- | / /
- + | | Pull request 5204
- |\ \ \
- | + | | ceph.spec.in: SUSE/openSUSE builds need libbz2-devel
- | / /
- + | | Pull request 5207
- |\ \ \
- | + | | ceph.spec.in: use _udevrulesdir to eliminate conditionals
- | / /
- + | | Pull request 5216
- |\ \ \
- | + | | ceph.spec.in: python-argparse only in Python 2.6
- | / /
- + | | Pull request 5264
- |\ \ \
- | + | | ceph.spec.in: snappy-devel for all supported distros
- | / /
- + | | Pull request 5368
- |\ \ \
- | + | | ceph.in: do not throw on unknown errno
- | / /
- + | | Pull request 5371
- |\ \ \
- | + | | ceph.in: print more detailed warning for 'ceph <type> tell'
- | + | | ceph.in: print more detailed error message for 'tell' command
- | / /
- + | | Pull request 5385
- |\ \ \
- | + | | packaging: RGW depends on /etc/mime.types
- | / /
- + | | Pull request 5411
- |\ \ \
- | + | | ceph.spec.in: remove SUSE-specific apache2-mod_fcgid dependency
- | / /
- + | | Pull request 5412
- |\ \ \
- | + | | ceph.spec.in: drop SUSE-specific %py_requires macro
- | / /
- + | | Pull request 5318
- |\ \ \
- | + | | tests: verify that image shrink properly handles flush op
- | + | | librbd: invalidate cache outside cache callback context
- + | | | Pull request 5319
- |\ \ \ \
- | + | | | librbd: don't cancel request lock early
- | + | | | tests: new test for transitioning exclusive lock
- | + | | | tests: verify that librbd will periodically resend lock request
- | + | | | common: Mutex shouldn't register w/ lockdep if disabled
- | + | | | librbd: improve debugging output for ImageWatcher
- | + | | | librados_test_stub: watcher id should be the instance id
- | + | | | librbd: retry lock requests periodically until acquired
- | + | | | librbd: don't hold owner_lock for write during flush
- | |/ / /
- + | | | Pull request 5296
- |\ \ \ \
- | |/ / /
- | + | | lockdep: do not automatically collect all backtraces
- | + | | librbd: flush operations need to acquire owner lock
- | + | | librbd: avoid infinite loop if copyup fails
- | + | | librbd: flush pending ops while not holding lock
- | + | | tests: fix possible deadlock in librbd ImageWatcher tests
- | + | | tests: enable lockdep for librbd unit tests
- | + | | librbd: owner_lock should be held during flush request
- | + | | osdc: ObjectCacher flusher might needs additional locks
- | + | | librbd: fix recursive locking issues
- | + | | librbd: simplify state machine handling of exclusive lock
- | + | | librbd: ObjectMap::aio_update can acquire snap_lock out-of-order
- | + | | librbd: move copyup class method call to CopyupRequest
- | + | | librbd: simplify AioRequest constructor parameters
- | + | | librbd/AioRequest.h: fix UNINIT_CTOR
- | + | | librbd: add object state accessor to ObjectMap
- | + | | librbd: AsyncObjectThrottle should always hold owner_lock
- | + | | librbd: execute flush completion outside of cache_lock
- | + | | librbd: add AsyncRequest task enqueue helper method
- | + | | librbd: disable lockdep on AioCompletion
- | + | | librbd: AioCompletion shouldn't hold its lock during callback
- | + | | librbd: give locks unique names to prevent false lockdep failures
- | + | | librbd: complete cache read in a new thread context
- | + | | librbd: require callers to ObjectMap::aio_update to acquire lock
- | + | | log: fix helgrind warnings regarding possible data race
- | + | | librados_test_stub: fix helgrind warnings
- | + | | librados_test_stub: add support for flushing watches
- | + | | common: lockdep now support unregistering once destructed
- | + | | common: add valgrind.h convenience wrapper
- | + | | librbd: add work queue for op completions
- | + | | WorkQueue: ContextWQ can now accept a return code
- | / /
- + | | Pull request 5559
- |\ \ \
- | + | | tests: increase test coverage for partial encodes/decodes
- | + | | common: bit_vector extent calculation incorrect for last page
- | / /
- + | | Pull request 5468
- |\ \ \
- | + | | osd: include newlines in scrub errors
- | + | | osd: fix condition for loggin scrub errors
- | + | | osd: fix fallback logic; move into be_select_auth_object
- | + | | osd: log a scrub error when we can't pick an auth object
- | + | | osd: repair record digest if all replicas match but do not match
- | + | | osd: move recorded vs on disk digest warning into be_compare_scrubmaps
- | + | | osd: be slightly paranoid about value of okseed
- | + | | osd: be precise about known vs best guess
- | + | | osd: record digest if object is clean
- | / /
- + | | Pull request 5376
- |\ \ \
- | + | | mon: ceph osd map shows NONE when an osd is missing
- | / /
- + | | Pull request 5359
- |\ \ \
- | |/ /
- |/| |
- | + | mon: PaxosService: call post_refresh() instead of post_paxos_update()
- | /
- + | Pull request 5691
- |\ \
- | |/
- |/|
- | + Objecter: pg_interval_t::is_new_interval needs pgid from previous pool
- | + osd_types::is_new_interval: size change triggers new interval
- |/
- + Merge remote-tracking branch 'gh/wip-12536-hammer' into hammer
- + Merge remote-tracking branch 'gh/wip-osd-compat-hammer' into wip-12536-hammer
- |\
- | + mon: disallow post-hammer OSDs if there are up pre-hammer OSDs
- | + include/ceph_features: define MON_METADATA feature
- + hobject_t: fix get_boundary to work with new sorting regime
- + hobject_t: decode future hobject_t::get_min() properly
- + OSDMonitor::preprocess_get_osdmap: send the last map as well
teuthology run commit e1dadd3da9e39daf669f94715c7833d2b280bbed (HAMMER BACKPORTS August-14)¶
git --no-pager log --format='%H %s' --graph ceph/hammer..ceph/hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge (\d+)/) { s|\w+\s+Merge (\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 5202
- |\
- | + rpm: add missing Java conditionals
- + Pull request 5203
- |\
- | + Add rpm conditionals : cephfs_java
- + Pull request 5204
- |\
- | + ceph.spec.in: SUSE/openSUSE builds need libbz2-devel
- + Pull request 5207
- |\
- | + ceph.spec.in: use _udevrulesdir to eliminate conditionals
- + Pull request 5216
- |\
- | + ceph.spec.in: python-argparse only in Python 2.6
- + Pull request 5223
- |\
- | + ceph.spec.in: do not run fdupes, even on SLE/openSUSE
- + Pull request 5264
- |\
- | + ceph.spec.in: snappy-devel for all supported distros
- + Pull request 5318
- |\
- | + tests: verify that image shrink properly handles flush op
- | + librbd: invalidate cache outside cache callback context
- + | Pull request 5319
- |\ \
- | + | librbd: don't cancel request lock early
- | + | tests: new test for transitioning exclusive lock
- | + | tests: verify that librbd will periodically resend lock request
- | + | common: Mutex shouldn't register w/ lockdep if disabled
- | + | librbd: improve debugging output for ImageWatcher
- | + | librados_test_stub: watcher id should be the instance id
- | + | librbd: retry lock requests periodically until acquired
- | + | librbd: don't hold owner_lock for write during flush
- | |/
- | + lockdep: do not automatically collect all backtraces
- | + librbd: flush operations need to acquire owner lock
- | + librbd: avoid infinite loop if copyup fails
- | + librbd: flush pending ops while not holding lock
- | + tests: fix possible deadlock in librbd ImageWatcher tests
- | + tests: enable lockdep for librbd unit tests
- | + librbd: owner_lock should be held during flush request
- | + osdc: ObjectCacher flusher might needs additional locks
- | + librbd: fix recursive locking issues
- | + librbd: simplify state machine handling of exclusive lock
- | + librbd: ObjectMap::aio_update can acquire snap_lock out-of-order
- | + librbd: move copyup class method call to CopyupRequest
- | + librbd: simplify AioRequest constructor parameters
- | + librbd/AioRequest.h: fix UNINIT_CTOR
- | + librbd: add object state accessor to ObjectMap
- | + librbd: AsyncObjectThrottle should always hold owner_lock
- | + librbd: execute flush completion outside of cache_lock
- | + librbd: add AsyncRequest task enqueue helper method
- | + librbd: disable lockdep on AioCompletion
- | + librbd: AioCompletion shouldn't hold its lock during callback
- | + librbd: give locks unique names to prevent false lockdep failures
- | + librbd: complete cache read in a new thread context
- | + librbd: require callers to ObjectMap::aio_update to acquire lock
- | + log: fix helgrind warnings regarding possible data race
- | + librados_test_stub: fix helgrind warnings
- | + librados_test_stub: add support for flushing watches
- | + common: lockdep now support unregistering once destructed
- | + common: add valgrind.h convenience wrapper
- | + librbd: add work queue for op completions
- | + WorkQueue: ContextWQ can now accept a return code
- + Pull request 5359
- |\
- | + mon: PaxosService: call post_refresh() instead of post_paxos_update()
- + Pull request 5361
- |\
- | + mon: MonitorDBStore: get_next_key() only if prefix matches
- + Pull request 5365
- |\
- | + buffer: Fix bufferlist::zero bug with special case
- | + UnittestBuffer: Add bufferlist zero test case
- + Pull request 5366
- |\
- | + common/Cycles.cc: skip initialization if rdtsc is not implemented
- + Pull request 5367
- |\
- | + os/chain_xattr: handle read on chnk-aligned xattr
- + Pull request 5368
- |\
- | + ceph.in: do not throw on unknown errno
- + Pull request 5369
- |\
- | + Update OSDMonitor.cc
- + Pull request 5370
- |\
- | + mon/PGMonitor: avoid uint64_t overflow when checking pool 'target/max' status. Fixes: #12401
- + Pull request 5371
- |\
- | + ceph.in: print more detailed warning for 'ceph <type> tell'
- | + ceph.in: print more detailed error message for 'tell' command
- + Pull request 5372
- |\
- | + mon: OSDMonitor: fix hex output on 'osd reweight'
- + Pull request 5373
- |\
- | + osd: pg_interval_t::check_new_interval should not rely on pool.min_size to determine if the PG was active
- | + osd: Move IsRecoverablePredicate/IsReadablePredicate to osd_types.h
- + Pull request 5374
- |\
- | + crush/CrushWrapper: fix adjust_subtree_weight debug
- | + crush/CrushWrapper: return changed from adjust_subtree_weight
- | + crush/CrushWrapper: adjust subtree base in adjust_subtree_weight
- | + unittest_crush_wrapper: test adjust_subtree_weight
- | + unittest_crush_wrapper: attach buckets to root in adjust_item_weight test
- | + unittest_crush_wrapper: parse env
- + Pull request 5376
- |\
- | + mon: ceph osd map shows NONE when an osd is missing
- + Pull request 5377
- |\
- | + mon/PGMonitor: bug fix pg monitor get crush rule
- + Pull request 5378
- |\
- | + Mutex: fix leak of pthread_mutexattr
- + Pull request 5380
- |\
- | + TestPGLog: fix invalid proc_replica_log test caes
- | + TestPGLog: fix noop log proc_replica_log test case
- | + TestPGLog: add test for 11358
- | + PGLog::proc_replica_log: handle split out overlapping entries
- + Pull request 5381
- |\
- | + Client: check dir is still complete after dropping locks in _readdir_cache_cb
- + Pull request 5382
- |\
- | + auth: check return value of keyring->get_secret
- + Pull request 5383
- |\
- | + rest_bench: bucketname is not mandatory as we have a default name
- | + rest_bench: drain the work queue to fix a crash Fixes: #3896 Signed-off-by: huangjun <hjwsm1989@gmail.com>
- + Pull request 5385
- |\
- | + packaging: RGW depends on /etc/mime.types
- + Pull request 5387
- |\
- | + rgw: fix assignment of copy obj attributes
- + Pull request 5411
- |\
- | + ceph.spec.in: remove SUSE-specific apache2-mod_fcgid dependency
- + Pull request 5412
- |\
- | + ceph.spec.in: drop SUSE-specific %py_requires macro
- + Pull request 5468
- |\
- | + osd: include newlines in scrub errors
- | + osd: fix condition for loggin scrub errors
- | + osd: fix fallback logic; move into be_select_auth_object
- | + osd: log a scrub error when we can't pick an auth object
- | + osd: repair record digest if all replicas match but do not match
- | + osd: move recorded vs on disk digest warning into be_compare_scrubmaps
- | + osd: be slightly paranoid about value of okseed
- | + osd: be precise about known vs best guess
- | + osd: record digest if object is clean
- + Pull request 5471
- |\
- | + mon: disallow post-hammer OSDs if there are up pre-hammer OSDs
- | + include/ceph_features: define MON_METADATA feature
- + Pull request 5482
- |\
- | + rgw: enforce Content-Type in Swift responses.
- | + rgw: force content-type header for swift account responses without body
- | + rgw: shouldn't return content-type: application/xml if content length is 0
- + Pull request 5498
- |\
- | + rgw: set http status in civetweb
- | + civetweb: update submodule to support setting of http status
- + Pull request 5527
- |\
- | + osd/OSDMap: handle incrementals that modify+del pool
- + Pull request 5559
- + tests: increase test coverage for partial encodes/decodes
- + common: bit_vector extent calculation incorrect for last page
rbd¶
./virtualenv/bin/teuthology-suite --priority 1000 --suite rbd --subset $(expr $RANDOM % 5)/5 --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports --machine-type plana,burnupi,mira
- fail http://pulpito.ceph.com/loic-2015-08-15_21:59:25-rbd-hammer-backports---basic-multi/
- 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper term qemu-system-x86_64 -enable-kvm -nographic -m 4096 -drive file=/home/ubuntu/cephtest/qemu/base.client.0.qcow2,format=qcow2,if=virtio -cdrom /home/ubuntu/cephtest/qemu/client.0.iso -drive file=rbd:rbd/client.0.0:id=0,format=raw,if=virtio,cache=none'
- {'plana19.front.sepia.ceph.com': "error while evaluating conditional: ssh_key_update.state == 'present'"}
- 'mkdir
p -/home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=e1dadd3da9e39daf669f94715c7833d2b280bbed TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/rbd/qemu-iotests.sh'
paddles=paddles.front.sepia.ceph.com run=loic-2015-08-15_21:59:25-rbd-hammer-backports---basic-multi eval filter=$(curl --silent http://$paddles/runs/$run/ | jq '.jobs[] | select(.status == "dead" or .status == "fail") | .description' | while read description ; do echo -n $description, ; done | sed -e 's/,$//') ./virtualenv/bin/teuthology-suite --priority 1000 --suite rbd --filter="$filter" --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports --machine-type plana,burnupi,mira
- fail http://pulpito.ceph.com/loic-2015-08-19_18:04:45-rbd-hammer-backports---basic-multi/
- can be ignored 'mkdir
p -/home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=e1dadd3da9e39daf669f94715c7833d2b280bbed TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/rbd/qemu-iotests.sh'- rbd/singleton/{all/qemu-iotests-writethrough.yaml}
- rbd/singleton/{all/qemu-iotests-writeback.yaml}
The same two jobs also consistently fail on hammer, see http://pulpito.ceph.com/loic-2015-08-30_11:19:51-rbd-hammer-testing-basic-multi
- can be ignored 'mkdir
rados¶
Together with http://pulpito.ceph.com/loic-2015-08-29_20:19:58-rados-hammer-backports---basic-multi/ that is a re-run of the failed tests after removing https://github.com/ceph/ceph/pull/5361 from the integration branch, the following makes for a successful run of the rados suite.
./virtualenv/bin/teuthology-suite --priority 1000 --suite rados --subset $(expr $RANDOM % 18)/18 --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports --machine-type plana,burnupi,mira
- fail http://pulpito.ceph.com/loic-2015-08-15_22:00:55-rados-hammer-backports---basic-multi
- failed to become clean before timeout expired
- 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 2'
paddles=paddles.front.sepia.ceph.com run=loic-2015-08-15_22:00:55-rados-hammer-backports---basic-multi eval filter=$(curl --silent http://$paddles/runs/$run/ | jq '.jobs[] | select(.status == "dead" or .status == "fail") | .description' | while read description ; do echo -n $description, ; done | sed -e 's/,$//') ./virtualenv/bin/teuthology-suite --priority 1000 --suite rados --filter="$filter" --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports --machine-type plana,burnupi,mira
rgw¶
./virtualenv/bin/teuthology-suite --priority 1000 --suite rgw --subset $(expr $RANDOM % 5)/5 --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports --machine-type plana,burnupi,mira
- fail http://pulpito.ceph.com/loic-2015-08-15_22:08:45-rgw-hammer-backports---basic-multi/
- *"S3TEST_CONF=/home/ubuntu/cephtest/archive/s3-tests.client.0.conf BOTO_CONFIG=/home/ubuntu/cephtest/boto.cfg
Updated by Abhishek Varshney over 8 years ago
- Start date changed from 06/12/2015 to 08/15/2015
Updated by Loïc Dachary over 8 years ago
http://pulpito.ceph.com/loic-2015-08-15_22:14:11-upgrade:hammer-hammer-backports---basic-multi/
Sage: every single test was stuck, and on the 3 I checked
all mon's were stuck in leveldb from scrub using 100% cpu:
#0 0x00007fa14bd164a0 in std::string::_Rep::_M_destroy(std::allocator<char> const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #1 0x00000000008a4c45 in _M_dispose (__a=..., this=<optimized out>) at /usr/include/c++/4.8/bits/basic_string.h:249 #2 ~basic_string (this=0x7fa145fe1880, __in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/basic_string.h:539 #3 LevelDBStore::split_key (in=..., prefix=prefix@entry=0x0, key=key@entry=0x7fa145fe1990) at os/LevelDBStore.cc:230 #4 0x00000000008a73d6 in LevelDBStore::LevelDBWholeSpaceIteratorImpl::key (this=<optimized out>) at os/LevelDBStore.h:271 #5 0x000000000059a86e in KeyValueDB::IteratorImpl::key (this=<optimized out>) at ./os/KeyValueDB.h:146 #6 0x00000000008a5fdf in LevelDBStore::get (this=0x361b080, prefix=..., keys=..., out=0x7fa145fe1b60) at os/LevelDBStore.cc:195 #7 0x000000000059fc0a in MonitorDBStore::get (this=0x361b1e0, prefix=..., key=..., bl=...) at mon/MonitorDBStore.h:499 #8 0x00000000005b268a in Monitor::_scrub (this=this@entry=0x3792000, r=0x360d358) at mon/Monitor.cc:4272 #9 0x00000000005c2531 in Monitor::scrub (this=this@entry=0x3792000) at mon/Monitor.cc:4217 #10 0x00000000005cc6b7 in Monitor::handle_command (this=this@entry=0x3792000, m=m@entry=0x3ad3a00) at mon/Monitor.cc:2711 #11 0x00000000005cefa9 in Monitor::dispatch (this=this@entry=0x3792000, s=s@entry=0x424ae00, m=m@entry=0x3ad3a00, src_is_mon=src_is_mon@entry=false) at mon/Monitor.cc:3457 #12 0x00000000005cfc26 in Monitor::_ms_dispatch (this=this@entry=0x3792000, m=m@entry=0x3ad3a00) at mon/Monitor.cc:3376 #13 0x00000000005cea35 in Monitor::handle_forward (this=this@entry=0x3792000, m=m@entry=0x3c0f080) at mon/Monitor.cc:3068 #14 0x00000000005cf66d in Monitor::dispatch (this=this@entry=0x3792000, s=s@entry=0x3636e00, m=m@entry=0x3c0f080, src_is_mon=src_is_mon@entry=true) at mon/Monitor.cc:3589 #15 0x00000000005cfc26 in Monitor::_ms_dispatch (this=this@entry=0x3792000, m=m@entry=0x3c0f080) at mon/Monitor.cc:3376 #16 0x00000000005ede23 in Monitor::ms_dispatch (this=0x3792000, m=0x3c0f080) at mon/Monitor.h:833 #17 0x00000000009277c9 in ms_deliver_dispatch (m=0x3c0f080, this=0x37a2700) at ./msg/Messenger.h:567 #18 DispatchQueue::entry (this=0x37a28c8) at msg/simple/DispatchQueue.cc:185 #19 0x00000000007c7fcd in DispatchQueue::DispatchThread::entry (this=<optimized out>) at msg/simple/DispatchQueue.h:103 #20 0x00007fa14cf0a182 in start_thread (arg=0x7fa145fe4700) at pthread_create.c:312 #21 0x00007fa14b474fbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111Candidate Unlikely Cosmetic Upgrade
Updated by Joao Eduardo Luis over 8 years ago
The stack trace appears to get stuck while dealloc'ing a string. I don't really know how this can happen.
Updated by Sage Weil over 8 years ago
Joao Luis wrote:
The stack trace appears to get stuck while dealloc'ing a string. I don't really know how this can happen.
The stack traces vary slightly, they're all in get_key() but with various other bits after that:
Thread 31 (Thread 0x7f3c24479700 (LWP 12869)): #0 LevelDBStore::get (this=0x41231e0, prefix=..., keys=..., out=0x7f3c24477e70) at /usr/include/c++/4.8/bits/stl_set.h:299 #1 0x000000000059fc0a in MonitorDBStore::get (this=0x4123340, prefix=..., key=..., bl=...) at mon/MonitorDBStore.h:499 #2 0x00000000005b268a in Monitor::_scrub (this=this@entry=0x4282000, r=r@entry=0x4897848) at mon/Monitor.cc:4272 Thread 33 (Thread 0x7fdad6593700 (LWP 12898)): #0 LevelDBStore::split_key (in=..., prefix=prefix@entry=0x7fdad6590890, key=key@entry=0x7fdad65908a0) at os/LevelDBStore.cc:226 #1 0x00000000008a7462 in LevelDBStore::LevelDBWholeSpaceIteratorImpl::raw_key (this=<optimized out>) at os/LevelDBStore.h:276 #2 0x000000000059aaaf in KeyValueDB::IteratorImpl::valid (this=0x5139e20) at ./os/KeyValueDB.h:132 #3 0x00000000008a5fc0 in LevelDBStore::get (this=0x50b8f20, prefix=..., keys=..., out=0x7fdad6590b60) at os/LevelDBStore.cc:195 #4 0x000000000059fc0a in MonitorDBStore::get (this=0x50b9080, prefix=..., key=..., bl=...) at mon/MonitorDBStore.h:499 #5 0x00000000005b268a in Monitor::_scrub (this=this@entry=0x521e000, r=0x50aa398) at mon/Monitor.cc:4272 Thread 31 (Thread 0x7f93f0e82700 (LWP 12899)): #0 0x00007f93f658be70 in std::string::compare(std::string const&) const () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #1 0x00000000005b2875 in operator< <char, std::char_traits<char>, std::allocator<char> > (__rhs=..., __lhs=...) at /usr/include/c++/4.8/bits/basic_string.h:2571 #2 operator() (this=<optimized out>, __y=..., __x=...) at /usr/include/c++/4.8/bits/stl_function.h:235 #3 operator[] (__k=..., this=0x4b005f8) at /usr/include/c++/4.8/bits/stl_map.h:463 #4 Monitor::_scrub (this=this@entry=0x465a000, r=r@entry=0x4b005c8) at mon/Monitor.cc:4274 Thread 33 (Thread 0x7f0194204700 (LWP 14872)): #0 ~basic_string (this=0x7f0194202b90, __in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/basic_string.h:539 #1 raw (l=1652, this=0x3aea0b0) at common/buffer.cc:135 #2 raw_char (l=1652, this=0x3aea0b0) at common/buffer.cc:483 #3 ceph::buffer::copy (c=0x7f018bad1012 "\001\002\002\202\002", len=1652) at common/buffer.cc:589 #4 0x000000000084f888 in ceph::buffer::ptr::ptr (this=0x7f0194202be0, d=<optimized out>, l=<optimized out>) at common/buffer.cc:651 #5 0x00000000008a59d4 in LevelDBStore::to_bufferlist (in=...) at os/LevelDBStore.cc:215 #6 0x00000000008a788e in LevelDBStore::LevelDBWholeSpaceIteratorImpl::value (this=<optimized out>) at os/LevelDBStore.h:280 #7 0x000000000059a88e in KeyValueDB::IteratorImpl::value (this=<optimized out>) at ./os/KeyValueDB.h:149 #8 0x00000000008a6053 in LevelDBStore::get (this=0x38c8f20, prefix=..., keys=..., out=0x7f0194202e70) at os/LevelDBStore.cc:196 #9 0x000000000059fc0a in MonitorDBStore::get (this=0x38c9080, prefix=..., key=..., bl=...) at mon/MonitorDBStore.h:499 #10 0x00000000005b268a in Monitor::_scrub (this=this@entry=0x3a6c000, r=r@entry=0x4067c48) at mon/Monitor.cc:4272 Thread 33 (Thread 0x7fcb0a153700 (LWP 14871)): #0 0x00000000008a73c3 in LevelDBStore::LevelDBWholeSpaceIteratorImpl::key (this=0x3b807d0) at os/LevelDBStore.h:271 #1 0x000000000059a86e in KeyValueDB::IteratorImpl::key (this=<optimized out>) at ./os/KeyValueDB.h:146 #2 0x00000000008a5fdf in LevelDBStore::get (this=0x3ba4f20, prefix=..., keys=..., out=0x7fcb0a151e70) at os/LevelDBStore.cc:195 #3 0x000000000059fc0a in MonitorDBStore::get (this=0x3ba5080, prefix=..., key=..., bl=...) at mon/MonitorDBStore.h:499 #4 0x00000000005b268a in Monitor::_scrub (this=this@entry=0x3d48000, r=r@entry=0x3fb8388) at mon/Monitor.cc:4272
and all are at 100% cpu. pretty clearly busy looping in get()!
Probably something in the caller is different on hammer vs master? :/
In any case, I'd drop that commit!
Updated by Loïc Dachary over 8 years ago
- Description updated (diff)
filter="rgw/multifs/{overrides.yaml clusters/fixed-2.yaml frontend/civetweb.yaml fs/xfs.yaml rgw_pool_type/replicated.yaml tasks/rgw_swift.yaml}" ./virtualenv/bin/teuthology-suite --priority 101 --suite rgw --filter="$filter" --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports-loic --machine-type plana,burnupi,mira
- fail hammer branch and the following http://pulpito.ceph.com/loic-2015-08-31_19:10:43-rgw-hammer-backports-loic---basic-multi
- https://github.com/ceph/ceph/pull/5721
- https://github.com/ceph/ceph/pull/5383
- https://github.com/ceph/ceph/pull/5498
- https://github.com/ceph/ceph/pull/5715
- https://github.com/ceph/ceph/pull/5717
- https://github.com/ceph/ceph/pull/5719
- https://github.com/ceph/ceph/pull/5720
- https://github.com/ceph/ceph/pull/5718
- fail hammer branch and the following http://pulpito.ceph.com/loic-2015-09-01_00:12:35-rgw-hammer-backports-loic---basic-multi
- pass hammer branch http://pulpito.ceph.com/loic-2015-08-31_19:11:00-rgw-hammer---basic-multi
Updated by Loïc Dachary over 8 years ago
$ git rev-parse ceph/hammer-backports
63c3d50ace54238418cec1d5ebb5a32364058cad
git --no-pager log --format='%H %s' --graph ceph/hammer..ceph/hammer-backports | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 5276
- |\
- | + mon: test the crush ruleset when creating a pool
- | + erasure-code: set max_size to chunk_count() instead of 20 for shec
- | + vstart.sh: set PATH to include pwd
- + Pull request 5367
- |\
- | + os/chain_xattr: handle read on chnk-aligned xattr
- + Pull request 5373
- |\
- | + osd: pg_interval_t::check_new_interval should not rely on pool.min_size to determine if the PG was active
- | + osd: Move IsRecoverablePredicate/IsReadablePredicate to osd_types.h
- + Pull request 5377
- |\
- | + mon/PGMonitor: bug fix pg monitor get crush rule
- + Pull request 5381
- |\
- | + Client: check dir is still complete after dropping locks in _readdir_cache_cb
- + Pull request 5382
- |\
- | + auth: check return value of keyring->get_secret
- + Pull request 5383
- |\
- | + rest_bench: bucketname is not mandatory as we have a default name
- | + rest_bench: drain the work queue to fix a crash Fixes: #3896 Signed-off-by: huangjun <hjwsm1989@gmail.com>
- + Pull request 5498
- |\
- | + rgw: set http status in civetweb
- | + civetweb: update submodule to support setting of http status
- + Pull request 5527
- |\
- | + osd/OSDMap: handle incrementals that modify+del pool
- + Pull request 5697
- |\
- | + mon: add a cache layer over MonitorDBStore
- + Pull request 5715
- |\
- | + rgw: url encode exposed bucket
- + Pull request 5716
- |\
- | + rgw: avoid using slashes for generated secret keys
- + Pull request 5717
- |\
- | + rgw: api adjustment following a rebase
- | + rgw: orphans, fix check on number of shards
- | + rgw: orphans, change default number of shards
- | + rgw: change error output related to orphans
- | + rgw: orphan, fix truncated detection
- | + radosgw-admin: simplify orphan command
- | + radosgw-admin: stat orphan objects before reporting leakage
- | + radosgw-admin: orphans finish command
- | + rgw: cannot re-init an orphan scan job
- | + rgw: stat_async() sets the object locator appropriately
- | + rgw: list_objects() sets namespace appropriately
- | + rgw: modify orphan search fingerprints
- | + rgw: compare oids and dump leaked objects
- | + rgw: keep accurate state for linked objects orphan scan
- | + rgw: iterate over linked objects, store them
- | + rgw: add rgw_obj::parse_raw_oid()
- | + rgw: iterate asynchronously over linked objects
- | + rgw: async object stat functionality
- | + rgw-admin: build index of bucket indexes
- | + rgw: initial work of orphan detection tool implementation
- | + Avoid an extra read on the atomic variable
- | + RGW: Make RADOS handles in RGW to be a configurable option
- + Pull request 5719
- |\
- | + rgw:segmentation fault when rgw_gc_max_objs > HASH_PRIME
- + Pull request 5720
- |\
- | + rgw:the arguments 'domain' should not be assigned when return false
- + Pull request 5721
- |\
- | + rgw: rework X-Trans-Id header to be conform with Swift API.
- | + Transaction Id added in response
- + Pull request 5732
- + rgw: init some manifest fields when handling explicit objs
Updated by Loïc Dachary over 8 years ago
./virtualenv/bin/teuthology-suite --priority 1000 --subset $(expr $RANDOM % 5)/5 --suite rgw --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports --machine-type plana,burnupi,mira
Updated by Loïc Dachary over 8 years ago
No need to run rgw suite on this batch because all pull requests have already been tested in the previous batch. No need to run the fs suite either because there is a single pull request for fs and it has already been tested. It is included merely because it is pending approval.
$ git rev-parse ceph/hammer-backports-loic
11265088285fe45c345a8771dfc2a39918a81857
git --no-pager log --format='%H %s' --graph ceph/hammer..ceph/hammer-backports-loic | perl -p -e 's/"/ /g; if (/\w+\s+Merge pull request #(\d+)/) { s|\w+\s+Merge pull request #(\d+).*|"Pull request $1":https://github.com/ceph/ceph/pull/$1|; } else { s|(\w+)\s+(.*)|"$2":https://github.com/ceph/ceph/commit/$1|; } s/\*/+/; s/^/* /;'
- + Pull request 5276
- + Pull request 5367
- + Pull request 5373
- + Pull request 5377
- + Pull request 5381
- + Pull request 5382
- + Pull request 5383
- + Pull request 5498
- + Pull request 5527
- + Pull request 5697
- + Pull request 5715
- |\
- | + rgw: url encode exposed bucket
- + Pull request 5716
- + Pull request 5717
- + Pull request 5719
- |\
- | + rgw:segmentation fault when rgw_gc_max_objs > HASH_PRIME
- + Pull request 5720
- |\
- | + rgw:the arguments 'domain' should not be assigned when return false
- + Pull request 5721
- + Pull request 5732
- + Pull request 5754
- + Pull request 5755
- + Pull request 5757
- + Pull request 5758
- + Pull request 5759
- + Pull request 5761
- |\
- | + OSD: break connection->session->waiting message->connection cycle
- + Pull request 5762
- + Pull request 5763
- |\
- | + osd/PGLog: dirty_to is inclusive
- + Pull request 5764
- |\
- | + common: fix code format
- | + test: add test case for insert empty ptr when buffer rebuild
- | + common: fix insert empty ptr when bufferlist rebuild
- + Pull request 5765
- |\
- | + tests: tiering agent and proxy read
- | + osd: trigger the cache agent after a promotion
- + Pull request 5766
- |\
- | + mon: fix checks on mds add_data_pool
- + Pull request 5767
- |\
- | + librbd: Add a paramter:purge_on_error in ImageCtx::invalidate_cache().
- | + librbd: Remvoe unused func ImageCtx::read_from_cache.
- | + osdc: clean up code in ObjectCacher::Object::map_write
- | + osdc: Don't pass mutex into ObjectCacher::_wait_for_write.
- | + osdc: After write try merge bh.
- | + osdc: Make last missing bh to wake up the reader.
- | + osdc: For trust_enoent is true, there is only one extent.
- | + osdc: In _readx() only no error can tidy read result.
- + Pull request 5768
- |\
- | + lockdep: allow lockdep to be dynamically enabled/disabled
- | + tests: librbd API test cannot use private md_config_t struct
- | + tests: ensure old-format RBD tests still work
- | + librados_test_stub: implement conf get/set API methods
- | + crypto: use NSS_InitContext/NSS_ShutdownContex to avoid memory leak
- | + auth: use crypto_init_mutex to protect NSS_Shutdown()
- | + auth: reinitialize NSS modules after fork()
- + Pull request 5769
- |\
- | + librbd: prevent race condition between resize requests
- + Pull request 5770
- + rgw: init some manifest fields when handling explicit objs
Updated by Loïc Dachary over 8 years ago
rbd¶
./virtualenv/bin/teuthology-suite --priority 1000 --suite rbd --subset $(expr $RANDOM % 5)/5 --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports-loic --machine-type plana,burnupi,mira
- fail http://pulpito.ceph.com/loic-2015-09-02_13:56:47-rbd-hammer-backports-loic---basic-multi/
- 'mkdir
p -/home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=11265088285fe45c345a8771dfc2a39918a81857 TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" PATH=$PATH:/usr/sbin RBD_FEATURES=13 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/rbd/test_librbd.sh' - 'mkdir
p -/home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=11265088285fe45c345a8771dfc2a39918a81857 TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/rbd/qemu-iotests.sh' - 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 4'
- 'mkdir
paddles=paddles.front.sepia.ceph.com run=loic-2015-09-02_13:56:47-rbd-hammer-backports-loic---basic-multi eval filter=$(curl --silent http://$paddles/runs/$run/ | jq '.jobs[] | select(.status == "dead" or .status == "fail") | .description' | while read description ; do echo -n $description, ; done | sed -e 's/,$//') ./virtualenv/bin/teuthology-suite --priority 101 --suite rbd --filter="$filter" --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports-loic --machine-type plana,burnupi,mira
Updated by Loïc Dachary over 8 years ago
rados¶
./virtualenv/bin/teuthology-suite --priority 1000 --suite rados --subset $(expr $RANDOM % 18)/18 --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports-loic --machine-type plana,burnupi,mira
- fail http://pulpito.ceph.com/loic-2015-09-02_13:58:31-rados-hammer-backports-loic---basic-multi/
- 'mkdir
p -/home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=11265088285fe45c345a8771dfc2a39918a81857 TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/rados/test.sh' - saw valgrind issues
- 'mkdir
paddles=paddles.front.sepia.ceph.com run=loic-2015-09-02_13:58:31-rados-hammer-backports-loic---basic-multi eval filter=$(curl --silent http://$paddles/runs/$run/ | jq '.jobs[] | select(.status == "dead" or .status == "fail") | .description' | while read description ; do echo -n $description, ; done | sed -e 's/,$//') ./virtualenv/bin/teuthology-suite --priority 101 --suite rados --filter="$filter" --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports-loic --machine-type plana,burnupi,mira
Verifying that https://github.com/ceph/ceph/pull/5887 does not cause problems with the rados suite.
teuthology-suite --priority 101 --suite rados --subset $(expr $RANDOM % 18)/18 --suite-branch hammer --distro ubuntu --email loic@dachary.org --filter-out=rhel --ceph wip-pr-5887 --machine-type plana,burnupi,mira
- fail http://pulpito.ceph.com/loic-2015-10-03_11:11:28-rados-wip-pr-5887---basic-multi/
- bug more hammer git ceph com updates
- rados/thrash/{0-size-min-size-overrides/2-size-1-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/fixed-2.yaml fs/ext4.yaml msgr-failures/fastclose.yaml thrashers/mapgap.yaml workloads/admin_socket_objecter_requests.yaml}
- rados/thrash/{0-size-min-size-overrides/3-size-2-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/osd-delay.yaml thrashers/default.yaml workloads/admin_socket_objecter_requests.yaml}
- rados/thrash/{0-size-min-size-overrides/2-size-2-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/few.yaml thrashers/pggrow.yaml workloads/admin_socket_objecter_requests.yaml}
- rados/thrash/{0-size-min-size-overrides/2-size-1-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml clusters/fixed-2.yaml fs/ext4.yaml msgr-failures/fastclose.yaml thrashers/morepggrow.yaml workloads/admin_socket_objecter_requests.yaml}
- rados/thrash/{0-size-min-size-overrides/3-size-2-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/osd-delay.yaml thrashers/mapgap.yaml workloads/admin_socket_objecter_requests.yaml}
- rados/thrash/{0-size-min-size-overrides/2-size-2-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/few.yaml thrashers/default.yaml workloads/admin_socket_objecter_requests.yaml}
- rados/thrash/{0-size-min-size-overrides/2-size-1-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/fixed-2.yaml fs/ext4.yaml msgr-failures/fastclose.yaml thrashers/pggrow.yaml workloads/admin_socket_objecter_requests.yaml}
- rados/thrash/{0-size-min-size-overrides/3-size-2-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/osd-delay.yaml thrashers/morepggrow.yaml workloads/admin_socket_objecter_requests.yaml}
- rados/thrash/{0-size-min-size-overrides/2-size-2-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/few.yaml thrashers/mapgap.yaml workloads/admin_socket_objecter_requests.yaml}
- rados/thrash/{0-size-min-size-overrides/2-size-1-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml clusters/fixed-2.yaml fs/ext4.yaml msgr-failures/fastclose.yaml thrashers/default.yaml workloads/admin_socket_objecter_requests.yaml}
- rados/thrash/{0-size-min-size-overrides/3-size-2-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/pggrow.yaml workloads/admin_socket_objecter_requests.yaml}
- 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 5'
- qa/workunits/cephtool/test.sh: don't assume crash_replay_interval=45
- "2015-10-04 05:34:47.220275 osd.1 10.214.136.12:6804/62701 10 : cluster [ERR] 9.0s0 shard 2(1): soid 3df68405/repair_test_obj/head//9 extra attr hinfo_key" in cluster log
- bug more hammer git ceph com updates
run=loic-2015-10-03_11:11:28-rados-wip-pr-5887---basic-multi eval filter=$(curl --silent http://$paddles/runs/$run/ | jq '.jobs[] | select(.status == "dead" or .status == "fail") | .description' | while read description ; do echo -n $description, ; done | sed -e 's/,$//') teuthology-suite --priority 101 --suite rados --filter="$filter" --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph wip-pr-5887 --machine-type plana,burnupi,mira
Updated by Loïc Dachary over 8 years ago
powercycle¶
./virtualenv/bin/teuthology-suite -l2 -v -c hammer-backports-loic -k testing -m plana,burnupi,mira -s powercycle -p 1000 --email loic@dachary.org
Updated by Loïc Dachary over 8 years ago
upgrade¶
teuthology-suite --verbose --suite upgrade/hammer --filter=centos_6,ubuntu_14 --suite-branch hammer --ceph hammer-backports --machine-type vps --priority 1000
- fail http://pulpito.ceph.com/loic-2015-09-02_14:09:47-upgrade:hammer-hammer-backports-loic---basic-vps
- timeout waiting for machines **
- upgrade:hammer/older/{0-cluster/start.yaml 1-install/v0.94.yaml 2-workload/blogbench.yaml 3-upgrade-sequence/upgrade-osd-mon-mds.yaml 4-final/{monthrash.yaml osdthrash.yaml testrados.yaml} distros/centos_6.5.yaml}
- upgrade:hammer/older/{0-cluster/start.yaml 1-install/v0.94.1.yaml 2-workload/rbd.yaml 3-upgrade-sequence/upgrade-osd-mon-mds.yaml 4-final/{monthrash.yaml osdthrash.yaml testrados.yaml} distros/centos_6.5.yaml}
- "sudo find /var/log/ceph
name '.log' -print0 | sudo xargs -0 --no-run-if-empty - gzip --"*
- timeout waiting for machines **
Updated by Loïc Dachary over 8 years ago
ceph-deploy¶
teuthology-suite --verbose --suite ceph-deploy --filter=centos_6,ubuntu_14 --suite-branch hammer --ceph hammer-backports-loic --machine-type vps --priority 1000
- fail http://pulpito.ceph.com/loic-2015-09-02_14:58:36-ceph-deploy-hammer-backports-loic---basic-vps
- ceph health was unable to get 'HEALTH_OK' after waiting 15 minutes
- ceph-deploy/basic/{ceph-deploy-overrides/enable_dmcrypt_diff_journal_disk.yaml config_options/cephdeploy_conf.yaml distros/centos_6.5.yaml tasks/ceph-deploy_hello_world.yaml}
- ceph-deploy/basic/{ceph-deploy-overrides/ceph_deploy_dmcrypt.yaml config_options/cephdeploy_conf.yaml distros/centos_6.5.yaml tasks/ceph-deploy_hello_world.yaml}
- ceph health was unable to get 'HEALTH_OK' after waiting 15 minutes
Note: centos 6.5 was removed from distros
Updated by Loïc Dachary over 8 years ago
fs¶
./virtualenv/bin/teuthology-suite --priority 1000 --suite fs --subset $(expr $RANDOM % 5)/5 --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports-loic --machine-type plana,burnupi,mira
- fail http://pulpito.ceph.com/loic-2015-09-06_23:31:17-fs-hammer-backports-loic---basic-multi/
- 'mkdir
p -/home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=11265088285fe45c345a8771dfc2a39918a81857 TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/libcephfs/test.sh'- fs/basic/{clusters/fixed-3-cephfs.yaml debug/mds_client.yaml fs/btrfs.yaml inline/yes.yaml overrides/whitelist_wrongly_marked_down.yaml tasks/libcephfs_interface_tests.yaml}
- fs/verify/{clusters/fixed-3-cephfs.yaml debug/mds_client.yaml fs/btrfs.yaml overrides/whitelist_wrongly_marked_down.yaml tasks/libcephfs_interface_tests.yaml validater/lockdep.yaml}
- 'mkdir
run=loic-2015-09-06_23:31:17-fs-hammer-backports-loic---basic-multi paddles=paddles.front.sepia.ceph.com eval filter=$(curl --silent http://$paddles/runs/$run/ | jq '.jobs[] | select(.status == "dead" or .status == "fail") | .description' | while read description ; do echo -n $description, ; done | sed -e 's/,$//') teuthology-suite --priority 1000 --suite fs --filter="$filter" --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports-loic --machine-type plana,burnupi,mira
- fail http://pulpito.ceph.com/loic-2015-09-08_13:46:08-fs-hammer-backports-loic---basic-multi/
- , 'failed': True, 'item': 'http://git.ceph.com/release.asc', 'msg': 'Failed to download key at http://git.ceph.com/release.asc: Request failed: <urlopen error timed out>'}, 'burnupi19.front.sepia.ceph.com': {'invocation': {'module_name': 'apt_key', 'module_args': ''}, 'failed': True, 'item': 'http://git.ceph.com/release.asc', 'msg': 'Failed to download key at http://git.ceph.com/release.asc: Request failed: <urlopen error timed out>'}, 'plana65.front.sepia.ceph.com': {'invocation': {'module_name': 'apt_key', 'module_args': ''}, 'failed': True, 'item': 'http://git.ceph.com/release.asc', 'msg': 'Failed to download key at http://git.ceph.com/release.asc: Request failed: <urlopen error timed out>'}}
- 'mkdir
p -/home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=11265088285fe45c345a8771dfc2a39918a81857 TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/libcephfs/test.sh'
Assuming transient errors / environmental errors, running again
run=loic-2015-09-06_23:31:17-fs-hammer-backports-loic---basic-multi paddles=paddles.front.sepia.ceph.com eval filter=$(curl --silent http://$paddles/runs/$run/ | jq '.jobs[] | select(.status == "dead" or .status == "fail") | .description' | while read description ; do echo -n $description, ; done | sed -e 's/,$//') teuthology-suite --priority 1000 --suite fs --filter="$filter" --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph hammer-backports-loic --machine-type plana,burnupi,mira
The errors are LibCephFS.GetPoolId failure, verifying the https://github.com/ceph/ceph/pull/5887
run=loic-2015-09-06_23:31:17-fs-hammer-backports-loic---basic-multi paddles=paddles.front.sepia.ceph.com eval filter=$(curl --silent http://$paddles/runs/$run/ | jq '.jobs[] | select(.status == "dead" or .status == "fail") | .description' | while read description ; do echo -n $description, ; done | sed -e 's/,$//') teuthology-suite --priority 101 --suite fs --filter="$filter" --suite-branch hammer --distro ubuntu --email loic@dachary.org --ceph wip-pr-5887 --machine-type plana,burnupi,mira
Updated by Yuri Weinstein over 8 years ago
QE Validation (started 10/12/15)¶
re-runs command lines and filters are captured in http://pad.ceph.com/p/hammer_v0.94.4_QE_validation_notes
rgw | http://pulpito.ceph.com/teuthology-2015-10-12_17:31:55-rgw-hammer-distro-basic-multi/ | PASSED |
http://pulpito.ceph.com/teuthology-2015-10-13_13:07:07-rgw-hammer-distro-basic-multi/ | ||
knfs | http://pulpito.ceph.com/teuthology-2015-10-13_08:10:14-knfs-hammer-testing-basic-multi/ | PASSED |
hadoop | http://pulpito.ceph.com/teuthology-2015-10-14_09:45:17-hadoop-hammer---basic-multi/ or http://pulpito.ovh.sepia.ceph.com:8081/teuthology-2015-10-15_21:24:32-hadoop-hammer---basic-openstack/ | FAILED env noise, John, pls double check download.ceph.com URL in hammer |
http://pulpito.ovh.sepia.ceph.com:8081/teuthology-2015-10-18_18:12:02-hadoop-hammer---basic-openstack/ | ||
multimds | not required, optional | |
rest | http://pulpito.ceph.com/teuthology-2015-10-14_09:48:09-rest-hammer---basic-multi/ | PASSED |
upgrade/client-upgrade | http://pulpito.ceph.com/teuthology-2015-10-14_14:04:59-upgrade:client-upgrade-hammer-distro-basic-multi/ | PASSED |
http://149.202.176.126:8081/teuthology-2015-10-14_21:18:02-upgrade:client-upgrade-hammer-distro-basic-openstack/ | same as above in openstack | |
upgrade/dumpling-firefly-x ubuntu14 | http://pulpito.ovh.sepia.ceph.com:8081/teuthology-2015-10-15_21:44:53-upgrade:dumpling-firefly-x-hammer-distro-basic-openstack/ | FAILED #11104 |
upgrade/dumpling-firefly-x - vps | http://149.202.176.126:8081/teuthology-2015-10-14_21:39:06-upgrade:dumpling-firefly-x-hammer-distro-basic-openstack/ | this suite runs out of memory on vps |
upgrade/giant-x - vps | http://pulpito.ceph.com/teuthology-2015-10-16_17:05:08-upgrade:giant-x-hammer-distro-basic-vps/ | FAILED #11104 |
upgrade/hammer - vps | http://pulpito.ceph.com/teuthology-2015-10-15_14:13:10-upgrade:hammer-hammer-distro-basic-vps/ | #11104 FAILED |
==
PASSED / FAILED | ||
Suite | Runs/Reruns | Notes/Issues |
PASSED / FAILED | ||
Updated by Loïc Dachary over 8 years ago
- Description updated (diff)
- Status changed from In Progress to Resolved