Activity
From 12/05/2019 to 01/03/2020
01/03/2020
- 11:54 PM Bug #43421 (Fix Under Review): mon spends too much time to build incremental osdmap
- 10:09 AM Bug #43421: mon spends too much time to build incremental osdmap
- It takes 5 seconds to build 640 incremental osdmap for one client.
- 08:15 AM Bug #43421: mon spends too much time to build incremental osdmap
- sorry. It took 5 seconds
- 11:49 PM Bug #43185 (Need More Info): ceph -s not showing client activity
- super xor wrote:
> Possible relation to https://tracker.ceph.com/issues/43364 and https://tracker.ceph.com/issues/43... - 10:48 PM Bug #43311 (Pending Backport): asynchronous recovery + backfill might spin pg undersized for a lo...
- 09:01 PM Feature #40870: Implement mon_memory_target
- Another follow-on fix: https://github.com/ceph/ceph/pull/32473
- 09:00 PM Bug #43454 (Fix Under Review): ceph monitor crashes after updating 'mon_memory_target' config set...
- 08:24 AM Bug #43454 (Resolved): ceph monitor crashes after updating 'mon_memory_target' config setting.
- Refer bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=1760257 for more details.
- 08:06 PM Backport #42197: nautilus: osd/PrimaryLogPG.cc: 13068: FAILED ceph_assert(obc)
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31028
merged - 04:39 PM Bug #43334: nautilus: rados/test_envlibrados_for_rocksdb.sh broken packages with ubuntu_16.04.yaml
- /a/yuriw-2019-12-23_20:23:51-rados-wip-yuri-testing-2019-12-16-2241-nautilus-distro-basic-smithi/4628899/
01/02/2020
- 03:41 PM Bug #43403: unittest_lockdep unreliable
- Happened in https://github.com/ceph/ceph/pull/27792 (among others)
01/01/2020
- 11:01 AM Documentation #42315: Improve rados command usage, man page and turorial
- RADOS(8) Ceph RADOS(8)
NAME
rados - rados object s... - 10:52 AM Documentation #42315: Improve rados command usage, man page and turorial
- [zdover@192-168-1-112 ~]$ rados -h
usage: rados [options] [commands]
POOL COMMANDS
lspools ...
12/25/2019
- 03:24 PM Bug #43422 (Resolved): qa/standalone/mon/osd-pool-create.sh fails to grep utf8 pool name
- ...
- 12:33 PM Bug #43421: mon spends too much time to build incremental osdmap
- In my cluster , it took five minutes to 1300 versions of incremental osdmap.
patch: https://github.com/ceph/ceph/... - 09:49 AM Bug #43421 (Fix Under Review): mon spends too much time to build incremental osdmap
- if a client's osdmap version is too low. mon spend too much time to build incremental osdmap.
Mon can't handle norma...
12/24/2019
- 05:03 AM Bug #43308 (Pending Backport): negative num_objects can set PG_STATE_DEGRADED
- 05:02 AM Bug #42780 (Pending Backport): recursive lock of OpTracker::lock (70)
- 01:53 AM Bug #43413 (New): Virtual IP address of iface lo results in failing to start an OSD
- We added a virtual IP on the loopback internetface lo to complete the LVS configuration....
12/23/2019
- 11:54 PM Bug #43412 (Resolved): cephadm ceph_manager IndexError: list index out of range
- ...
- 08:26 PM Backport #43140: nautilus: ceph-mon --mkfs: public_address type (v1|v2) is not respected
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/32028
mergedReviewed-by: Ricardo Dias <rdias@suse.com>
- 02:18 PM Bug #43174: pgs inconsistent, union_shard_errors=missing
- Hi David.
> Are you running your own Ceph build?
No, we use official (comunity) build.
> Sortbitwise needed to...
12/21/2019
12/20/2019
- 11:39 PM Bug #42328 (Resolved): osd/PrimaryLogPG.cc: 3962: ceph_abort_msg("out of order op")
- I can't check the original reports (logs have been removed), but assuming it's the same root cause PR #32382 5bb932c3...
- 01:31 AM Bug #42328: osd/PrimaryLogPG.cc: 3962: ceph_abort_msg("out of order op")
- I observed something similar on a ceph_test_rados teuthology run: sjust-2019-12-19_20:05:13-rados-wip-sjust-read-from...
- 11:37 PM Bug #43394 (Resolved): crimson::dmclock segv in crimson::IndIntruHeap
- Should be fixed with PR #32380 2c9542901532feafd569d92e9f67ccd2e1af3129
- 08:53 PM Bug #43403 (Resolved): unittest_lockdep unreliable
- ...
- 08:22 AM Bug #41255: backfill_toofull seen on cluster where the most full OSD is at 1%
- Hi David:
Good to know the bug is indeed fixed ... too bad it didn't make it in 13.2.8. Anyways ... building patch... - 04:50 AM Bug #38345 (In Progress): mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- 01:50 AM Bug #43174: pgs inconsistent, union_shard_errors=missing
Scrub incorrectly thinks the object really isn't there, but we know it is.
The way that you can see missing obje...
12/19/2019
- 11:57 PM Bug #42780 (Fix Under Review): recursive lock of OpTracker::lock (70)
- https://github.com/ceph/ceph/pull/32364
- 12:09 PM Bug #42780 (In Progress): recursive lock of OpTracker::lock (70)
- 10:30 PM Bug #43307 (Fix Under Review): Remove use of rules batching for upmap balancer
- 10:27 PM Bug #43397 (Resolved): FS_DEGRADED to cluster log despite --no-mon-health-to-clog
- ...
- 09:38 PM Bug #43394 (Resolved): crimson::dmclock segv in crimson::IndIntruHeap
- ...
- 07:06 PM Bug #41255: backfill_toofull seen on cluster where the most full OSD is at 1%
- A backport to Mimic of the fix can be found here:
https://github.com/ceph/ceph/pull/32361
Or if you can build fro... - 02:34 PM Bug #41255: backfill_toofull seen on cluster where the most full OSD is at 1%
- We added a CRUSH policy (replicated_nvme) and set this policy on our cephfs metadata pool (with 1.2 Bilion objects) a...
- 07:02 PM Backport #41584 (In Progress): mimic: backfill_toofull seen on cluster where the most full OSD is...
- 02:29 PM Bug #43306: segv in collect_sys_info
- Neha Ojha wrote:
> This looks similar to https://tracker.ceph.com/issues/38296, though the mon seems to have been up... - 02:22 PM Backport #39474 (In Progress): luminous: segv in fgets() in collect_sys_info reading /proc/cpuinfo
- 02:18 PM Bug #41383 (Resolved): scrub object count mismatch on device_health_metrics pool
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 02:14 PM Backport #42739 (Resolved): nautilus: scrub object count mismatch on device_health_metrics pool
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31735
m... - 07:39 AM Bug #43382: medium io/system load causes quorum failure
- Or due to limited bandwidth? 10G NICs dedicated.
- 07:36 AM Bug #43382 (New): medium io/system load causes quorum failure
- We just found out that if you put some io pressure on your system by e.g. big rsync, the mon process has issues proba...
- 05:44 AM Bug #43126 (Fix Under Review): OSD_SLOW_PING_TIME_BACK nits
- 02:20 AM Bug #43318: monitor mark all services(osd mgr) down
- mgr has no log when setting the debug_mgr to 40.
12/18/2019
- 10:31 PM Bug #43193 (Need More Info): "ceph ping mon.<id>" cannot work
- Can you provide the sequence of commands that fail? Also, please attach the monitor names and monmap.
- 10:25 PM Bug #43305 (Won't Fix): "psutil.NoSuchProcess process no longer exists" error in luminous-x-nauti...
- This is an infra issue....
- 10:23 PM Bug #43306: segv in collect_sys_info
- This looks similar to https://tracker.ceph.com/issues/38296, though the mon seems to have been upgraded to nautilus(w...
- 10:17 PM Bug #43318 (Need More Info): monitor mark all services(osd mgr) down
- Can you provide mgr logs from when this happened?
- 10:12 PM Feature #43377 (Resolved): Make Zstandard compression level a configurable option
- I've played with using the different compression algorithms on the RGWs and the default compression level for Zstanda...
- 07:38 PM Backport #42739: nautilus: scrub object count mismatch on device_health_metrics pool
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31735
merged - 03:53 PM Backport #43316 (Resolved): nautilus:wrong datatype describing crush_rule
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/32254
m... - 12:11 PM Bug #43365: Nautilus: Random mon crashes in failed assertion at ceph::time_detail::signedspan
- So it's asserting inside of to_timespan, and the Paxos code triggering that assert is
> auto start = ceph::coarse_... - 12:03 PM Bug #43365 (Resolved): Nautilus: Random mon crashes in failed assertion at ceph::time_detail::sig...
- Thanks to 14.2.5 auto warning for recent crashes, we are observing frequent (somewhat daily period) random crashes of...
- 09:35 AM Bug #43185: ceph -s not showing client activity
- Possible relation to https://tracker.ceph.com/issues/43364 and https://tracker.ceph.com/issues/43317
12/17/2019
- 05:39 PM Bug #43308 (Fix Under Review): negative num_objects can set PG_STATE_DEGRADED
- 09:19 AM Backport #43346 (Resolved): nautilus: short pg log + cache tier ceph_test_rados out of order reply
- https://github.com/ceph/ceph/pull/32848
- 06:47 AM Bug #41950 (Can't reproduce): crimson compile
- 06:46 AM Bug #41950: crimson compile
- i assume that you were trying to compile crimson-osd not crimson-old. please check the submodule of seastar to unders...
12/16/2019
- 10:36 PM Bug #43296 (Need More Info): Ceph assimilate-conf results in config entries which can not be removed
- Can you attach the (relevant) output from "ceph config-key dump | grep config"? I think the keys are being installed...
- 10:22 PM Bug #43296: Ceph assimilate-conf results in config entries which can not be removed
- Might be related to #42964?
- 10:06 PM Bug #43334 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh broken packages with ubunt...
- Run: http://pulpito.ceph.com/yuriw-2019-12-15_16:25:11-rados-wip-yuri-nautilus-baseline_12.13.19-distro-basic-smithi/...
- 08:36 PM Bug #38358 (Pending Backport): short pg log + cache tier ceph_test_rados out of order reply
- Seen in nautilus: /a/yuriw-2019-12-15_16:25:11-rados-wip-yuri-nautilus-baseline_12.13.19-distro-basic-smithi/4605500/
- 12:40 PM Bug #43174 (New): pgs inconsistent, union_shard_errors=missing
- Hmm this may be something else then. David, does it look familiar?
- 08:40 AM Feature #43324: Make zlib windowBits configurable for compression
- Xiyuan Wang wrote:
> Now the zlib windowBits is hardcoding as -15[1]. But it should be set to different value for di... - 03:38 AM Feature #43324 (Resolved): Make zlib windowBits configurable for compression
- Now the zlib windowBits is hardcoding as -15[1]. But it should be set to different value for different case.
Accor... - 07:27 AM Backport #43325 (In Progress): luminous: wrong datatype describing crush_rule
- 07:24 AM Backport #43325 (New): luminous: wrong datatype describing crush_rule
- 07:24 AM Backport #43325 (Resolved): luminous: wrong datatype describing crush_rule
- https://github.com/ceph/ceph/pull/32267
12/15/2019
- 10:04 PM Documentation #41389 (Pending Backport): wrong datatype describing crush_rule
- 03:55 PM Bug #38076 (Resolved): osds allows to partially start more than N+2
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:53 PM Feature #40528 (Resolved): Better default value for osd_snap_trim_sleep
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:53 PM Backport #43320 (Resolved): mimic: PeeringState::GoClean will call purge_strays unconditionally
- https://github.com/ceph/ceph/pull/33329
- 03:53 PM Backport #43319 (Resolved): nautilus: PeeringState::GoClean will call purge_strays unconditionally
- https://github.com/ceph/ceph/pull/32847
- 01:27 PM Bug #42328: osd/PrimaryLogPG.cc: 3962: ceph_abort_msg("out of order op")
- Looking at the historical test runs, it seems to have started after [1] but before [2].
[1] http://pulpito.ceph.co... - 01:30 AM Bug #42328: osd/PrimaryLogPG.cc: 3962: ceph_abort_msg("out of order op")
- http://qa-proxy.ceph.com/teuthology/teuthology-2019-12-02_02:01:02-rbd-master-distro-basic-smithi/4559106/teuthology.log
- 01:29 AM Bug #42328: osd/PrimaryLogPG.cc: 3962: ceph_abort_msg("out of order op")
- http://qa-proxy.ceph.com/teuthology/jdillaman-2019-12-14_17:15:11-rbd-wip-jd-testing-distro-basic-smithi/4603518/teut...
- 06:55 AM Bug #43318 (Need More Info): monitor mark all services(osd mgr) down
- Suddenly, all mgrs and osds in my cluster began to be set to down by the monitor.
the log of monitor like this
```
...
12/14/2019
- 08:28 AM Documentation #41389 (In Progress): wrong datatype describing crush_rule
- 07:21 AM Documentation #41389 (Pending Backport): wrong datatype describing crush_rule
- 02:42 AM Documentation #41389: wrong datatype describing crush_rule
- Just needs a cherry-pick of 3ed3de6c964ba998d5b18ceb997d1a6dffe355db
- 08:26 AM Backport #43315 (In Progress): mimic:wrong datatype describing crush_rule
- 08:02 AM Backport #43315 (Resolved): mimic:wrong datatype describing crush_rule
- https://github.com/ceph/ceph/pull/32255
- 08:24 AM Backport #43316 (In Progress): nautilus:wrong datatype describing crush_rule
- 08:03 AM Backport #43316 (Resolved): nautilus:wrong datatype describing crush_rule
- https://github.com/ceph/ceph/pull/32254
- 02:50 AM Bug #43307 (In Progress): Remove use of rules batching for upmap balancer
- 02:49 AM Bug #43312 (In Progress): Change default upmap_max_deviation to 5
- 02:06 AM Bug #43312 (Resolved): Change default upmap_max_deviation to 5
- 12:24 AM Bug #43311 (Resolved): asynchronous recovery + backfill might spin pg undersized for a long time
- When an osd that is part of current up set gets chosen as an
async_recovery_target, it gets removed from the acting ... - 12:16 AM Bug #43308 (In Progress): negative num_objects can set PG_STATE_DEGRADED
12/13/2019
- 08:40 PM Bug #40963 (Resolved): mimic: MQuery during Deleting state
- 08:40 PM Bug #41317 (Pending Backport): PeeringState::GoClean will call purge_strays unconditionally
- 07:47 PM Bug #43308 (Resolved): negative num_objects can set PG_STATE_DEGRADED
- ...
- 07:05 PM Bug #43296: Ceph assimilate-conf results in config entries which can not be removed
- Alwin from Proxmox provided a work around but this still appears to be a bug:
https://forum.proxmox.com/threads/ceph... - 04:51 PM Bug #43296: Ceph assimilate-conf results in config entries which can not be removed
- Setting debug_rdb to 5/5 unfortunately doesn't reveal anything:
Commands:... - 03:37 AM Bug #43296 (Resolved): Ceph assimilate-conf results in config entries which can not be removed
- We assimilated our Ceph configuration file and subsequently have a minimal config file. We are subsequently not able ...
- 04:31 PM Bug #43307 (Resolved): Remove use of rules batching for upmap balancer
Due to cost of calculations for very large PG/shard counts, we will settle for balancing each pool individually for...- 03:43 PM Bug #25174 (Can't reproduce): osd: assert failure with FAILED assert(repop_queue.front() == repop...
- 02:43 PM Bug #43306 (Resolved): segv in collect_sys_info
- Run: http://pulpito.ceph.com/teuthology-2019-12-13_02:25:03-upgrade:luminous-x-nautilus-distro-basic-smithi/
Job: '4... - 02:40 PM Bug #43305 (Won't Fix): "psutil.NoSuchProcess process no longer exists" error in luminous-x-nauti...
- Run: http://pulpito.ceph.com/teuthology-2019-12-13_02:25:03-upgrade:luminous-x-nautilus-distro-basic-smithi/
Jobs: '... - 08:23 AM Backport #42259 (Resolved): nautilus: document new option mon_max_pg_per_osd
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31300
m... - 08:22 AM Backport #40947 (Resolved): luminous: Better default value for osd_snap_trim_sleep
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31857
m... - 08:22 AM Backport #38205 (Resolved): luminous: osds allows to partially start more than N+2
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31858
m... - 08:22 AM Backport #43093 (Resolved): luminous: Improve OSDMap::calc_pg_upmaps() efficiency
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31992
m... - 06:17 AM Bug #40712: ceph-mon crash with assert(err == 0) after rocksdb->get
- we meet this problem recently.
we decline this related more to rocksdb but not ceph
12/12/2019
- 04:41 PM Backport #40947: luminous: Better default value for osd_snap_trim_sleep
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31857
mergedReviewed-by: Josh Durgin <jdurgin@redhat.com>
- 04:41 PM Backport #38205: luminous: osds allows to partially start more than N+2
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31858
merged - 04:40 PM Backport #43093: luminous: Improve OSDMap::calc_pg_upmaps() efficiency
- David Zafman wrote:
> https://github.com/ceph/ceph/pull/31992
merged - 10:16 AM Bug #43174: pgs inconsistent, union_shard_errors=missing
- Greg thanks for the reply.
Greg Farnum wrote:
> If you fetch an object in RGW and its backing RADOS objects are m... - 09:41 AM Bug #38330 (Resolved): osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 09:23 AM Backport #43119 (Resolved): mimic: osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/32000
m... - 08:44 AM Bug #43193: "ceph ping mon.<id>" cannot work
- The command "ceph ping mon.a" or "ceph ping mon.b" or "ceph ping mon.c" works fine.
If the mon id is not specified, ... - 05:31 AM Bug #41317 (Fix Under Review): PeeringState::GoClean will call purge_strays unconditionally
- 12:04 AM Bug #43267 (Rejected): unexpected error in BlueStore::_txc_add_transaction
- 12:02 AM Bug #43267: unexpected error in BlueStore::_txc_add_transaction
- Nope, it was full. Well spotted:...
12/11/2019
- 11:28 PM Bug #43267: unexpected error in BlueStore::_txc_add_transaction
This is caused by an out of space condition that won't usually happen. Check your BlueStore configuration.
Is ...- 10:21 PM Bug #43267: unexpected error in BlueStore::_txc_add_transaction
- This is simply out-of-space condition, see:
-6> 2019-12-11T16:13:44.466-0500 7fcbe4ecd700 -1 bluestore(/build/ce... - 09:39 PM Bug #43267 (Rejected): unexpected error in BlueStore::_txc_add_transaction
- I was testing kcephfs vs. a vstart cluster and the OSD crashed. fsstress was running at the time, so it was being kep...
- 10:26 PM Bug #43268 (New): Restrict admin socket commands more from the Ceph tool
- https://bugzilla.redhat.com/show_bug.cgi?id=1780458
It sounds like we've given admin socket access to any cephx us... - 10:17 PM Bug #43106 (Resolved): mimic: crash in build_incremental_map_msg
- Marking this resolved as all the backports are now in place.
- 10:17 PM Bug #43174 (Closed): pgs inconsistent, union_shard_errors=missing
- If you fetch an object in RGW and its backing RADOS objects are missing, it just fills in the space with zeros. It so...
- 10:15 PM Bug #43173 (Duplicate): pgs inconsistent, union_shard_errors=missing
- 08:07 PM Bug #43266 (Fix Under Review): common: admin socket compiler warning
- 08:03 PM Bug #43266 (Resolved): common: admin socket compiler warning
- ...
- 01:38 PM Backport #43257 (Resolved): mimic: monitor config store: Deleting logging config settings does no...
- https://github.com/ceph/ceph/pull/33327
- 01:38 PM Backport #43256 (Resolved): nautilus: monitor config store: Deleting logging config settings does...
- https://github.com/ceph/ceph/pull/32846
- 04:05 AM Bug #42964 (Pending Backport): monitor config store: Deleting logging config settings does not de...
12/10/2019
- 08:44 PM Backport #40890 (In Progress): mimic: Pool settings aren't populated to OSD after restart.
- 08:41 PM Backport #40891 (In Progress): nautilus: Pool settings aren't populated to OSD after restart.
- 08:34 PM Backport #43246 (Resolved): nautilus: Nearfull warnings are incorrect
- https://github.com/ceph/ceph/pull/32773
- 08:29 PM Backport #43245 (Resolved): nautilus: osd: increase priority in certain OSD perf counters
- https://github.com/ceph/ceph/pull/32845
- 08:25 PM Backport #43239 (Resolved): nautilus: ok-to-stop incorrect for some ec pgs
- https://github.com/ceph/ceph/pull/32844
- 08:24 PM Backport #43232 (Rejected): nautilus: pgs stuck in laggy state
- 04:10 PM Bug #42346 (Pending Backport): Nearfull warnings are incorrect
- 03:26 PM Bug #42961 (Pending Backport): osd: increase priority in certain OSD perf counters
- 02:51 PM Bug #43189 (Pending Backport): pgs stuck in laggy state
- I'm not sure whether we should backport this to nautilus or not. We only noticed qa failures because the new octopus...
- 02:50 PM Bug #43189 (Resolved): pgs stuck in laggy state
- 01:48 AM Bug #43048: nautilus: upgrade/mimic-x/stress-split: failed to recover before timeout expired
- /a/yuriw-2019-12-06_21:30:44-upgrade:mimic-x-nautilus-distro-basic-smithi/4576681
12/09/2019
- 10:07 PM Bug #43067: Git Master: src/compressor/zlib/ZlibCompressor.cc / src/compressor/zlib/CMakeLists.txt
- Thanks Lee!
We generally do patch contributions through Github; can you submit a PR there?
If not, we need a spec... - 09:53 PM Bug #43176 (Duplicate): pgs inconsistent, union_shard_errors=missing
- 09:53 PM Bug #43175 (Duplicate): pgs inconsistent, union_shard_errors=missing
- 09:35 PM Bug #43151 (Pending Backport): ok-to-stop incorrect for some ec pgs
- 04:58 PM Bug #43189 (Fix Under Review): pgs stuck in laggy state
- 03:15 PM Bug #43189: pgs stuck in laggy state
- The problem is the role. The proc_lease() method does this check...
- 02:33 PM Bug #43189 (In Progress): pgs stuck in laggy state
- 04:50 PM Bug #43213 (New): OSDMap::pg_to_up_acting etc specify primary as osd, not pg_shard_t(osd+shard)
- The OSD methods to map a PG return primary as an int, not pg_shard_t (osd + shard).
Objecter compensates for this ... - 04:06 PM Bug #40963: mimic: MQuery during Deleting state
- /a/sage-2019-12-08_05:43:33-rados-nautilus-distro-basic-smithi/4580545
- 12:59 PM Backport #40890: mimic: Pool settings aren't populated to OSD after restart.
- Here's my attempt at the backport: https://github.com/ceph/ceph/pull/32125
- 12:53 PM Backport #40891: nautilus: Pool settings aren't populated to OSD after restart.
- Here's my attempt at the backport: https://github.com/ceph/ceph/pull/32123
- 08:55 AM Bug #43193 (Rejected): "ceph ping mon.<id>" cannot work
- The command "ceph ping mon.<id>" returns an error output:...
- 06:35 AM Bug #42706: LibRadosList.EnumerateObjectsSplit fails
- rados_cluster handler will be freed if set_pg_num failed,...
- 03:35 AM Bug #42861: Libceph-common.so needs to use private link attribute when including dpdk static library
- The dpdk library initializes the EAL using constructors and global
variables, and cannot be re-initialized. Both tes...
12/08/2019
- 11:22 PM Bug #43190 (New): qa/standalone/osd/osd-recovery-prio.sh has a race
http://pulpito.ceph.com/dzafman-2019-12-08_11:51:45-rados-master-distro-basic-smithi/4582053/
The test expected ...- 09:25 PM Bug #43189: pgs stuck in laggy state
- more logs here:
/a/sage-2019-12-07_18:31:18-rados:thrash-erasure-code-wip-sage3-testing-2019-12-05-0959-distro-basic... - 09:23 PM Bug #43189 (Resolved): pgs stuck in laggy state
- ...
12/07/2019
- 06:28 PM Bug #43150 (Resolved): osd-scrub-snaps.sh fails
- 02:47 PM Bug #41313: PG distribution completely messed up since Nautilus
- ceph balancer status
{
"active": true,
"plans": [],
"mode": "upmap"
}
bad distribution:
<p... - 02:45 PM Bug #43185: ceph -s not showing client activity
- ceph -s only looks like this:
ceph -s
cluster:
id: c4068f25-d46d-438d-af63-5679a2d56efb
health: H... - 02:44 PM Bug #43185 (Resolved): ceph -s not showing client activity
- Since Nautilus upgrade ceph -s often (2 out of 3 times) does not show any client or recovery activity. Right now it's...
12/06/2019
- 05:21 PM Bug #42964 (Fix Under Review): monitor config store: Deleting logging config settings does not de...
- 04:07 PM Bug #42347: nautilus assert during osd shutdown: FAILED ceph_assert((sharded_in_flight_list.back(...
- Seen in this scrub test run during osd-scrub-repair.sh.
http://pulpito.ceph.com/dzafman-2019-12-05_19:53:40-rados-... - 02:01 PM Bug #43176 (Duplicate): pgs inconsistent, union_shard_errors=missing
- Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
... - 02:01 PM Bug #43175 (Duplicate): pgs inconsistent, union_shard_errors=missing
- Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
... - 02:01 PM Bug #43174 (Resolved): pgs inconsistent, union_shard_errors=missing
- Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
... - 02:00 PM Bug #43173 (Duplicate): pgs inconsistent, union_shard_errors=missing
- Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
... - 12:55 PM Backport #42997 (In Progress): nautilus: acting_recovery_backfill won't catch all up peers
- 12:48 PM Backport #42878 (In Progress): nautilus: ceph_test_admin_socket_output fails in rados qa suite
- 12:48 PM Backport #42853 (In Progress): nautilus: format error: ceph osd stat --format=json
- 12:47 PM Backport #42847 (Need More Info): mimic: "failing miserably..." in Infiniband.cc
- non-trivial
- 12:47 PM Backport #42848 (Need More Info): nautilus: "failing miserably..." in Infiniband.cc
- non-trivial
- 04:23 AM Bug #38069: upgrade:jewel-x-luminous with short_pg_log.yaml fails with assert(s <= can_rollback_to)
- Oops. I think the more significant issue is that short_pg_log.yaml isn't involved.
- 02:09 AM Bug #38069: upgrade:jewel-x-luminous with short_pg_log.yaml fails with assert(s <= can_rollback_to)
- David Zafman wrote:
> Seen in a non-upgrade test:
This is an upgrade test: "rados/upgrade/jewel-x-singleton/{0-c... - 02:00 AM Bug #38069: upgrade:jewel-x-luminous with short_pg_log.yaml fails with assert(s <= can_rollback_to)
- Seen in a -non-upgrade- test with description:
rados/upgrade/jewel-x-singleton/{0-cluster/{openstack.yaml start.ya...
12/05/2019
- 11:28 PM Bug #41240 (Can't reproduce): All of the cluster SSDs aborted at around the same time and will no...
- 09:37 PM Bug #41240 (New): All of the cluster SSDs aborted at around the same time and will not start.
- 11:24 PM Bug #38892 (Closed): /ceph/src/tools/kvstore_tool.cc:266:1: internal compiler error: Segmentation...
- 09:45 PM Bug #38892 (Fix Under Review): /ceph/src/tools/kvstore_tool.cc:266:1: internal compiler error: Se...
- 09:44 PM Bug #23590 (Fix Under Review): kstore: statfs: (95) Operation not supported
- 09:44 PM Bug #23297 (Fix Under Review): mon-seesaw 'failed to become clean before timeout' due to laggy pg...
- 09:43 PM Bug #13111 (Fix Under Review): replicatedPG:the assert occurs in the fuction ReplicatedPG::on_loc...
- 09:40 PM Feature #38653 (New): Enhance health message when pool quota fills up
- 09:40 PM Bug #38783 (New): Changing mon_pg_warn_max_object_skew has no effect.
- 09:40 PM Feature #3764 (New): osd: async replicas
- 09:37 PM Bug #43048 (New): nautilus: upgrade/mimic-x/stress-split: failed to recover before timeout expired
- 09:37 PM Bug #42918 (New): memory corruption and lockups with I-Object
- 09:37 PM Bug #42780 (New): recursive lock of OpTracker::lock (70)
- 09:37 PM Bug #42706 (New): LibRadosList.EnumerateObjectsSplit fails
- 09:37 PM Bug #42666 (New): mgropen from mgr comes from unknown.$id instead of mgr.$id
- 09:37 PM Bug #42186 (New): "2019-10-04T19:31:51.053283+0000 osd.7 (osd.7) 108 : cluster [ERR] 2.5s0 shard ...
- 09:37 PM Bug #41406 (New): common: SafeTimer reinit doesn't fix up "stopping" bool, used in MonClient boot...
- 09:37 PM Bug #40963 (New): mimic: MQuery during Deleting state
- 06:31 PM Bug #40963: mimic: MQuery during Deleting state
- yuriw-2019-12-04_22:44:10-rados-wip-yuri2-testing-2019-12-04-1938-mimic-distro-basic-smithi/4567200/
DeleteStart e... - 09:37 PM Bug #40868 (New): src/common/config_proxy.h: 70: FAILED ceph_assert(p != obs_call_gate.end())
- 09:37 PM Bug #40820 (New): standalone/scrub/osd-scrub-test.sh +3 day failed assert
- 09:37 PM Bug #40666 (New): osd fails to get latest map
- 09:37 PM Fix #40564 (New): Objecter does not have perfcounters for op latency
- 09:37 PM Bug #40522 (New): on_local_recover doesn't touch?
- 09:37 PM Bug #40454 (New): snap_mapper error, scrub gets r -2..repaired
- 09:37 PM Bug #40521 (New): cli timeout (e.g., ceph pg dump)
- 09:37 PM Bug #40367 (New): "*** Caught signal (Segmentation fault) **" in upgrade:luminous-x-nautilus
- 09:37 PM Bug #40410 (New): ceph pg query Segmentation fault in 12.2.10
- 09:36 PM Feature #39966 (New): mon: allow log messages to be throttled and/or force trimming
- 09:36 PM Bug #40000 (New): osds do not bound xattrs and/or aggregate xattr data in pg log
- 09:36 PM Bug #39366 (New): ClsLock.TestRenew failure
- 09:36 PM Bug #39145 (New): luminous: jewel-x-singleton: FAILED assert(0 == "we got a bad state machine eve...
- 09:36 PM Bug #39148 (New): luminous: powercycle: reached maximum tries (500) after waiting for 3000 seconds
- 09:36 PM Bug #39039 (New): mon connection reset, command not resent
- 09:36 PM Fix #39071 (New): monclient: initial probe is non-optimal with v2+v1
- 09:36 PM Bug #38656 (New): scrub reservation leak?
- 09:36 PM Bug #38718 (New): 'osd crush weight-set create-compat' (and other OSDMonitor commands) can leak u...
- 09:36 PM Bug #38624 (New): crush: get_rule_weight_osd_map does not handle multi-take rules
- 09:36 PM Bug #38513 (New): luminous: "AsyncReserver.h: 190: FAILED assert(!queue_pointers.count(item) && !...
- 09:36 PM Bug #38402 (New): ceph-objectstore-tool on down osd w/ not enough in osds
- 09:36 PM Bug #38417 (New): ceph tell mon.a help timeout
- 09:36 PM Bug #38357 (New): ClsLock.TestExclusiveEphemeralStealEphemeral failed
- 09:36 PM Bug #38358 (New): short pg log + cache tier ceph_test_rados out of order reply
- 09:36 PM Bug #38195 (New): osd-backfill-space.sh exposes rocksdb hang
- 09:36 PM Bug #38345 (New): mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- 09:36 PM Bug #38184 (New): osd: recovery does not preserve copy-on-write allocations between object clones...
- 09:36 PM Bug #38159 (New): ec does not recover below min_size
- 09:36 PM Bug #38172 (New): segv in rocksdb NewIterator
- 09:36 PM Bug #38151 (New): cephx: service ticket validity dobuled
- 09:36 PM Bug #38082 (New): mimic: mon/caps.sh fails with "Expected return 0, got 110"
- 09:36 PM Bug #38064 (New): librados::OPERATION_FULL_TRY not completely implemented, test LibRadosAio.PoolQ...
- 09:36 PM Bug #37582 (New): luminous: ceph -s client gets all mgrmaps
- 09:36 PM Bug #37532 (New): mon: expected_num_objects warning triggers on bluestore-only setups
- 09:36 PM Bug #37509 (New): require past_interval bounds mismatch due to osd oldest_map
- 09:36 PM Bug #36748 (New): ms_deliver_verify_authorizer no AuthAuthorizeHandler found for protocol 0
- 09:36 PM Bug #37289 (New): Issue with overfilled OSD for cache-tier pools
- 09:36 PM Bug #36634 (New): LibRadosWatchNotify.WatchNotify2Timeout failure
- 09:36 PM Bug #36337 (New): OSDs crash with failed assertion in PGLog::merge_log as logs do not overlap
- 09:36 PM Bug #36164 (New): cephtool/test fails 'ceph tell mon.a help' with EINTR
- 09:36 PM Bug #36113 (New): fusestore test umount failed?
- 09:36 PM Bug #35075 (New): copy-get stuck sending osd_op
- 09:36 PM Bug #36040 (New): mon: Valgrind: mon (InvalidFree, InvalidWrite, InvalidRead)
- 09:36 PM Bug #24874 (New): ec fast reads can trigger read errors in log
- 09:36 PM Bug #26891 (New): backfill reservation deadlock/stall
- 09:36 PM Bug #24242 (New): tcmalloc::ThreadCache::ReleaseToCentralCache on rhel (w/ centos packages)
- 09:36 PM Bug #24339 (New): FULL_FORCE ops are dropped if fail-safe full check fails, but not resent in sca...
- 09:36 PM Bug #23965 (New): FAIL: s3tests.functional.test_s3.test_multipart_upload_resend_part with ec cach...
- 09:36 PM Bug #23857 (New): flush (manifest) vs async recovery causes out of order op
- 09:36 PM Bug #23879 (New): test_mon_osdmap_prune.sh fails
- 09:36 PM Bug #23828 (New): ec gen object leaks into different filestore collection just after split
- 09:36 PM Bug #23760 (New): mon: `config get <who>` does not allow `who` as 'mon'/'osd'
- 09:36 PM Bug #23767 (New): "ceph ping mon" doesn't work
- 09:36 PM Bug #23270 (New): failed mutex assert in PipeConnection::try_get_pipe() (via OSD::do_command())
- 09:36 PM Bug #23428 (New): Snapset inconsistency is hard to diagnose because authoritative copy used by li...
- 09:36 PM Bug #23029 (New): osd does not handle eio on meta objects (e.g., osdmap)
- 09:36 PM Bug #22656 (New): scrub mismatch on bytes (cache pools)
- 09:36 PM Bug #21592 (New): LibRadosCWriteOps.CmpExt got 0 instead of -4095-1
- 09:36 PM Bug #21495 (New): src/osd/OSD.cc: 346: FAILED assert(piter != rev_pending_splits.end())
- 09:36 PM Bug #21129 (New): 'ceph -s' hang
- 09:36 PM Bug #21194 (New): mon clock skew test is fragile
- 09:36 PM Bug #20960 (New): ceph_test_rados: mismatched version (due to pg import/export)
- 09:35 PM Bug #20952 (New): Glitchy monitor quorum causes spurious test failure
- 09:35 PM Bug #20922 (New): misdirected op with localize_reads set
- 09:35 PM Bug #20846 (New): ceph_test_rados_list_parallel: options dtor racing with DispatchQueue lockdep -...
- 09:35 PM Bug #20770 (New): test_pidfile.sh test is failing 2 places
- 09:35 PM Bug #20730 (New): need new OSD_SKEWED_USAGE implementation
- 09:35 PM Bug #20370 (New): leaked MOSDOp via PrimaryLogPG::_copy_some and PrimaryLogPG::do_proxy_write
- 09:35 PM Bug #20646 (New): run_seed_to_range.sh: segv, tp_fstore_op timeout
- 09:35 PM Bug #20360 (New): rados/verify valgrind tests: osds fail to start (xenial valgrind)
- 09:35 PM Bug #20369 (New): segv in OSD::ShardedOpWQ::_process
- 09:35 PM Bug #20221 (New): kill osd + osd out leads to stale PGs
- 09:35 PM Bug #20169 (New): filestore+btrfs occasionally returns ENOSPC
- 09:35 PM Bug #20053 (New): crush compile / decompile looses precision on weight
- 09:35 PM Bug #19700 (New): OSD remained up despite cluster network being inactive?
- 09:35 PM Bug #19486 (New): Rebalancing can propagate corrupt copy of replicated object
- 09:35 PM Bug #19518 (New): log entry does not include per-op rvals?
- 09:35 PM Bug #19440 (New): osd: trims maps taht pgs haven't consumed yet when there are gaps
- 09:35 PM Bug #17257 (New): ceph_test_rados_api_lock fails LibRadosLockPP.LockExclusiveDurPP
- 09:35 PM Bug #15015 (New): prepare_new_pool doesn't return failure string ss
- 09:35 PM Bug #14115 (New): crypto: race in nss init
- 09:35 PM Bug #13385 (New): cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final ro...
- 09:35 PM Bug #12687 (New): osd thrashing + pg import/export can cause maybe_went_rw intervals to be missed
- 09:35 PM Bug #12615 (New): Repair of Erasure Coded pool with an unrepairable object causes pg state to los...
- 09:35 PM Bug #11235 (New): test_rados.py test_aio_read is racy
- 09:35 PM Bug #9606 (New): mon: ambiguous error_status returned to user when type is wrong in a command
- 08:31 PM Bug #43151 (Fix Under Review): ok-to-stop incorrect for some ec pgs
- 04:33 PM Bug #43151 (Resolved): ok-to-stop incorrect for some ec pgs
- before,...
- 08:16 PM Backport #43119: mimic: osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/32000
merged - 08:01 PM Backport #41238: nautilus: Implement mon_memory_target
- Follow-on fix: https://github.com/ceph/ceph/pull/32045
- 08:00 PM Feature #40870: Implement mon_memory_target
- This has a follow-on fix: https://github.com/ceph/ceph/pull/32044
- 06:01 PM Bug #38040: osd_map_message_max default is too high?
- Luminous backport analysis:
* https://github.com/ceph/ceph/pull/26340 - two of three commits backported to luminou... - 05:50 PM Bug #43150 (In Progress): osd-scrub-snaps.sh fails
- 05:21 PM Bug #43150: osd-scrub-snaps.sh fails
- During testing I saw this even though it isn't what happened in the teuthology runs. I think in all cases we have sc...
- 03:51 PM Bug #43150 (Resolved): osd-scrub-snaps.sh fails
- /a/sage-2019-12-04_19:33:15-rados-wip-sage2-testing-2019-12-04-0856-distro-basic-smithi/4567061
/a/sage-2019-12-04_1... - 05:41 PM Bug #43106: mimic: crash in build_incremental_map_msg
- The three PRs that need to be backported to mimic are:
* https://github.com/ceph/ceph/pull/26340 - backported to m... - 01:41 PM Backport #43140 (In Progress): nautilus: ceph-mon --mkfs: public_address type (v1|v2) is not resp...
- 11:07 AM Backport #43140 (Resolved): nautilus: ceph-mon --mkfs: public_address type (v1|v2) is not respected
- https://github.com/ceph/ceph/pull/32028
- 01:34 PM Bug #42485: verify_upmaps can not cancel invalid upmap_items in some cases
- NOTE: https://github.com/ceph/ceph/pull/31131 was merged to master and backported to nautilus and luminous, before it...
- 04:04 AM Bug #42485 (Resolved): verify_upmaps can not cancel invalid upmap_items in some cases
- 01:32 PM Backport #42547: nautilus: verify_upmaps can not cancel invalid upmap_items in some cases
- NOTE: reverted by https://github.com/ceph/ceph/pull/32018
- 01:30 PM Backport #42548: luminous: verify_upmaps can not cancel invalid upmap_items in some cases
- Note: reverted by https://github.com/ceph/ceph/pull/32019
- 07:52 AM Bug #42906 (Pending Backport): ceph-mon --mkfs: public_address type (v1|v2) is not respected
- 06:22 AM Bug #37968 (Resolved): maybe_remove_pg_upmaps incorrectly cancels valid pending upmaps
- 06:21 AM Backport #38163 (Resolved): mimic: maybe_remove_pg_upmaps incorrectly cancels valid pending upmaps
- 04:04 AM Backport #42546 (Rejected): mimic: verify_upmaps can not cancel invalid upmap_items in some cases
- This change has been reverted so we won't backport.
- 12:43 AM Bug #43124: Probably legal crush rules cause upmaps to be cleaned
We are reverting the original pull request which changed verify_upmaps(): https://github.com/ceph/ceph/pull/31131
...
Also available in: Atom