Activity
From 11/08/2019 to 12/07/2019
12/07/2019
- 06:28 PM Bug #43150 (Resolved): osd-scrub-snaps.sh fails
- 02:47 PM Bug #41313: PG distribution completely messed up since Nautilus
- ceph balancer status
{
"active": true,
"plans": [],
"mode": "upmap"
}
bad distribution:
<p... - 02:45 PM Bug #43185: ceph -s not showing client activity
- ceph -s only looks like this:
ceph -s
cluster:
id: c4068f25-d46d-438d-af63-5679a2d56efb
health: H... - 02:44 PM Bug #43185 (Resolved): ceph -s not showing client activity
- Since Nautilus upgrade ceph -s often (2 out of 3 times) does not show any client or recovery activity. Right now it's...
12/06/2019
- 05:21 PM Bug #42964 (Fix Under Review): monitor config store: Deleting logging config settings does not de...
- 04:07 PM Bug #42347: nautilus assert during osd shutdown: FAILED ceph_assert((sharded_in_flight_list.back(...
- Seen in this scrub test run during osd-scrub-repair.sh.
http://pulpito.ceph.com/dzafman-2019-12-05_19:53:40-rados-... - 02:01 PM Bug #43176 (Duplicate): pgs inconsistent, union_shard_errors=missing
- Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
... - 02:01 PM Bug #43175 (Duplicate): pgs inconsistent, union_shard_errors=missing
- Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
... - 02:01 PM Bug #43174 (Resolved): pgs inconsistent, union_shard_errors=missing
- Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
... - 02:00 PM Bug #43173 (Duplicate): pgs inconsistent, union_shard_errors=missing
- Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
... - 12:55 PM Backport #42997 (In Progress): nautilus: acting_recovery_backfill won't catch all up peers
- 12:48 PM Backport #42878 (In Progress): nautilus: ceph_test_admin_socket_output fails in rados qa suite
- 12:48 PM Backport #42853 (In Progress): nautilus: format error: ceph osd stat --format=json
- 12:47 PM Backport #42847 (Need More Info): mimic: "failing miserably..." in Infiniband.cc
- non-trivial
- 12:47 PM Backport #42848 (Need More Info): nautilus: "failing miserably..." in Infiniband.cc
- non-trivial
- 04:23 AM Bug #38069: upgrade:jewel-x-luminous with short_pg_log.yaml fails with assert(s <= can_rollback_to)
- Oops. I think the more significant issue is that short_pg_log.yaml isn't involved.
- 02:09 AM Bug #38069: upgrade:jewel-x-luminous with short_pg_log.yaml fails with assert(s <= can_rollback_to)
- David Zafman wrote:
> Seen in a non-upgrade test:
This is an upgrade test: "rados/upgrade/jewel-x-singleton/{0-c... - 02:00 AM Bug #38069: upgrade:jewel-x-luminous with short_pg_log.yaml fails with assert(s <= can_rollback_to)
- Seen in a -non-upgrade- test with description:
rados/upgrade/jewel-x-singleton/{0-cluster/{openstack.yaml start.ya...
12/05/2019
- 11:28 PM Bug #41240 (Can't reproduce): All of the cluster SSDs aborted at around the same time and will no...
- 09:37 PM Bug #41240 (New): All of the cluster SSDs aborted at around the same time and will not start.
- 11:24 PM Bug #38892 (Closed): /ceph/src/tools/kvstore_tool.cc:266:1: internal compiler error: Segmentation...
- 09:45 PM Bug #38892 (Fix Under Review): /ceph/src/tools/kvstore_tool.cc:266:1: internal compiler error: Se...
- 09:44 PM Bug #23590 (Fix Under Review): kstore: statfs: (95) Operation not supported
- 09:44 PM Bug #23297 (Fix Under Review): mon-seesaw 'failed to become clean before timeout' due to laggy pg...
- 09:43 PM Bug #13111 (Fix Under Review): replicatedPG:the assert occurs in the fuction ReplicatedPG::on_loc...
- 09:40 PM Feature #38653 (New): Enhance health message when pool quota fills up
- 09:40 PM Bug #38783 (New): Changing mon_pg_warn_max_object_skew has no effect.
- 09:40 PM Feature #3764 (New): osd: async replicas
- 09:37 PM Bug #43048 (New): nautilus: upgrade/mimic-x/stress-split: failed to recover before timeout expired
- 09:37 PM Bug #42918 (New): memory corruption and lockups with I-Object
- 09:37 PM Bug #42780 (New): recursive lock of OpTracker::lock (70)
- 09:37 PM Bug #42706 (New): LibRadosList.EnumerateObjectsSplit fails
- 09:37 PM Bug #42666 (New): mgropen from mgr comes from unknown.$id instead of mgr.$id
- 09:37 PM Bug #42186 (New): "2019-10-04T19:31:51.053283+0000 osd.7 (osd.7) 108 : cluster [ERR] 2.5s0 shard ...
- 09:37 PM Bug #41406 (New): common: SafeTimer reinit doesn't fix up "stopping" bool, used in MonClient boot...
- 09:37 PM Bug #40963 (New): mimic: MQuery during Deleting state
- 06:31 PM Bug #40963: mimic: MQuery during Deleting state
- yuriw-2019-12-04_22:44:10-rados-wip-yuri2-testing-2019-12-04-1938-mimic-distro-basic-smithi/4567200/
DeleteStart e... - 09:37 PM Bug #40868 (New): src/common/config_proxy.h: 70: FAILED ceph_assert(p != obs_call_gate.end())
- 09:37 PM Bug #40820 (New): standalone/scrub/osd-scrub-test.sh +3 day failed assert
- 09:37 PM Bug #40666 (New): osd fails to get latest map
- 09:37 PM Fix #40564 (New): Objecter does not have perfcounters for op latency
- 09:37 PM Bug #40522 (New): on_local_recover doesn't touch?
- 09:37 PM Bug #40454 (New): snap_mapper error, scrub gets r -2..repaired
- 09:37 PM Bug #40521 (New): cli timeout (e.g., ceph pg dump)
- 09:37 PM Bug #40367 (New): "*** Caught signal (Segmentation fault) **" in upgrade:luminous-x-nautilus
- 09:37 PM Bug #40410 (New): ceph pg query Segmentation fault in 12.2.10
- 09:36 PM Feature #39966 (New): mon: allow log messages to be throttled and/or force trimming
- 09:36 PM Bug #40000 (New): osds do not bound xattrs and/or aggregate xattr data in pg log
- 09:36 PM Bug #39366 (New): ClsLock.TestRenew failure
- 09:36 PM Bug #39145 (New): luminous: jewel-x-singleton: FAILED assert(0 == "we got a bad state machine eve...
- 09:36 PM Bug #39148 (New): luminous: powercycle: reached maximum tries (500) after waiting for 3000 seconds
- 09:36 PM Bug #39039 (New): mon connection reset, command not resent
- 09:36 PM Fix #39071 (New): monclient: initial probe is non-optimal with v2+v1
- 09:36 PM Bug #38656 (New): scrub reservation leak?
- 09:36 PM Bug #38718 (New): 'osd crush weight-set create-compat' (and other OSDMonitor commands) can leak u...
- 09:36 PM Bug #38624 (New): crush: get_rule_weight_osd_map does not handle multi-take rules
- 09:36 PM Bug #38513 (New): luminous: "AsyncReserver.h: 190: FAILED assert(!queue_pointers.count(item) && !...
- 09:36 PM Bug #38402 (New): ceph-objectstore-tool on down osd w/ not enough in osds
- 09:36 PM Bug #38417 (New): ceph tell mon.a help timeout
- 09:36 PM Bug #38357 (New): ClsLock.TestExclusiveEphemeralStealEphemeral failed
- 09:36 PM Bug #38358 (New): short pg log + cache tier ceph_test_rados out of order reply
- 09:36 PM Bug #38195 (New): osd-backfill-space.sh exposes rocksdb hang
- 09:36 PM Bug #38345 (New): mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- 09:36 PM Bug #38184 (New): osd: recovery does not preserve copy-on-write allocations between object clones...
- 09:36 PM Bug #38159 (New): ec does not recover below min_size
- 09:36 PM Bug #38172 (New): segv in rocksdb NewIterator
- 09:36 PM Bug #38151 (New): cephx: service ticket validity dobuled
- 09:36 PM Bug #38082 (New): mimic: mon/caps.sh fails with "Expected return 0, got 110"
- 09:36 PM Bug #38064 (New): librados::OPERATION_FULL_TRY not completely implemented, test LibRadosAio.PoolQ...
- 09:36 PM Bug #37582 (New): luminous: ceph -s client gets all mgrmaps
- 09:36 PM Bug #37532 (New): mon: expected_num_objects warning triggers on bluestore-only setups
- 09:36 PM Bug #37509 (New): require past_interval bounds mismatch due to osd oldest_map
- 09:36 PM Bug #36748 (New): ms_deliver_verify_authorizer no AuthAuthorizeHandler found for protocol 0
- 09:36 PM Bug #37289 (New): Issue with overfilled OSD for cache-tier pools
- 09:36 PM Bug #36634 (New): LibRadosWatchNotify.WatchNotify2Timeout failure
- 09:36 PM Bug #36337 (New): OSDs crash with failed assertion in PGLog::merge_log as logs do not overlap
- 09:36 PM Bug #36164 (New): cephtool/test fails 'ceph tell mon.a help' with EINTR
- 09:36 PM Bug #36113 (New): fusestore test umount failed?
- 09:36 PM Bug #35075 (New): copy-get stuck sending osd_op
- 09:36 PM Bug #36040 (New): mon: Valgrind: mon (InvalidFree, InvalidWrite, InvalidRead)
- 09:36 PM Bug #24874 (New): ec fast reads can trigger read errors in log
- 09:36 PM Bug #26891 (New): backfill reservation deadlock/stall
- 09:36 PM Bug #24242 (New): tcmalloc::ThreadCache::ReleaseToCentralCache on rhel (w/ centos packages)
- 09:36 PM Bug #24339 (New): FULL_FORCE ops are dropped if fail-safe full check fails, but not resent in sca...
- 09:36 PM Bug #23965 (New): FAIL: s3tests.functional.test_s3.test_multipart_upload_resend_part with ec cach...
- 09:36 PM Bug #23857 (New): flush (manifest) vs async recovery causes out of order op
- 09:36 PM Bug #23879 (New): test_mon_osdmap_prune.sh fails
- 09:36 PM Bug #23828 (New): ec gen object leaks into different filestore collection just after split
- 09:36 PM Bug #23760 (New): mon: `config get <who>` does not allow `who` as 'mon'/'osd'
- 09:36 PM Bug #23767 (New): "ceph ping mon" doesn't work
- 09:36 PM Bug #23270 (New): failed mutex assert in PipeConnection::try_get_pipe() (via OSD::do_command())
- 09:36 PM Bug #23428 (New): Snapset inconsistency is hard to diagnose because authoritative copy used by li...
- 09:36 PM Bug #23029 (New): osd does not handle eio on meta objects (e.g., osdmap)
- 09:36 PM Bug #22656 (New): scrub mismatch on bytes (cache pools)
- 09:36 PM Bug #21592 (New): LibRadosCWriteOps.CmpExt got 0 instead of -4095-1
- 09:36 PM Bug #21495 (New): src/osd/OSD.cc: 346: FAILED assert(piter != rev_pending_splits.end())
- 09:36 PM Bug #21129 (New): 'ceph -s' hang
- 09:36 PM Bug #21194 (New): mon clock skew test is fragile
- 09:36 PM Bug #20960 (New): ceph_test_rados: mismatched version (due to pg import/export)
- 09:35 PM Bug #20952 (New): Glitchy monitor quorum causes spurious test failure
- 09:35 PM Bug #20922 (New): misdirected op with localize_reads set
- 09:35 PM Bug #20846 (New): ceph_test_rados_list_parallel: options dtor racing with DispatchQueue lockdep -...
- 09:35 PM Bug #20770 (New): test_pidfile.sh test is failing 2 places
- 09:35 PM Bug #20730 (New): need new OSD_SKEWED_USAGE implementation
- 09:35 PM Bug #20370 (New): leaked MOSDOp via PrimaryLogPG::_copy_some and PrimaryLogPG::do_proxy_write
- 09:35 PM Bug #20646 (New): run_seed_to_range.sh: segv, tp_fstore_op timeout
- 09:35 PM Bug #20360 (New): rados/verify valgrind tests: osds fail to start (xenial valgrind)
- 09:35 PM Bug #20369 (New): segv in OSD::ShardedOpWQ::_process
- 09:35 PM Bug #20221 (New): kill osd + osd out leads to stale PGs
- 09:35 PM Bug #20169 (New): filestore+btrfs occasionally returns ENOSPC
- 09:35 PM Bug #20053 (New): crush compile / decompile looses precision on weight
- 09:35 PM Bug #19700 (New): OSD remained up despite cluster network being inactive?
- 09:35 PM Bug #19486 (New): Rebalancing can propagate corrupt copy of replicated object
- 09:35 PM Bug #19518 (New): log entry does not include per-op rvals?
- 09:35 PM Bug #19440 (New): osd: trims maps taht pgs haven't consumed yet when there are gaps
- 09:35 PM Bug #17257 (New): ceph_test_rados_api_lock fails LibRadosLockPP.LockExclusiveDurPP
- 09:35 PM Bug #15015 (New): prepare_new_pool doesn't return failure string ss
- 09:35 PM Bug #14115 (New): crypto: race in nss init
- 09:35 PM Bug #13385 (New): cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final ro...
- 09:35 PM Bug #12687 (New): osd thrashing + pg import/export can cause maybe_went_rw intervals to be missed
- 09:35 PM Bug #12615 (New): Repair of Erasure Coded pool with an unrepairable object causes pg state to los...
- 09:35 PM Bug #11235 (New): test_rados.py test_aio_read is racy
- 09:35 PM Bug #9606 (New): mon: ambiguous error_status returned to user when type is wrong in a command
- 08:31 PM Bug #43151 (Fix Under Review): ok-to-stop incorrect for some ec pgs
- 04:33 PM Bug #43151 (Resolved): ok-to-stop incorrect for some ec pgs
- before,...
- 08:16 PM Backport #43119: mimic: osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/32000
merged - 08:01 PM Backport #41238: nautilus: Implement mon_memory_target
- Follow-on fix: https://github.com/ceph/ceph/pull/32045
- 08:00 PM Feature #40870: Implement mon_memory_target
- This has a follow-on fix: https://github.com/ceph/ceph/pull/32044
- 06:01 PM Bug #38040: osd_map_message_max default is too high?
- Luminous backport analysis:
* https://github.com/ceph/ceph/pull/26340 - two of three commits backported to luminou... - 05:50 PM Bug #43150 (In Progress): osd-scrub-snaps.sh fails
- 05:21 PM Bug #43150: osd-scrub-snaps.sh fails
- During testing I saw this even though it isn't what happened in the teuthology runs. I think in all cases we have sc...
- 03:51 PM Bug #43150 (Resolved): osd-scrub-snaps.sh fails
- /a/sage-2019-12-04_19:33:15-rados-wip-sage2-testing-2019-12-04-0856-distro-basic-smithi/4567061
/a/sage-2019-12-04_1... - 05:41 PM Bug #43106: mimic: crash in build_incremental_map_msg
- The three PRs that need to be backported to mimic are:
* https://github.com/ceph/ceph/pull/26340 - backported to m... - 01:41 PM Backport #43140 (In Progress): nautilus: ceph-mon --mkfs: public_address type (v1|v2) is not resp...
- 11:07 AM Backport #43140 (Resolved): nautilus: ceph-mon --mkfs: public_address type (v1|v2) is not respected
- https://github.com/ceph/ceph/pull/32028
- 01:34 PM Bug #42485: verify_upmaps can not cancel invalid upmap_items in some cases
- NOTE: https://github.com/ceph/ceph/pull/31131 was merged to master and backported to nautilus and luminous, before it...
- 04:04 AM Bug #42485 (Resolved): verify_upmaps can not cancel invalid upmap_items in some cases
- 01:32 PM Backport #42547: nautilus: verify_upmaps can not cancel invalid upmap_items in some cases
- NOTE: reverted by https://github.com/ceph/ceph/pull/32018
- 01:30 PM Backport #42548: luminous: verify_upmaps can not cancel invalid upmap_items in some cases
- Note: reverted by https://github.com/ceph/ceph/pull/32019
- 07:52 AM Bug #42906 (Pending Backport): ceph-mon --mkfs: public_address type (v1|v2) is not respected
- 06:22 AM Bug #37968 (Resolved): maybe_remove_pg_upmaps incorrectly cancels valid pending upmaps
- 06:21 AM Backport #38163 (Resolved): mimic: maybe_remove_pg_upmaps incorrectly cancels valid pending upmaps
- 04:04 AM Backport #42546 (Rejected): mimic: verify_upmaps can not cancel invalid upmap_items in some cases
- This change has been reverted so we won't backport.
- 12:43 AM Bug #43124: Probably legal crush rules cause upmaps to be cleaned
We are reverting the original pull request which changed verify_upmaps(): https://github.com/ceph/ceph/pull/31131
...
12/04/2019
- 08:51 PM Bug #43126 (Resolved): OSD_SLOW_PING_TIME_BACK nits
From Sage e-mail:
Long heartbeat ping times on back interface seen, longest is 1315.510 msec (OSD_SLOW_PING_TIME...- 08:46 PM Bug #43124 (Resolved): Probably legal crush rules cause upmaps to be cleaned
- I've seen multiple user sites with crush rules for EC pools which will trigger the verify_upmap() to detect an error....
- 08:24 PM Backport #42546 (In Progress): mimic: verify_upmaps can not cancel invalid upmap_items in some cases
- 08:13 PM Backport #42546 (Resolved): mimic: verify_upmaps can not cancel invalid upmap_items in some cases
- 12:13 PM Bug #38330: osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
- @Dan, @Neha - mimic backport staged at https://github.com/ceph/ceph/pull/26448
- 02:33 AM Bug #38330 (Pending Backport): osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
- Based on https://tracker.ceph.com/issues/43106#note-1 and https://tracker.ceph.com/issues/38282#note-14
- 12:11 PM Backport #43119 (In Progress): mimic: osd/OSD.cc: 1515: abort() in Service::build_incremental_map...
- 12:08 PM Backport #43119 (Resolved): mimic: osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
- https://github.com/ceph/ceph/pull/32000
- 02:30 AM Bug #43106: mimic: crash in build_incremental_map_msg
- I think you are right. We should have backported all three PRs according to https://tracker.ceph.com/issues/38040#not...
12/03/2019
- 07:37 PM Bug #43110 (Duplicate): rados/test.sh failure: ceph_test_rados_api_watch_notify_pp
- https://tracker.ceph.com/issues/42933
- 07:27 PM Bug #43110: rados/test.sh failure: ceph_test_rados_api_watch_notify_pp
- Neha pointed out the core info is obviously helpful:
> 1575385919.6406.core: ELF 64-bit LSB core file x86-64, vers... - 07:18 PM Bug #43110 (Duplicate): rados/test.sh failure: ceph_test_rados_api_watch_notify_pp
- I noticed this in a branch of my own, but it appears to be showing up in the master smoke tests too.
rados/test.sh... - 05:25 PM Backport #43093: luminous: Improve OSDMap::calc_pg_upmaps() efficiency
- @David Does this need https://github.com/ceph/ceph/pull/31944 as well?
- 05:05 PM Backport #43093 (In Progress): luminous: Improve OSDMap::calc_pg_upmaps() efficiency
- 03:40 PM Bug #43106 (Resolved): mimic: crash in build_incremental_map_msg
- Since upgrading from 13.2.6 to 13.2.7 we get this around once per 10 minutes on a cluster with 500 out of 1500 OSDs u...
- 03:27 PM Bug #38330: osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
- https://tracker.ceph.com/issues/38282 was backported to mimic in 13.2.7.
Does this need a backport also ?
(we ha... - 09:53 AM Bug #42961: osd: increase priority in certain OSD perf counters
- Neha Ojha wrote:
> Ernesto, while we are at it, are there any other specific stats that you've gotten requests for?
... - 02:00 AM Bug #42961 (Fix Under Review): osd: increase priority in certain OSD perf counters
- Ernesto, while we are at it, are there any other specific stats that you've gotten requests for?
- 09:13 AM Backport #43099 (Resolved): nautilus: nautilus:osd: network numa affinity not supporting subnet port
- https://github.com/ceph/ceph/pull/32843
- 02:53 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- I think we can dispense with the session put when we call 'remove_session' since we call it when we replace the sessi...
- 01:17 AM Backport #43094 (In Progress): mimic: Improve OSDMap::calc_pg_upmaps() efficiency
- 01:15 AM Backport #43092 (In Progress): nautilus: Improve OSDMap::calc_pg_upmaps() efficiency
12/02/2019
- 11:30 PM Bug #42346 (In Progress): Nearfull warnings are incorrect
Spurious nearfull warnings caused by backfill reservation mechanism during rebalancing. The nearfull ratio was com...- 11:23 PM Bug #42718: Improve OSDMap::calc_pg_upmaps() efficiency
- https://github.com/ceph/ceph/pull/31944 is a follow-on fix for https://github.com/ceph/ceph/pull/31774
- 09:56 PM Bug #42718 (Pending Backport): Improve OSDMap::calc_pg_upmaps() efficiency
- 09:59 PM Backport #43094 (Resolved): mimic: Improve OSDMap::calc_pg_upmaps() efficiency
- https://github.com/ceph/ceph/pull/31957
- 09:58 PM Backport #43093 (Resolved): luminous: Improve OSDMap::calc_pg_upmaps() efficiency
- https://github.com/ceph/ceph/pull/31992
- 09:58 PM Backport #43092 (Resolved): nautilus: Improve OSDMap::calc_pg_upmaps() efficiency
- https://github.com/ceph/ceph/pull/31956
- 06:56 PM Bug #42411 (Pending Backport): nautilus:osd: network numa affinity not supporting subnet port
- 05:54 AM Bug #41313: PG distribution completely messed up since Nautilus
- This happens with active PG balancer if the cluster is in WARN state.
...
51 hdd 9.09470 1.00000 9.1 TiB 5.8 ... - 02:11 AM Bug #42102: use-after-free in Objecter timer handing
- ...
12/01/2019
- 01:23 AM Bug #43067 (New): Git Master: src/compressor/zlib/ZlibCompressor.cc / src/compressor/zlib/CMakeLi...
- When Ceph is built without support for CPU feature SSE4_1 (HAVE_INTEL_SSE4_1), the CMake build system does not link ...
11/29/2019
- 06:01 PM Bug #42780: recursive lock of OpTracker::lock (70)
- I will be working on this bug after returning from PTO (ETA: 16 Dec 2019).
- 05:41 PM Bug #42780: recursive lock of OpTracker::lock (70)
- THe problem comes from OSD::get_health_metrics(), where the visitor lambda is holding the lock and also drops a refer...
11/27/2019
- 08:40 PM Bug #43048: nautilus: upgrade/mimic-x/stress-split: failed to recover before timeout expired
- https://www.spinics.net/lists/ceph-users/msg54910.html - could also be related.
- 08:36 PM Bug #43048 (Won't Fix - EOL): nautilus: upgrade/mimic-x/stress-split: failed to recover before ti...
- ...
- 04:52 PM Bug #24419: ceph-objectstore-tool unable to open mon store
- To be clear, this isn't an issue in mimic or later releases.
- 04:50 PM Bug #24419 (Won't Fix): ceph-objectstore-tool unable to open mon store
- It looks like this is due to bluestore setting the rocksdb_db_paths config option in luminous. This causes the ceph-o...
11/26/2019
- 10:18 PM Bug #42978 (Resolved): ops waiting for lock not requeued; client sees misordering
- 10:16 PM Bug #42012 (Resolved): mon osd_snap keys grow unbounded
- 09:13 AM Backport #42258 (In Progress): mimic: document new option mon_max_pg_per_osd
- 05:18 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- Theory:
In Monitor::_ms_dispatch() when we detect a feature change we end up with the following sequence.
Monit... - 03:45 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- ...
- 03:20 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- I'm wondering if maybe this happens due to the feature change and the session being removed during the upgrade proces...
11/25/2019
- 07:32 PM Bug #42012 (Fix Under Review): mon osd_snap keys grow unbounded
- 07:29 PM Bug #42012 (In Progress): mon osd_snap keys grow unbounded
- Okay, in octopus, there are now 2 sets of keys
- purged_snap_*: map intervals of snaps that are purged. adjacent r... - 07:16 PM Bug #42978 (Fix Under Review): ops waiting for lock not requeued; client sees misordering
- 03:35 PM Feature #39066 (Resolved): src/ceph-disk/tests/ceph-disk.sh is using hardcoded port
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 02:58 PM Feature #39066: src/ceph-disk/tests/ceph-disk.sh is using hardcoded port
- Rejecting luminous backport - luminous is EOL.
- 03:33 PM Bug #40910 (Resolved): mon/OSDMonitor.cc: better error message about min_size
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 02:57 PM Bug #40910: mon/OSDMonitor.cc: better error message about min_size
- Rejecting luminous backport - luminous is EOL.
- 03:33 PM Bug #41017 (Resolved): Change default for bluestore_fsck_on_mount_deep as false
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 02:56 PM Bug #41017: Change default for bluestore_fsck_on_mount_deep as false
- Rejecting luminous backport - luminous is EOL.
- 03:29 PM Bug #42933 (Rejected): LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify2/1
- we reverted, see https://github.com/ceph/ceph/pull/31790
- 03:23 PM Backport #38205 (In Progress): luminous: osds allows to partially start more than N+2
- 03:19 PM Backport #40947 (In Progress): luminous: Better default value for osd_snap_trim_sleep
- 03:13 PM Backport #41730 (Need More Info): luminous: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(pe...
- 03:05 PM Backport #41730 (In Progress): luminous: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_...
- 02:57 PM Backport #39381 (Rejected): luminous: src/ceph-disk/tests/ceph-disk.sh is using hardcoded port
- 02:57 PM Backport #40941 (Rejected): luminous: mon/OSDMonitor.cc: better error message about min_size
- 02:56 PM Backport #41085 (Rejected): luminous: Change default for bluestore_fsck_on_mount_deep as false
- 09:44 AM Backport #42998 (Resolved): mimic: acting_recovery_backfill won't catch all up peers
- https://github.com/ceph/ceph/pull/33324
- 09:43 AM Backport #42997 (Resolved): nautilus: acting_recovery_backfill won't catch all up peers
- https://github.com/ceph/ceph/pull/32064
- 09:43 AM Backport #42996 (Rejected): luminous: acting_recovery_backfill won't catch all up peers
- https://github.com/ceph/ceph/pull/33326
- 06:56 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- I'm interested to see more cores in case one sheds more light. I've started a few runs in the hope they will fail.
- 05:14 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- In both cases so far the Message type appears to be MSG_MON_PAXOS and priority is CEPH_MSG_PRIO_HIGH....
- 03:40 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- Neha sent me another instance of this issue available at http://pulpito.ceph.com/nojha-2019-11-22_18:41:03-rados:upgr...
- 12:58 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- ...
- 05:27 AM Bug #42971: mgr hangs with upmap balancer
- So I wrote my own upmap balancer this weekend and after running it for a bit I found the same problem. It appears th...
- 04:28 AM Backport #42662 (In Progress): nautilus:Issue a HEALTH_WARN when a Pool is configured with [min_]...
11/24/2019
- 06:17 PM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- /a/sage-2019-11-24_06:32:18-rados-wip-sage-testing-2019-11-23-2031-distro-basic-smithi/4538572...
- 04:58 PM Bug #42577 (Pending Backport): acting_recovery_backfill won't catch all up peers
- 04:57 PM Bug #42782: nautilus: rados/test_librados_build.sh build failure
- https://github.com/ceph/ceph/pull/31693
11/22/2019
- 10:22 PM Bug #42975 (Duplicate): out of order ops in rados/upgrade/nautilus-x-singleton
- 05:49 PM Bug #42975: out of order ops in rados/upgrade/nautilus-x-singleton
- Another out of order bug https://tracker.ceph.com/issues/42328.
- 05:47 PM Bug #42975 (Duplicate): out of order ops in rados/upgrade/nautilus-x-singleton
- ...
- 09:36 PM Bug #42968 (Duplicate): TestClsRbd.mirror_image_status failure during luminous->nautilus upgrade
- Duplicating to https://tracker.ceph.com/issues/42891 as its the same issue.
- 03:06 PM Bug #42968 (Duplicate): TestClsRbd.mirror_image_status failure during luminous->nautilus upgrade
- Run: http://pulpito.ceph.com/teuthology-2019-11-22_02:25:03-upgrade:luminous-x-nautilus-distro-basic-smithi/
Jobs:'4... - 07:44 PM Bug #42978: ops waiting for lock not requeued; client sees misordering
- reproduces with suite: rados:upgrade:nautilus-x-singleton
filter: '0-cluster/{openstack.yaml start.yaml} 1-install... - 07:09 PM Bug #42978: ops waiting for lock not requeued; client sees misordering
- ok, 99% sure the problem si this bit of code in release_object_locks()...
- 07:05 PM Bug #42978 (Resolved): ops waiting for lock not requeued; client sees misordering
- a ceph_test_rados sequence of ops come in, but replies go back out of order...
- 06:44 PM Bug #42977 (Resolved): mon/Elector.cc: FAILED ceph_assert(m->epoch == get_epoch())
- ...
- 04:37 PM Bug #42971: mgr hangs with upmap balancer
- We are using device classes.
- 04:28 PM Bug #42971: mgr hangs with upmap balancer
- Hey Bryan, David's been fixing a couple issues in the balancer that sound like what you're running into:
1) https:... - 04:15 PM Bug #42971 (New): mgr hangs with upmap balancer
- On multiple clusters we are seeing the mgr hang frequently when the balancer is enabled. It seems that the balancer ...
- 01:24 PM Bug #42964 (Resolved): monitor config store: Deleting logging config settings does not decrease l...
- How to reproduce:
1. increase log level of mds:
ceph config set mds debug_mds 10/10
2. try to revert this:
ce... - 11:22 AM Bug #42477: Rados should use the '-o outfile' convention
- @Nathan, I think that's the right decision in this case mate. It should be less disruptive hopefully.
- 08:45 AM Bug #42477: Rados should use the '-o outfile' convention
- @Brad - got it, thanks. So, the issue is fixed as of Octopus and the fix will not be backported for the reason you st...
- 08:44 AM Bug #42477 (Resolved): Rados should use the '-o outfile' convention
- 11:16 AM Bug #42961 (Resolved): osd: increase priority in certain OSD perf counters
- There are reports from users missing stats in dashboard/prometheus mgr modules about the following perf counters:
<p...
11/21/2019
- 03:56 PM Bug #42933 (Rejected): LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify2/1
- ...
- 03:23 PM Bug #42918: memory corruption and lockups with I-Object
- I managed to grab the stack traces from when it locks up instead of crashing -- also around watch/notify in the face ...
- 03:18 PM Bug #42918: memory corruption and lockups with I-Object
- Ilya Dryomov wrote:
> Haven't tried without failure injection yet, but it's probably related to ms_inject_socket_fai... - 02:37 PM Bug #42918: memory corruption and lockups with I-Object
- one segfault related to watch/notify is fixed in https://github.com/ceph/ceph/pull/31768, but testing in the rgw suit...
- 01:31 PM Bug #42918: memory corruption and lockups with I-Object
- Haven't tried without failure injection yet, but it's probably related to ms_inject_socket_failures (and resulting wa...
- 01:28 PM Bug #42918: memory corruption and lockups with I-Object
- @Ilya: does it reproduce when you have injected socket failures disabled? From your initial logs and from the backtra...
- 01:25 PM Bug #42918: memory corruption and lockups with I-Object
- Excellent sleuthing -- thanks! I am going to bump this over to the RADOS project since I can't see how this is purely...
- 11:56 AM Bug #42918: memory corruption and lockups with I-Object
- Got actionable stack traces on 669453138d89:...
- 10:42 AM Bug #42918: memory corruption and lockups with I-Object
- Looks real and seems to be introduced with I-Object: no issues with 36f5fcbb97eb ("Merge PR #31672 into master") and ...
- 05:26 AM Bug #42718: Improve OSDMap::calc_pg_upmaps() efficiency
The rules based pool groups being passed to calc_pg_upmaps() is a better method, so we don't want to revert.
try...- 01:58 AM Backport #41531 (Resolved): nautilus: Move bluefs alloc size initialization log message to log le...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30229
m...
11/20/2019
- 11:02 PM Bug #42921 (Can't reproduce): osd: segmentation fault in PGLog::check
- ...
- 10:21 PM Bug #42918: memory corruption and lockups with I-Object
- The stack is corrupt in both:...
- 10:11 PM Bug #42918 (Closed): memory corruption and lockups with I-Object
- http://pulpito.ceph.com/dis-2019-11-20_20:35:04-krbd-master-wip-krbd-readonly-basic-mira/4526411...
- 10:06 PM Bug #42783 (Resolved): test failure: due to client closed connection
- 08:07 PM Backport #41531: nautilus: Move bluefs alloc size initialization log message to log level 1
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30229
merged - 05:47 PM Bug #42782 (Resolved): nautilus: rados/test_librados_build.sh build failure
- 02:54 PM Bug #42906 (Resolved): ceph-mon --mkfs: public_address type (v1|v2) is not respected
- When calling `ceph-mon --mkfs ... --public_address v1:<ip_address>:<random_port>`
the `v1:` type is ignored and the ... - 08:07 AM Backport #42796 (Resolved): luminous: unnecessary error message "calc_pg_upmaps failed to build o...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31598
m... - 02:35 AM Bug #42890 (New): Deadlock occurs when exiting with dpdkstack
- exit() will call pthread_cond_destroy attempting to destroy dpdk::eal::cond
upon which other threads are currently b... - 02:14 AM Bug #40367: "*** Caught signal (Segmentation fault) **" in upgrade:luminous-x-nautilus
- /a/sage-2019-11-19_05:29:27-rados-wip-sage-testing-2019-11-18-1656-distro-basic-smithi/4522662
11/19/2019
- 09:45 PM Backport #42796: luminous: unnecessary error message "calc_pg_upmaps failed to build overfull/und...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31598
merged - 03:24 PM Bug #42884: OSDMapTest.CleanPGUpmaps failure
- not reproducible locally. i am testing 9b61479da4f89014b6d1857287102bbc9db13e6e
- 02:00 PM Bug #42884 (New): OSDMapTest.CleanPGUpmaps failure
- During a make check build for a PR, I got this crash during OSD testing:
https://jenkins.ceph.com/job/ceph-pull-re... - 12:55 PM Backport #42846 (In Progress): nautilus: src/msg/async/net_handler.cc: Fix compilation
- 12:50 PM Backport #42739 (In Progress): nautilus: scrub object count mismatch on device_health_metrics pool
- 09:01 AM Bug #41669 (Resolved): Make dumping of reservation info congruent between scrub and recovery
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 09:01 AM Bug #41924 (Resolved): asynchronous recovery can not function under certain circumstances
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 09:00 AM Bug #42015 (Resolved): Remove unused full and nearful output from OSDMap summary
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 09:00 AM Backport #42879 (Resolved): mimic: ceph_test_admin_socket_output fails in rados qa suite
- https://github.com/ceph/ceph/pull/33323
- 09:00 AM Backport #42878 (Resolved): nautilus: ceph_test_admin_socket_output fails in rados qa suite
- https://github.com/ceph/ceph/pull/32063
- 08:54 AM Backport #41785 (Resolved): nautilus: Make dumping of reservation info congruent between scrub an...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31444
m... - 08:39 AM Backport #42141 (Resolved): nautilus: asynchronous recovery can not function under certain circum...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31077
m... - 08:39 AM Backport #42136 (Resolved): nautilus: Remove unused full and nearful output from OSDMap summary
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30900
m... - 06:25 AM Bug #42387 (Pending Backport): ceph_test_admin_socket_output fails in rados qa suite
11/18/2019
- 09:45 PM Bug #42830: problem returning mon to cluster
- I forgot, Ceph is at version 14.2.1 on our side.
- 09:44 PM Bug #42830: problem returning mon to cluster
- We encountered the same problem last week, after stopping a monitor service on a server on the cluster, trying to sta...
- 09:13 PM Bug #42387 (In Progress): ceph_test_admin_socket_output fails in rados qa suite
- 09:09 AM Bug #42387 (Fix Under Review): ceph_test_admin_socket_output fails in rados qa suite
- 08:14 PM Bug #42782: nautilus: rados/test_librados_build.sh build failure
- https://github.com/ceph/ceph/pull/31604 merged
- 08:13 PM Backport #41785: nautilus: Make dumping of reservation info congruent between scrub and recovery
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31444
merged - 08:08 PM Backport #42141: nautilus: asynchronous recovery can not function under certain circumstances
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31077
merged - 08:05 PM Backport #42136: nautilus: Remove unused full and nearful output from OSDMap summary
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30900
merged - 08:29 AM Bug #42592 (Duplicate): ceph-mon/mgr PGstat Segmentation Fault
- 07:43 AM Bug #42577 (Fix Under Review): acting_recovery_backfill won't catch all up peers
- 07:14 AM Bug #42861 (Fix Under Review): Libceph-common.so needs to use private link attribute when includi...
- Libceph-common.so does not specify a link attribute containing the dpdk library,
dpdk global variables and function...
11/17/2019
- 10:27 PM Bug #42477: Rados should use the '-o outfile' convention
- Note that the reason I did not set this for backport is that it has the potential to break existing scripts and funct...
- 06:02 PM Bug #42082 (Resolved): pybind/rados: set_omap() crash on py3
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 06:00 PM Backport #42853 (Resolved): nautilus: format error: ceph osd stat --format=json
- https://github.com/ceph/ceph/pull/32062
- 06:00 PM Backport #42852 (Resolved): mimic: format error: ceph osd stat --format=json
- https://github.com/ceph/ceph/pull/33322
- 05:59 PM Backport #42848 (Rejected): nautilus: "failing miserably..." in Infiniband.cc
- 05:59 PM Backport #42847 (Rejected): mimic: "failing miserably..." in Infiniband.cc
- 05:59 PM Backport #42846 (Resolved): nautilus: src/msg/async/net_handler.cc: Fix compilation
- https://github.com/ceph/ceph/pull/31736
- 05:53 PM Bug #42821 (Pending Backport): src/msg/async/net_handler.cc: Fix compilation
- 04:08 PM Bug #42845 (New): CVE-2019-14818
- https://nvd.nist.gov/vuln/detail/CVE-2019-14818 affects many versions of `dpdk`.
In ceph, you appear to bundle ver...
11/16/2019
- 05:16 PM Bug #42742 (Pending Backport): "failing miserably..." in Infiniband.cc
- 05:12 PM Bug #42477 (Pending Backport): Rados should use the '-o outfile' convention
- 05:09 PM Bug #42501 (Pending Backport): format error: ceph osd stat --format=json
- 05:02 PM Bug #42501 (Resolved): format error: ceph osd stat --format=json
- 04:55 PM Bug #42689 (Duplicate): nautilus mon/mgr: ceph status:pool number display is not right
- 06:38 AM Bug #42332 (Resolved): CephContext::CephContextServiceThread might pause for 5 seconds at shutdown
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 06:38 AM Bug #42360 (Resolved): python3-cephfs should provide python36-cephfs
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 06:34 AM Backport #42363 (Resolved): nautilus: python3-cephfs should provide python36-cephfs
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30983
m... - 06:33 AM Backport #42395 (Resolved): nautilus: CephContext::CephContextServiceThread might pause for 5 sec...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31097
m...
11/15/2019
- 10:56 PM Documentation #12059: rados/troubleshooting/troubleshooting-mon: suprious `` in titles
- Looks like this was fixed by https://github.com/ceph/ceph/pull/5004, not https://github.com/ceph/ceph/pull/4988.
- 10:53 PM Documentation #12059 (Resolved): rados/troubleshooting/troubleshooting-mon: suprious `` in titles
- 08:37 AM Documentation #12059: rados/troubleshooting/troubleshooting-mon: suprious `` in titles
- This issue has already been resolved. Updated the pull request ID for the same.
- 10:39 PM Backport #42363: nautilus: python3-cephfs should provide python36-cephfs
- Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/30983
merged - 10:38 PM Backport #42395: nautilus: CephContext::CephContextServiceThread might pause for 5 seconds at shu...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31097
merged - 09:46 PM Bug #42824 (Resolved): mimic: rebuild_mondb.cc: FAILED assert(0) in update_osdmap()
- 05:58 PM Bug #42824 (Fix Under Review): mimic: rebuild_mondb.cc: FAILED assert(0) in update_osdmap()
- 05:39 AM Bug #42824: mimic: rebuild_mondb.cc: FAILED assert(0) in update_osdmap()
- ...
- 07:31 AM Bug #42830 (New): problem returning mon to cluster
- as discussed on the list, here https://www.spinics.net/lists/ceph-users/msg55977.html
After rebooting one of the n... - 03:42 AM Bug #42821 (Fix Under Review): src/msg/async/net_handler.cc: Fix compilation
11/14/2019
- 09:58 PM Bug #42824 (Resolved): mimic: rebuild_mondb.cc: FAILED assert(0) in update_osdmap()
- The assert got added recently https://github.com/ceph/ceph/commit/43662bd4266a4fcc8db62f4bc9beb0de78eef355#diff-431a9...
- 09:37 PM Bug #22656: scrub mismatch on bytes (cache pools)
- mimic with cache pools: /a/yuriw-2019-11-09_18:55:35-rados-wip-yuri-mimic_13.2.7_RC2-distro-basic-smithi/4489809/
... - 07:33 PM Bug #42821 (Resolved): src/msg/async/net_handler.cc: Fix compilation
- On a Cray system I'm working on, it seems that SO_PRIORITY is defined, but IPTOS_CLASS_CS6 is not. Without this patch...
- 05:17 PM Backport #41092: nautilus: rocksdb: enable rocksdb_rmrange=true by default and make delete range ...
- partially reverted by https://github.com/ceph/ceph/pull/31612
- 04:44 PM Bug #37875: osdmaps aren't being cleaned up automatically on healthy cluster
- I think this is when this started:...
- 03:46 PM Bug #37875: osdmaps aren't being cleaned up automatically on healthy cluster
- I may have found more about what's causing this.
I have a cluster with ~1600 uncleaned maps. ... - 04:14 PM Bug #42783 (Fix Under Review): test failure: due to client closed connection
- This is just the 'mon tell' implementation sucking up through nautilus. it's rewritten to not suck for octopus.
F... - 03:54 PM Bug #24531 (Resolved): Mimic MONs have slow/long running ops
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:54 PM Bug #37654 (Resolved): FAILED ceph_assert(info.history.same_interval_since != 0) in PG::start_pee...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:53 PM Bug #38483 (Resolved): FAILED ceph_assert(p != pg_slots.end()) in OSDShard::register_and_wake_spl...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:53 PM Feature #39162 (Resolved): Improvements to standalone tests.
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:52 PM Bug #40287 (Resolved): OSDMonitor: missing `pool_id` field in `osd pool ls` command
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:51 PM Bug #40620 (Resolved): Explicitly requested repair of an inconsistent PG cannot be scheduled time...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:51 PM Feature #40870 (Resolved): Implement mon_memory_target
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:50 PM Bug #41210 (Resolved): osd: failure result of do_osd_ops not logged in prepare_transaction function
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:49 PM Bug #41330 (Resolved): hidden corei7 requirement in binary packages
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:49 PM Bug #41816 (Resolved): Enable auto-scaler and get src/osd/PeeringState.cc:3671: failed assert inf...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:48 PM Bug #42052 (Resolved): mgr/balancer FAILED ceph_assert(osd_weight.count(i.first))
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:48 PM Bug #42111 (Resolved): max_size from crushmap ignored when increasing size on pool
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:37 PM Backport #42547 (Resolved): nautilus: verify_upmaps can not cancel invalid upmap_items in some cases
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30899
m... - 03:37 PM Backport #42126 (Resolved): nautilus: mgr/balancer FAILED ceph_assert(osd_weight.count(i.first))
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30899
m... - 03:36 PM Backport #41238 (Resolved): nautilus: Implement mon_memory_target
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30419
m... - 03:34 PM Backport #42326 (Resolved): nautilus: max_size from crushmap ignored when increasing size on pool
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30941
m... - 03:34 PM Backport #42242: nautilus: Adding Placement Group id in Large omap log message
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30923
m... - 03:33 PM Backport #40840 (Resolved): nautilus: Explicitly requested repair of an inconsistent PG cannot be...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/29748
m... - 03:32 PM Backport #41917 (Resolved): nautilus: osd: failure result of do_osd_ops not logged in prepare_tra...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30546
m... - 03:32 PM Backport #39517 (Resolved): nautilus: Improvements to standalone tests.
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30528
m... - 03:31 PM Backport #42014 (Resolved): nautilus: Enable auto-scaler and get src/osd/PeeringState.cc:3671: fa...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30528
m... - 03:30 PM Backport #41921 (Resolved): nautilus: OSDMonitor: missing `pool_id` field in `osd pool ls` command
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30486
m... - 03:30 PM Backport #41862 (Resolved): nautilus: Mimic MONs have slow/long running ops
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30480
m... - 03:30 PM Backport #41712 (Resolved): nautilus: FAILED ceph_assert(p != pg_slots.end()) in OSDShard::regist...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30371
m... - 03:30 PM Backport #41640 (Resolved): nautilus: FAILED ceph_assert(info.history.same_interval_since != 0) i...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30280
m... - 12:36 PM Backport #41350 (Resolved): nautilus: hidden corei7 requirement in binary packages
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/29772
m... - 10:05 AM Bug #42810 (Resolved): ceph config rm does not revert debug_mon to default
- When we increase the debug_mon level to 10, then rm the setting, the effective log level is stuck at 10 in the ceph-m...
- 10:03 AM Feature #41359 (Resolved): Adding Placement Group id in Large omap log message
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 04:54 AM Bug #42387: ceph_test_admin_socket_output fails in rados qa suite
- As suspected this is due to the length of time the 'bench' command takes to run and return a result to the socket.
...
11/13/2019
- 10:44 PM Bug #42328: osd/PrimaryLogPG.cc: 3962: ceph_abort_msg("out of order op")
- /a/trociny-2019-10-15_07:49:13-rbd-master-distro-basic-smithi/4414497/
- 10:32 PM Bug #42503 (Closed): There are a lot of OSD downturns on this node. After PG is redistributed, a ...
- Yes, sometimes CRUSH selection fails when you have a very small number of choices compared to the number of required ...
- 10:27 PM Bug #42529 (Closed): memory bloat + OSD process crash
- 10:26 PM Bug #42577 (Rejected): acting_recovery_backfill won't catch all up peers
- Xie, feel free to reopen it with more explanation, if you still think this is a problem.
- 10:07 PM Backport #42242 (Resolved): nautilus: Adding Placement Group id in Large omap log message
- 08:31 PM Backport #42547: nautilus: verify_upmaps can not cancel invalid upmap_items in some cases
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30899
merged - 08:30 PM Backport #42126: nautilus: mgr/balancer FAILED ceph_assert(osd_weight.count(i.first))
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30899
merged - 08:25 PM Backport #41238: nautilus: Implement mon_memory_target
- Sridhar Seshasayee wrote:
> https://github.com/ceph/ceph/pull/30419
merged - 08:21 PM Backport #42326: nautilus: max_size from crushmap ignored when increasing size on pool
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30941
merged - 08:20 PM Feature #41359: Adding Placement Group id in Large omap log message
- https://github.com/ceph/ceph/pull/30923 merged
- 08:13 PM Backport #40840: nautilus: Explicitly requested repair of an inconsistent PG cannot be scheduled ...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29748
merged - 08:09 PM Backport #41917: nautilus: osd: failure result of do_osd_ops not logged in prepare_transaction fu...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30546
merged - 08:09 PM Backport #39517: nautilus: Improvements to standalone tests.
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30528
merged - 08:09 PM Backport #42014: nautilus: Enable auto-scaler and get src/osd/PeeringState.cc:3671: failed assert...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30528
merged - 08:07 PM Backport #41921: nautilus: OSDMonitor: missing `pool_id` field in `osd pool ls` command
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30486
merged - 08:07 PM Backport #41862: nautilus: Mimic MONs have slow/long running ops
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30480
merged - 08:06 PM Backport #41712: nautilus: FAILED ceph_assert(p != pg_slots.end()) in OSDShard::register_and_wake...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30371
merged - 08:06 PM Backport #41640: nautilus: FAILED ceph_assert(info.history.same_interval_since != 0) in PG::start...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30280
merged - 07:14 PM Bug #42783: test failure: due to client closed connection
- Marking this high, since this is showing up a lot on nautilus.
- 03:32 AM Bug #42783: test failure: due to client closed connection
- the connections were closed by client repeatly. i wonder if it's expected: we have "msgr-failures/fastclose.yaml". an...
- 03:29 AM Bug #42783 (Resolved): test failure: due to client closed connection
- on client side:...
- 04:45 PM Backport #41350: nautilus: hidden corei7 requirement in binary packages
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29772
merged - 01:46 PM Bug #42782: nautilus: rados/test_librados_build.sh build failure
- Note: when octopus is split off from master, it will start having this same problem.
- 01:46 PM Bug #42782: nautilus: rados/test_librados_build.sh build failure
- mimic and luminous already have this fix. It was backported from master before the nautilus release.
- 01:26 PM Bug #42782 (Fix Under Review): nautilus: rados/test_librados_build.sh build failure
- 12:41 PM Backport #42798 (In Progress): mimic: unnecessary error message "calc_pg_upmaps failed to build o...
- 12:25 PM Backport #42798 (Resolved): mimic: unnecessary error message "calc_pg_upmaps failed to build over...
- https://github.com/ceph/ceph/pull/31957
- 12:41 PM Backport #42797 (In Progress): nautilus: unnecessary error message "calc_pg_upmaps failed to buil...
- 12:25 PM Backport #42797 (Resolved): nautilus: unnecessary error message "calc_pg_upmaps failed to build o...
- https://github.com/ceph/ceph/pull/31956
- 12:40 PM Backport #42796 (In Progress): luminous: unnecessary error message "calc_pg_upmaps failed to buil...
- 12:24 PM Backport #42796 (Resolved): luminous: unnecessary error message "calc_pg_upmaps failed to build o...
- https://github.com/ceph/ceph/pull/31598
- 12:26 PM Bug #41680 (Resolved): Removed OSDs with outstanding peer failure reports crash the monitor
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 12:18 PM Backport #41695: nautilus: Network ping monitoring
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30195
m... - 03:33 AM Backport #41695 (Resolved): nautilus: Network ping monitoring
- 12:17 PM Backport #42152 (Resolved): nautilus: Removed OSDs with outstanding peer failure reports crash th...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30904
m... - 07:59 AM Bug #42387: ceph_test_admin_socket_output fails in rados qa suite
- It's only the bench command that causes the issue....
- 03:39 AM Feature #40640 (Resolved): Network ping monitoring
- 03:39 AM Backport #41697: luminous: Network ping monitoring
- Backporting this requires https://github.com/ceph/ceph/pull/31277
- 03:37 AM Backport #41696: mimic: Network ping monitoring
- Backporting this requires https://github.com/ceph/ceph/pull/31275 from https://tracker.ceph.com/issues/42570
11/12/2019
- 11:40 PM Backport #41695: nautilus: Network ping monitoring
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30195
merged - 11:40 PM Backport #42152: nautilus: Removed OSDs with outstanding peer failure reports crash the monitor
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30904
merged - 11:32 PM Bug #42782 (Resolved): nautilus: rados/test_librados_build.sh build failure
- ...
- 08:22 PM Bug #42780 (Resolved): recursive lock of OpTracker::lock (70)
- I was testing 2 cephfs clients vs. a vstart cluster and the osd crashed....
- 06:24 PM Bug #42756 (Pending Backport): unnecessary error message "calc_pg_upmaps failed to build overfull...
- 04:31 PM Bug #41362 (Resolved): Rados bench sequential and random read: not behaving as expected when op s...
- 07:06 AM Feature #41666 (Pending Backport): Issue a HEALTH_WARN when a Pool is configured with [min_]size ...
- PR #31416 - https://github.com/ceph/ceph/pull/31416 is now merged in to master.
- 03:37 AM Bug #42387 (New): ceph_test_admin_socket_output fails in rados qa suite
- ...
11/11/2019
- 11:00 PM Feature #14865: Permit cache eviction of watched object
- See [1] for an abandoned PR
[1] http://tracker.ceph.com/issues/14865 - 10:08 PM Support #42584 (Closed): MGR error: auth: could not find secret_id=<number>
- I'm closing this as I think it got addressed on the mailing list?
- 02:39 PM Support #42584: MGR error: auth: could not find secret_id=<number>
- This error message is *not* only written in active MGR log but in specific OSD logs, too.
- 09:45 PM Bug #42756 (Fix Under Review): unnecessary error message "calc_pg_upmaps failed to build overfull...
- 08:25 PM Bug #42756 (Resolved): unnecessary error message "calc_pg_upmaps failed to build overfull/underfull"
- After enabling ceph-mgr module balancer in upmap mode, we can see in ceph-mgr logs messages like:
-1 calc_pg_upmaps ... - 08:00 PM Bug #41191 (Resolved): osd: scrub error on big objects; make bluestore refuse to start on big obj...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 07:58 PM Bug #41936 (Resolved): scrub errors after quick split/merge cycle
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:48 PM Bug #42501 (Fix Under Review): format error: ceph osd stat --format=json
- 02:24 PM Bug #42742 (Resolved): "failing miserably..." in Infiniband.cc
- lockdep should be initialized before creating any mutex.
as RDMA is always enabled when building ceph. and global ... - 12:52 PM Backport #41958 (Resolved): nautilus: scrub errors after quick split/merge cycle
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30643
m... - 12:52 PM Backport #42095: nautilus: global osd crash in DynamicPerfStats::add_to_reports
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30648
m... - 12:51 PM Backport #41920 (Resolved): nautilus: osd: scrub error on big objects; make bluestore refuse to s...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30783
m... - 12:36 PM Backport #42739 (Resolved): nautilus: scrub object count mismatch on device_health_metrics pool
- https://github.com/ceph/ceph/pull/31735
11/09/2019
- 09:39 PM Bug #41383 (Pending Backport): scrub object count mismatch on device_health_metrics pool
- 01:22 AM Bug #42718 (Resolved): Improve OSDMap::calc_pg_upmaps() efficiency
We should eliminate the rules based pool sets being passed to calc_pg_upmaps()
Also, osdmaptool --upmap should b...
11/08/2019
- 09:50 PM Bug #42716 (Resolved): Pool creation error message is hidden on FileStore-backed pools
- When trying to create a pool with an incorrect PG number then the error message is hidden by a warning message.
os... - 05:48 PM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- ...
- 04:11 PM Bug #42175: _txc_add_transaction error (2) No such file or directory not handled on operation 15
Seen in luminous for final point release:
http://pulpito.ceph.com/yuriw-2019-11-08_02:53:57-rados-wip-yuri8-test...- 02:54 PM Bug #42706 (Can't reproduce): LibRadosList.EnumerateObjectsSplit fails
- ...
- 01:16 PM Bug #41891 (Resolved): global osd crash in DynamicPerfStats::add_to_reports
- 01:16 PM Backport #42095 (Resolved): nautilus: global osd crash in DynamicPerfStats::add_to_reports
- 03:59 AM Bug #42689 (Duplicate): nautilus mon/mgr: ceph status:pool number display is not right
- When I create a pool, and then I remove it. The pool number display is not right in ceph status dumpinfo.
!pool_info...
Also available in: Atom