Project

General

Profile

Activity

From 11/15/2019 to 12/14/2019

12/14/2019

08:28 AM Documentation #41389 (In Progress): wrong datatype describing crush_rule
Deepika Upadhyay
07:21 AM Documentation #41389 (Pending Backport): wrong datatype describing crush_rule
Deepika Upadhyay
02:42 AM Documentation #41389: wrong datatype describing crush_rule
Just needs a cherry-pick of 3ed3de6c964ba998d5b18ceb997d1a6dffe355db Neha Ojha
08:26 AM Backport #43315 (In Progress): mimic:wrong datatype describing crush_rule
Deepika Upadhyay
08:02 AM Backport #43315 (Resolved): mimic:wrong datatype describing crush_rule
https://github.com/ceph/ceph/pull/32255 Deepika Upadhyay
08:24 AM Backport #43316 (In Progress): nautilus:wrong datatype describing crush_rule
Deepika Upadhyay
08:03 AM Backport #43316 (Resolved): nautilus:wrong datatype describing crush_rule
https://github.com/ceph/ceph/pull/32254 Deepika Upadhyay
02:50 AM Bug #43307 (In Progress): Remove use of rules batching for upmap balancer
David Zafman
02:49 AM Bug #43312 (In Progress): Change default upmap_max_deviation to 5
David Zafman
02:06 AM Bug #43312 (Resolved): Change default upmap_max_deviation to 5
David Zafman
12:24 AM Bug #43311 (Resolved): asynchronous recovery + backfill might spin pg undersized for a long time
When an osd that is part of current up set gets chosen as an
async_recovery_target, it gets removed from the acting ...
xie xingguo
12:16 AM Bug #43308 (In Progress): negative num_objects can set PG_STATE_DEGRADED
Neha Ojha

12/13/2019

08:40 PM Bug #40963 (Resolved): mimic: MQuery during Deleting state
Sage Weil
08:40 PM Bug #41317 (Pending Backport): PeeringState::GoClean will call purge_strays unconditionally
Sage Weil
07:47 PM Bug #43308 (Resolved): negative num_objects can set PG_STATE_DEGRADED
... Neha Ojha
07:05 PM Bug #43296: Ceph assimilate-conf results in config entries which can not be removed
Alwin from Proxmox provided a work around but this still appears to be a bug:
https://forum.proxmox.com/threads/ceph...
David Herselman
04:51 PM Bug #43296: Ceph assimilate-conf results in config entries which can not be removed
Setting debug_rdb to 5/5 unfortunately doesn't reveal anything:
Commands:...
David Herselman
03:37 AM Bug #43296 (Resolved): Ceph assimilate-conf results in config entries which can not be removed
We assimilated our Ceph configuration file and subsequently have a minimal config file. We are subsequently not able ... David Herselman
04:31 PM Bug #43307 (Resolved): Remove use of rules batching for upmap balancer

Due to cost of calculations for very large PG/shard counts, we will settle for balancing each pool individually for...
David Zafman
03:43 PM Bug #25174 (Can't reproduce): osd: assert failure with FAILED assert(repop_queue.front() == repop...
Neha Ojha
02:43 PM Bug #43306 (Resolved): segv in collect_sys_info
Run: http://pulpito.ceph.com/teuthology-2019-12-13_02:25:03-upgrade:luminous-x-nautilus-distro-basic-smithi/
Job: '4...
Yuri Weinstein
02:40 PM Bug #43305 (Won't Fix): "psutil.NoSuchProcess process no longer exists" error in luminous-x-nauti...
Run: http://pulpito.ceph.com/teuthology-2019-12-13_02:25:03-upgrade:luminous-x-nautilus-distro-basic-smithi/
Jobs: '...
Yuri Weinstein
08:23 AM Backport #42259 (Resolved): nautilus: document new option mon_max_pg_per_osd
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31300
m...
Nathan Cutler
08:22 AM Backport #40947 (Resolved): luminous: Better default value for osd_snap_trim_sleep
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31857
m...
Nathan Cutler
08:22 AM Backport #38205 (Resolved): luminous: osds allows to partially start more than N+2
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31858
m...
Nathan Cutler
08:22 AM Backport #43093 (Resolved): luminous: Improve OSDMap::calc_pg_upmaps() efficiency
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31992
m...
Nathan Cutler
06:17 AM Bug #40712: ceph-mon crash with assert(err == 0) after rocksdb->get
we meet this problem recently.
we decline this related more to rocksdb but not ceph
huang jun

12/12/2019

04:41 PM Backport #40947: luminous: Better default value for osd_snap_trim_sleep
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31857
mergedReviewed-by: Josh Durgin <jdurgin@redhat.com>
Yuri Weinstein
04:41 PM Backport #38205: luminous: osds allows to partially start more than N+2
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31858
merged
Yuri Weinstein
04:40 PM Backport #43093: luminous: Improve OSDMap::calc_pg_upmaps() efficiency
David Zafman wrote:
> https://github.com/ceph/ceph/pull/31992
merged
Yuri Weinstein
10:16 AM Bug #43174: pgs inconsistent, union_shard_errors=missing
Greg thanks for the reply.
Greg Farnum wrote:
> If you fetch an object in RGW and its backing RADOS objects are m...
Aleksandr Rudenko
09:41 AM Bug #38330 (Resolved): osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
09:23 AM Backport #43119 (Resolved): mimic: osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/32000
m...
Nathan Cutler
08:44 AM Bug #43193: "ceph ping mon.<id>" cannot work
The command "ceph ping mon.a" or "ceph ping mon.b" or "ceph ping mon.c" works fine.
If the mon id is not specified, ...
Min Shi
05:31 AM Bug #41317 (Fix Under Review): PeeringState::GoClean will call purge_strays unconditionally
Neha Ojha
12:04 AM Bug #43267 (Rejected): unexpected error in BlueStore::_txc_add_transaction
Jeff Layton
12:02 AM Bug #43267: unexpected error in BlueStore::_txc_add_transaction
Nope, it was full. Well spotted:... Jeff Layton

12/11/2019

11:28 PM Bug #43267: unexpected error in BlueStore::_txc_add_transaction

This is caused by an out of space condition that won't usually happen. Check your BlueStore configuration.
Is ...
David Zafman
10:21 PM Bug #43267: unexpected error in BlueStore::_txc_add_transaction
This is simply out-of-space condition, see:
-6> 2019-12-11T16:13:44.466-0500 7fcbe4ecd700 -1 bluestore(/build/ce...
Igor Fedotov
09:39 PM Bug #43267 (Rejected): unexpected error in BlueStore::_txc_add_transaction
I was testing kcephfs vs. a vstart cluster and the OSD crashed. fsstress was running at the time, so it was being kep... Jeff Layton
10:26 PM Bug #43268 (New): Restrict admin socket commands more from the Ceph tool
https://bugzilla.redhat.com/show_bug.cgi?id=1780458
It sounds like we've given admin socket access to any cephx us...
Greg Farnum
10:17 PM Bug #43106 (Resolved): mimic: crash in build_incremental_map_msg
Marking this resolved as all the backports are now in place. Neha Ojha
10:17 PM Bug #43174 (Closed): pgs inconsistent, union_shard_errors=missing
If you fetch an object in RGW and its backing RADOS objects are missing, it just fills in the space with zeros. It so... Greg Farnum
10:15 PM Bug #43173 (Duplicate): pgs inconsistent, union_shard_errors=missing
Neha Ojha
08:07 PM Bug #43266 (Fix Under Review): common: admin socket compiler warning
Patrick Donnelly
08:03 PM Bug #43266 (Resolved): common: admin socket compiler warning
... Patrick Donnelly
01:38 PM Backport #43257 (Resolved): mimic: monitor config store: Deleting logging config settings does no...
https://github.com/ceph/ceph/pull/33327 Nathan Cutler
01:38 PM Backport #43256 (Resolved): nautilus: monitor config store: Deleting logging config settings does...
https://github.com/ceph/ceph/pull/32846 Nathan Cutler
04:05 AM Bug #42964 (Pending Backport): monitor config store: Deleting logging config settings does not de...
Sage Weil

12/10/2019

08:44 PM Backport #40890 (In Progress): mimic: Pool settings aren't populated to OSD after restart.
Nathan Cutler
08:41 PM Backport #40891 (In Progress): nautilus: Pool settings aren't populated to OSD after restart.
Nathan Cutler
08:34 PM Backport #43246 (Resolved): nautilus: Nearfull warnings are incorrect
https://github.com/ceph/ceph/pull/32773 Nathan Cutler
08:29 PM Backport #43245 (Resolved): nautilus: osd: increase priority in certain OSD perf counters
https://github.com/ceph/ceph/pull/32845 Nathan Cutler
08:25 PM Backport #43239 (Resolved): nautilus: ok-to-stop incorrect for some ec pgs
https://github.com/ceph/ceph/pull/32844 Nathan Cutler
08:24 PM Backport #43232 (Rejected): nautilus: pgs stuck in laggy state
Nathan Cutler
04:10 PM Bug #42346 (Pending Backport): Nearfull warnings are incorrect
David Zafman
03:26 PM Bug #42961 (Pending Backport): osd: increase priority in certain OSD perf counters
Neha Ojha
02:51 PM Bug #43189 (Pending Backport): pgs stuck in laggy state
I'm not sure whether we should backport this to nautilus or not. We only noticed qa failures because the new octopus... Sage Weil
02:50 PM Bug #43189 (Resolved): pgs stuck in laggy state
Sage Weil
01:48 AM Bug #43048: nautilus: upgrade/mimic-x/stress-split: failed to recover before timeout expired
/a/yuriw-2019-12-06_21:30:44-upgrade:mimic-x-nautilus-distro-basic-smithi/4576681 Neha Ojha

12/09/2019

10:07 PM Bug #43067: Git Master: src/compressor/zlib/ZlibCompressor.cc / src/compressor/zlib/CMakeLists.txt
Thanks Lee!
We generally do patch contributions through Github; can you submit a PR there?
If not, we need a spec...
Greg Farnum
09:53 PM Bug #43176 (Duplicate): pgs inconsistent, union_shard_errors=missing
Nathan Cutler
09:53 PM Bug #43175 (Duplicate): pgs inconsistent, union_shard_errors=missing
Nathan Cutler
09:35 PM Bug #43151 (Pending Backport): ok-to-stop incorrect for some ec pgs
Sage Weil
04:58 PM Bug #43189 (Fix Under Review): pgs stuck in laggy state
Sage Weil
03:15 PM Bug #43189: pgs stuck in laggy state
The problem is the role. The proc_lease() method does this check... Sage Weil
02:33 PM Bug #43189 (In Progress): pgs stuck in laggy state
Sage Weil
04:50 PM Bug #43213 (New): OSDMap::pg_to_up_acting etc specify primary as osd, not pg_shard_t(osd+shard)
The OSD methods to map a PG return primary as an int, not pg_shard_t (osd + shard).
Objecter compensates for this ...
Sage Weil
04:06 PM Bug #40963: mimic: MQuery during Deleting state
/a/sage-2019-12-08_05:43:33-rados-nautilus-distro-basic-smithi/4580545 Neha Ojha
12:59 PM Backport #40890: mimic: Pool settings aren't populated to OSD after restart.
Here's my attempt at the backport: https://github.com/ceph/ceph/pull/32125 Dan van der Ster
12:53 PM Backport #40891: nautilus: Pool settings aren't populated to OSD after restart.
Here's my attempt at the backport: https://github.com/ceph/ceph/pull/32123 Dan van der Ster
08:55 AM Bug #43193 (Rejected): "ceph ping mon.<id>" cannot work
The command "ceph ping mon.<id>" returns an error output:... Min Shi
06:35 AM Bug #42706: LibRadosList.EnumerateObjectsSplit fails
rados_cluster handler will be freed if set_pg_num failed,... huang jun
03:35 AM Bug #42861: Libceph-common.so needs to use private link attribute when including dpdk static library
The dpdk library initializes the EAL using constructors and global
variables, and cannot be re-initialized. Both tes...
chunsong feng

12/08/2019

11:22 PM Bug #43190 (New): qa/standalone/osd/osd-recovery-prio.sh has a race

http://pulpito.ceph.com/dzafman-2019-12-08_11:51:45-rados-master-distro-basic-smithi/4582053/
The test expected ...
David Zafman
09:25 PM Bug #43189: pgs stuck in laggy state
more logs here:
/a/sage-2019-12-07_18:31:18-rados:thrash-erasure-code-wip-sage3-testing-2019-12-05-0959-distro-basic...
Sage Weil
09:23 PM Bug #43189 (Resolved): pgs stuck in laggy state
... Sage Weil

12/07/2019

06:28 PM Bug #43150 (Resolved): osd-scrub-snaps.sh fails
Sage Weil
02:47 PM Bug #41313: PG distribution completely messed up since Nautilus
ceph balancer status
{
"active": true,
"plans": [],
"mode": "upmap"
}
bad distribution:
<p...
Anonymous
02:45 PM Bug #43185: ceph -s not showing client activity
ceph -s only looks like this:
ceph -s
cluster:
id: c4068f25-d46d-438d-af63-5679a2d56efb
health: H...
Anonymous
02:44 PM Bug #43185 (Resolved): ceph -s not showing client activity
Since Nautilus upgrade ceph -s often (2 out of 3 times) does not show any client or recovery activity. Right now it's... Anonymous

12/06/2019

05:21 PM Bug #42964 (Fix Under Review): monitor config store: Deleting logging config settings does not de...
Sage Weil
04:07 PM Bug #42347: nautilus assert during osd shutdown: FAILED ceph_assert((sharded_in_flight_list.back(...
Seen in this scrub test run during osd-scrub-repair.sh.
http://pulpito.ceph.com/dzafman-2019-12-05_19:53:40-rados-...
David Zafman
02:01 PM Bug #43176 (Duplicate): pgs inconsistent, union_shard_errors=missing
Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
...
Aleksandr Rudenko
02:01 PM Bug #43175 (Duplicate): pgs inconsistent, union_shard_errors=missing
Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
...
Aleksandr Rudenko
02:01 PM Bug #43174 (Resolved): pgs inconsistent, union_shard_errors=missing
Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
...
Aleksandr Rudenko
02:00 PM Bug #43173 (Duplicate): pgs inconsistent, union_shard_errors=missing
Hi,
Luminous 12.2.12.
2/3 OSDs - Filestore, 1/3 - Bluestore
size=3, min_size=2
Cluster used as S3 (RadosGW).
...
Aleksandr Rudenko
12:55 PM Backport #42997 (In Progress): nautilus: acting_recovery_backfill won't catch all up peers
Nathan Cutler
12:48 PM Backport #42878 (In Progress): nautilus: ceph_test_admin_socket_output fails in rados qa suite
Nathan Cutler
12:48 PM Backport #42853 (In Progress): nautilus: format error: ceph osd stat --format=json
Nathan Cutler
12:47 PM Backport #42847 (Need More Info): mimic: "failing miserably..." in Infiniband.cc
non-trivial Nathan Cutler
12:47 PM Backport #42848 (Need More Info): nautilus: "failing miserably..." in Infiniband.cc
non-trivial Nathan Cutler
04:23 AM Bug #38069: upgrade:jewel-x-luminous with short_pg_log.yaml fails with assert(s <= can_rollback_to)
Oops. I think the more significant issue is that short_pg_log.yaml isn't involved. David Zafman
02:09 AM Bug #38069: upgrade:jewel-x-luminous with short_pg_log.yaml fails with assert(s <= can_rollback_to)
David Zafman wrote:
> Seen in a non-upgrade test:
This is an upgrade test: "rados/upgrade/jewel-x-singleton/{0-c...
Neha Ojha
02:00 AM Bug #38069: upgrade:jewel-x-luminous with short_pg_log.yaml fails with assert(s <= can_rollback_to)
Seen in a -non-upgrade- test with description:
rados/upgrade/jewel-x-singleton/{0-cluster/{openstack.yaml start.ya...
David Zafman

12/05/2019

11:28 PM Bug #41240 (Can't reproduce): All of the cluster SSDs aborted at around the same time and will no...
Brad Hubbard
09:37 PM Bug #41240 (New): All of the cluster SSDs aborted at around the same time and will not start.
Patrick Donnelly
11:24 PM Bug #38892 (Closed): /ceph/src/tools/kvstore_tool.cc:266:1: internal compiler error: Segmentation...
Brad Hubbard
09:45 PM Bug #38892 (Fix Under Review): /ceph/src/tools/kvstore_tool.cc:266:1: internal compiler error: Se...
Patrick Donnelly
09:44 PM Bug #23590 (Fix Under Review): kstore: statfs: (95) Operation not supported
Patrick Donnelly
09:44 PM Bug #23297 (Fix Under Review): mon-seesaw 'failed to become clean before timeout' due to laggy pg...
Patrick Donnelly
09:43 PM Bug #13111 (Fix Under Review): replicatedPG:the assert occurs in the fuction ReplicatedPG::on_loc...
Patrick Donnelly
09:40 PM Feature #38653 (New): Enhance health message when pool quota fills up
Patrick Donnelly
09:40 PM Bug #38783 (New): Changing mon_pg_warn_max_object_skew has no effect.
Patrick Donnelly
09:40 PM Feature #3764 (New): osd: async replicas
Patrick Donnelly
09:37 PM Bug #43048 (New): nautilus: upgrade/mimic-x/stress-split: failed to recover before timeout expired
Patrick Donnelly
09:37 PM Bug #42918 (New): memory corruption and lockups with I-Object
Patrick Donnelly
09:37 PM Bug #42780 (New): recursive lock of OpTracker::lock (70)
Patrick Donnelly
09:37 PM Bug #42706 (New): LibRadosList.EnumerateObjectsSplit fails
Patrick Donnelly
09:37 PM Bug #42666 (New): mgropen from mgr comes from unknown.$id instead of mgr.$id
Patrick Donnelly
09:37 PM Bug #42186 (New): "2019-10-04T19:31:51.053283+0000 osd.7 (osd.7) 108 : cluster [ERR] 2.5s0 shard ...
Patrick Donnelly
09:37 PM Bug #41406 (New): common: SafeTimer reinit doesn't fix up "stopping" bool, used in MonClient boot...
Patrick Donnelly
09:37 PM Bug #40963 (New): mimic: MQuery during Deleting state
Patrick Donnelly
06:31 PM Bug #40963: mimic: MQuery during Deleting state
yuriw-2019-12-04_22:44:10-rados-wip-yuri2-testing-2019-12-04-1938-mimic-distro-basic-smithi/4567200/
DeleteStart e...
David Zafman
09:37 PM Bug #40868 (New): src/common/config_proxy.h: 70: FAILED ceph_assert(p != obs_call_gate.end())
Patrick Donnelly
09:37 PM Bug #40820 (New): standalone/scrub/osd-scrub-test.sh +3 day failed assert
Patrick Donnelly
09:37 PM Bug #40666 (New): osd fails to get latest map
Patrick Donnelly
09:37 PM Fix #40564 (New): Objecter does not have perfcounters for op latency
Patrick Donnelly
09:37 PM Bug #40522 (New): on_local_recover doesn't touch?
Patrick Donnelly
09:37 PM Bug #40454 (New): snap_mapper error, scrub gets r -2..repaired
Patrick Donnelly
09:37 PM Bug #40521 (New): cli timeout (e.g., ceph pg dump)
Patrick Donnelly
09:37 PM Bug #40367 (New): "*** Caught signal (Segmentation fault) **" in upgrade:luminous-x-nautilus
Patrick Donnelly
09:37 PM Bug #40410 (New): ceph pg query Segmentation fault in 12.2.10
Patrick Donnelly
09:36 PM Feature #39966 (New): mon: allow log messages to be throttled and/or force trimming
Patrick Donnelly
09:36 PM Bug #40000 (New): osds do not bound xattrs and/or aggregate xattr data in pg log
Patrick Donnelly
09:36 PM Bug #39366 (New): ClsLock.TestRenew failure
Patrick Donnelly
09:36 PM Bug #39145 (New): luminous: jewel-x-singleton: FAILED assert(0 == "we got a bad state machine eve...
Patrick Donnelly
09:36 PM Bug #39148 (New): luminous: powercycle: reached maximum tries (500) after waiting for 3000 seconds
Patrick Donnelly
09:36 PM Bug #39039 (New): mon connection reset, command not resent
Patrick Donnelly
09:36 PM Fix #39071 (New): monclient: initial probe is non-optimal with v2+v1
Patrick Donnelly
09:36 PM Bug #38656 (New): scrub reservation leak?
Patrick Donnelly
09:36 PM Bug #38718 (New): 'osd crush weight-set create-compat' (and other OSDMonitor commands) can leak u...
Patrick Donnelly
09:36 PM Bug #38624 (New): crush: get_rule_weight_osd_map does not handle multi-take rules
Patrick Donnelly
09:36 PM Bug #38513 (New): luminous: "AsyncReserver.h: 190: FAILED assert(!queue_pointers.count(item) && !...
Patrick Donnelly
09:36 PM Bug #38402 (New): ceph-objectstore-tool on down osd w/ not enough in osds
Patrick Donnelly
09:36 PM Bug #38417 (New): ceph tell mon.a help timeout
Patrick Donnelly
09:36 PM Bug #38357 (New): ClsLock.TestExclusiveEphemeralStealEphemeral failed
Patrick Donnelly
09:36 PM Bug #38358 (New): short pg log + cache tier ceph_test_rados out of order reply
Patrick Donnelly
09:36 PM Bug #38195 (New): osd-backfill-space.sh exposes rocksdb hang
Patrick Donnelly
09:36 PM Bug #38345 (New): mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
Patrick Donnelly
09:36 PM Bug #38184 (New): osd: recovery does not preserve copy-on-write allocations between object clones...
Patrick Donnelly
09:36 PM Bug #38159 (New): ec does not recover below min_size
Patrick Donnelly
09:36 PM Bug #38172 (New): segv in rocksdb NewIterator
Patrick Donnelly
09:36 PM Bug #38151 (New): cephx: service ticket validity dobuled
Patrick Donnelly
09:36 PM Bug #38082 (New): mimic: mon/caps.sh fails with "Expected return 0, got 110"
Patrick Donnelly
09:36 PM Bug #38064 (New): librados::OPERATION_FULL_TRY not completely implemented, test LibRadosAio.PoolQ...
Patrick Donnelly
09:36 PM Bug #37582 (New): luminous: ceph -s client gets all mgrmaps
Patrick Donnelly
09:36 PM Bug #37532 (New): mon: expected_num_objects warning triggers on bluestore-only setups
Patrick Donnelly
09:36 PM Bug #37509 (New): require past_interval bounds mismatch due to osd oldest_map
Patrick Donnelly
09:36 PM Bug #36748 (New): ms_deliver_verify_authorizer no AuthAuthorizeHandler found for protocol 0
Patrick Donnelly
09:36 PM Bug #37289 (New): Issue with overfilled OSD for cache-tier pools
Patrick Donnelly
09:36 PM Bug #36634 (New): LibRadosWatchNotify.WatchNotify2Timeout failure
Patrick Donnelly
09:36 PM Bug #36337 (New): OSDs crash with failed assertion in PGLog::merge_log as logs do not overlap
Patrick Donnelly
09:36 PM Bug #36164 (New): cephtool/test fails 'ceph tell mon.a help' with EINTR
Patrick Donnelly
09:36 PM Bug #36113 (New): fusestore test umount failed?
Patrick Donnelly
09:36 PM Bug #35075 (New): copy-get stuck sending osd_op
Patrick Donnelly
09:36 PM Bug #36040 (New): mon: Valgrind: mon (InvalidFree, InvalidWrite, InvalidRead)
Patrick Donnelly
09:36 PM Bug #24874 (New): ec fast reads can trigger read errors in log
Patrick Donnelly
09:36 PM Bug #26891 (New): backfill reservation deadlock/stall
Patrick Donnelly
09:36 PM Bug #24242 (New): tcmalloc::ThreadCache::ReleaseToCentralCache on rhel (w/ centos packages)
Patrick Donnelly
09:36 PM Bug #24339 (New): FULL_FORCE ops are dropped if fail-safe full check fails, but not resent in sca...
Patrick Donnelly
09:36 PM Bug #23965 (New): FAIL: s3tests.functional.test_s3.test_multipart_upload_resend_part with ec cach...
Patrick Donnelly
09:36 PM Bug #23857 (New): flush (manifest) vs async recovery causes out of order op
Patrick Donnelly
09:36 PM Bug #23879 (New): test_mon_osdmap_prune.sh fails
Patrick Donnelly
09:36 PM Bug #23828 (New): ec gen object leaks into different filestore collection just after split
Patrick Donnelly
09:36 PM Bug #23760 (New): mon: `config get <who>` does not allow `who` as 'mon'/'osd'
Patrick Donnelly
09:36 PM Bug #23767 (New): "ceph ping mon" doesn't work
Patrick Donnelly
09:36 PM Bug #23270 (New): failed mutex assert in PipeConnection::try_get_pipe() (via OSD::do_command())
Patrick Donnelly
09:36 PM Bug #23428 (New): Snapset inconsistency is hard to diagnose because authoritative copy used by li...
Patrick Donnelly
09:36 PM Bug #23029 (New): osd does not handle eio on meta objects (e.g., osdmap)
Patrick Donnelly
09:36 PM Bug #22656 (New): scrub mismatch on bytes (cache pools)
Patrick Donnelly
09:36 PM Bug #21592 (New): LibRadosCWriteOps.CmpExt got 0 instead of -4095-1
Patrick Donnelly
09:36 PM Bug #21495 (New): src/osd/OSD.cc: 346: FAILED assert(piter != rev_pending_splits.end())
Patrick Donnelly
09:36 PM Bug #21129 (New): 'ceph -s' hang
Patrick Donnelly
09:36 PM Bug #21194 (New): mon clock skew test is fragile
Patrick Donnelly
09:36 PM Bug #20960 (New): ceph_test_rados: mismatched version (due to pg import/export)
Patrick Donnelly
09:35 PM Bug #20952 (New): Glitchy monitor quorum causes spurious test failure
Patrick Donnelly
09:35 PM Bug #20922 (New): misdirected op with localize_reads set
Patrick Donnelly
09:35 PM Bug #20846 (New): ceph_test_rados_list_parallel: options dtor racing with DispatchQueue lockdep -...
Patrick Donnelly
09:35 PM Bug #20770 (New): test_pidfile.sh test is failing 2 places
Patrick Donnelly
09:35 PM Bug #20730 (New): need new OSD_SKEWED_USAGE implementation
Patrick Donnelly
09:35 PM Bug #20370 (New): leaked MOSDOp via PrimaryLogPG::_copy_some and PrimaryLogPG::do_proxy_write
Patrick Donnelly
09:35 PM Bug #20646 (New): run_seed_to_range.sh: segv, tp_fstore_op timeout
Patrick Donnelly
09:35 PM Bug #20360 (New): rados/verify valgrind tests: osds fail to start (xenial valgrind)
Patrick Donnelly
09:35 PM Bug #20369 (New): segv in OSD::ShardedOpWQ::_process
Patrick Donnelly
09:35 PM Bug #20221 (New): kill osd + osd out leads to stale PGs
Patrick Donnelly
09:35 PM Bug #20169 (New): filestore+btrfs occasionally returns ENOSPC
Patrick Donnelly
09:35 PM Bug #20053 (New): crush compile / decompile looses precision on weight
Patrick Donnelly
09:35 PM Bug #19700 (New): OSD remained up despite cluster network being inactive?
Patrick Donnelly
09:35 PM Bug #19486 (New): Rebalancing can propagate corrupt copy of replicated object
Patrick Donnelly
09:35 PM Bug #19518 (New): log entry does not include per-op rvals?
Patrick Donnelly
09:35 PM Bug #19440 (New): osd: trims maps taht pgs haven't consumed yet when there are gaps
Patrick Donnelly
09:35 PM Bug #17257 (New): ceph_test_rados_api_lock fails LibRadosLockPP.LockExclusiveDurPP
Patrick Donnelly
09:35 PM Bug #15015 (New): prepare_new_pool doesn't return failure string ss
Patrick Donnelly
09:35 PM Bug #14115 (New): crypto: race in nss init
Patrick Donnelly
09:35 PM Bug #13385 (New): cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final ro...
Patrick Donnelly
09:35 PM Bug #12687 (New): osd thrashing + pg import/export can cause maybe_went_rw intervals to be missed
Patrick Donnelly
09:35 PM Bug #12615 (New): Repair of Erasure Coded pool with an unrepairable object causes pg state to los...
Patrick Donnelly
09:35 PM Bug #11235 (New): test_rados.py test_aio_read is racy
Patrick Donnelly
09:35 PM Bug #9606 (New): mon: ambiguous error_status returned to user when type is wrong in a command
Patrick Donnelly
08:31 PM Bug #43151 (Fix Under Review): ok-to-stop incorrect for some ec pgs
Sage Weil
04:33 PM Bug #43151 (Resolved): ok-to-stop incorrect for some ec pgs
before,... Sage Weil
08:16 PM Backport #43119: mimic: osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/32000
merged
Yuri Weinstein
08:01 PM Backport #41238: nautilus: Implement mon_memory_target
Follow-on fix: https://github.com/ceph/ceph/pull/32045 Neha Ojha
08:00 PM Feature #40870: Implement mon_memory_target
This has a follow-on fix: https://github.com/ceph/ceph/pull/32044 Neha Ojha
06:01 PM Bug #38040: osd_map_message_max default is too high?
Luminous backport analysis:
* https://github.com/ceph/ceph/pull/26340 - two of three commits backported to luminou...
Nathan Cutler
05:50 PM Bug #43150 (In Progress): osd-scrub-snaps.sh fails
David Zafman
05:21 PM Bug #43150: osd-scrub-snaps.sh fails
During testing I saw this even though it isn't what happened in the teuthology runs. I think in all cases we have sc... David Zafman
03:51 PM Bug #43150 (Resolved): osd-scrub-snaps.sh fails
/a/sage-2019-12-04_19:33:15-rados-wip-sage2-testing-2019-12-04-0856-distro-basic-smithi/4567061
/a/sage-2019-12-04_1...
Sage Weil
05:41 PM Bug #43106: mimic: crash in build_incremental_map_msg
The three PRs that need to be backported to mimic are:
* https://github.com/ceph/ceph/pull/26340 - backported to m...
Nathan Cutler
01:41 PM Backport #43140 (In Progress): nautilus: ceph-mon --mkfs: public_address type (v1|v2) is not resp...
Nathan Cutler
11:07 AM Backport #43140 (Resolved): nautilus: ceph-mon --mkfs: public_address type (v1|v2) is not respected
https://github.com/ceph/ceph/pull/32028 Nathan Cutler
01:34 PM Bug #42485: verify_upmaps can not cancel invalid upmap_items in some cases
NOTE: https://github.com/ceph/ceph/pull/31131 was merged to master and backported to nautilus and luminous, before it... Nathan Cutler
04:04 AM Bug #42485 (Resolved): verify_upmaps can not cancel invalid upmap_items in some cases
David Zafman
01:32 PM Backport #42547: nautilus: verify_upmaps can not cancel invalid upmap_items in some cases
NOTE: reverted by https://github.com/ceph/ceph/pull/32018 Nathan Cutler
01:30 PM Backport #42548: luminous: verify_upmaps can not cancel invalid upmap_items in some cases
Note: reverted by https://github.com/ceph/ceph/pull/32019 Nathan Cutler
07:52 AM Bug #42906 (Pending Backport): ceph-mon --mkfs: public_address type (v1|v2) is not respected
Kefu Chai
06:22 AM Bug #37968 (Resolved): maybe_remove_pg_upmaps incorrectly cancels valid pending upmaps
David Zafman
06:21 AM Backport #38163 (Resolved): mimic: maybe_remove_pg_upmaps incorrectly cancels valid pending upmaps
David Zafman
04:04 AM Backport #42546 (Rejected): mimic: verify_upmaps can not cancel invalid upmap_items in some cases
This change has been reverted so we won't backport. David Zafman
12:43 AM Bug #43124: Probably legal crush rules cause upmaps to be cleaned

We are reverting the original pull request which changed verify_upmaps(): https://github.com/ceph/ceph/pull/31131
...
David Zafman

12/04/2019

08:51 PM Bug #43126 (Resolved): OSD_SLOW_PING_TIME_BACK nits

From Sage e-mail:
Long heartbeat ping times on back interface seen, longest is 1315.510 msec (OSD_SLOW_PING_TIME...
David Zafman
08:46 PM Bug #43124 (Resolved): Probably legal crush rules cause upmaps to be cleaned
I've seen multiple user sites with crush rules for EC pools which will trigger the verify_upmap() to detect an error.... David Zafman
08:24 PM Backport #42546 (In Progress): mimic: verify_upmaps can not cancel invalid upmap_items in some cases
David Zafman
08:13 PM Backport #42546 (Resolved): mimic: verify_upmaps can not cancel invalid upmap_items in some cases
David Zafman
12:13 PM Bug #38330: osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
@Dan, @Neha - mimic backport staged at https://github.com/ceph/ceph/pull/26448 Nathan Cutler
02:33 AM Bug #38330 (Pending Backport): osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
Based on https://tracker.ceph.com/issues/43106#note-1 and https://tracker.ceph.com/issues/38282#note-14 Neha Ojha
12:11 PM Backport #43119 (In Progress): mimic: osd/OSD.cc: 1515: abort() in Service::build_incremental_map...
Nathan Cutler
12:08 PM Backport #43119 (Resolved): mimic: osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
https://github.com/ceph/ceph/pull/32000 Nathan Cutler
02:30 AM Bug #43106: mimic: crash in build_incremental_map_msg
I think you are right. We should have backported all three PRs according to https://tracker.ceph.com/issues/38040#not... Neha Ojha

12/03/2019

07:37 PM Bug #43110 (Duplicate): rados/test.sh failure: ceph_test_rados_api_watch_notify_pp
https://tracker.ceph.com/issues/42933 Greg Farnum
07:27 PM Bug #43110: rados/test.sh failure: ceph_test_rados_api_watch_notify_pp
Neha pointed out the core info is obviously helpful:
> 1575385919.6406.core: ELF 64-bit LSB core file x86-64, vers...
Greg Farnum
07:18 PM Bug #43110 (Duplicate): rados/test.sh failure: ceph_test_rados_api_watch_notify_pp
I noticed this in a branch of my own, but it appears to be showing up in the master smoke tests too.
rados/test.sh...
Greg Farnum
05:25 PM Backport #43093: luminous: Improve OSDMap::calc_pg_upmaps() efficiency
@David Does this need https://github.com/ceph/ceph/pull/31944 as well? Nathan Cutler
05:05 PM Backport #43093 (In Progress): luminous: Improve OSDMap::calc_pg_upmaps() efficiency
David Zafman
03:40 PM Bug #43106 (Resolved): mimic: crash in build_incremental_map_msg
Since upgrading from 13.2.6 to 13.2.7 we get this around once per 10 minutes on a cluster with 500 out of 1500 OSDs u... Dan van der Ster
03:27 PM Bug #38330: osd/OSD.cc: 1515: abort() in Service::build_incremental_map_msg
https://tracker.ceph.com/issues/38282 was backported to mimic in 13.2.7.
Does this need a backport also ?
(we ha...
Dan van der Ster
09:53 AM Bug #42961: osd: increase priority in certain OSD perf counters
Neha Ojha wrote:
> Ernesto, while we are at it, are there any other specific stats that you've gotten requests for?
...
Ernesto Puerta
02:00 AM Bug #42961 (Fix Under Review): osd: increase priority in certain OSD perf counters
Ernesto, while we are at it, are there any other specific stats that you've gotten requests for? Neha Ojha
09:13 AM Backport #43099 (Resolved): nautilus: nautilus:osd: network numa affinity not supporting subnet port
https://github.com/ceph/ceph/pull/32843 Nathan Cutler
02:53 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
I think we can dispense with the session put when we call 'remove_session' since we call it when we replace the sessi... Brad Hubbard
01:17 AM Backport #43094 (In Progress): mimic: Improve OSDMap::calc_pg_upmaps() efficiency
David Zafman
01:15 AM Backport #43092 (In Progress): nautilus: Improve OSDMap::calc_pg_upmaps() efficiency
David Zafman

12/02/2019

11:30 PM Bug #42346 (In Progress): Nearfull warnings are incorrect

Spurious nearfull warnings caused by backfill reservation mechanism during rebalancing. The nearfull ratio was com...
David Zafman
11:23 PM Bug #42718: Improve OSDMap::calc_pg_upmaps() efficiency
https://github.com/ceph/ceph/pull/31944 is a follow-on fix for https://github.com/ceph/ceph/pull/31774 Neha Ojha
09:56 PM Bug #42718 (Pending Backport): Improve OSDMap::calc_pg_upmaps() efficiency
David Zafman
09:59 PM Backport #43094 (Resolved): mimic: Improve OSDMap::calc_pg_upmaps() efficiency
https://github.com/ceph/ceph/pull/31957 David Zafman
09:58 PM Backport #43093 (Resolved): luminous: Improve OSDMap::calc_pg_upmaps() efficiency
https://github.com/ceph/ceph/pull/31992 David Zafman
09:58 PM Backport #43092 (Resolved): nautilus: Improve OSDMap::calc_pg_upmaps() efficiency
https://github.com/ceph/ceph/pull/31956 David Zafman
06:56 PM Bug #42411 (Pending Backport): nautilus:osd: network numa affinity not supporting subnet port
Sage Weil
05:54 AM Bug #41313: PG distribution completely messed up since Nautilus
This happens with active PG balancer if the cluster is in WARN state.
...
51 hdd 9.09470 1.00000 9.1 TiB 5.8 ...
Anonymous
02:11 AM Bug #42102: use-after-free in Objecter timer handing
... Sage Weil

12/01/2019

01:23 AM Bug #43067 (New): Git Master: src/compressor/zlib/ZlibCompressor.cc / src/compressor/zlib/CMakeLi...
When Ceph is built without support for CPU feature SSE4_1 (HAVE_INTEL_SSE4_1), the CMake build system does not link ... Lee Leahu

11/29/2019

06:01 PM Bug #42780: recursive lock of OpTracker::lock (70)
I will be working on this bug after returning from PTO (ETA: 16 Dec 2019). Radoslaw Zarzynski
05:41 PM Bug #42780: recursive lock of OpTracker::lock (70)
THe problem comes from OSD::get_health_metrics(), where the visitor lambda is holding the lock and also drops a refer... Sage Weil

11/27/2019

08:40 PM Bug #43048: nautilus: upgrade/mimic-x/stress-split: failed to recover before timeout expired
https://www.spinics.net/lists/ceph-users/msg54910.html - could also be related. Neha Ojha
08:36 PM Bug #43048 (Won't Fix - EOL): nautilus: upgrade/mimic-x/stress-split: failed to recover before ti...
... Neha Ojha
04:52 PM Bug #24419: ceph-objectstore-tool unable to open mon store
To be clear, this isn't an issue in mimic or later releases. Josh Durgin
04:50 PM Bug #24419 (Won't Fix): ceph-objectstore-tool unable to open mon store
It looks like this is due to bluestore setting the rocksdb_db_paths config option in luminous. This causes the ceph-o... Josh Durgin

11/26/2019

10:18 PM Bug #42978 (Resolved): ops waiting for lock not requeued; client sees misordering
Sage Weil
10:16 PM Bug #42012 (Resolved): mon osd_snap keys grow unbounded
Sage Weil
09:13 AM Backport #42258 (In Progress): mimic: document new option mon_max_pg_per_osd
Nathan Cutler
05:18 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
Theory:
In Monitor::_ms_dispatch() when we detect a feature change we end up with the following sequence.
Monit...
Brad Hubbard
03:45 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
... Brad Hubbard
03:20 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
I'm wondering if maybe this happens due to the feature change and the session being removed during the upgrade proces... Brad Hubbard

11/25/2019

07:32 PM Bug #42012 (Fix Under Review): mon osd_snap keys grow unbounded
Sage Weil
07:29 PM Bug #42012 (In Progress): mon osd_snap keys grow unbounded
Okay, in octopus, there are now 2 sets of keys
- purged_snap_*: map intervals of snaps that are purged. adjacent r...
Sage Weil
07:16 PM Bug #42978 (Fix Under Review): ops waiting for lock not requeued; client sees misordering
Sage Weil
03:35 PM Feature #39066 (Resolved): src/ceph-disk/tests/ceph-disk.sh is using hardcoded port
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:58 PM Feature #39066: src/ceph-disk/tests/ceph-disk.sh is using hardcoded port
Rejecting luminous backport - luminous is EOL. Nathan Cutler
03:33 PM Bug #40910 (Resolved): mon/OSDMonitor.cc: better error message about min_size
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:57 PM Bug #40910: mon/OSDMonitor.cc: better error message about min_size
Rejecting luminous backport - luminous is EOL. Nathan Cutler
03:33 PM Bug #41017 (Resolved): Change default for bluestore_fsck_on_mount_deep as false
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:56 PM Bug #41017: Change default for bluestore_fsck_on_mount_deep as false
Rejecting luminous backport - luminous is EOL. Nathan Cutler
03:29 PM Bug #42933 (Rejected): LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify2/1
we reverted, see https://github.com/ceph/ceph/pull/31790 Sage Weil
03:23 PM Backport #38205 (In Progress): luminous: osds allows to partially start more than N+2
Nathan Cutler
03:19 PM Backport #40947 (In Progress): luminous: Better default value for osd_snap_trim_sleep
Nathan Cutler
03:13 PM Backport #41730 (Need More Info): luminous: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(pe...
Nathan Cutler
03:05 PM Backport #41730 (In Progress): luminous: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_...
Nathan Cutler
02:57 PM Backport #39381 (Rejected): luminous: src/ceph-disk/tests/ceph-disk.sh is using hardcoded port
Nathan Cutler
02:57 PM Backport #40941 (Rejected): luminous: mon/OSDMonitor.cc: better error message about min_size
Nathan Cutler
02:56 PM Backport #41085 (Rejected): luminous: Change default for bluestore_fsck_on_mount_deep as false
Nathan Cutler
09:44 AM Backport #42998 (Resolved): mimic: acting_recovery_backfill won't catch all up peers
https://github.com/ceph/ceph/pull/33324 Nathan Cutler
09:43 AM Backport #42997 (Resolved): nautilus: acting_recovery_backfill won't catch all up peers
https://github.com/ceph/ceph/pull/32064 Nathan Cutler
09:43 AM Backport #42996 (Rejected): luminous: acting_recovery_backfill won't catch all up peers
https://github.com/ceph/ceph/pull/33326 Nathan Cutler
06:56 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
I'm interested to see more cores in case one sheds more light. I've started a few runs in the hope they will fail. Brad Hubbard
05:14 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
In both cases so far the Message type appears to be MSG_MON_PAXOS and priority is CEPH_MSG_PRIO_HIGH.... Brad Hubbard
03:40 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
Neha sent me another instance of this issue available at http://pulpito.ceph.com/nojha-2019-11-22_18:41:03-rados:upgr... Brad Hubbard
12:58 AM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
... Brad Hubbard
05:27 AM Bug #42971: mgr hangs with upmap balancer
So I wrote my own upmap balancer this weekend and after running it for a bit I found the same problem. It appears th... Bryan Stillwell
04:28 AM Backport #42662 (In Progress): nautilus:Issue a HEALTH_WARN when a Pool is configured with [min_]...
Sridhar Seshasayee

11/24/2019

06:17 PM Bug #38345: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
/a/sage-2019-11-24_06:32:18-rados-wip-sage-testing-2019-11-23-2031-distro-basic-smithi/4538572... Sage Weil
04:58 PM Bug #42577 (Pending Backport): acting_recovery_backfill won't catch all up peers
Kefu Chai
04:57 PM Bug #42782: nautilus: rados/test_librados_build.sh build failure
https://github.com/ceph/ceph/pull/31693 Kefu Chai

11/22/2019

10:22 PM Bug #42975 (Duplicate): out of order ops in rados/upgrade/nautilus-x-singleton
Neha Ojha
05:49 PM Bug #42975: out of order ops in rados/upgrade/nautilus-x-singleton
Another out of order bug https://tracker.ceph.com/issues/42328. Neha Ojha
05:47 PM Bug #42975 (Duplicate): out of order ops in rados/upgrade/nautilus-x-singleton
... Neha Ojha
09:36 PM Bug #42968 (Duplicate): TestClsRbd.mirror_image_status failure during luminous->nautilus upgrade
Duplicating to https://tracker.ceph.com/issues/42891 as its the same issue. Jason Dillaman
03:06 PM Bug #42968 (Duplicate): TestClsRbd.mirror_image_status failure during luminous->nautilus upgrade
Run: http://pulpito.ceph.com/teuthology-2019-11-22_02:25:03-upgrade:luminous-x-nautilus-distro-basic-smithi/
Jobs:'4...
Yuri Weinstein
07:44 PM Bug #42978: ops waiting for lock not requeued; client sees misordering
reproduces with suite: rados:upgrade:nautilus-x-singleton
filter: '0-cluster/{openstack.yaml start.yaml} 1-install...
Neha Ojha
07:09 PM Bug #42978: ops waiting for lock not requeued; client sees misordering
ok, 99% sure the problem si this bit of code in release_object_locks()... Sage Weil
07:05 PM Bug #42978 (Resolved): ops waiting for lock not requeued; client sees misordering
a ceph_test_rados sequence of ops come in, but replies go back out of order... Sage Weil
06:44 PM Bug #42977 (Resolved): mon/Elector.cc: FAILED ceph_assert(m->epoch == get_epoch())
... Neha Ojha
04:37 PM Bug #42971: mgr hangs with upmap balancer
We are using device classes. Bryan Stillwell
04:28 PM Bug #42971: mgr hangs with upmap balancer
Hey Bryan, David's been fixing a couple issues in the balancer that sound like what you're running into:
1) https:...
Josh Durgin
04:15 PM Bug #42971 (New): mgr hangs with upmap balancer
On multiple clusters we are seeing the mgr hang frequently when the balancer is enabled. It seems that the balancer ... Bryan Stillwell
01:24 PM Bug #42964 (Resolved): monitor config store: Deleting logging config settings does not decrease l...
How to reproduce:
1. increase log level of mds:
ceph config set mds debug_mds 10/10
2. try to revert this:
ce...
Марк Коренберг
11:22 AM Bug #42477: Rados should use the '-o outfile' convention
@Nathan, I think that's the right decision in this case mate. It should be less disruptive hopefully. Brad Hubbard
08:45 AM Bug #42477: Rados should use the '-o outfile' convention
@Brad - got it, thanks. So, the issue is fixed as of Octopus and the fix will not be backported for the reason you st... Nathan Cutler
08:44 AM Bug #42477 (Resolved): Rados should use the '-o outfile' convention
Nathan Cutler
11:16 AM Bug #42961 (Resolved): osd: increase priority in certain OSD perf counters
There are reports from users missing stats in dashboard/prometheus mgr modules about the following perf counters:
<p...
Ernesto Puerta

11/21/2019

03:56 PM Bug #42933 (Rejected): LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify2/1
... Sage Weil
03:23 PM Bug #42918: memory corruption and lockups with I-Object
I managed to grab the stack traces from when it locks up instead of crashing -- also around watch/notify in the face ... Ilya Dryomov
03:18 PM Bug #42918: memory corruption and lockups with I-Object
Ilya Dryomov wrote:
> Haven't tried without failure injection yet, but it's probably related to ms_inject_socket_fai...
Ilya Dryomov
02:37 PM Bug #42918: memory corruption and lockups with I-Object
one segfault related to watch/notify is fixed in https://github.com/ceph/ceph/pull/31768, but testing in the rgw suit... Casey Bodley
01:31 PM Bug #42918: memory corruption and lockups with I-Object
Haven't tried without failure injection yet, but it's probably related to ms_inject_socket_failures (and resulting wa... Ilya Dryomov
01:28 PM Bug #42918: memory corruption and lockups with I-Object
@Ilya: does it reproduce when you have injected socket failures disabled? From your initial logs and from the backtra... Jason Dillaman
01:25 PM Bug #42918: memory corruption and lockups with I-Object
Excellent sleuthing -- thanks! I am going to bump this over to the RADOS project since I can't see how this is purely... Jason Dillaman
11:56 AM Bug #42918: memory corruption and lockups with I-Object
Got actionable stack traces on 669453138d89:... Ilya Dryomov
10:42 AM Bug #42918: memory corruption and lockups with I-Object
Looks real and seems to be introduced with I-Object: no issues with 36f5fcbb97eb ("Merge PR #31672 into master") and ... Ilya Dryomov
05:26 AM Bug #42718: Improve OSDMap::calc_pg_upmaps() efficiency

The rules based pool groups being passed to calc_pg_upmaps() is a better method, so we don't want to revert.
try...
David Zafman
01:58 AM Backport #41531 (Resolved): nautilus: Move bluefs alloc size initialization log message to log le...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30229
m...
Nathan Cutler

11/20/2019

11:02 PM Bug #42921 (Can't reproduce): osd: segmentation fault in PGLog::check
... Patrick Donnelly
10:21 PM Bug #42918: memory corruption and lockups with I-Object
The stack is corrupt in both:... Ilya Dryomov
10:11 PM Bug #42918 (Closed): memory corruption and lockups with I-Object
http://pulpito.ceph.com/dis-2019-11-20_20:35:04-krbd-master-wip-krbd-readonly-basic-mira/4526411... Ilya Dryomov
10:06 PM Bug #42783 (Resolved): test failure: due to client closed connection
Neha Ojha
08:07 PM Backport #41531: nautilus: Move bluefs alloc size initialization log message to log level 1
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30229
merged
Yuri Weinstein
05:47 PM Bug #42782 (Resolved): nautilus: rados/test_librados_build.sh build failure
Nathan Cutler
02:54 PM Bug #42906 (Resolved): ceph-mon --mkfs: public_address type (v1|v2) is not respected
When calling `ceph-mon --mkfs ... --public_address v1:<ip_address>:<random_port>`
the `v1:` type is ignored and the ...
Ricardo Dias
08:07 AM Backport #42796 (Resolved): luminous: unnecessary error message "calc_pg_upmaps failed to build o...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31598
m...
Nathan Cutler
02:35 AM Bug #42890 (New): Deadlock occurs when exiting with dpdkstack
exit() will call pthread_cond_destroy attempting to destroy dpdk::eal::cond
upon which other threads are currently b...
chunsong feng
02:14 AM Bug #40367: "*** Caught signal (Segmentation fault) **" in upgrade:luminous-x-nautilus
/a/sage-2019-11-19_05:29:27-rados-wip-sage-testing-2019-11-18-1656-distro-basic-smithi/4522662
Sage Weil

11/19/2019

09:45 PM Backport #42796: luminous: unnecessary error message "calc_pg_upmaps failed to build overfull/und...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31598
merged
Yuri Weinstein
03:24 PM Bug #42884: OSDMapTest.CleanPGUpmaps failure
not reproducible locally. i am testing 9b61479da4f89014b6d1857287102bbc9db13e6e Kefu Chai
02:00 PM Bug #42884 (New): OSDMapTest.CleanPGUpmaps failure
During a make check build for a PR, I got this crash during OSD testing:
https://jenkins.ceph.com/job/ceph-pull-re...
Jeff Layton
12:55 PM Backport #42846 (In Progress): nautilus: src/msg/async/net_handler.cc: Fix compilation
Nathan Cutler
12:50 PM Backport #42739 (In Progress): nautilus: scrub object count mismatch on device_health_metrics pool
Nathan Cutler
09:01 AM Bug #41669 (Resolved): Make dumping of reservation info congruent between scrub and recovery
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
09:01 AM Bug #41924 (Resolved): asynchronous recovery can not function under certain circumstances
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
09:00 AM Bug #42015 (Resolved): Remove unused full and nearful output from OSDMap summary
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
09:00 AM Backport #42879 (Resolved): mimic: ceph_test_admin_socket_output fails in rados qa suite
https://github.com/ceph/ceph/pull/33323 Nathan Cutler
09:00 AM Backport #42878 (Resolved): nautilus: ceph_test_admin_socket_output fails in rados qa suite
https://github.com/ceph/ceph/pull/32063 Nathan Cutler
08:54 AM Backport #41785 (Resolved): nautilus: Make dumping of reservation info congruent between scrub an...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31444
m...
Nathan Cutler
08:39 AM Backport #42141 (Resolved): nautilus: asynchronous recovery can not function under certain circum...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31077
m...
Nathan Cutler
08:39 AM Backport #42136 (Resolved): nautilus: Remove unused full and nearful output from OSDMap summary
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30900
m...
Nathan Cutler
06:25 AM Bug #42387 (Pending Backport): ceph_test_admin_socket_output fails in rados qa suite
Kefu Chai

11/18/2019

09:45 PM Bug #42830: problem returning mon to cluster
I forgot, Ceph is at version 14.2.1 on our side. Jérôme Poulin
09:44 PM Bug #42830: problem returning mon to cluster
We encountered the same problem last week, after stopping a monitor service on a server on the cluster, trying to sta... Jérôme Poulin
09:13 PM Bug #42387 (In Progress): ceph_test_admin_socket_output fails in rados qa suite
Brad Hubbard
09:09 AM Bug #42387 (Fix Under Review): ceph_test_admin_socket_output fails in rados qa suite
Nathan Cutler
08:14 PM Bug #42782: nautilus: rados/test_librados_build.sh build failure
https://github.com/ceph/ceph/pull/31604 merged Yuri Weinstein
08:13 PM Backport #41785: nautilus: Make dumping of reservation info congruent between scrub and recovery
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31444
merged
Yuri Weinstein
08:08 PM Backport #42141: nautilus: asynchronous recovery can not function under certain circumstances
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31077
merged
Yuri Weinstein
08:05 PM Backport #42136: nautilus: Remove unused full and nearful output from OSDMap summary
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30900
merged
Yuri Weinstein
08:29 AM Bug #42592 (Duplicate): ceph-mon/mgr PGstat Segmentation Fault
Nathan Cutler
07:43 AM Bug #42577 (Fix Under Review): acting_recovery_backfill won't catch all up peers
xie xingguo
07:14 AM Bug #42861 (Fix Under Review): Libceph-common.so needs to use private link attribute when includi...
Libceph-common.so does not specify a link attribute containing the dpdk library,
dpdk global variables and function...
chunsong feng

11/17/2019

10:27 PM Bug #42477: Rados should use the '-o outfile' convention
Note that the reason I did not set this for backport is that it has the potential to break existing scripts and funct... Brad Hubbard
06:02 PM Bug #42082 (Resolved): pybind/rados: set_omap() crash on py3
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
06:00 PM Backport #42853 (Resolved): nautilus: format error: ceph osd stat --format=json
https://github.com/ceph/ceph/pull/32062 Nathan Cutler
06:00 PM Backport #42852 (Resolved): mimic: format error: ceph osd stat --format=json
https://github.com/ceph/ceph/pull/33322 Nathan Cutler
05:59 PM Backport #42848 (Rejected): nautilus: "failing miserably..." in Infiniband.cc
Nathan Cutler
05:59 PM Backport #42847 (Rejected): mimic: "failing miserably..." in Infiniband.cc
Nathan Cutler
05:59 PM Backport #42846 (Resolved): nautilus: src/msg/async/net_handler.cc: Fix compilation
https://github.com/ceph/ceph/pull/31736 Nathan Cutler
05:53 PM Bug #42821 (Pending Backport): src/msg/async/net_handler.cc: Fix compilation
Kefu Chai
04:08 PM Bug #42845 (New): CVE-2019-14818
https://nvd.nist.gov/vuln/detail/CVE-2019-14818 affects many versions of `dpdk`.
In ceph, you appear to bundle ver...
Robert Scott

11/16/2019

05:16 PM Bug #42742 (Pending Backport): "failing miserably..." in Infiniband.cc
Kefu Chai
05:12 PM Bug #42477 (Pending Backport): Rados should use the '-o outfile' convention
Kefu Chai
05:09 PM Bug #42501 (Pending Backport): format error: ceph osd stat --format=json
Kefu Chai
05:02 PM Bug #42501 (Resolved): format error: ceph osd stat --format=json
Kefu Chai
04:55 PM Bug #42689 (Duplicate): nautilus mon/mgr: ceph status:pool number display is not right
Kefu Chai
06:38 AM Bug #42332 (Resolved): CephContext::CephContextServiceThread might pause for 5 seconds at shutdown
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
06:38 AM Bug #42360 (Resolved): python3-cephfs should provide python36-cephfs
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
06:34 AM Backport #42363 (Resolved): nautilus: python3-cephfs should provide python36-cephfs
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30983
m...
Nathan Cutler
06:33 AM Backport #42395 (Resolved): nautilus: CephContext::CephContextServiceThread might pause for 5 sec...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31097
m...
Nathan Cutler

11/15/2019

10:56 PM Documentation #12059: rados/troubleshooting/troubleshooting-mon: suprious `` in titles
Looks like this was fixed by https://github.com/ceph/ceph/pull/5004, not https://github.com/ceph/ceph/pull/4988. Nathan Cutler
10:53 PM Documentation #12059 (Resolved): rados/troubleshooting/troubleshooting-mon: suprious `` in titles
Nathan Cutler
08:37 AM Documentation #12059: rados/troubleshooting/troubleshooting-mon: suprious `` in titles
This issue has already been resolved. Updated the pull request ID for the same. Deepika Upadhyay
10:39 PM Backport #42363: nautilus: python3-cephfs should provide python36-cephfs
Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/30983
merged
Yuri Weinstein
10:38 PM Backport #42395: nautilus: CephContext::CephContextServiceThread might pause for 5 seconds at shu...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/31097
merged
Yuri Weinstein
09:46 PM Bug #42824 (Resolved): mimic: rebuild_mondb.cc: FAILED assert(0) in update_osdmap()
Nathan Cutler
05:58 PM Bug #42824 (Fix Under Review): mimic: rebuild_mondb.cc: FAILED assert(0) in update_osdmap()
Neha Ojha
05:39 AM Bug #42824: mimic: rebuild_mondb.cc: FAILED assert(0) in update_osdmap()
... Brad Hubbard
07:31 AM Bug #42830 (New): problem returning mon to cluster
as discussed on the list, here https://www.spinics.net/lists/ceph-users/msg55977.html
After rebooting one of the n...
Nikola Ciprich
03:42 AM Bug #42821 (Fix Under Review): src/msg/async/net_handler.cc: Fix compilation
Kefu Chai
 

Also available in: Atom