Project

General

Profile

Activity

From 05/26/2022 to 06/24/2022

06/24/2022

07:45 PM Bug #48029: Exiting scrub checking -- not all pgs scrubbed.
/a/yuriw-2022-06-22_22:13:20-rados-wip-yuri3-testing-2022-06-22-1121-pacific-distro-default-smithi/6892691
Descrip...
Laura Flores
04:18 PM Bug #45702: PGLog::read_log_and_missing: ceph_assert(miter == missing.get_items().end() || (miter...
/a/yuriw-2022-06-23_21:29:45-rados-wip-yuri4-testing-2022-06-22-1415-pacific-distro-default-smithi/6895353 Laura Flores
09:45 AM Bug #56386: Writes to a cephfs after metadata pool snapshot causes inconsistent objects
> Removing the pool snap then deep scrubbing again removes the inconsistent objects.
This isn't true -- my quick t...
Dan van der Ster
07:26 AM Bug #56386 (Can't reproduce): Writes to a cephfs after metadata pool snapshot causes inconsistent...
If you take a snapshot of the meta pool, then decrease max_mds, metadata objects will be inconsistent.
Removing the ...
Dan van der Ster
03:14 AM Bug #56377 (New): crash: MOSDRepOp::encode_payload(unsigned long)

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=e94095d0cd0fcf3fd898984b...
Telemetry Bot
03:13 AM Bug #56371 (Duplicate): crash: MOSDPGLog::encode_payload(unsigned long)

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=2260a57d5917388881ad6b24...
Telemetry Bot
03:13 AM Bug #56352 (New): crash: void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=3348f49ddb73c803861097dc...
Telemetry Bot
03:13 AM Bug #56351 (New): crash: void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=3aca522a7914d781399c656e...
Telemetry Bot
03:13 AM Bug #56350 (New): crash: void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=279837e667d5bd5af7117e58...
Telemetry Bot
03:12 AM Bug #56349 (New): crash: void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=0af06b2db676dc127cf14736...
Telemetry Bot
03:12 AM Bug #56348 (New): crash: void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=2c87a7239d9493be78ec973d...
Telemetry Bot
03:12 AM Bug #56347 (New): crash: void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=d89aad10db4ba24f32836f6b...
Telemetry Bot
03:12 AM Bug #56341 (New): crash: __cxa_rethrow()

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=4ea8075453d1e75053186bfb...
Telemetry Bot
03:12 AM Bug #56340 (New): crash: MOSDRepOp::encode_payload(unsigned long)

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=109df62e078655f21a42e939...
Telemetry Bot
03:12 AM Bug #56337 (New): crash: void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=2966e604246718b92712c37f...
Telemetry Bot
03:12 AM Bug #56336 (New): crash: MOSDPGScan::encode_payload(unsigned long)

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=6a63e3ada81c75347510f5f6...
Telemetry Bot
03:12 AM Bug #56333 (New): crash: void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=ea60d26b0fb86048ba4db78d...
Telemetry Bot
03:12 AM Bug #56332 (New): crash: void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=c743495a46a830b11d21142d...
Telemetry Bot
03:12 AM Bug #56331 (New): crash: MOSDPGLog::encode_payload(unsigned long)

*New crash events were reported via Telemetry with newer versions (['17.2.0']) than encountered in Tracker (0.0.0)....
Telemetry Bot
03:12 AM Bug #56330 (New): crash: void pg_missing_set<TrackChanges>::got(const hobject_t&, eversion_t) [wi...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=2c5246221b9449c92df232e9...
Telemetry Bot
03:12 AM Bug #56329 (New): crash: rocksdb::DBImpl::CompactRange(rocksdb::CompactRangeOptions const&, rocks...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=bfc26738960c1795f6c7671a...
Telemetry Bot
03:12 AM Bug #56326 (New): crash: void PeeringState::add_log_entry(const pg_log_entry_t&, bool): assert(e....

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=2786349b0161a62145b35214...
Telemetry Bot
03:11 AM Bug #56325 (New): crash: void PeeringState::add_log_entry(const pg_log_entry_t&, bool): assert(e....

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=86280552d1b3deaaae5d29d2...
Telemetry Bot
03:11 AM Bug #56324 (New): crash: MOSDPGLog::encode_payload(unsigned long)

*New crash events were reported via Telemetry with newer versions (['17.2.0']) than encountered in Tracker (0.0.0)....
Telemetry Bot
03:11 AM Bug #56320 (New): crash: int OSDMap::build_simple_optioned(ceph::common::CephContext*, epoch_t, u...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=3d21b380da7e67a51e3c3ff5...
Telemetry Bot
03:11 AM Bug #56319 (New): crash: int OSDMap::build_simple_optioned(ceph::common::CephContext*, epoch_t, u...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=e1f3eef62cf680a2a1e12fe3...
Telemetry Bot
03:11 AM Bug #56307 (New): crash: virtual void PrimaryLogPG::on_local_recover(const hobject_t&, const Obje...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=1bffe92eb6037f0945a6822d...
Telemetry Bot
03:11 AM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=97cfb7606f247983cba0a9666...
Telemetry Bot
03:11 AM Bug #56303 (New): crash: virtual bool PrimaryLogPG::should_send_op(pg_shard_t, const hobject_t&):...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=03f9b6cfdf027552d7733607...
Telemetry Bot
03:10 AM Bug #56300 (New): crash: void MonitorDBStore::_open(const string&): abort

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=8fdacfdb77748da3299053f8...
Telemetry Bot
03:10 AM Bug #56292 (New): crash: int OSD::shutdown(): assert(end_time - start_time_func < cct->_conf->osd...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=0523fecb66e5a47efa4b27b4...
Telemetry Bot
03:10 AM Bug #56289 (Duplicate): crash: void PeeringState::check_past_interval_bounds() const: abort

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=657abbfaddca21e4f153180f...
Telemetry Bot
03:09 AM Bug #56265 (New): crash: void MonitorDBStore::_open(const string&): abort

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=5a64b6a073f492ff2e80966e...
Telemetry Bot
03:08 AM Bug #56247 (New): crash: BackfillInterval::pop_front()

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=b80697d3e5dc1d900a588df5...
Telemetry Bot
03:08 AM Bug #56244 (New): crash: int MonitorDBStore::apply_transaction(MonitorDBStore::TransactionRef): a...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=3db4d3c60ed74d15ae58c626...
Telemetry Bot
03:08 AM Bug #56243 (New): crash: int MonitorDBStore::apply_transaction(MonitorDBStore::TransactionRef): a...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=9b71de1eabc47b3fb6580322...
Telemetry Bot
03:08 AM Bug #56242 (New): crash: int MonitorDBStore::apply_transaction(MonitorDBStore::TransactionRef): a...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=8da89c9532956acaff6e7f70...
Telemetry Bot
03:08 AM Bug #56241 (New): crash: int MonitorDBStore::apply_transaction(MonitorDBStore::TransactionRef): a...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=66ed6db515cc489439bc0f6a...
Telemetry Bot
03:08 AM Bug #56238 (New): crash: non-virtual thunk to PrimaryLogPG::op_applied(eversion_t const&)

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=b4a30ba69ea953149454037b...
Telemetry Bot
03:06 AM Bug #56207 (New): crash: void ECBackend::handle_sub_write_reply(pg_shard_t, const ECSubWriteReply...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=7f6b4f1dc6564900d71f71e9...
Telemetry Bot
03:06 AM Bug #56203 (New): crash: void pg_missing_set<TrackChanges>::got(const hobject_t&, eversion_t) [wi...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=f69a87d11f743c4125be0748...
Telemetry Bot
03:05 AM Bug #56201 (New): crash: void OSD::do_recovery(PG*, epoch_t, uint64_t, ThreadPool::TPHandle&): as...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=e11765511e1628fcf2a52548...
Telemetry Bot
03:05 AM Bug #56198 (New): crash: rocksdb::port::Mutex::Unlock()

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=2f6c08ed7f0db8e9480a0cde...
Telemetry Bot
03:05 AM Bug #56194 (New): crash: OpTracker::~OpTracker(): assert((sharded_in_flight_list.back())->ops_in_...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=e86e0b84f213b49af7dc5555...
Telemetry Bot
03:05 AM Bug #56192 (Pending Backport): crash: virtual Monitor::~Monitor(): assert(session_map.sessions.em...

*New crash events were reported via Telemetry with newer versions (['16.2.9', '17.2.0']) than encountered in Tracke...
Telemetry Bot
03:04 AM Bug #56191 (New): crash: std::vector<std::filesystem::path::_Cmpt, std::allocator<std::filesystem...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=d7dee2fb28426f502f494906...
Telemetry Bot
03:04 AM Bug #56188 (New): crash: void PGLog::IndexedLog::add(const pg_log_entry_t&, bool): assert(head.ve...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=1ced48493cc0b5d62eb75f1e...
Telemetry Bot

06/23/2022

08:44 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
/a/yuriw-2022-06-14_20:42:00-rados-wip-yuri2-testing-2022-06-14-0949-octopus-distro-default-smithi/6878271 Laura Flores
08:43 PM Bug #52737 (Duplicate): osd/tests: stat mismatch
Laura Flores
08:40 PM Bug #43584: MON_DOWN during mon_join process
/a/yuriw-2022-06-14_20:42:00-rados-wip-yuri2-testing-2022-06-14-0949-octopus-distro-default-smithi/6878197 Laura Flores
03:06 PM Bug #56147: snapshots will not be deleted after upgrade from nautilus to pacific
Yes. For the debuglogs I tested this with nautilus (14.2.22) to octopus (15.2.16). The behavior is the same as descri... Manuel Lausch
01:22 PM Bug #52416 (Resolved): devices: mon devices appear empty when scraping SMART metrics
Yaarit Hatuka
01:22 PM Backport #54233 (Resolved): octopus: devices: mon devices appear empty when scraping SMART metrics
Yaarit Hatuka
06:26 AM Fix #50574 (Resolved): qa/standalone: Modify/re-write failing standalone tests with mclock scheduler
Sridhar Seshasayee
06:18 AM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
I think this is the same issue as https://tracker.ceph.com/issues/53855. Myoungwon Oh
06:09 AM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
OK, I'll take a look. Myoungwon Oh
05:10 AM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
I have been able to easily reproduce this by running the following test:
rados/verify/{centos_latest ceph clusters/...
Aishwarya Mathuria

06/22/2022

09:45 PM Bug #53969 (Resolved): BufferList.rebuild_aligned_size_and_memory failure
Neha Ojha
09:45 PM Backport #53972 (Resolved): pacific: BufferList.rebuild_aligned_size_and_memory failure
Neha Ojha
09:42 PM Bug #53677 (Resolved): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
Neha Ojha
09:42 PM Bug #53308 (Resolved): pg-temp entries are not cleared for PGs that no longer exist
Neha Ojha
09:41 PM Bug #54593 (Resolved): librados: check latest osdmap on ENOENT in pool_reverse_lookup()
Neha Ojha
09:41 PM Backport #55012 (Resolved): octopus: librados: check latest osdmap on ENOENT in pool_reverse_look...
Neha Ojha
09:40 PM Backport #55013 (Resolved): pacific: librados: check latest osdmap on ENOENT in pool_reverse_look...
Neha Ojha
09:36 PM Bug #54592 (Resolved): partial recovery: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty
Neha Ojha
09:36 PM Backport #55019 (Resolved): octopus: partial recovery: CEPH_OSD_OP_OMAPRMKEYRANGE should mark oma...
Neha Ojha
09:34 PM Backport #55984 (Resolved): pacific: log the numbers of dups in PG Log
Neha Ojha
08:52 PM Bug #56101 (New): Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Vikhyat Umrao
08:51 PM Bug #56101 (Duplicate): Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Vikhyat Umrao
05:54 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?var-sig_v2=97cfb7606f247983cba0a9666bb882d9e1... Neha Ojha
12:58 AM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Looking into this further in today's team meeting we discussed the fact that these segfaults appear to occur in pthre... Brad Hubbard
08:51 PM Bug #56102 (Duplicate): Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in - RocksDBStore::e...
Vikhyat Umrao
08:36 PM Bug #53729 (In Progress): ceph-osd takes all memory before oom on boot
Neha Ojha
08:25 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
/a/yuriw-2022-06-21_16:28:27-rados-wip-yuri4-testing-2022-06-21-0704-pacific-distro-default-smithi/6889549 Laura Flores
06:39 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
/a/yuriw-2022-06-18_00:01:31-rados-quincy-release-distro-default-smithi/6884838 Laura Flores
06:24 PM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
Hi Myoungwon Oh, can you please help take a look at this issue? Neha Ojha
06:17 PM Bug #56030 (Fix Under Review): frequently down and up a osd may cause recovery not in asynchronous
Neha Ojha
06:15 PM Bug #56147 (Need More Info): snapshots will not be deleted after upgrade from nautilus to pacific
> Also I could this observer on a update from nautilus to ocotopus.
Just to ensure: am I correct the issue is visi...
Radoslaw Zarzynski
05:36 PM Bug #55695 (Fix Under Review): Shutting down a monitor forces Paxos to restart and sometimes disr...
Neha Ojha
03:34 PM Bug #56149: thrash-erasure-code: AssertionError: wait_for_recovery timeout due to "active+recover...
According to some tests I ran on main and pacific, this kind of failure does not happen very frequently:
Main:
ht...
Laura Flores
02:59 PM Bug #54172 (Fix Under Review): ceph version 16.2.7 PG scrubs not progressing
Neha Ojha
06:33 AM Feature #56153 (Resolved): add option to dump pg log to pg command
Currently we need to stop the cluster and use ceph_objectstore_tool to dump pg log
Commend: ceph pg n.n log
will...
Nitzan Mordechai

06/21/2022

10:46 PM Backport #53339 (In Progress): pacific: src/osd/scrub_machine.cc: FAILED ceph_assert(state_cast<c...
Neha Ojha
08:01 PM Bug #56151 (Resolved): mgr/DaemonServer:: adjust_pgs gap > max_pg_num_change should be gap >= max...
output should say that gap >= max_pg_num_change when pg is trying to scale beyond pgpnum in mgr/DaemonServer:: adjust... Kamoltat (Junior) Sirivadhna
07:31 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
Oh interesting. Is there a reason that bug would only be affecting minsize_recovery? Laura Flores
08:03 AM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
Talked with Ronen Friedman regarding that issue, it may be related to other bug that he is working on that the scrub ... Nitzan Mordechai
07:20 PM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
Sridhar Seshasayee wrote:
> /a/yuriw-2022-06-15_18:29:33-rados-wip-yuri4-testing-2022-06-15-1000-pacific-distro-defa...
Laura Flores
05:23 PM Bug #56149 (New): thrash-erasure-code: AssertionError: wait_for_recovery timeout due to "active+r...
Description:
rados/thrash-erasure-code/{ceph clusters/{fixed-2 openstack} fast/normal mon_election/connectivity msgr...
Laura Flores
05:04 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
- OSD.392 logs are available in gibba037 following path:... Vikhyat Umrao
05:03 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
- Looks like there is a commonality that this crash is happening in shutdown/restart so looks like some issue during ... Vikhyat Umrao
04:50 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
- After upgrading the LRC cluster, the same crash was seen in one of the OSDs in LRC.... Vikhyat Umrao
02:31 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
... Vikhyat Umrao
04:53 PM Bug #51945: qa/workunits/mon/caps.sh: Error: Expected return 13, got 0
/a/yuriw-2022-06-16_19:58:30-rados-wip-yuri7-testing-2022-06-16-1051-pacific-distro-default-smithi/6882914 Laura Flores
04:37 PM Backport #56099: pacific: rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
https://github.com/ceph/ceph/pull/46748 Laura Flores
04:35 PM Bug #56102: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in - RocksDBStore::estimate_pref...
Wasn't sure if I should keep the files on here due to privacy, but Neha said it's okay. Laura Flores
04:27 PM Bug #56102: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in - RocksDBStore::estimate_pref...
@Adam I have uploaded all of the relevant logs from gibba019, osd.31 here. I found the crash in ceph-osd.31.log-20220... Laura Flores
02:25 PM Bug #56147 (Resolved): snapshots will not be deleted after upgrade from nautilus to pacific
After upgrading from 14.2.22 to 16.2.9 snapshot deletion does not remove "clones" from pool
More precise: Objects in...
Manuel Lausch
07:15 AM Bug #56136 (Fix Under Review): [Progress] Do not show NEW PG_NUM value for pool if autoscaler is ...
Prashant D
06:51 AM Bug #56136 (Resolved): [Progress] Do not show NEW PG_NUM value for pool if autoscaler is set to off
When noautscale is set, autoscale-status shows NEW PG_NUM value if pool pg_num is more than 96.
$ ./bin/ceph osd p...
Prashant D
06:50 AM Backport #56135 (Resolved): pacific: scrub starts message missing in cluster log
https://github.com/ceph/ceph/pull/48070 Backport Bot
06:50 AM Backport #56134 (Resolved): quincy: scrub starts message missing in cluster log
https://github.com/ceph/ceph/pull/47621 Backport Bot
06:45 AM Bug #55798 (Pending Backport): scrub starts message missing in cluster log
Prashant D

06/20/2022

07:38 AM Bug #53855: rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
https://github.com/ceph/ceph/pull/46748 Myoungwon Oh
06:48 AM Bug #56030: frequently down and up a osd may cause recovery not in asynchronous
add more log zhouyue zhou
04:52 AM Backport #56059 (Resolved): pacific: Assertion failure (ceph_assert(have_pending)) when creating ...
Sridhar Seshasayee

06/19/2022

01:44 PM Bug #54172: ceph version 16.2.7 PG scrubs not progressing
@CorySnider: thanks. Your suggestion is spot on. The suggested fix
solves one issue. There is another problem relate...
Ronen Friedman

06/17/2022

08:39 PM Bug #56102 (Duplicate): Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in - RocksDBStore::e...
... Vikhyat Umrao
08:37 PM Bug #56101 (Resolved): Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
... Vikhyat Umrao
08:26 PM Feature #55982: log the numbers of dups in PG Log
https://github.com/ceph/ceph/pull/46608 merged Yuri Weinstein
08:19 PM Bug #53294 (New): rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
There are still some occurrences of this type of failure in Quincy, which includes the backport of #53855. So, I am r... Laura Flores
08:06 PM Backport #56099 (Resolved): pacific: rados/test.sh hangs while running LibRadosTwoPoolsPP.Manifes...
Backport Bot
08:02 PM Bug #53855 (Pending Backport): rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlush...
Laura Flores
06:54 AM Bug #53855: rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
ok Myoungwon Oh
07:28 PM Bug #56097 (Fix Under Review): Timeout on `sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtes...
/a/yuriw-2022-06-16_18:33:18-rados-wip-yuri5-testing-2022-06-16-0649-distro-default-smithi/6882594... Laura Flores
06:45 PM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
/a/yuriw-2022-06-16_18:33:18-rados-wip-yuri5-testing-2022-06-16-0649-distro-default-smithi/6882724 Laura Flores
04:44 PM Backport #56059: pacific: Assertion failure (ceph_assert(have_pending)) when creating new OSDs du...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46691
merged
Yuri Weinstein
02:47 PM Bug #55355: osd thread deadlock
Radoslaw Zarzynski wrote:
> Big, big thanks, jianwei zhang, for your analysis. It was extremely helpful!
Good news,...
jianwei zhang
12:31 PM Bug #55355: osd thread deadlock
Big, big thanks, jianwei zhang, for your analysis. It was extremely helpful! Radoslaw Zarzynski
12:30 PM Bug #55355 (Fix Under Review): osd thread deadlock
Radoslaw Zarzynski
08:39 AM Bug #54172: ceph version 16.2.7 PG scrubs not progressing
We've experienced this issue as well, on both 16.2.6 and 16.2.7, and I've identified the cause. Here's the scenario:
...
Cory Snyder

06/16/2022

03:28 PM Bug #53685: Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS)' failed.
/a/yuriw-2022-06-11_02:24:12-rados-quincy-release-distro-default-smithi/6873771 Laura Flores
12:40 PM Backport #53338: pacific: osd/scrub: src/osd/scrub_machine.cc: 55: FAILED ceph_assert(state_cast<...
It looks like this backport has been merged in https://github.com/ceph/ceph/pull/45374, and released in 16.2.8, so I ... Benoît Knecht
09:38 AM Bug #55141: thrashers/fastread: assertion failure: rollback_info_trimmed_to == head
+*Observed this in a pacific run:*+
/a/yuriw-2022-06-15_18:29:33-rados-wip-yuri4-testing-2022-06-15-1000-pacific-dis...
Sridhar Seshasayee
09:23 AM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
/a/yuriw-2022-06-15_18:29:33-rados-wip-yuri4-testing-2022-06-15-1000-pacific-distro-default-smithi/6881131
Descrip...
Sridhar Seshasayee
09:14 AM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
/a/yuriw-2022-06-15_18:29:33-rados-wip-yuri4-testing-2022-06-15-1000-pacific-distro-default-smithi/6881215 Sridhar Seshasayee
08:05 AM Bug #55726: Drained OSDs are still ACTIVE_PRIMARY - casuing high IO latency on clients
... Denis Polom
05:04 AM Bug #55153 (Fix Under Review): Make the mClock config options related to [res, wgt, lim] modifiab...
Sridhar Seshasayee
01:54 AM Bug #55750: mon: slow request of very long time
Neha Ojha wrote:
> yite gu wrote:
> > It appears that this mon request has been completed,but it have no erase from...
yite gu

06/15/2022

06:56 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
This `wait_for_clean` assertion failure is happening with the minsize_recovery thrasher, which is used by rados/thras... Laura Flores
06:21 PM Bug #55750: mon: slow request of very long time
yite gu wrote:
> It appears that this mon request has been completed,but it have no erase from ops_in_flight_sharded...
Neha Ojha
06:11 PM Bug #55776: octopus: map exx had wrong cluster addr
... Radoslaw Zarzynski
05:54 PM Bug #55726: Drained OSDs are still ACTIVE_PRIMARY - casuing high IO latency on clients
Could you please provide the output from @ceph osd lspools@ as well? Radoslaw Zarzynski
05:51 PM Bug #47300 (Resolved): mount.ceph fails to understand AAAA records from SRV record
Matan Breizman
05:50 PM Backport #55513 (Resolved): quincy: mount.ceph fails to understand AAAA records from SRV record
Matan Breizman
05:50 PM Backport #55514 (Resolved): pacific: mount.ceph fails to understand AAAA records from SRV record
Matan Breizman
05:06 PM Backport #55514: pacific: mount.ceph fails to understand AAAA records from SRV record
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46112
merged
Yuri Weinstein
05:03 PM Backport #55296: pacific: malformed json in a Ceph RESTful API call can stop all ceph-mon services
nikhil kshirsagar wrote:
> please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/...
Yuri Weinstein
10:05 AM Bug #56057 (Fix Under Review): Add health error if one or more OSDs registered v1/v2 public ip ad...
Prashant D
07:13 AM Bug #56057 (Pending Backport): Add health error if one or more OSDs registered v1/v2 public ip ad...
In a containerized environment after a OSD node reboot, some OSDs registered their public v1/v2 addresses on cluster ... Prashant D
09:10 AM Backport #56059 (In Progress): pacific: Assertion failure (ceph_assert(have_pending)) when creati...
Sridhar Seshasayee
08:55 AM Backport #56059 (Resolved): pacific: Assertion failure (ceph_assert(have_pending)) when creating ...
https://github.com/ceph/ceph/pull/46691 Backport Bot
09:07 AM Backport #56060 (In Progress): quincy: Assertion failure (ceph_assert(have_pending)) when creatin...
Sridhar Seshasayee
08:55 AM Backport #56060 (Resolved): quincy: Assertion failure (ceph_assert(have_pending)) when creating n...
https://github.com/ceph/ceph/pull/46689 Backport Bot
08:51 AM Bug #55773 (Pending Backport): Assertion failure (ceph_assert(have_pending)) when creating new OS...
Sridhar Seshasayee

06/14/2022

09:40 PM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
Running some tests to try and reproduce the issue and get a sense of how frequently it fails. This has actually been ... Laura Flores
09:05 PM Backport #51287 (In Progress): pacific: LibRadosService.StatusFormat failed, Expected: (0) != (re...
Laura Flores
08:08 PM Bug #53855: rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
@Myoungwon Oh does this look like the same thing to you? Perhaps your fix needs to be backported to Pacific.
/a/yu...
Laura Flores
03:03 PM Bug #52316: qa/tasks/mon_thrash.py: _do_thrash AssertionError len(s['quorum']) == len(mons)
/a/yuriw-2022-06-13_16:36:31-rados-wip-yuri7-testing-2022-06-13-0706-distro-default-smithi/6876523
Description: ra...
Laura Flores
02:37 PM Bug #56034: qa/standalone/osd/divergent-priors.sh fails in test TEST_divergent_3()
Another detail to note is that this particular test has the pg autoscaler enabled, as opposed to TEST_divergent_2(), ... Laura Flores
10:48 AM Bug #56034 (Resolved): qa/standalone/osd/divergent-priors.sh fails in test TEST_divergent_3()
/a/yuriw-2022-06-13_16:36:31-rados-wip-yuri7-testing-2022-06-13-0706-distro-default-smithi/6876516
Also historical...
Sridhar Seshasayee
06:22 AM Bug #56030: frequently down and up a osd may cause recovery not in asynchronous
i set osd_async_recovery_min_cost = 0 hope async recovery anyway zhouyue zhou
03:57 AM Bug #56030 (Fix Under Review): frequently down and up a osd may cause recovery not in asynchronous
ceph version: octopus 15.2.13
in my test cluster, have 6 osds, 3 for bucket index pool,3 for other pools, there ar...
zhouyue zhou

06/13/2022

10:40 PM Bug #56028 (New): thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.vers...
This assertion is resurfacing in Pacific runs. The last fix for this was tracked in #46323, but this test branch incl... Laura Flores
10:27 PM Bug #52737: osd/tests: stat mismatch
@Ronen I'm pretty sure this is a duplicate of #50222 Laura Flores
10:26 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
/a/yuriw-2022-06-07_19:48:58-rados-wip-yuri6-testing-2022-06-07-0955-pacific-distro-default-smithi/6866688 Laura Flores
03:12 AM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
/a/yuriw-2022-06-09_03:58:30-smoke-quincy-release-distro-default-smithi/6869659/
Test description: smoke/basic/{clus...
Aishwarya Mathuria

06/10/2022

05:38 PM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
Sridhar Seshasayee wrote:
> *+Quick Update+*
> This was again hit recently in
> /a/yuriw-2022-06-09_03:58:30-smoke...
Yuri Weinstein
05:10 PM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
verifying again
http://pulpito.front.sepia.ceph.com/yuriw-2022-06-11_02:22:38-smoke-quincy-release-distro-default-sm...
Yuri Weinstein
03:47 PM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
*+Quick Update+*
This was again hit recently in
/a/yuriw-2022-06-09_03:58:30-smoke-quincy-release-distro-default-sm...
Sridhar Seshasayee
05:20 PM Backport #55981: quincy: don't trim excessive PGLog::IndexedLog::dups entries on-line
Radoslaw Zarzynski wrote:
> https://github.com/ceph/ceph/pull/46605
merged
Yuri Weinstein
04:52 PM Bug #55001: rados/test.sh: Early exit right after LibRados global tests complete
/a/yuriw-2022-06-10_03:10:47-rados-wip-yuri4-testing-2022-06-09-1510-quincy-distro-default-smithi/6871955
Coredump...
Laura Flores
04:48 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
Nitzan Mordechai wrote:
> That could work, but when we have socket failure injection, the error callback will not be...
Laura Flores
04:44 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
/a/yuriw-2022-06-10_03:10:47-rados-wip-yuri4-testing-2022-06-09-1510-quincy-distro-default-smithi/6872050 Laura Flores
04:39 PM Feature #55982: log the numbers of dups in PG Log
https://github.com/ceph/ceph/pull/46607 merged Yuri Weinstein
08:18 AM Bug #55995 (New): OSD Crash: /lib64/libpthread.so.0(+0x12ce0) [0x7f94cdcbbce0]
Hi,
i recently upgraded my ceph cluster from 14.2.x to 16.2.7 and switched to docker deployment. Since then, i see...
Kilian Ries

06/09/2022

09:42 PM Backport #55981: quincy: don't trim excessive PGLog::IndexedLog::dups entries on-line
https://github.com/ceph/ceph/pull/46605 Radoslaw Zarzynski
06:36 PM Backport #55981 (Resolved): quincy: don't trim excessive PGLog::IndexedLog::dups entries on-line
Radoslaw Zarzynski
08:42 PM Backport #55985 (In Progress): octopus: log the numbers of dups in PG Log
https://github.com/ceph/ceph/pull/46609 Radoslaw Zarzynski
08:35 PM Backport #55985 (Resolved): octopus: log the numbers of dups in PG Log
Backport Bot
08:40 PM Backport #55984 (In Progress): pacific: log the numbers of dups in PG Log
https://github.com/ceph/ceph/pull/46608 Radoslaw Zarzynski
08:35 PM Backport #55984 (Resolved): pacific: log the numbers of dups in PG Log
Backport Bot
08:38 PM Backport #55983 (In Progress): quincy: log the numbers of dups in PG Log
https://github.com/ceph/ceph/pull/46607 Radoslaw Zarzynski
08:35 PM Backport #55983 (Resolved): quincy: log the numbers of dups in PG Log
Backport Bot
08:32 PM Feature #55982 (Pending Backport): log the numbers of dups in PG Log
Approved for `main`, QA is going on. Switching to _Pending backport_ before the merge to unblock backports. Radoslaw Zarzynski
08:03 PM Feature #55982 (Fix Under Review): log the numbers of dups in PG Log
Radoslaw Zarzynski
07:59 PM Feature #55982 (Resolved): log the numbers of dups in PG Log
This is a feature requests that is critical for investigating / verification of the dups inflation issue. Radoslaw Zarzynski
01:28 PM Backport #55747: pacific: Support blocklisting a CIDR range
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46470
merged
Yuri Weinstein

06/08/2022

06:23 PM Bug #52969 (Fix Under Review): use "ceph df" command found pool max avail increase when there are...
Radoslaw Zarzynski
06:22 PM Backport #55973 (New): pacific: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi101215...
Backport Bot
06:22 PM Backport #55972 (Resolved): quincy: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10...
Backport Bot
06:16 PM Bug #49525 (Pending Backport): found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi1012151...
Neha Ojha
06:13 PM Bug #55407 (Rejected): quincy osd's fail to boot and crash
Closing this ticket. The new crash is tracked independently (https://tracker.ceph.com/issues/55698). Radoslaw Zarzynski
06:10 PM Bug #55851: Assert in Ceph messenger
From Neha:
* http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?var-sig_v2=12eed3bdd041d05365...
Radoslaw Zarzynski
06:04 PM Bug #45318: Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log running...
This is isn't octupus-specific as we saw it in pacific as well. Radoslaw Zarzynski
05:52 PM Bug #49888: rados/singleton: radosbench.py: teuthology.exceptions.MaxWhileTries: reached maximum ...
No high priority. Possibly a test issue. Radoslaw Zarzynski
05:49 PM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
Maybe let's talk on that in one of the RADOS Team meetings. Radoslaw Zarzynski
05:48 PM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
Maybe let's talk on that in one of the RADOS Team meetings. Radoslaw Zarzynski
03:25 PM Backport #55309: pacific: prometheus metrics shows incorrect ceph version for upgraded ceph daemon
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46429
merged
Yuri Weinstein
03:25 PM Backport #55308: pacific: Manager is failing to keep updated metadata in daemon_state for upgrade...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46427
merged
Yuri Weinstein
03:12 PM Bug #52724 (Duplicate): octopus: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
Laura Flores
03:09 PM Bug #53855 (Resolved): rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
Laura Flores
02:38 PM Bug #51076 (Resolved): "wait_for_recovery: failed before timeout expired" during thrashosd test w...
Laura Flores
02:38 PM Backport #55743 (Resolved): octopus: "wait_for_recovery: failed before timeout expired" during th...
Laura Flores
07:53 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
That could work, but when we have socket failure injection, the error callback will not be calling in the python API ... Nitzan Mordechai
06:17 AM Bug #55836 (Fix Under Review): add an asok command for pg log investigations
Nitzan Mordechai
02:24 AM Backport #55305 (In Progress): quincy: Manager is failing to keep updated metadata in daemon_stat...
Prashant D

06/07/2022

05:42 PM Bug #53729 (Resolved): ceph-osd takes all memory before oom on boot
Neha Ojha
05:42 PM Bug #54296 (Resolved): OSDs using too much memory
Neha Ojha
05:41 PM Backport #55633 (Resolved): octopus: ceph-osd takes all memory before oom on boot
Neha Ojha
05:41 PM Backport #55631 (Resolved): pacific: ceph-osd takes all memory before oom on boot
Neha Ojha
05:13 PM Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after wait...
/a/yuriw-2022-05-31_21:35:41-rados-wip-yuri2-testing-2022-05-31-1300-pacific-distro-default-smithi/6856269
Descrip...
Laura Flores
04:04 PM Backport #53972: pacific: BufferList.rebuild_aligned_size_and_memory failure
Radoslaw Zarzynski wrote:
> https://github.com/ceph/ceph/pull/46215
merged
Yuri Weinstein
04:03 PM Bug #50806: osd/PrimaryLogPG.cc: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_mis...
https://github.com/ceph/ceph/pull/46120 merged Yuri Weinstein
04:02 PM Backport #55281: pacific: mon/OSDMonitor: properly set last_force_op_resend in stretch mode
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/45870
merged
Yuri Weinstein
03:42 PM Bug #49888: rados/singleton: radosbench.py: teuthology.exceptions.MaxWhileTries: reached maximum ...
/a/yuriw-2022-06-02_00:50:42-rados-wip-yuri4-testing-2022-06-01-1350-pacific-distro-default-smithi/6859734
Descrip...
Laura Flores
03:30 PM Bug #48965: qa/standalone/osd/osd-force-create-pg.sh: TEST_reuse_id: return 1
/a/yuriw-2022-06-02_00:50:42-rados-wip-yuri4-testing-2022-06-01-1350-pacific-distro-default-smithi/6859929 Laura Flores
03:29 PM Bug #55906: cephfs/metrics/Types.h: In function 'std::ostream& operator<<(std::ostream&, const Cl...
Oops, updated the wrong Tracker. Laura Flores
06:01 AM Bug #55906: cephfs/metrics/Types.h: In function 'std::ostream& operator<<(std::ostream&, const Cl...
This has been fixed by https://tracker.ceph.com/issues/50822 Xiubo Li
05:55 AM Bug #55906 (New): cephfs/metrics/Types.h: In function 'std::ostream& operator<<(std::ostream&, co...
/home/teuthworker/archive/yuriw-2022-06-02_14:44:32-rados-wip-yuri4-testing-2022-06-01-1350-pacific-distro-default-sm... Nitzan Mordechai
03:20 PM Bug #45318: Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log running...
/a/yuriw-2022-06-02_00:50:42-rados-wip-yuri4-testing-2022-06-01-1350-pacific-distro-default-smithi/6859916 Laura Flores
02:01 PM Backport #55298: octopus: malformed json in a Ceph RESTful API call can stop all ceph-mon services
nikhil kshirsagar wrote:
> please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/...
Yuri Weinstein
01:56 AM Bug #55905 (New): Failed to build rados.cpython-310-x86_64-linux-gnu.so
I build ceph on Ubuntu22.04, but I meet the error. And under my research, I found a way to solve the error, but I don... Hualong Feng

06/06/2022

08:06 PM Bug #55836: add an asok command for pg log investigations
It'd be nice if we could retrieve pg log dups length by means of an existing command. FWIW, we log the "approx pg log... Neha Ojha
06:38 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
Tested with the fixed version and now it is working fine!... Vikhyat Umrao
04:56 PM Bug #51076 (Pending Backport): "wait_for_recovery: failed before timeout expired" during thrashos...
Laura Flores
04:55 PM Bug #51076 (Resolved): "wait_for_recovery: failed before timeout expired" during thrashosd test w...
Laura Flores
04:55 PM Backport #55745 (Resolved): pacific: "wait_for_recovery: failed before timeout expired" during th...
Laura Flores
04:55 PM Bug #50842 (Resolved): pacific: recovery does not complete because of rw_manager lock not being ...
Laura Flores
02:58 PM Backport #55746: quincy: Support blocklisting a CIDR range
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46469
merged
Yuri Weinstein

06/05/2022

10:18 AM Bug #55407: quincy osd's fail to boot and crash
Radoslaw Zarzynski wrote:
> This looks like something new and unrelated to other crashes in this ticket, so created ...
Gonzalo Aguilar Delgado

06/03/2022

10:14 PM Bug #46877: mon_clock_skew_check: expected MON_CLOCK_SKEW but got none
Spotted in Quincy:
/a/yuriw-2022-06-02_20:24:42-rados-wip-yuri5-testing-2022-06-02-0825-quincy-distro-default-smit...
Laura Flores
02:50 PM Bug #55851 (Resolved): Assert in Ceph messenger
Context:
Ceph balancer was busy balancing: PG remaps...
Stefan Kooman
01:23 AM Backport #55306 (Resolved): quincy: prometheus metrics shows incorrect ceph version for upgraded ...
should be fixed in 17.2.1 Adam King

06/02/2022

07:26 PM Bug #55836 (Resolved): add an asok command for pg log investigations
The rationale is that @ceph-objectstore-tool -op log@ requires stopping OSD, and thus is intrusive.
This feature i...
Radoslaw Zarzynski
09:39 AM Backport #55767 (In Progress): octopus: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Nitzan Mordechai
09:38 AM Backport #55768 (In Progress): pacific: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Nitzan Mordechai
08:35 AM Bug #54004 (Rejected): When creating erasure-code-profile incorrectly set parameters, it can be c...
Nitzan Mordechai
08:35 AM Bug #54004: When creating erasure-code-profile incorrectly set parameters, it can be created succ...
The profile can be created, but that doesn't mean that you will use it, yet.
as long as you are not using it, no err...
Nitzan Mordechai
06:32 AM Bug #54172: ceph version 16.2.7 PG scrubs not progressing

We've seen this at a customers cluster as well. A simple repeer of the pg gets it unstuck. We've not investigated a...
Wout van Heeswijk

06/01/2022

02:52 PM Backport #55631: pacific: ceph-osd takes all memory before oom on boot
This PR is ready to merge. Can this be executed so this change will end up in the next Pacific release? Wout van Heeswijk
10:47 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
jianwei zhang wrote:
> https://github.com/ceph/ceph/pull/46478
test result
jianwei zhang
10:40 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
https://github.com/ceph/ceph/pull/46478 jianwei zhang
08:15 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
... jianwei zhang
07:18 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
The original intention of raising this question is that testers (users) are confused as to why MAX_AVAIL does not dec... jianwei zhang
06:17 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
step4 vs step5:
4. kill 9 osd.0.pid - OSD.0 OUT unset nobackfill --> recovery HEALTH_OK
STORED = 1.1G ///increase 1...
jianwei zhang
06:15 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
5. remove out osd.0... jianwei zhang
06:03 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
for Problem2 step1 vs step4:
osd.0 already out and recovery complete HEALTH_OK, but STORED/(DATA) 1.0G increase ...
jianwei zhang
05:59 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
for ceph df detail commands
I don't think raw_used_rate should be adjusted:...
jianwei zhang
05:51 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
针对MAX AVAIL字段,我认为应该将down or out osd.0去除掉... jianwei zhang
05:44 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
Problem1 step1 vs step2:
1. ceph cluster initial state
STORED = 1.0G
(DATA) = 1.0G
MAX AVAIL = 260G
2. ...
jianwei zhang
05:30 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
ceph v15.2.13
I found same problem
1. ceph cluster initial state...
jianwei zhang
06:23 AM Fix #54565 (Resolved): Add snaptrim stats to the existing PG stats.
Sridhar Seshasayee
06:23 AM Backport #54612 (Resolved): quincy: Add snaptrim stats to the existing PG stats.
Sridhar Seshasayee
06:22 AM Bug #55186 (Resolved): Doc: Update mclock release notes regarding an existing issue.
Sridhar Seshasayee
06:21 AM Backport #55219 (Resolved): quincy: Doc: Update mclock release notes regarding an existing issue.
Sridhar Seshasayee
06:19 AM Feature #51984 (Resolved): [RFE] Provide warning when the 'require-osd-release' flag does not mat...
Sridhar Seshasayee
06:18 AM Backport #53549 (Rejected): nautilus: [RFE] Provide warning when the 'require-osd-release' flag d...
The backport to nautilus was deemed not needed. See BZ https://bugzilla.redhat.com/show_bug.cgi?id=2033078 for more d... Sridhar Seshasayee
05:57 AM Backport #53550 (Resolved): octopus: [RFE] Provide warning when the 'require-osd-release' flag do...
Sridhar Seshasayee
04:58 AM Bug #49525 (Fix Under Review): found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi1012151...
Indeed caused by scrub starting while the PG is being snap-trimmed.
Ronen Friedman
04:51 AM Bug #55794 (Duplicate): scrub: scrub is not prevented from started while snap-trimming is in prog...
Laura Flores wrote:
> @Ronen is this already tracked in #49525?
Yes. Thanks. I will mark as duplicate.
Ronen Friedman

05/31/2022

11:52 PM Bug #54316 (Resolved): mon/MonCommands.h: target_size_ratio range is incorrect
Kamoltat (Junior) Sirivadhna
11:51 PM Backport #54567 (Resolved): pacific: mon/MonCommands.h: target_size_ratio range is incorrect
Kamoltat (Junior) Sirivadhna
11:50 PM Backport #54568 (Resolved): octopus: mon/MonCommands.h: target_size_ratio range is incorrect
Kamoltat (Junior) Sirivadhna
11:33 PM Backport #55747 (In Progress): pacific: Support blocklisting a CIDR range
Greg Farnum
11:18 PM Backport #55746 (In Progress): quincy: Support blocklisting a CIDR range
Greg Farnum
10:26 PM Bug #55794: scrub: scrub is not prevented from started while snap-trimming is in progress
@Ronen is this already tracked in #49525? Laura Flores
09:38 PM Bug #55809: "Leak_IndirectlyLost" valgrind report on mon.c
Laura Flores wrote:
> /a/yuriw-2022-05-27_21:59:17-rados-wip-yuri-testing-2022-05-27-0934-distro-default-smithi/6851...
Laura Flores
09:35 PM Bug #55809 (New): "Leak_IndirectlyLost" valgrind report on mon.c
/a/yuriw-2022-05-27_21:59:17-rados-wip-yuri-testing-2022-05-27-0934-distro-default-smithi/6851271/remote/smithi085/lo... Laura Flores
06:13 PM Backport #53971 (Resolved): octopus: BufferList.rebuild_aligned_size_and_memory failure
Neha Ojha
06:07 PM Backport #53971: octopus: BufferList.rebuild_aligned_size_and_memory failure
Radoslaw Zarzynski wrote:
> https://github.com/ceph/ceph/pull/46216
merged
Yuri Weinstein
03:10 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
Other reported instances of this `wait_for_clean` assertion failure where the pgmap has a pg stuck in recovery have l... Laura Flores
03:04 PM Bug #55726: Drained OSDs are still ACTIVE_PRIMARY - casuing high IO latency on clients
Hi,
set debug mode on OSDs and MONs but didn't find string 'choose_acting'.
Also what I found, our EC profile ...
Denis Polom
02:42 PM Bug #39150 (Resolved): mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
Neha Ojha
02:41 PM Bug #50659 (Resolved): Segmentation fault under Pacific 16.2.1 when using a custom crush location...
Neha Ojha
02:39 PM Bug #53306 (Resolved): ceph -s mon quorum age negative number
Neha Ojha
02:38 PM Backport #55280 (Resolved): quincy: mon/OSDMonitor: properly set last_force_op_resend in stretch ...
Neha Ojha
02:37 PM Bug #53327 (Resolved): osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shut...
Neha Ojha
02:34 PM Backport #55632 (Resolved): quincy: ceph-osd takes all memory before oom on boot
Neha Ojha
12:52 PM Bug #55435 (Fix Under Review): mon/Elector: notify_ranked_removed() does not properly erase dead_...
Kamoltat (Junior) Sirivadhna
05:18 AM Bug #55798 (Fix Under Review): scrub starts message missing in cluster log
Prashant D
05:15 AM Bug #55798 (Pending Backport): scrub starts message missing in cluster log
We used to log "scrub starts" and "deep-scrub starts" message if scrub/deep-scrub process has been started for the pg... Prashant D

05/30/2022

01:27 PM Bug #55773 (Fix Under Review): Assertion failure (ceph_assert(have_pending)) when creating new OS...
Sridhar Seshasayee
01:16 PM Backport #55309 (In Progress): pacific: prometheus metrics shows incorrect ceph version for upgra...
Prashant D
01:13 PM Backport #55308 (In Progress): pacific: Manager is failing to keep updated metadata in daemon_sta...
Prashant D
12:24 PM Bug #55794 (Duplicate): scrub: scrub is not prevented from started while snap-trimming is in prog...
Scrub code only tests the target PG for 'active' & 'clean'. And snap-trimming PGs are
'clean'.
For example:
http...
Ronen Friedman
09:25 AM Backport #55792 (Rejected): octopus: CEPH Graylog Logging Missing "host" Field
Konstantin Shalygin
09:25 AM Backport #55791 (Rejected): pacific: CEPH Graylog Logging Missing "host" Field
Konstantin Shalygin

05/27/2022

10:29 PM Bug #55787 (New): mon/crush_ops.sh: Error ENOENT: item osd.7 does not exist
Found in an Octopus teuthology run:
/a/yuriw-2022-05-14_14:30:10-rados-wip-yuri5-testing-2022-05-13-1402-octopus-d...
Laura Flores
03:59 PM Bug #55383 (Resolved): monitor cluster logs(ceph.log) appear empty until rotated
Vikhyat Umrao
03:59 PM Backport #55742 (Resolved): quincy: monitor cluster logs(ceph.log) appear empty until rotated
Vikhyat Umrao
03:50 PM Backport #55742: quincy: monitor cluster logs(ceph.log) appear empty until rotated
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46374
merged
Yuri Weinstein
03:35 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
/a/yuriw-2022-05-26_23:23:48-rados-wip-yuri2-testing-2022-05-26-1430-quincy-distro-default-smithi/6849426$... Laura Flores

05/26/2022

10:56 PM Bug #55776 (New): octopus: map exx had wrong cluster addr
Description: rados/objectstore/{backends/ceph_objectstore_tool supported-random-distro$/{ubuntu_18.04}}
/a/yuriw-2...
Laura Flores
10:33 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
/a/yuriw-2022-05-13_14:13:55-rados-wip-yuri3-testing-2022-05-12-1609-octopus-distro-default-smithi/6832544
Descrip...
Laura Flores
05:06 PM Bug #55773: Assertion failure (ceph_assert(have_pending)) when creating new OSDs during OSD deplo...
+*ANALYSIS*+
Note that the analysis is for the first crash when the leader was: mon.f25-h23-000-r730xd.rdu2.scalel...
Sridhar Seshasayee
04:54 PM Bug #55773 (Resolved): Assertion failure (ceph_assert(have_pending)) when creating new OSDs durin...
See https://bugzilla.redhat.com/show_bug.cgi?id=2086419 for more details.
+*Assertion Failure*+...
Sridhar Seshasayee
08:52 AM Bug #55355: osd thread deadlock
I think this problem may be a problem with ProtocolV2... jianwei zhang
01:56 AM Bug #55750: mon: slow request of very long time
https://github.com/ceph/ceph/pull/41516
https://github.com/ceph/ceph/commit/a124ee85b03e15f4ea371358008ecac65f9f4e50...
yite gu
 

Also available in: Atom