Activity
From 04/15/2021 to 05/14/2021
05/14/2021
- 09:58 PM Bug #50692 (Resolved): nautilus: ERROR: test_rados.TestIoctx.test_service_daemon
- 09:56 PM Bug #50746: osd: terminate called after throwing an instance of 'std::out_of_range'
- I ran the same command "MDS=3 OSD=3 MON=3 MGR=1 ../src/vstart.sh -n -X -G --msgr1 --memstore" and everything works fi...
- 09:43 PM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- Do the OSDs hitting this assert come up fine on restarting? or are they repeatedly hitting this assert?
- 01:34 PM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- Just purely based on the numbering of OSDs I know for a fact that osd.47 was upgraded before osd.59 so based on that ...
- 08:51 PM Bug #50775: mds and osd unable to obtain rotating service keys
- Hi Song,
Could you please confirm the ceph version with the output of "ceph-mds --version"? - 08:33 PM Bug #50657 (In Progress): smart query on monitors
- Hi Jan-Philipp,
Thanks for reporting this.
Can you please provide the output of `df` on the host where a monito... - 05:46 PM Backport #50703: octopus: Data loss propagation after backfill
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41237
merged - 05:43 PM Backport #49993: octopus: unittest_mempool.check_shard_select failed
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39978
merged - 05:43 PM Backport #49053: octopus: pick_a_shard() always select shard 0
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39978
merged - 04:42 PM Bug #47380 (Resolved): mon: slow ops due to osd_failure
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:47 PM Backport #49919 (Resolved): nautilus: mon: slow ops due to osd_failure
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41213
m... - 02:02 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- ...
- 07:40 AM Bug #50813 (Duplicate): mon/OSDMonitor: should clear new flag when do destroy
- the new flag in osdmap will affects the osd up according option
mon_osd_auto_mark_new_in. So it is more safe... - 01:41 AM Bug #50806 (Resolved): osd/PrimaryLogPG.cc: FAILED ceph_assert(attrs || !recovery_state.get_pg_lo...
- ...
05/13/2021
- 11:30 PM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- The crashed OSD was running 15.2.11 but do you happen to know what version osd.59(the primary for pg 6.7a) was runnin...
- 03:59 PM Bug #50692 (Fix Under Review): nautilus: ERROR: test_rados.TestIoctx.test_service_daemon
- 03:15 PM Bug #50761: ceph mon hangs forever while trying to parse config
- removed logs since they are unrelated
- 08:08 AM Backport #50793 (In Progress): octopus: osd: FAILED ceph_assert(recovering.count(*i)) after non-p...
- 07:05 AM Backport #50793 (Resolved): octopus: osd: FAILED ceph_assert(recovering.count(*i)) after non-prim...
- https://github.com/ceph/ceph/pull/41321
- 08:07 AM Backport #50794 (In Progress): pacific: osd: FAILED ceph_assert(recovering.count(*i)) after non-p...
- 07:05 AM Backport #50794 (Resolved): pacific: osd: FAILED ceph_assert(recovering.count(*i)) after non-prim...
- https://github.com/ceph/ceph/pull/41320
- 07:12 AM Backport #50792 (In Progress): nautilus: osd: FAILED ceph_assert(recovering.count(*i)) after non-...
- The backport is included in https://github.com/ceph/ceph/pull/41293
- 07:05 AM Backport #50792 (Rejected): nautilus: osd: FAILED ceph_assert(recovering.count(*i)) after non-pri...
- 07:05 AM Backport #50797 (Resolved): pacific: mon: spawn loop after mon reinstalled
- https://github.com/ceph/ceph/pull/41768
- 07:05 AM Backport #50796 (Resolved): octopus: mon: spawn loop after mon reinstalled
- https://github.com/ceph/ceph/pull/41621
- 07:05 AM Backport #50795 (Resolved): nautilus: mon: spawn loop after mon reinstalled
- https://github.com/ceph/ceph/pull/41762
- 07:02 AM Bug #50351 (Pending Backport): osd: FAILED ceph_assert(recovering.count(*i)) after non-primary os...
- 07:01 AM Bug #50230 (Pending Backport): mon: spawn loop after mon reinstalled
- 06:55 AM Backport #50791 (Resolved): pacific: osd: write_trunc omitted to clear data digest
- https://github.com/ceph/ceph/pull/42019
- 06:55 AM Backport #50790 (Resolved): octopus: osd: write_trunc omitted to clear data digest
- https://github.com/ceph/ceph/pull/41620
- 06:55 AM Backport #50789 (Rejected): nautilus: osd: write_trunc omitted to clear data digest
- 06:54 AM Bug #50763 (Pending Backport): osd: write_trunc omitted to clear data digest
05/12/2021
- 10:44 PM Bug #49688: FAILED ceph_assert(is_primary()) in submit_log_entries during PromoteManifestCallback...
- Myoungwon Oh, any idea what could be causing this? Feel free to unassign, if you are not aware of what is causing thi...
- 10:39 PM Bug #49688: FAILED ceph_assert(is_primary()) in submit_log_entries during PromoteManifestCallback...
- /a/yuriw-2021-05-11_19:33:39-rados-wip-yuri2-testing-2021-05-11-1032-pacific-distro-basic-smithi/6110085
- 06:02 PM Bug #50761: ceph mon hangs forever while trying to parse config
- Ilya Dryomov wrote:
> Are you sure that client.admin.4638.log was generated by your hello-world binary? Because the... - 05:15 PM Bug #50761: ceph mon hangs forever while trying to parse config
- Are you sure that client.admin.4638.log was generated by your hello-world binary? Because the complete log attached ...
- 04:38 PM Bug #50761: ceph mon hangs forever while trying to parse config
- logs: https://drive.google.com/file/d/1RdLToyo3vpL3nFMI2U3hfGrY4tpfQ9Az/view?usp=sharing
- 04:03 PM Bug #50761: ceph mon hangs forever while trying to parse config
- Ilya Dryomov wrote:
> Where client.admin.4803.log came from? How was it captured?
I ran a hello world script:
<... - 03:51 PM Bug #50761: ceph mon hangs forever while trying to parse config
- Where client.admin.4803.log came from? How was it captured?
- 02:24 PM Bug #50775: mds and osd unable to obtain rotating service keys
- I will reproduce, fix and verify this bug. Then will send code review of bugfix.
- 02:23 PM Bug #50775 (Fix Under Review): mds and osd unable to obtain rotating service keys
- version-15.2.0
error message:
2021-05-04T05:51:54.719+0800 7f105b2737c0 -1 mds.c unable to obtain rotating serv... - 02:19 PM Bug #50384 (Fix Under Review): pacific ceph-mon: mon initial failed on aarch64
- 12:09 PM Backport #50666 (Resolved): pacific: upgrade:nautilus-x-pacific: LibRadosService.StatusFormat fai...
- 12:09 PM Bug #50595 (Resolved): upgrade:nautilus-x-pacific: LibRadosService.StatusFormat failure
- 07:06 AM Bug #50747 (Fix Under Review): nautilus: osd: backfill_unfound state reset to clean after osd res...
- 05:46 AM Bug #50763 (Fix Under Review): osd: write_trunc omitted to clear data digest
- 02:20 AM Bug #50763: osd: write_trunc omitted to clear data digest
- https://github.com/ceph/ceph/pull/41290
- 02:13 AM Bug #50763 (Resolved): osd: write_trunc omitted to clear data digest
05/11/2021
- 07:43 PM Backport #50666 (In Progress): pacific: upgrade:nautilus-x-pacific: LibRadosService.StatusFormat ...
- 04:32 PM Bug #50692: nautilus: ERROR: test_rados.TestIoctx.test_service_daemon
- /a/yuriw-2021-05-11_14:36:21-rados-wip-yuri2-testing-2021-05-10-1557-nautilus-distro-basic-smithi/6109477
- 03:37 PM Bug #50761 (New): ceph mon hangs forever while trying to parse config
- ...
- 08:58 AM Bug #50004 (Resolved): mon: Modify Paxos trim logic to be more efficient
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:57 AM Bug #50395 (Resolved): filestore: ENODATA error after directory split confuses transaction
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:52 AM Backport #50125 (Resolved): nautilus: mon: Modify Paxos trim logic to be more efficient
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41099
m... - 08:52 AM Backport #50506 (Resolved): nautilus: mon/MonClient: reset authenticate_err in _reopen_session()
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41016
m... - 08:52 AM Backport #50481 (Resolved): nautilus: filestore: ENODATA error after directory split confuses tra...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40987
m... - 08:50 AM Backport #50504 (Resolved): octopus: mon/MonClient: reset authenticate_err in _reopen_session()
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41017
m... - 08:50 AM Backport #50479 (Resolved): octopus: filestore: ENODATA error after directory split confuses tran...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40988
m... - 07:48 AM Backport #49918 (Resolved): pacific: mon: slow ops due to osd_failure
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41090
m... - 07:45 AM Backport #50750 (Resolved): octopus: max_misplaced was replaced by target_max_misplaced_ratio
- https://github.com/ceph/ceph/pull/41624
- 07:45 AM Backport #50749 (Rejected): nautilus: max_misplaced was replaced by target_max_misplaced_ratio
- 07:45 AM Backport #50748 (Resolved): pacific: max_misplaced was replaced by target_max_misplaced_ratio
- https://github.com/ceph/ceph/pull/42250
- 07:41 AM Bug #50745 (Pending Backport): max_misplaced was replaced by target_max_misplaced_ratio
- would be great if we can backport https://github.com/ceph/ceph/pull/41207 along with the https://github.com/ceph/ceph...
- 04:22 AM Bug #50745 (Resolved): max_misplaced was replaced by target_max_misplaced_ratio
- but the document was not sync'ed.
- 07:07 AM Bug #50747 (Fix Under Review): nautilus: osd: backfill_unfound state reset to clean after osd res...
- On nautilus we have been observing an issue when an EC pg is in active+backfill_unfound+degraded state (which happens...
- 07:00 AM Bug #50351 (Fix Under Review): osd: FAILED ceph_assert(recovering.count(*i)) after non-primary os...
- In the mailing list thread [1] I provided some details why I think the current behaviour of `PrimaryLogPG::on_failed_...
- 04:38 AM Bug #50746 (New): osd: terminate called after throwing an instance of 'std::out_of_range'
- ...
- 02:30 AM Bug #50743 (Need More Info): *: crash in pthread_getname_np
http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=8032fa5f1f2107af12b68e6f...
05/10/2021
- 05:07 PM Bug #50681: memstore: apparent memory leak when removing objects
- How long did you wait to see if memory usage dropped? Did you look at any logs or dump any pool object info?
I rea... - 07:35 AM Bug #46670: refuse to remove mon from the monmap if the mon is in quorum
- I still believe the "extra" security is important, I mean we do this for pools, mons are almost equally critical...
- 04:03 AM Bug #50346: OSD crash FAILED ceph_assert(!is_scrubbing())
- I have encountered this 4 more times on a 20-OSD cluster now running 16.2.3. If needed, I can provide more info.
05/09/2021
- 05:17 PM Bug #45690: pg_interval_t::check_new_interval is overly generous about guessing when EC PGs could...
- This description doesn't seem quite right to me -- OSDs 1-3 were part of the interval in step 4 so they know that not...
- 06:09 AM Bug #45390 (Closed): FreeBSD: osdmap decode and encode does not give the same OSDMap
- I assume this is fixed by now since the FreeBSD port is under active development? :)
- 05:45 AM Bug #46670: refuse to remove mon from the monmap if the mon is in quorum
- I'm inclined to say that this is fine as-is? I don't off-hand know how we remove monitors from quorum from the Ceph CLI.
- 05:42 AM Bug #46876 (Resolved): osd/ECBackend: optimize remaining read as readop contain multiple objects
- 05:08 AM Feature #47666: Ceph pool history
- Much of this is also maintained in the audit log, but that's not easily digestible back in by Ceph.
- 03:42 AM Feature #48151 (Closed): osd: allow remote read by calling cls method from within cls context
- Remote calls like this are unfortunately not plausible to implement within the object handling workflow.
- 03:19 AM Support #48530 (Closed): ceph pg status in incomplete.
- This kind of question is best served on the ceph-users@ceph.io mailing list if you can't find the answer in the docum...
05/08/2021
- 11:04 PM Bug #49158: doc: ceph-monstore-tools might create wrong monitor store
- This problem was fixed in the following PR.
https://github.com/ceph/ceph/pull/39288 - 02:49 PM Bug #48468: ceph-osd crash before being up again
- Hi Sage,
Hum I've finally managed to recover my cluster after an uncounted osd restart procedures until they star... - 11:21 AM Backport #50701 (In Progress): nautilus: Data loss propagation after backfill
- 08:40 AM Backport #50701 (Resolved): nautilus: Data loss propagation after backfill
- https://github.com/ceph/ceph/pull/41238
- 11:20 AM Backport #50703 (In Progress): octopus: Data loss propagation after backfill
- 08:40 AM Backport #50703 (Resolved): octopus: Data loss propagation after backfill
- https://github.com/ceph/ceph/pull/41237
- 11:19 AM Backport #50702 (In Progress): pacific: Data loss propagation after backfill
- 08:40 AM Backport #50702 (Resolved): pacific: Data loss propagation after backfill
- https://github.com/ceph/ceph/pull/41236
- 09:40 AM Backport #50706 (Resolved): pacific: _delete_some additional unexpected onode list
- https://github.com/ceph/ceph/pull/41680
- 09:40 AM Backport #50705 (Resolved): octopus: _delete_some additional unexpected onode list
- https://github.com/ceph/ceph/pull/41623
- 09:40 AM Backport #50704 (Resolved): nautilus: _delete_some additional unexpected onode list
- https://github.com/ceph/ceph/pull/41682
- 09:37 AM Bug #50466 (Pending Backport): _delete_some additional unexpected onode list
- 08:37 AM Bug #50558 (Pending Backport): Data loss propagation after backfill
- 08:30 AM Backport #50697 (Resolved): pacific: common: the dump of thread IDs is in dec instead of hex
- https://github.com/ceph/ceph/pull/53465
- 08:29 AM Bug #50653 (Pending Backport): common: the dump of thread IDs is in dec instead of hex
- 03:26 AM Feature #49089 (In Progress): msg: add new func support_reencode
- 03:24 AM Support #49268 (Closed): Blocked IOs up to 30 seconds when host powered down
- You can also tune how quickly the OSDs report their peers down from missing heartbeats, but in general losing a monit...
- 03:17 AM Support #49489: Getting Long heartbeat and slow requests on ceph luminous 12.2.13
- This is almost certainly a result of cache tiering (which we generally discourage from use) being a bad fit or incorr...
05/07/2021
- 11:39 PM Bug #50659: Segmentation fault under Pacific 16.2.1 when using a custom crush location hook
- I have attached a coredump. This hook works fine in 15.2.9. I can also run it fine manually from inside a launched OS...
- 09:57 PM Bug #50659 (Need More Info): Segmentation fault under Pacific 16.2.1 when using a custom crush lo...
- Is it possible for you to capture a coredump? Did the same crush_location_hook work fine on your 15.2.9 cluster?
- 10:12 PM Bug #50637: OSD slow ops warning stuck after OSD fail
- This sounds like a bug, we shouldn't be accounting for down+out osds when counting slow ops.
- 08:57 AM Bug #50637: OSD slow ops warning stuck after OSD fail
- I now zapped and re-created the OSDs on this disk. As expected, purging OSD 580 from the cluster cleared the health w...
- 10:01 PM Bug #50657: smart query on monitors
- Yaarit, can you help take a look at this?
- 09:45 PM Bug #50682: Pacific - OSD not starting after upgrade
- This issue has been fixed by https://github.com/ceph/ceph/pull/40845 and will be released in the next pacific point r...
- 07:48 PM Bug #47949: scrub/osd-scrub-repair.sh: TEST_auto_repair_bluestore_scrub: return 1
- /a/yuriw-2021-05-06_15:20:22-rados-wip-yuri4-testing-2021-05-05-1236-nautilus-distro-basic-smithi/6101282
- 07:47 PM Bug #50692 (Resolved): nautilus: ERROR: test_rados.TestIoctx.test_service_daemon
- ...
- 02:25 PM Bug #50688 (Duplicate): Ceph can't be deployed using cephadm on nodes with /32 ip addresses
- *Preamble*
In certain data centers it is common to assign a /32 ip address to a node and let bgp handle the reacha... - 12:34 PM Bug #50681: memstore: apparent memory leak when removing objects
- Thanks Greg for your answer. So my expectation was, that at least when there is memory pressure or I am unmounting th...
- 09:16 AM Bug #50683: [RBD] master - cluster [WRN] Health check failed: mon is allowing insecure global_id ...
- Hi Harish,...
- 09:04 AM Bug #50683 (Rejected): [RBD] master - cluster [WRN] Health check failed: mon is allowing insecure...
- Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e826082... - 09:06 AM Bug #49231: MONs unresponsive over extended periods of time
- After running for a few months with the modified setting, it seems that it fixes the issue. I still see CPU load- and...
- 03:10 AM Backport #49919 (In Progress): nautilus: mon: slow ops due to osd_failure
- 02:13 AM Bug #50245: TEST_recovery_scrub_2: Not enough recovery started simultaneously
- /a/nojha-2021-05-06_22:58:00-rados-wip-default-mclock-2021-05-06-distro-basic-smithi/6102970
- 01:22 AM Bug #50162 (Won't Fix): Backport to Natilus of automatic lowering min_size for repairing tasks (o...
- Nathan Cutler wrote:
> This needs a pull request ID, or a list of master commits that are requested to be backported...
05/06/2021
- 11:48 PM Bug #50682 (New): Pacific - OSD not starting after upgrade
- Copied from https://tracker.ceph.com/issues/50169
Using Ubuntu 20.04, none cephadm, packages from ceph repositorie... - 09:49 PM Bug #50681: memstore: apparent memory leak when removing objects
- I’m not totally clear on what you’re doing here and what you think the erroneous behavior is. Memstore only stores da...
- 07:05 PM Bug #50681: memstore: apparent memory leak when removing objects
- The title should say "osd objectstore = memstore"
- 06:31 PM Bug #50681 (New): memstore: apparent memory leak when removing objects
- When I create and unlink big files like in this[1] little program in my development environment, the OSD daemon keeps...
- 02:26 AM Bug #50558: Data loss propagation after backfill
- For the record, the following is the sequence of the data loss propagation when readdir error happens on filestore du...
05/05/2021
- 09:38 PM Bug #49809: 1 out of 3 mon crashed in MonitorDBStore::get_synchronizer
- Hi Christian,
No, unfortunately I hit a dead end on this as the log message issue was a red herring.
I'm afraid... - 02:36 PM Bug #49809: 1 out of 3 mon crashed in MonitorDBStore::get_synchronizer
- Brad were you able to find out more about the root cause of this crash?
- 09:09 PM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
- Possibly related to 40119
- 06:05 PM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
- /a/sage-2021-05-05_15:58:13-rados-wip-sage-testing-2021-05-04-1814-distro-basic-smithi/6099487
- 07:26 PM Backport #49918: pacific: mon: slow ops due to osd_failure
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41090
merged - 06:10 PM Backport #50666 (Resolved): pacific: upgrade:nautilus-x-pacific: LibRadosService.StatusFormat fai...
- https://github.com/ceph/ceph/pull/41182
- 06:09 PM Bug #50595 (Pending Backport): upgrade:nautilus-x-pacific: LibRadosService.StatusFormat failure
- 04:26 PM Backport #50504: octopus: mon/MonClient: reset authenticate_err in _reopen_session()
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41017
merged - 04:25 PM Backport #50479: octopus: filestore: ENODATA error after directory split confuses transaction
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40988
merged - 03:33 PM Bug #17257 (Can't reproduce): ceph_test_rados_api_lock fails LibRadosLockPP.LockExclusiveDurPP
- 02:29 PM Bug #50659: Segmentation fault under Pacific 16.2.1 when using a custom crush location hook
- I forgot to add that I tried to diff code I thought was relevant between tags v15.2.9 and v16.2.1 and thought I saw s...
- 02:21 PM Bug #50659 (Resolved): Segmentation fault under Pacific 16.2.1 when using a custom crush location...
- I feel like if this wasn't somehow just my problem, there'd be an issue open on it already, but I'm not seeing one, a...
- 01:58 PM Bug #50658 (New): TEST_backfill_pool_priority fails
- ...
- 12:39 PM Bug #50657 (Resolved): smart query on monitors
- Since the upgrade to Pacific, our manager queries each daemon for smart statistics.
This is fine on the OSDs (at l... - 10:51 AM Bug #49962 (Fix Under Review): 'sudo ceph --cluster ceph osd crush tunables default' fails due to...
- https://github.com/ceph/ceph/pull/41169
- 03:56 AM Bug #40119: api_tier_pp hung causing a dead job
- /a/bhubbard-2021-04-26_22:38:21-rados-master-distro-basic-smithi/6075940
In this instance the slow requests are on...
05/04/2021
- 07:28 PM Bug #50647: common: the fault handling becomes inoperational when multiple faults happen the same...
- Just to the record: https://gist.github.com/rzarzynski/eb21e48a4458b593912eccd50ab8da46.
- 07:09 PM Bug #50647 (Fix Under Review): common: the fault handling becomes inoperational when multiple fau...
- https://github.com/ceph/ceph/pull/41154
- 02:42 PM Bug #50647 (Fix Under Review): common: the fault handling becomes inoperational when multiple fau...
- The problem arises due to installing the fault handlers with the flag @SA_RESETHAND@. It instructs the kernel to rest...
- 07:26 PM Bug #50653 (Fix Under Review): common: the dump of thread IDs is in dec instead of hex
- https://github.com/ceph/ceph/pull/41155
- 07:08 PM Bug #50653 (Resolved): common: the dump of thread IDs is in dec instead of hex
- It's a fallout from 5b8274f09951c7f36eb1ca1a234e7c8a08c30c9c.
- 04:39 PM Backport #50125: nautilus: mon: Modify Paxos trim logic to be more efficient
- https://github.com/ceph/ceph/pull/41099 merged
- 04:38 PM Bug #50587: mon election storm following osd recreation: huge tcmalloc and ceph::msgr::v2::FrameA...
- https://github.com/ceph/ceph/pull/41098 merged
- 04:37 PM Backport #50506: nautilus: mon/MonClient: reset authenticate_err in _reopen_session()
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41016
merged - 04:02 PM Bug #50595 (Fix Under Review): upgrade:nautilus-x-pacific: LibRadosService.StatusFormat failure
- ...
- 03:31 PM Backport #50481: nautilus: filestore: ENODATA error after directory split confuses transaction
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40987
merged - 03:16 PM Bug #50648 (New): nautilus: ceph status check times out
- ...
- 07:47 AM Bug #50637 (Duplicate): OSD slow ops warning stuck after OSD fail
- We had a disk fail with 2 OSDs deployed on it, ids=580, 581. Since then, the health warning @430 slow ops, oldest one...
05/03/2021
- 11:05 PM Backport #50344: pacific: mon: stretch state is inconsistently-maintained on peons, preventing pr...
- https://github.com/ceph/ceph/pull/41130
- 10:59 PM Backport #50087 (In Progress): pacific: test_mon_pg: mon fails to join quorum to due election str...
- 09:57 PM Backport #50087: pacific: test_mon_pg: mon fails to join quorum to due election strategy mismatch
- https://github.com/ceph/ceph/pull/40484
- 08:53 PM Bug #47719 (Resolved): api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:53 PM Bug #48946 (Resolved): Disable and re-enable clog_to_monitors could trigger assertion
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:52 PM Bug #49392 (Resolved): osd ok-to-stop too conservative
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:50 PM Backport #49640 (Resolved): nautilus: Disable and re-enable clog_to_monitors could trigger assertion
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39912
m... - 08:49 PM Backport #49567 (Resolved): nautilus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40697
m... - 08:47 PM Backport #50130: nautilus: monmaptool --create --add nodeA --clobber monmap aborts in entity_addr...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40700
m... - 08:47 PM Backport #49531 (Resolved): nautilus: osd ok-to-stop too conservative
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40676
m... - 08:46 PM Backport #50459: nautilus: ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReqTest) mgr/...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40959
m... - 07:17 PM Bug #48732: Marking OSDs out causes mon daemons to crash following tcmalloc: large alloc
- Wes Dillingham wrote:
> Hello Dan and Neha. Shortly after filing this bug I went on paternity leave but have returne... - 07:04 PM Bug #48732: Marking OSDs out causes mon daemons to crash following tcmalloc: large alloc
- Hello Dan and Neha. Shortly after filing this bug I went on paternity leave but have returned today. I will try and a...
- 05:48 PM Bug #50595: upgrade:nautilus-x-pacific: LibRadosService.StatusFormat failure
- The upgrade test do not fail all the time.
upgrade:nautilus-x/parallel/{0-cluster/{openstack start} 1-ceph-install... - 05:21 PM Bug #50595: upgrade:nautilus-x-pacific: LibRadosService.StatusFormat failure
- I think the upgrade test just needs to skip that test. It's just looking for a specific string that changed in pacifi...
- 04:31 PM Bug #50595 (Triaged): upgrade:nautilus-x-pacific: LibRadosService.StatusFormat failure
- This seems related to ab0d8f2ae9f551e15a4c7bacbf69161e91263785.
Reverting makes the issue go away http://pulpito.fro... - 04:36 PM Bug #48997: rados/singleton/all/recovery-preemption: defer backfill|defer recovery not found in logs
- /a/yuriw-2021-04-30_13:45:05-rados-pacific-distro-basic-smithi/6086228
- 04:22 PM Backport #50129: octopus: monmaptool --create --add nodeA --clobber monmap aborts in entity_addr_...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40758
m... - 04:21 PM Backport #49917: octopus: mon: slow ops due to osd_failure
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40558
m... - 04:21 PM Backport #50123: octopus: mon: Modify Paxos trim logic to be more efficient
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40699
m... - 04:21 PM Backport #49566: octopus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40756
m... - 04:20 PM Backport #49816: octopus: mon: promote_standby does not update available_modules
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40757
m... - 04:19 PM Backport #50457: octopus: ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReqTest) mgr/d...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40958
m... - 04:16 PM Bug #48336 (Resolved): monmaptool --create --add nodeA --clobber monmap aborts in entity_addr_t::...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 04:14 PM Bug #49778 (Resolved): mon: promote_standby does not update available_modules
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 04:04 PM Backport #50124 (Resolved): pacific: mon: Modify Paxos trim logic to be more efficient
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40691
m... - 04:04 PM Backport #50480: pacific: filestore: ENODATA error after directory split confuses transaction
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40989
m... - 04:04 PM Backport #50154 (Resolved): pacific: Reproduce https://tracker.ceph.com/issues/48417
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40759
m... - 04:04 PM Backport #50131 (Resolved): pacific: monmaptool --create --add nodeA --clobber monmap aborts in e...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40690
m... - 03:53 PM Backport #50458: pacific: ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReqTest) mgr/d...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40957
m... - 12:07 PM Bug #47299: Assertion in pg_missing_set: p->second.need <= v || p->second.is_delete()
- I had multiple OSDs die during an upgrade from Nautilus to Octopus with this trace. See the attached crash2.txt
- 12:05 PM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- Longer debug_osd output in crash1.txt file....
- 12:04 PM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- Longer debug_osd output in crash1.txt file.
-1> 2021-04-29T13:51:57.756+0200 7fa2edae2700 -1 /home/jenkins-b... - 12:02 PM Bug #50608 (Need More Info): ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- This was happening on a cluster running Nautilus 14.2.18/14.2.19 during upgrading to Octopus 15.2.11 some 3-4 OSDs cr...
- 09:00 AM Backport #50606 (In Progress): pacific: osd/scheduler/mClockScheduler: Async reservers are not up...
- 08:12 AM Backport #50606 (Resolved): pacific: osd/scheduler/mClockScheduler: Async reservers are not updat...
- https://github.com/ceph/ceph/pull/41125
05/02/2021
- 01:44 PM Bug #50510: OSD will return -EAGAIN on balance_reads although it can return the data
- Neha Ojha wrote:
> Can you please provide osd logs from the primary and replica with debug_osd=20 and debug_ms=1? Th...
05/01/2021
- 09:54 PM Support #49847 (Closed): OSD Fails to init after upgrading to octopus: _deferred_replay failed to...
- 12:12 AM Bug #50595: upgrade:nautilus-x-pacific: LibRadosService.StatusFormat failure
- ...
04/30/2021
- 09:53 PM Bug #50420 (Need More Info): all osd down after mon scrub too long
- Can provide us with the cluster log from this time? How large is your mon db?
- 09:21 PM Bug #50422: Error: finished tid 1 when last_acked_tid was 2
- Looks like a cache tiering+short pg log bug...
- 08:29 PM Bug #50587: mon election storm following osd recreation: huge tcmalloc and ceph::msgr::v2::FrameA...
- Dan van der Ster wrote:
> Leaving this open to address the msgr2 abort. Presumably this is caused by the >4GB messag... - 06:36 AM Bug #50587: mon election storm following osd recreation: huge tcmalloc and ceph::msgr::v2::FrameA...
- Josh Durgin wrote:
> Ah good catch Dan, that loop does appear to be generating millions of '=' in nautilus. Sounds l... - 06:07 AM Bug #50587: mon election storm following osd recreation: huge tcmalloc and ceph::msgr::v2::FrameA...
- Leaving this open to address the msgr2 abort. Presumably this is caused by the >4GB message generated to respond to `...
- 01:08 AM Bug #50587: mon election storm following osd recreation: huge tcmalloc and ceph::msgr::v2::FrameA...
- Ah good catch Dan, that loop does appear to be generating millions of '=' in nautilus. Sounds like we need to fix tha...
- 07:38 PM Bug #50595 (Resolved): upgrade:nautilus-x-pacific: LibRadosService.StatusFormat failure
- Seems to be ubuntu 18.04 specific
Run: https://pulpito.ceph.com/yuriw-2021-04-29_16:10:26-upgrade:nautilus-x-pacif... - 06:56 PM Backport #49919: nautilus: mon: slow ops due to osd_failure
- Kefu, can you please help with a minimal backport for nautilus?
- 05:38 PM Bug #50501 (Pending Backport): osd/scheduler/mClockScheduler: Async reservers are not updated wit...
- 04:27 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- /a/yuriw-2021-04-30_12:58:14-rados-wip-yuri2-testing-2021-04-29-1501-pacific-distro-basic-smithi/6086154
- 04:25 PM Bug #50042: rados/test.sh: api_watch_notify failures
- ...
- 09:21 AM Backport #50125: nautilus: mon: Modify Paxos trim logic to be more efficient
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/41099
ceph-backport.sh versi...
04/29/2021
- 11:16 PM Bug #50587: mon election storm following osd recreation: huge tcmalloc and ceph::msgr::v2::FrameA...
- commit 5f95ec4457059889bc4dbc2ad25cdc0537255f69 removed that loop in Monitor.cc but wasn't backported to nautilus.
... - 10:39 PM Bug #50587: mon election storm following osd recreation: huge tcmalloc and ceph::msgr::v2::FrameA...
- Something else weird in the mgr log: can negative progress events break the mon ?...
- 10:26 PM Bug #50587: mon election storm following osd recreation: huge tcmalloc and ceph::msgr::v2::FrameA...
- Here are some notes on the timelines between various actors at the start of the incident:
mon.cephbeesly-mon-2a00f... - 07:14 PM Bug #50587 (Resolved): mon election storm following osd recreation: huge tcmalloc and ceph::msgr:...
- We recreated an osd and seconds later our mons started using 100% CPU and going into an election storm which lasted n...
- 08:55 PM Bug #48732: Marking OSDs out causes mon daemons to crash following tcmalloc: large alloc
- Wes, did you ever find out more about the root cause of this? We saw something similar today in #50587
- 07:51 PM Bug #50510 (Need More Info): OSD will return -EAGAIN on balance_reads although it can return the ...
- Can you please provide osd logs from the primary and replica with debug_osd=20 and debug_ms=1? That will help us unde...
- 05:41 PM Backport #49918 (In Progress): pacific: mon: slow ops due to osd_failure
- 05:06 PM Backport #50480 (Resolved): pacific: filestore: ENODATA error after directory split confuses tran...
- 12:05 PM Bug #50558 (Fix Under Review): Data loss propagation after backfill
- 11:25 AM Bug #50558: Data loss propagation after backfill
- Hi
I worked with hase-san and submitted PR to handle readdir error correctly in filestore code: https://github.com... - 10:38 AM Fix #50574: qa/standalone: Modify/re-write failing standalone tests with mclock scheduler
- Standalone failures observed here:
https://pulpito.ceph.com/sseshasa-2021-04-23_15:37:51-rados-wip-mclock-max-backfi... - 05:43 AM Fix #50574 (Resolved): qa/standalone: Modify/re-write failing standalone tests with mclock scheduler
- A subset of the existing qa/standlone tests are failing with osd_op_queue set to "mclock_scheduler".
This is mainl... - 10:19 AM Bug #50462: OSDs crash in osd/osd_types.cc: FAILED ceph_assert(clone_overlap.count(clone))
- We run into the same assert over and over on one OSD. We were upgrading from luminous to nautilus ceph version 14.2.2...
04/28/2021
- 12:17 PM Bug #50558 (Resolved): Data loss propagation after backfill
- Situation:
An OSD data loss has been propagated to other OSDs. If backfill is performed when shard is missing in a p...
04/27/2021
- 09:14 PM Backport #50130 (Resolved): nautilus: monmaptool --create --add nodeA --clobber monmap aborts in ...
- 04:48 PM Backport #50130: nautilus: monmaptool --create --add nodeA --clobber monmap aborts in entity_addr...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40700
merged - 09:06 PM Backport #49567: nautilus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40697
merged - 04:47 PM Backport #49531: nautilus: osd ok-to-stop too conservative
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40676
merged - 12:37 PM Bug #50536 (New): "Command failed (workunit test rados/test.sh)" - rados/test.sh times out on mas...
- /a/sseshasa-2021-04-23_18:11:53-rados-wip-sseshasa-testing-2021-04-23-2212-distro-basic-smithi/6068991
Noticed a t... - 11:23 AM Bug #49888: rados/singleton: radosbench.py: teuthology.exceptions.MaxWhileTries: reached maximum ...
- Observed the same failure here:
/a/sseshasa-2021-04-23_18:11:53-rados-wip-sseshasa-testing-2021-04-23-2212-distro-ba... - 06:08 AM Backport #50129 (Resolved): octopus: monmaptool --create --add nodeA --clobber monmap aborts in e...
04/26/2021
- 09:33 PM Backport #50124: pacific: mon: Modify Paxos trim logic to be more efficient
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40691
merged - 09:30 PM Backport #50480: pacific: filestore: ENODATA error after directory split confuses transaction
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40989
merged - 09:29 PM Backport #50154: pacific: Reproduce https://tracker.ceph.com/issues/48417
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40759
merged - 09:27 PM Backport #50131: pacific: monmaptool --create --add nodeA --clobber monmap aborts in entity_addr_...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40690
merged - 06:32 PM Bug #50512: upgrade:nautilus-p2p-nautilus: unhandled event in ToDelete
- ...
- 09:54 AM Bug #50351: osd: FAILED ceph_assert(recovering.count(*i)) after non-primary osd restart when in b...
- `PrimaryLogPG::on_failed_pull` [1] looks suspicious to me. We remove the oid from `backfills_in_flight` here only if ...
04/25/2021
- 11:41 PM Bug #50512 (Won't Fix - EOL): upgrade:nautilus-p2p-nautilus: unhandled event in ToDelete
- Run: https://pulpito.ceph.com/teuthology-2021-04-22_01:25:03-upgrade:nautilus-p2p-nautilus-distro-basic-smithi/
Job:... - 05:59 PM Bug #50510 (Need More Info): OSD will return -EAGAIN on balance_reads although it can return the ...
- PrimaryLogPG.cc:
if (!is_primary()) {
if (!recovery_state.can_serve_replica_read(oid)) {
dout(20) <... - 02:23 PM Bug #50508: ceph-mon crash when create pool
- Hi, the ceph-mon process crashed when I created a pool. ...
- 02:18 PM Bug #50508: ceph-mon crash when create pool
- [root@arsenal-ceph-test-167 ~]# ceph crash ls
ID ENTIT... - 02:15 PM Bug #50508 (New): ceph-mon crash when create pool
- 10:00 AM Backport #50505 (In Progress): pacific: mon/MonClient: reset authenticate_err in _reopen_session()
- 10:00 AM Backport #50504 (In Progress): octopus: mon/MonClient: reset authenticate_err in _reopen_session()
- 09:59 AM Backport #50506 (In Progress): nautilus: mon/MonClient: reset authenticate_err in _reopen_session()
- 02:52 AM Backport #49917 (Resolved): octopus: mon: slow ops due to osd_failure
- 02:51 AM Backport #50123 (Resolved): octopus: mon: Modify Paxos trim logic to be more efficient
- 02:49 AM Backport #49566 (Resolved): octopus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- 02:49 AM Backport #49816 (Resolved): octopus: mon: promote_standby does not update available_modules
04/24/2021
- 05:55 AM Backport #50506 (Resolved): nautilus: mon/MonClient: reset authenticate_err in _reopen_session()
- https://github.com/ceph/ceph/pull/41016
- 05:55 AM Backport #50505 (Resolved): pacific: mon/MonClient: reset authenticate_err in _reopen_session()
- https://github.com/ceph/ceph/pull/41019
- 05:55 AM Backport #50504 (Resolved): octopus: mon/MonClient: reset authenticate_err in _reopen_session()
- https://github.com/ceph/ceph/pull/41017
- 05:51 AM Bug #50477 (Pending Backport): mon/MonClient: reset authenticate_err in _reopen_session()
- 05:20 AM Bug #49961: scrub/osd-recovery-scrub.sh: TEST_recovery_scrub_1 failed
- /a/kchai-2021-04-24_04:07:09-rados-wip-kefu-testing-2021-04-23-2026-distro-basic-smithi/6070018
04/23/2021
- 02:40 PM Bug #43489: PG.cc: 953: FAILED assert(0 == "past_interval start interval mismatch")
- This has been observed a handful of times at AT&T over the last six months or so. I'm afraid I don't have logs, but t...
- 01:45 PM Bug #50501 (Fix Under Review): osd/scheduler/mClockScheduler: Async reservers are not updated wit...
- 01:06 PM Bug #50501 (Resolved): osd/scheduler/mClockScheduler: Async reservers are not updated with the ov...
- The local and remote Async reserver objects are not updated with the new overridden values as part of mClockScheduler...
- 08:58 AM Bug #50466: _delete_some additional unexpected onode list
- Konstantin Shalygin wrote:
> Actually, when PG objects deleted with sleep 1 (default) - NVMe is not loaded, but when... - 08:52 AM Bug #50466: _delete_some additional unexpected onode list
- Actually, when PG objects deleted with sleep 1 (default) - NVMe is not loaded, but when pg header is deleted - is hug...
04/22/2021
- 05:53 PM Bug #50466: _delete_some additional unexpected onode list
- https://github.com/ceph/ceph/pull/40993
- 04:09 PM Bug #49591: no active mgr (MGR_DOWN)" in cluster log
- /a/yuriw-2021-04-21_15:39:30-rados-wip-yuri5-testing-2021-04-20-0819-pacific-distro-basic-smithi/6061973/
- 03:24 PM Backport #50480 (In Progress): pacific: filestore: ENODATA error after directory split confuses t...
- 01:25 PM Backport #50480 (Resolved): pacific: filestore: ENODATA error after directory split confuses tran...
- https://github.com/ceph/ceph/pull/40989
- 03:23 PM Backport #50479 (In Progress): octopus: filestore: ENODATA error after directory split confuses t...
- 01:25 PM Backport #50479 (Resolved): octopus: filestore: ENODATA error after directory split confuses tran...
- https://github.com/ceph/ceph/pull/40988
- 03:20 PM Backport #50481 (In Progress): nautilus: filestore: ENODATA error after directory split confuses ...
- 01:25 PM Backport #50481 (Resolved): nautilus: filestore: ENODATA error after directory split confuses tra...
- https://github.com/ceph/ceph/pull/40987
- 01:25 PM Bug #50352 (Resolved): LibRadosTwoPoolsPP.ManifestSnapRefcount failure
- 01:20 PM Bug #50395 (Pending Backport): filestore: ENODATA error after directory split confuses transaction
- 10:36 AM Bug #50477 (Fix Under Review): mon/MonClient: reset authenticate_err in _reopen_session()
- 10:25 AM Bug #50477 (Resolved): mon/MonClient: reset authenticate_err in _reopen_session()
- Otherwise, if "mon host" list has at least one unqualified IP address without a port and both msgr1 and msgr2 are tur...
- 07:11 AM Bug #50245: TEST_recovery_scrub_2: Not enough recovery started simultaneously
- /a/kchai-2021-04-22_05:10:24-rados-wip-kefu-testing-2021-04-22-1017-distro-basic-smithi/6063735/
- 02:41 AM Bug #49428: ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool create failed wi...
- /a/sage-2021-03-28_19:04:26-rados-wip-sage2-testing-2021-03-28-0933-pacific-distro-basic-smithi/6007274
- 02:40 AM Bug #49428: ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool create failed wi...
- /a/nojha-2021-04-15_20:05:27-rados-wip-50217-distro-basic-smithi/6049636...
- 02:40 AM Bug #50042: rados/test.sh: api_watch_notify failures
- The actual failure that caused the segfault for /a/nojha-2021-04-15_20:05:27-rados-wip-50217-distro-basic-smithi/6049...
- 12:39 AM Bug #50042: rados/test.sh: api_watch_notify failures
- -Core from /a/nojha-2021-04-15_20:05:27-rados-wip-50217-distro-basic-smithi/6049636 is the issue seen in https://trac...
- 02:31 AM Bug #50473: ceph_test_rados_api_lock_pp segfault in librados::v14_2_0::RadosClient::wait_for_osdm...
- I suspect the Rados object was deleted in another, now finished, thread while this thread was still using it. This is...
- 02:16 AM Bug #50473 (Can't reproduce): ceph_test_rados_api_lock_pp segfault in librados::v14_2_0::RadosCli...
- /a/nojha-2021-04-15_20:05:27-rados-wip-50217-distro-basic-smithi/6049636...
04/21/2021
- 11:31 PM Bug #50466: _delete_some additional unexpected onode list
- I think #178:ec000000::::head# is just a pgmeta object which is skipped until all other objects are removed. Note how...
- 02:42 PM Bug #50466: _delete_some additional unexpected onode list
- osd log with debugging:
ceph-post-file: 09094430-abdb-4248-812c-47b7babae06c - 02:39 PM Bug #50466 (Resolved): _delete_some additional unexpected onode list
- After updating to 14.2.19 and then moving some PGs around we have a few warnings related to the new efficient PG remo...
- 06:06 PM Bug #50468 (New): Simultaneous mon daemon crash on cephfs mount
- I have a newly installed 16.2.0 cluster consisting of 5 nodes deployed using cephadm on Ubuntu 18.04.5. I have create...
- 05:30 PM Backport #50457 (Resolved): octopus: ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReq...
- 10:49 AM Backport #50457 (In Progress): octopus: ERROR: test_version (tasks.mgr.dashboard.test_api.Version...
- 10:44 AM Backport #50457 (Resolved): octopus: ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReq...
- https://github.com/ceph/ceph/pull/40958
- 05:30 PM Bug #50374 (Resolved): ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReqTest) mgr/dash...
- 10:43 AM Bug #50374 (Pending Backport): ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReqTest) ...
- 05:29 PM Backport #50458 (Resolved): pacific: ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReq...
- 10:48 AM Backport #50458 (In Progress): pacific: ERROR: test_version (tasks.mgr.dashboard.test_api.Version...
- 10:44 AM Backport #50458 (Resolved): pacific: ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReq...
- https://github.com/ceph/ceph/pull/40957
- 01:56 PM Backport #50459 (Resolved): nautilus: ERROR: test_version (tasks.mgr.dashboard.test_api.VersionRe...
- 10:52 AM Backport #50459 (In Progress): nautilus: ERROR: test_version (tasks.mgr.dashboard.test_api.Versio...
- 10:45 AM Backport #50459 (Resolved): nautilus: ERROR: test_version (tasks.mgr.dashboard.test_api.VersionRe...
- https://github.com/ceph/ceph/pull/40959
- 11:30 AM Bug #50462: OSDs crash in osd/osd_types.cc: FAILED ceph_assert(clone_overlap.count(clone))
- finaly the correct format
The issue started on luminous and it looked like an instance of https://tracker.ceph.com... - 11:20 AM Bug #50462: OSDs crash in osd/osd_types.cc: FAILED ceph_assert(clone_overlap.count(clone))
- Sorry for formatting that bad
The issue started on luminous and it looked like an instance of https://tracker.c... - 11:17 AM Bug #50462 (Won't Fix - EOL): OSDs crash in osd/osd_types.cc: FAILED ceph_assert(clone_overlap.co...
The issue started on luminous and it looked like an instance of https://tracker.ceph.com/issues/23030, so we decide...- 09:01 AM Backport #49991 (Resolved): nautilus: unittest_mempool.check_shard_select failed
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40567
m... - 06:45 AM Bug #50042: rados/test.sh: api_watch_notify failures
- Neha Ojha wrote:
> Looks similar to https://tracker.ceph.com/issues/50042#note-2, feel free to create a separate tra... - 04:16 AM Bug #50042: rados/test.sh: api_watch_notify failures
- I believe the log snippet below shows a race. We have just called and sent the unwatch command to osd4 but just after...
- 03:05 AM Bug #50371 (Closed): Segmentation fault (core dumped) ceph_test_rados_api_watch_notify_pp
- This looks like an issue that only comes about because of the problem seen in https://tracker.ceph.com/issues/50042. ...
- 12:34 AM Bug #50446 (Fix Under Review): PGs always go into active+clean+scrubbing+deep+repair in the LRC
- 12:29 AM Bug #50446 (Triaged): PGs always go into active+clean+scrubbing+deep+repair in the LRC
- ...
04/20/2021
- 11:51 PM Bug #50446 (Pending Backport): PGs always go into active+clean+scrubbing+deep+repair in the LRC
- ...
- 10:08 PM Bug #50042: rados/test.sh: api_watch_notify failures
- Looking at the latest issue (ignoring the segfault which is being tracked in https://tracker.ceph.com/issues/50371) t...
- 03:00 PM Bug #50346: OSD crash FAILED ceph_assert(!is_scrubbing())
- We are using two distinct conditions to decide whether a candidate PG is already being scrubbed. The OSD checks pg->i...
- 12:59 PM Bug #50441 (Rejected): cephadm bootstrap on arm64 fails to start ceph/ceph-grafana service
- Hello,
I installed a new Ceph 15.2.10 cluster on Ubuntu 20.04 arm64 bare metal starting with a first monitor/manag...
04/19/2021
- 11:02 PM Bug #50089: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing number of monitors i...
- showed up in a pacific->master upgrade test...
- 02:49 PM Bug #50368 (Resolved): common/PriorityCache.cc: FAILED ceph_assert(mem_avail >= 0) in radosbench_...
- 12:29 PM Bug #50422 (New): Error: finished tid 1 when last_acked_tid was 2
- ...
- 12:26 PM Bug #50396: leak in PrimaryLogPG::inc_refcount_by_set
- /a/sage-2021-04-18_22:27:23-rados-wip-sage-testing-2021-04-18-1607-distro-basic-smithi/6056492
/a/sage-2021-04-18_22... - 10:56 AM Bug #50393 (Resolved): CommandCrashedError: Command crashed: 'mkdir -p -- /home/ubuntu/cephtest/m...
- 10:07 AM Bug #50420 (Need More Info): all osd down after mon scrub too long
- Hi all.
My cluster has 5 mons. Everything is ok.
my ceph mon config ... - 08:29 AM Bug #50299 (Resolved): PrimaryLogPG::inc_refcount_by_set leak
04/18/2021
- 01:16 AM Bug #50352: LibRadosTwoPoolsPP.ManifestSnapRefcount failure
- https://github.com/ceph/ceph/pull/40900
04/17/2021
- 12:59 AM Bug #50368 (Fix Under Review): common/PriorityCache.cc: FAILED ceph_assert(mem_avail >= 0) in rad...
- Neha Ojha wrote:
> Tests are passing with fdb4f834486, the commit before https://github.com/ceph/ceph/pull/40731 mer... - 12:04 AM Bug #50368 (Triaged): common/PriorityCache.cc: FAILED ceph_assert(mem_avail >= 0) in radosbench_o...
- Tests are passing with fdb4f834486, the commit before https://github.com/ceph/ceph/pull/40731 merged - https://pulpit...
04/16/2021
- 10:20 PM Backport #50406 (Resolved): pacific: mon: new monitors may direct MMonJoin to a peon instead of t...
- https://github.com/ceph/ceph/pull/41131
- 10:16 PM Bug #50345 (Pending Backport): mon: new monitors may direct MMonJoin to a peon instead of the leader
- 10:14 PM Bug #50346: OSD crash FAILED ceph_assert(!is_scrubbing())
- Ronen, can you please take a look at this bug.
- 09:51 PM Bug #50396 (Duplicate): leak in PrimaryLogPG::inc_refcount_by_set
- 12:04 PM Bug #50396 (Duplicate): leak in PrimaryLogPG::inc_refcount_by_set
- ...
- 08:57 PM Bug #50368: common/PriorityCache.cc: FAILED ceph_assert(mem_avail >= 0) in radosbench_omap_write ...
- The tests are consistently failing in master - https://pulpito.ceph.com/nojha-2021-04-16_15:33:38-rados:perf-wip-5021...
- 03:23 PM Bug #50368: common/PriorityCache.cc: FAILED ceph_assert(mem_avail >= 0) in radosbench_omap_write ...
- rados/perf/{ceph mon_election/classic objectstore/bluestore-basic-min-osd-mem-target openstack scheduler/dmclock_1Sha...
- 02:33 PM Bug #50368: common/PriorityCache.cc: FAILED ceph_assert(mem_avail >= 0) in radosbench_omap_write ...
- rados/perf/{ceph mon_election/classic objectstore/bluestore-low-osd-mem-target openstack scheduler/dmclock_default_sh...
- 08:13 PM Bug #50398 (Duplicate): cls_cas.dup_get fails with ENOENT
- 12:10 PM Bug #50398 (Duplicate): cls_cas.dup_get fails with ENOENT
- ...
- 08:05 PM Bug #50404 (New): qa/workunits/mon/crush_ops.sh: Error ENOENT: no weight-set for pool
- ...
- 05:43 PM Bug #50042: rados/test.sh: api_watch_notify failures
- Looks similar to https://tracker.ceph.com/issues/50042#note-2, feel free to create a separate tracker, Brad....
- 05:39 PM Bug #39150: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
- /a/nojha-2021-04-15_20:05:27-rados-wip-50217-distro-basic-smithi/6049676
- 02:31 PM Bug #50397 (Duplicate): src/common/PriorityCache.cc: 301: FAILED ceph_assert(mem_avail >= 0)
- 12:06 PM Bug #50397 (Duplicate): src/common/PriorityCache.cc: 301: FAILED ceph_assert(mem_avail >= 0)
- ...
- 08:38 AM Bug #50395 (Resolved): filestore: ENODATA error after directory split confuses transaction
- We had a case reported by our customer, when a faulty disk was returning ENODATA error on directory split and it crea...
- 04:19 AM Bug #50393 (Fix Under Review): CommandCrashedError: Command crashed: 'mkdir -p -- /home/ubuntu/ce...
- 04:17 AM Bug #50393: CommandCrashedError: Command crashed: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client...
- mon/test_mon_config_key.py
/a/kchai-2021-04-15_08:31:03-rados-wip-kefu-testing-2021-04-15-1359-distro-basic-smithi... - 04:16 AM Bug #50393 (Resolved): CommandCrashedError: Command crashed: 'mkdir -p -- /home/ubuntu/cephtest/m...
- https://sentry.ceph.com/organizations/ceph/issues/7316/...
- 03:42 AM Bug #50299 (Fix Under Review): PrimaryLogPG::inc_refcount_by_set leak
- 03:05 AM Bug #50299: PrimaryLogPG::inc_refcount_by_set leak
- https://github.com/ceph/ceph/pull/40879
- 01:43 AM Bug #50299: PrimaryLogPG::inc_refcount_by_set leak
- created https://github.com/ceph/ceph/pull/40878 on my way looking into this issue.
- 01:25 AM Bug #50299: PrimaryLogPG::inc_refcount_by_set leak
- I'll take a look.
- 12:54 AM Bug #50299: PrimaryLogPG::inc_refcount_by_set leak
- /a/kchai-2021-04-15_08:31:03-rados-wip-kefu-testing-2021-04-15-1359-distro-basic-smithi/6048611
- 12:33 AM Bug #50299: PrimaryLogPG::inc_refcount_by_set leak
- seems related to https://github.com/ceph/ceph/pull/39216
- 12:25 AM Bug #50299: PrimaryLogPG::inc_refcount_by_set leak
- /a/nojha-2021-04-15_20:05:27-rados-wip-50217-distro-basic-smithi/6049402
- 02:58 AM Bug #37808: osd: osdmap cache weak_refs assert during shutdown
- /ceph/teuthology-archive/pdonnell-2021-04-15_01:35:57-fs-wip-pdonnell-testing-20210414.230315-distro-basic-smithi/604...
- 02:55 AM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
- https://pulpito.ceph.com/pdonnell-2021-04-15_01:35:57-fs-wip-pdonnell-testing-20210414.230315-distro-basic-smithi/604...
- 01:48 AM Bug #50384: pacific ceph-mon: mon initial failed on aarch64
- as bug #48681 recording,
https://tracker.ceph.com/issues/48681#note-14 rosin luo wrote:
> this bug has been solved... - 01:31 AM Bug #50384 (Resolved): pacific ceph-mon: mon initial failed on aarch64
- OS: centos8.1, 4.18.0, selinux is disabled
ceph: pacific 16.2.0 aarch64
platform: Kunpeng920 5250@2.6GHz
rpm bin...
04/15/2021
- 11:41 PM Bug #50371: Segmentation fault (core dumped) ceph_test_rados_api_watch_notify_pp
- Here's the log from the coredump that I believe should pinpoint the issue but I'll need to analyse it further today.
... - 05:35 AM Bug #50371: Segmentation fault (core dumped) ceph_test_rados_api_watch_notify_pp
- Looking at the coredump....
- 03:43 AM Bug #50371 (New): Segmentation fault (core dumped) ceph_test_rados_api_watch_notify_pp
- /a/nojha-2021-04-14_00:54:53-rados-master-distro-basic-smithi/6044164...
- 06:13 PM Bug #50376 (New): upgrade-clients:client-upgrade-octopus-pacific: cluster [WRN] Health check fail...
- https://pulpito.ceph.com/ideepika-2021-04-15_12:31:36-upgrade-clients:client-upgrade-octopus-pacific-wip-rgw-dpp-upda...
- 05:42 PM Bug #50376 (Resolved): upgrade-clients:client-upgrade-octopus-pacific: cluster [WRN] Health check...
- just need to disable it for upgrade suite for now
- 02:12 PM Bug #50376: upgrade-clients:client-upgrade-octopus-pacific: cluster [WRN] Health check failed: mo...
- https://sentry.ceph.com/organizations/ceph/issues/739/events/8a47fcc562d74339989ae5d93c3d1669/events/?project=2
- 11:05 AM Bug #50376 (New): upgrade-clients:client-upgrade-octopus-pacific: cluster [WRN] Health check fail...
- ...
- 01:33 PM Bug #50374 (Resolved): ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReqTest) mgr/dash...
- 01:11 PM Bug #50374 (Fix Under Review): ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReqTest) ...
- 05:44 AM Bug #50374 (Resolved): ERROR: test_version (tasks.mgr.dashboard.test_api.VersionReqTest) mgr/dash...
- ...
- 01:25 PM Bug #50339 (Resolved): test_cls_cas failure: FAILED cls_cas.dup_get
- 03:37 AM Bug #50339 (Fix Under Review): test_cls_cas failure: FAILED cls_cas.dup_get
- 03:44 AM Bug #50042: rados/test.sh: api_watch_notify failures
- Created https://tracker.ceph.com/issues/50371 for the segfault analysis from /a/nojha-2021-04-14_00:54:53-rados-maste...
Also available in: Atom