Project

General

Profile

Activity

From 09/04/2020 to 10/03/2020

10/03/2020

02:09 AM Bug #37532 (Resolved): mon: expected_num_objects warning triggers on bluestore-only setups
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:05 AM Bug #44815 (Resolved): Pool stats increase after PG merged (PGMap::apply_incremental doesn't subt...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:04 AM Bug #46024 (Resolved): larger osd_scrub_max_preemptions values cause Floating point exception
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:04 AM Bug #46216 (Resolved): mon: log entry with garbage generated by bad memory access
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:03 AM Bug #46705 (Resolved): Negative peer_num_objects crashes osd
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:03 AM Bug #46824 (Resolved): "No such file or directory" when exporting or importing a pool if locator ...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:02 AM Bug #47159 (Resolved): add ability to clean_temps in osdmaptool
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:02 AM Bug #47309 (Resolved): mon/mon-last-epoch-clean.sh failure
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
01:36 AM Backport #47346 (Resolved): octopus: mon/mon-last-epoch-clean.sh failure
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37349
m...
Nathan Cutler
01:36 AM Backport #47251 (Resolved): octopus: add ability to clean_temps in osdmaptool
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37348
m...
Nathan Cutler
01:25 AM Backport #47345 (Resolved): nautilus: mon/mon-last-epoch-clean.sh failure
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37478
m...
Nathan Cutler
01:25 AM Backport #47250 (Resolved): nautilus: add ability to clean_temps in osdmaptool
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37477
m...
Nathan Cutler
01:24 AM Backport #46965 (Resolved): nautilus: Pool stats increase after PG merged (PGMap::apply_increment...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37476
m...
Nathan Cutler
01:24 AM Backport #46935 (Resolved): nautilus: "No such file or directory" when exporting or importing a p...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37475
m...
Nathan Cutler
01:24 AM Backport #46738 (Resolved): nautilus: mon: expected_num_objects warning triggers on bluestore-onl...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37474
m...
Nathan Cutler
01:23 AM Backport #46710 (Resolved): nautilus: Negative peer_num_objects crashes osd
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37473
m...
Nathan Cutler
01:23 AM Backport #46262 (Resolved): nautilus: larger osd_scrub_max_preemptions values cause Floating poin...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37470
m...
Nathan Cutler
01:23 AM Backport #46461 (Resolved): nautilus: pybind/mgr/balancer: should use "==" and "!=" for comparing...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37471
m...
Nathan Cutler

10/02/2020

10:09 PM Bug #40367 (Can't reproduce): "*** Caught signal (Segmentation fault) **" in upgrade:luminous-x-n...
Neha Ojha
10:07 PM Bug #40081 (Closed): mon: luminous crash attempting to decode maps after nautilus quorum has been...
Doesn't apply to nautilus and future releases. Luminous and mimic are EOL. Neha Ojha
10:02 PM Bug #40029 (Resolved): ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(Cep...
Neha Ojha
10:02 PM Bug #40029 (Rejected): ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(Cep...
This only seems to be a problem with luminous and mimic, which are EOL now. Neha Ojha
09:56 PM Bug #39366 (Can't reproduce): ClsLock.TestRenew failure
Neha Ojha
09:55 PM Bug #38513 (Rejected): luminous: "AsyncReserver.h: 190: FAILED assert(!queue_pointers.count(item)...
Luminous is EOL. Neha Ojha
09:54 PM Bug #38402 (Can't reproduce): ceph-objectstore-tool on down osd w/ not enough in osds
Neha Ojha
09:53 PM Bug #38375 (Need More Info): OSD segmentation fault on rbd create
Seems like we have lost the ceph-post-file due one of the lab incidents. Neha Ojha
09:46 PM Bug #38064 (Duplicate): librados::OPERATION_FULL_TRY not completely implemented, test LibRadosAio...
Josh Durgin
09:42 PM Bug #35974: Apparent export-diff/import-diff corruption
Looking at this again, it seems like a potential bug when reading from replicas and encountering an EIO - this should... Josh Durgin
09:18 PM Bug #46318: mon_recovery: quorum_status times out
Joao, are you working on a fix for this? Neha Ojha
09:15 PM Bug #36304: FAILED ceph_assert(p != pg_slots.end()) in OSDShard::register_and_wake_split_child(PG*)
Haven't seen this in a while. Neha Ojha
09:11 PM Bug #44362 (Can't reproduce): osd: uninitialized memory in sendmsg
Neha Ojha
09:03 PM Bug #46405 (Fix Under Review): osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
Neha Ojha
06:30 PM Feature #47732 (New): Issue health warning if a performance issue is occurring especially for cep...

This feature would identify a false network ping warning which might occur with a very busy ceph-osd(s).
The mon...
David Zafman
06:26 PM Backport #47346: octopus: mon/mon-last-epoch-clean.sh failure
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37349
merged
Yuri Weinstein
06:25 PM Backport #47251: octopus: add ability to clean_temps in osdmaptool
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37348
merged
Yuri Weinstein
04:30 PM Bug #45191: erasure-code/test-erasure-eio.sh: TEST_ec_single_recovery_error fails
http://qa-proxy.ceph.com/teuthology/yuriw-2020-10-01_17:46:11-rados-wip-yuri5-testing-2020-10-01-0834-octopus-distro-... Deepika Upadhyay

10/01/2020

11:04 PM Backport #46287 (Rejected): nautilus: mon: log entry with garbage generated by bad memory access
Not required for Nautilus. Patrick Donnelly
09:18 PM Bug #47508 (In Progress): Multiple read errors cause repeated entry/exit recovery for each error
David Zafman
06:10 PM Bug #47692: qa/standalone/osd/osd-backfill-stats.sh TEST_backfill_sizeup wait_for_clean timeout

qa/standalone/osd/osd-backfill-stats.sh:213: TEST_backfill_sizeup_out
https://pulpito.ceph.com/dzafman-2020-09...
David Zafman
05:18 PM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
/a/teuthology-2020-10-01_07:01:02-rados-master-distro-basic-smithi/5486214 Neha Ojha
05:14 PM Bug #47719 (Resolved): api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
... Neha Ojha
04:54 PM Backport #47345: nautilus: mon/mon-last-epoch-clean.sh failure
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37478
merged
Yuri Weinstein
04:54 PM Backport #47250: nautilus: add ability to clean_temps in osdmaptool
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37477
merged
Yuri Weinstein
04:53 PM Backport #46965: nautilus: Pool stats increase after PG merged (PGMap::apply_incremental doesn't ...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37476
merged
Yuri Weinstein
04:53 PM Backport #46935: nautilus: "No such file or directory" when exporting or importing a pool if loca...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37475
merged
Yuri Weinstein
04:52 PM Backport #46738: nautilus: mon: expected_num_objects warning triggers on bluestore-only setups
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37474
merged
Yuri Weinstein
04:51 PM Backport #46710: nautilus: Negative peer_num_objects crashes osd
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37473
merged
Yuri Weinstein
04:51 PM Backport #46262: nautilus: larger osd_scrub_max_preemptions values cause Floating point exception
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37470
merged
Yuri Weinstein
04:38 PM Backport #46461: nautilus: pybind/mgr/balancer: should use "==" and "!=" for comparing strings
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37471
merged
Yuri Weinstein
08:58 AM Bug #47697: mon: set session_timeout when adding to session_map
Not to my knowledge. When it happens, it is mostly transparent to the user -- the peer reopens the socket and attemp... Ilya Dryomov
02:22 AM Bug #47697: mon: set session_timeout when adding to session_map
Were there upstream QA tests that failed because of this? How did you learn of this problem? Patrick Donnelly
07:56 AM Bug #47712 (New): hdd pg's migrating when converting one ssd class osd to dmcrypt
I have pg's of hdd pools remapping, when I take out an ssd osd.
change crush reweight of ssd osd 33 to 0.0
[@ce...
none none

09/30/2020

05:58 PM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
rados/singleton/{all/osd-recovery-incomplete mon_election/connectivity msgr-failures/many msgr/async-v2only objectsto... Neha Ojha
05:56 PM Bug #47654: test_mon_pg: mon fails to join quorum to due election strategy mismatch
/a/teuthology-2020-09-30_07:01:02-rados-master-distro-basic-smithi/5483682 Neha Ojha
05:55 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
/a/teuthology-2020-09-30_07:01:02-rados-master-distro-basic-smithi/5483631 Neha Ojha
03:41 PM Documentation #46531 (Resolved): The default value of osd_scrub_during_recovery is false since v1...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
03:41 PM Feature #46663 (Resolved): Add pg count for pools in the `ceph df` command
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
03:40 PM Bug #46914 (Resolved): mon: stuck osd_pgtemp message forwards
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
03:39 PM Bug #47180 (Resolved): qa/standalone/mon/mon-handle-forward.sh failure
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:39 AM Bug #47697 (Fix Under Review): mon: set session_timeout when adding to session_map
Ilya Dryomov
11:30 AM Bug #47697 (Resolved): mon: set session_timeout when adding to session_map
With msgr2, the session is added in Monitor::ms_handle_accept() which is queued by ProtocolV2 at the end of handling ... Ilya Dryomov
07:14 AM Backport #46587 (Resolved): nautilus: The default value of osd_scrub_during_recovery is false sin...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37472
m...
Nathan Cutler
07:13 AM Backport #47091 (Resolved): octopus: mon: stuck osd_pgtemp message forwards
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37347
m...
Nathan Cutler
07:12 AM Backport #47258 (Resolved): octopus: Add pg count for pools in the `ceph df` command
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36945
m...
Nathan Cutler
01:06 AM Bug #47692: qa/standalone/osd/osd-backfill-stats.sh TEST_backfill_sizeup wait_for_clean timeout

qa/standalone/osd/osd-recovery-stats.sh TEST_recovery_replicated_out1 wait_for_clean time out
https://pulpito.c...
David Zafman
01:01 AM Bug #47692 (New): qa/standalone/osd/osd-backfill-stats.sh TEST_backfill_sizeup wait_for_clean ti...

wait_for_clean timeout in TEST_backfill_sizeup
https://pulpito.ceph.com/dzafman-2020-09-29_20:13:01-rados-wip-za...
David Zafman
12:57 AM Bug #47691 (New): osd-markdown.sh TEST_markdown_boot, osd not having enough time to boot

qa/standalone/osd/osd-markdown.sh:102: TEST_markdown_boot: ceph tell osd.0 get_latest_osdmap
2020-09-29T22:35:45....
David Zafman

09/29/2020

10:46 PM Bug #46405 (In Progress): osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
David Zafman
08:11 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
/a/teuthology-2020-09-29_07:01:02-rados-master-distro-basic-smithi/5480928 Neha Ojha
06:00 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
This change fixes the odd object names in the subtest, but shouldn't change help fix this problem. On my build machi... David Zafman
09:49 PM Backport #47091: octopus: mon: stuck osd_pgtemp message forwards
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37347
merged
Yuri Weinstein
09:47 PM Feature #46663: Add pg count for pools in the `ceph df` command
https://github.com/ceph/ceph/pull/36945 merged Yuri Weinstein
08:39 PM Bug #46603: osd/osd-backfill-space.sh: TEST_ec_backfill_simple: return 1

http://qa-proxy.ceph.com/teuthology/yuriw-2020-09-28_18:47:33-rados-wip-yuri-testing-2020-09-28-1007-octopus-distro...
Deepika Upadhyay
08:31 PM Bug #47153: monitor crash during upgrade due to LogSummary encoding changes between luminous and ...
Neha Ojha wrote:
> https://github.com/ceph/ceph/pull/36838 is not a fix for this issue. It was being used to reprodu...
Yuri Weinstein
08:27 PM Bug #47153 (New): monitor crash during upgrade due to LogSummary encoding changes between luminou...
https://github.com/ceph/ceph/pull/36838 is not a fix for this issue. It was being used to reproduce this issue but ha... Neha Ojha
08:25 PM Bug #47153: monitor crash during upgrade due to LogSummary encoding changes between luminous and ...
since the fix is targeting nautilus, I'll go out on a limb and fill in "Affected versions" with a guess. Nathan Cutler
08:24 PM Bug #47153 (Fix Under Review): monitor crash during upgrade due to LogSummary encoding changes be...
Nathan Cutler
04:56 PM Backport #47345 (In Progress): nautilus: mon/mon-last-epoch-clean.sh failure
Nathan Cutler
04:55 PM Backport #47250 (In Progress): nautilus: add ability to clean_temps in osdmaptool
Nathan Cutler
04:38 PM Backport #46965 (In Progress): nautilus: Pool stats increase after PG merged (PGMap::apply_increm...
Nathan Cutler
04:37 PM Backport #46935 (In Progress): nautilus: "No such file or directory" when exporting or importing ...
Nathan Cutler
04:36 PM Backport #46738 (In Progress): nautilus: mon: expected_num_objects warning triggers on bluestore-...
Nathan Cutler
04:36 PM Backport #46710 (In Progress): nautilus: Negative peer_num_objects crashes osd
Nathan Cutler
04:33 PM Backport #46587 (In Progress): nautilus: The default value of osd_scrub_during_recovery is false ...
Nathan Cutler
04:32 PM Backport #46461 (In Progress): nautilus: pybind/mgr/balancer: should use "==" and "!=" for compar...
Nathan Cutler
04:31 PM Backport #46287: nautilus: mon: log entry with garbage generated by bad memory access
don't see how the code change applies to nautilus Nathan Cutler
04:31 PM Backport #46287 (Need More Info): nautilus: mon: log entry with garbage generated by bad memory a...
Nathan Cutler
04:27 PM Backport #46262 (In Progress): nautilus: larger osd_scrub_max_preemptions values cause Floating p...
Nathan Cutler
12:12 PM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
鑫 王 wrote:
>
> In addition, Can you tell me which fields you care about perf counters.
Everything under "bluest...
Igor Fedotov
11:38 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
Igor Fedotov wrote:
> So a short summary for now is:
> 1) High memory consumption is just temporary and goes away o...
Stellar Wang
11:09 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
Would you please collect perf counter dumps for both running benchmark (e.g. in the middle of it) and on its completi... Igor Fedotov
11:08 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
鑫 王 wrote:
>
> *Question:*
> In mempool dump information, bluestore_writing takes up most of the memory. How is t...
Igor Fedotov
11:00 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
So a short summary for now is:
1) High memory consumption is just temporary and goes away on writing benchmark compl...
Igor Fedotov
10:56 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
hi Igor,
*I do the following today.*
# Adjust rocksDB parameters (max_write_buffer_number=4,write_buffer_size=128...
Stellar Wang
03:12 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
Igor Fedotov wrote:
> And could you please try the same benchmark against replicated pool? Would this have the same ...
Stellar Wang
02:46 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
Igor Fedotov wrote:
> Does memory consumption stay that high after benchmark is completed/terminated?
Answer: Mem...
Stellar Wang
12:32 AM Bug #45647: "ceph --cluster ceph --log-early osd last-stat-seq osd.0" times out due to msgr-failu...
rados/singleton/{all/peer mon_election/connectivity msgr-failures/many msgr/async-v2only objectstore/bluestore-bitmap... Neha Ojha
12:30 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
/a/teuthology-2020-09-28_07:01:02-rados-master-distro-basic-smithi/5476775... Neha Ojha

09/28/2020

07:30 PM Backport #47599 (Resolved): octopus: qa/standalone/mon/mon-handle-forward.sh failure
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36705
m...
Nathan Cutler
07:30 PM Backport #47600 (Resolved): nautilus: qa/standalone/mon/mon-handle-forward.sh failure
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36704
m...
Nathan Cutler
02:51 PM Backport #47600: nautilus: qa/standalone/mon/mon-handle-forward.sh failure
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36704
merged
Yuri Weinstein
06:05 PM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
And could you please try the same benchmark against replicated pool? Would this have the same problem? Igor Fedotov
06:04 PM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
Does memory consumption stay that high after benchmark is completed/terminated?
Igor Fedotov
02:09 PM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
hi lgor,
Thank you for your quick feedback, Osd memory still exceeds the set threshold of 2G when i run again it,...
Stellar Wang
12:16 PM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
This might be related to https://tracker.ceph.com/issues/46658
Could you please collect a mempool dump from an OSD...
Igor Fedotov
11:47 AM Bug #47673 (New): cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
A 4K random write scenario in a single-node full SSD cephfs will cause the OSD memory space to grow indefinitely and ... Stellar Wang
05:08 PM Bug #47654: test_mon_pg: mon fails to join quorum to due election strategy mismatch
/a/teuthology-2020-09-28_07:01:02-rados-master-distro-basic-smithi/5476834 Neha Ojha
01:57 PM Documentation #47523 (Duplicate): ceph df documentation is outdated
Zac Dover
01:57 PM Documentation #47523: ceph df documentation is outdated
See issue # 47522.
https://tracker.ceph.com/issues/47522
Zac Dover
01:53 AM Feature #47666: Ceph pool history
I second this, having benefited from ZFS history in my Solaris 10 days.
For years we've relied on shell history fo...
Anthony D'Atri

09/27/2020

10:59 PM Backport #47599: octopus: qa/standalone/mon/mon-handle-forward.sh failure
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36705
merged
Yuri Weinstein
10:18 PM Feature #47666 (New): Ceph pool history
Introduce a "ceph pool $pool history" command to obtain historic information that modified pool state. I.e. changing ... Stefan Kooman
04:14 PM Bug #47452 (Resolved): invalid values of crush-failure-domain should not be allowed while creatin...
Kefu Chai

09/26/2020

04:36 PM Bug #47590: osd do not respect scrub schedule
Neha Ojha wrote:
> Can you provide the output of "ceph config dump"?
WHO MASK LEVEL OPTION ...
Petr Bena

09/25/2020

09:21 PM Bug #47590 (Need More Info): osd do not respect scrub schedule
Can you provide the output of "ceph config dump"? Neha Ojha
08:32 PM Bug #47654 (Resolved): test_mon_pg: mon fails to join quorum to due election strategy mismatch
... Neha Ojha
08:10 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
/a/teuthology-2020-09-25_07:01:01-rados-master-distro-basic-smithi/5466817 Neha Ojha
08:09 PM Bug #45615: api_watch_notify_pp: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify/1 f...
/a/teuthology-2020-09-25_07:01:01-rados-master-distro-basic-smithi/5466707 Neha Ojha
02:44 PM Bug #46877: mon_clock_skew_check: expected MON_CLOCK_SKEW but got none
Deepika Upadhyay wrote:
> this time job got dead after this warning:
>
> /a/yuriw-2020-08-20_19:48:15-rados-wip-...
Neha Ojha

09/24/2020

08:07 PM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
http://qa-proxy.ceph.com/teuthology/yuriw-2020-09-23_21:32:26-rados-wip-yuri4-testing-2020-09-23-1206-nautilus-distro... Deepika Upadhyay
02:37 PM Bug #46877: mon_clock_skew_check: expected MON_CLOCK_SKEW but got none
this time job got dead after this warning:
/a/yuriw-2020-08-20_19:48:15-rados-wip-yuri-testing-2020-08-17-1723-oc...
Deepika Upadhyay
01:18 PM Bug #45615: api_watch_notify_pp: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify/1 f...
http://pulpito.ceph.com/yuriw-2020-09-23_15:16:58-rados-wip-yuri-testing-2020-09-22-1332-octopus-distro-basic-smithi/... Deepika Upadhyay
09:14 AM Bug #47626 (New): process will crash by invalidate pointer
1:Version: mimic 13.2.9-0.el7.aarch64.rpm
2: coredump gbd info
(gdb) bt
#0 now (this=0x30) at /usr/src/debug/c...
Yi Li
12:42 AM Documentation #47522: Document "ceph df detail"
https://docs.ceph.com/en/latest/man/8/ceph/#df -- man page location
https://docs.ceph.com/en/latest/rados/operatio...
Zac Dover

09/23/2020

03:04 PM Bug #47617 (New): rebuild_mondb: daemon-helper: command failed with exit status 1
/a/yuriw-2020-09-16_23:57:37-rados-wip-yuri8-testing-2020-09-16-2220-octopus-distro-basic-smithi/5441511/teuthology.l... Deepika Upadhyay
11:52 AM Backport #47599 (In Progress): octopus: qa/standalone/mon/mon-handle-forward.sh failure
Nathan Cutler
09:08 AM Backport #47599 (Resolved): octopus: qa/standalone/mon/mon-handle-forward.sh failure
https://github.com/ceph/ceph/pull/36705 Nathan Cutler
11:49 AM Backport #47346 (In Progress): octopus: mon/mon-last-epoch-clean.sh failure
Nathan Cutler
11:48 AM Backport #47251 (In Progress): octopus: add ability to clean_temps in osdmaptool
Nathan Cutler
11:47 AM Backport #47091 (In Progress): octopus: mon: stuck osd_pgtemp message forwards
Nathan Cutler
11:15 AM Bug #47290 (Resolved): osdmaps aren't being cleaned up automatically on healthy cluster
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
09:09 AM Backport #47600 (In Progress): nautilus: qa/standalone/mon/mon-handle-forward.sh failure
Nathan Cutler
09:08 AM Backport #47600 (Resolved): nautilus: qa/standalone/mon/mon-handle-forward.sh failure
https://github.com/ceph/ceph/pull/36704 Nathan Cutler
08:19 AM Backport #47362 (Resolved): nautilus: pgs inconsistent, union_shard_errors=missing
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37051
m...
Nathan Cutler
08:05 AM Backport #47297 (Resolved): octopus: osdmaps aren't being cleaned up automatically on healthy clu...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36981
m...
Nathan Cutler

09/22/2020

11:22 PM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
rados/singleton/{all/osd-recovery-incomplete mon_election/connectivity msgr-failures/many msgr/async-v1only objectsto... Neha Ojha
11:20 PM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
Neha Ojha wrote:
> /a/teuthology-2020-09-22_07:01:02-rados-master-distro-basic-smithi/5458722
rados/thrash/{0-siz...
Neha Ojha
04:52 PM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
/a/teuthology-2020-09-22_07:01:02-rados-master-distro-basic-smithi/5458722 Neha Ojha
11:21 PM Bug #47592 (New): extract-monmap changes permission on some files
Running ceph-mon with the --extract-monmap flag (which must be done with root privileges apparently) seems to cause s... Nate Morrison
10:22 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
... Neha Ojha
08:27 PM Bug #47590 (Need More Info): osd do not respect scrub schedule
Since update to Nautilus (14.2.4 -> 14.2.10) scrub schedules stopped working, I used to have them in config file, but... Petr Bena
07:38 PM Bug #47589 (Can't reproduce): radosbench times out "reached maximum tries (800) after waiting for...
... Neha Ojha
02:07 PM Bug #47309: mon/mon-last-epoch-clean.sh failure
/a/yuriw-2020-09-16_23:57:37-rados-wip-yuri8-testing-2020-09-16-2220-octopus-distro-basic-smithi/5441469/teuthology.l... Deepika Upadhyay
01:18 PM Bug #47044 (In Progress): PG::_delete_some isn't optimal iterating objects
Igor Fedotov
04:39 AM Bug #47352 (In Progress): rados ls improvements.
David Zafman
04:39 AM Bug #47352: rados ls improvements.

There are 2 issues here. 1. This is making ls more expensive. 2. How do I create ghobject_t for all cases.
<pre...
David Zafman

09/21/2020

11:57 PM Bug #47361 (Rejected): invalid upmap not getting cleaned

I diagnosed this issue running the following with the supplied osd map.
CEPH_ARGS=" --debug_osd=30" osdmaptool -...
David Zafman
07:50 PM Backport #47362: nautilus: pgs inconsistent, union_shard_errors=missing
Mykola Golub wrote:
> https://github.com/ceph/ceph/pull/37051
merged
Yuri Weinstein
09:39 AM Bug #47492 (Resolved): tools/osdmaptool.cc: fix inaccurate pg map result when simulating osd out
Kefu Chai

09/19/2020

01:58 AM Bug #47180 (Pending Backport): qa/standalone/mon/mon-handle-forward.sh failure
Patrick Donnelly

09/18/2020

10:29 PM Documentation #47522 (Duplicate): Document "ceph df detail"
Josh Durgin
09:34 PM Bug #47508 (Fix Under Review): Multiple read errors cause repeated entry/exit recovery for each e...
Neha Ojha
12:38 AM Bug #47508: Multiple read errors cause repeated entry/exit recovery for each error
WIthout this fix every object is a recovery. Only with added 2 dout()s.... David Zafman
09:32 PM Bug #47509: test: ceph_test_argparse.py no longer passes/is correctly invoked in CI
We'll revisit this test to see what's going on but not urgent IMO. Neha Ojha
07:17 PM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
this still fails consistently
/a/teuthology-2020-09-17_07:01:02-rados-master-distro-basic-smithi/5443303
Neha Ojha
02:01 AM Feature #39012 (Resolved): osd: distinguish unfound + impossible to find, vs start some down OSDs...
David Zafman

09/17/2020

10:55 PM Bug #47492 (Fix Under Review): tools/osdmaptool.cc: fix inaccurate pg map result when simulating ...
Neha Ojha
09:31 PM Documentation #47523 (Resolved): ceph df documentation is outdated
Fields have changed meaning and new ones have been added, in the 'ceph df detail' output:
https://github.com/ceph/...
Josh Durgin
09:07 PM Documentation #47522 (Closed): Document "ceph df detail"
The output of "ceph df detail" has a lot of useful information including per pool stats. Let's document whatever is n... Neha Ojha
07:03 PM Backport #47092 (Resolved): nautilus: mon: stuck osd_pgtemp message forwards
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37171
m...
Nathan Cutler
07:03 PM Backport #47296 (Resolved): nautilus: osdmaps aren't being cleaned up automatically on healthy cl...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36982
m...
Nathan Cutler
07:02 PM Backport #47257: nautilus: Add pg count for pools in the `ceph df` command
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36944
m...
Nathan Cutler
04:46 PM Feature #47519: Gracefully detect MTU mismatch
Good suggestion, the pings are with large sizes already, but we don't particularly warn about MTU right now, you just... Josh Durgin
04:22 PM Feature #47519 (New): Gracefully detect MTU mismatch
I ran into an issue this morning of flapping OSDs. The ... David Galloway
03:59 AM Bug #47508 (In Progress): Multiple read errors cause repeated entry/exit recovery for each error
David Zafman
01:08 AM Bug #47509 (New): test: ceph_test_argparse.py no longer passes/is correctly invoked in CI
Despite the test showing up as passed in Jenkins, it's apparent that most of the actual tests are now being skipped i... Greg Farnum

09/16/2020

09:32 PM Bug #47180 (Fix Under Review): qa/standalone/mon/mon-handle-forward.sh failure
Patrick Donnelly
09:06 PM Backport #47092: nautilus: mon: stuck osd_pgtemp message forwards
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37171
merged
Yuri Weinstein
07:21 PM Bug #47447 (Resolved): test_osd_cannot_recover (tasks.mgr.test_progress.TestProgress) fails
Patrick Donnelly
06:15 PM Bug #47508: Multiple read errors cause repeated entry/exit recovery for each error
... David Zafman
06:12 PM Bug #47508 (In Progress): Multiple read errors cause repeated entry/exit recovery for each error

After looking at https://github.com/ceph/ceph/pull/36989 I realized that after the first read error all the other g...
David Zafman
04:35 PM Bug #47239 (Resolved): thrashosds.thrasher error in rados
Neha Ojha
03:44 PM Backport #47296: nautilus: osdmaps aren't being cleaned up automatically on healthy cluster
Neha Ojha wrote:
> https://github.com/ceph/ceph/pull/36982
merged
Yuri Weinstein
03:22 PM Bug #38219: rebuild-mondb hangs
/a/yuriw-2020-09-16_01:27:14-rados-wip-yuri3-testing-2020-09-16-0014-nautilus-distro-basic-smithi/5437537/teuthology.log Deepika Upadhyay
08:42 AM Bug #47492 (Resolved): tools/osdmaptool.cc: fix inaccurate pg map result when simulating osd out
When simulating osd out, it will always adjust this osd's crush weight to 1.0. Hence the pg map result is not same as... Zhi Zhang
07:55 AM Backport #47257 (Resolved): nautilus: Add pg count for pools in the `ceph df` command
Vikhyat Umrao

09/15/2020

11:59 PM Backport #47092 (In Progress): nautilus: mon: stuck osd_pgtemp message forwards
Neha Ojha
08:39 PM Backport #47257: nautilus: Add pg count for pools in the `ceph df` command
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36944
merged
Yuri Weinstein
08:35 PM Bug #47239: thrashosds.thrasher error in rados
/a/gregf-2020-09-14_05:25:36-rados-wip-stretch-mode-distro-basic-smithi/5433134 Neha Ojha
08:07 PM Bug #47239 (Fix Under Review): thrashosds.thrasher error in rados
Neha Ojha
04:13 PM Bug #47440: nautilus: valgrind caught leak in Messenger::ms_deliver_verify_authorizer
... Deepika Upadhyay
12:54 PM Bug #47447 (Fix Under Review): test_osd_cannot_recover (tasks.mgr.test_progress.TestProgress) fails
Rishabh Dave
12:02 PM Bug #47223 (Rejected): Change the default value of option osd_async_recovery_min_cost from 100 to 10
Vikhyat Umrao
03:17 AM Bug #47452 (Fix Under Review): invalid values of crush-failure-domain should not be allowed while...
Prashant D
02:26 AM Bug #47452 (Resolved): invalid values of crush-failure-domain should not be allowed while creatin...
# ceph osd erasure-code-profile set testprofile k=4 m=2 crush-failure-domain=x
# ceph osd erasure-code-profile get...
Prashant D

09/14/2020

10:36 PM Bug #47239: thrashosds.thrasher error in rados
I think 530982129ec131ef78e2f9989abfaeddb0959c65 caused this issue. Neha Ojha
09:53 PM Bug #47239: thrashosds.thrasher error in rados
/a/teuthology-2020-09-14_07:01:01-rados-master-distro-basic-smithi/5433978 Neha Ojha
08:42 PM Bug #47447 (Triaged): test_osd_cannot_recover (tasks.mgr.test_progress.TestProgress) fails
The same test passed with this revert https://github.com/ceph/ceph/pull/37122.
https://pulpito.ceph.com/nojha-2020...
Neha Ojha
08:18 PM Bug #47447 (Resolved): test_osd_cannot_recover (tasks.mgr.test_progress.TestProgress) fails
... Neha Ojha
01:38 PM Bug #47440 (New): nautilus: valgrind caught leak in Messenger::ms_deliver_verify_authorizer
http://qa-proxy.ceph.com/teuthology/yuriw-2020-09-10_15:32:32-rados-wip-yuri2-testing-2020-09-08-1946-nautilus-distro... Deepika Upadhyay

09/13/2020

05:50 PM Feature #39012 (Fix Under Review): osd: distinguish unfound + impossible to find, vs start some d...
David Zafman

09/11/2020

09:49 PM Bug #47420 (New): nautilus: test_rados.TestIoctx.test_aio_read fails with AssertionError: 5 != 2

http://qa-proxy.ceph.com/teuthology/yuriw-2020-09-10_22:09:32-rados-wip-yuri2-testing-2020-09-10-0000-nautilus-dist...
Deepika Upadhyay
09:39 PM Bug #47419 (Resolved): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo bench...
The PR did not change code at all. Log is attached. Josh Durgin
09:19 PM Bug #47344 (Resolved): osd: Poor client IO throughput/latency observed with dmclock scheduler dur...
Neha Ojha
03:03 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
Brad, thanks. will create a separate ticket. Kefu Chai
03:23 AM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
Kefu,
/a/kchai-2020-09-10_16:44:13-rados-wip-kefu-testing-2020-09-10-1633-distro-basic-smithi/5421813/teuthology.l...
Brad Hubbard
02:40 AM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
... Kefu Chai
05:07 AM Bug #47395 (New): ceph raw used is more than total used in all pools (ceph df detail)
In my ceph cluster, when i run the *ceph df detail* command it shows me like as following result... Morteza Bashsiz
02:36 AM Bug #47024: rados/test.sh: api_tier_pp LibRadosTwoPoolsPP.ManifestSnapRefcount failed
... Kefu Chai

09/10/2020

04:43 AM Bug #24531: Mimic MONs have slow/long running ops
please note, this fix was backported as f0697a9af54bf866572036bd6d582abd5299d0c8... Kefu Chai

09/09/2020

12:58 PM Bug #47361: invalid upmap not getting cleaned
As a workaround for our cluster operations I have removed the un-used "rack" level from our osd tree, and now the upm... Dan van der Ster
10:56 AM Bug #47361: invalid upmap not getting cleaned
This seems to be still breaking in master.
osdmap is attached.
Dan van der Ster
12:32 PM Bug #47380 (Resolved): mon: slow ops due to osd_failure
... Honggang Yang
09:15 AM Backport #47364 (In Progress): luminous: pgs inconsistent, union_shard_errors=missing
Mykola Golub

09/08/2020

06:06 PM Backport #47365 (In Progress): mimic: pgs inconsistent, union_shard_errors=missing
Mykola Golub
02:48 PM Backport #47365 (Resolved): mimic: pgs inconsistent, union_shard_errors=missing
https://github.com/ceph/ceph/pull/37053 Nathan Cutler
05:47 PM Backport #47362 (In Progress): nautilus: pgs inconsistent, union_shard_errors=missing
Mykola Golub
02:01 PM Backport #47362 (Resolved): nautilus: pgs inconsistent, union_shard_errors=missing
https://github.com/ceph/ceph/pull/37051 Mykola Golub
03:23 PM Backport #47363 (In Progress): octopus: pgs inconsistent, union_shard_errors=missing
Mykola Golub
02:01 PM Backport #47363 (Resolved): octopus: pgs inconsistent, union_shard_errors=missing
https://github.com/ceph/ceph/pull/37048 Mykola Golub
02:48 PM Backport #47364 (Resolved): luminous: pgs inconsistent, union_shard_errors=missing
https://github.com/ceph/ceph/pull/37062 Nathan Cutler
01:52 PM Bug #43174 (Pending Backport): pgs inconsistent, union_shard_errors=missing
Josh Durgin
01:35 PM Bug #43174 (Resolved): pgs inconsistent, union_shard_errors=missing
Kefu Chai
01:38 PM Bug #47361: invalid upmap not getting cleaned
We deleted all the pg_upmap_items and let the balancer start again. It created bad upmap rules again in the first ite... Dan van der Ster
01:08 PM Bug #47361 (Rejected): invalid upmap not getting cleaned
In v14.2.11 we have some invalid upmaps which don't get cleaned. (And I presume they were created by the balancer).
...
Dan van der Ster
03:35 AM Bug #47344 (Fix Under Review): osd: Poor client IO throughput/latency observed with dmclock sched...
Sridhar Seshasayee

09/07/2020

10:00 PM Bug #47352 (In Progress): rados ls improvements.
"rados ls" should NOT include deleted head objects, with json output include snapshots but add snapid fields to output. David Zafman
08:09 PM Feature #46842 (Resolved): librados: add LIBRBD_SUPPORTS_GETADDRS support
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:08 PM Backport #47346 (Resolved): octopus: mon/mon-last-epoch-clean.sh failure
https://github.com/ceph/ceph/pull/37349 Nathan Cutler
08:08 PM Backport #47345 (Resolved): nautilus: mon/mon-last-epoch-clean.sh failure
https://github.com/ceph/ceph/pull/37478 Nathan Cutler
08:04 PM Bug #47344 (Resolved): osd: Poor client IO throughput/latency observed with dmclock scheduler dur...
Regardless of the higher weightage given to client IO when compared to
recovery IO, poor client throughput/latency i...
Sridhar Seshasayee
06:41 AM Bug #47309 (Pending Backport): mon/mon-last-epoch-clean.sh failure
Kefu Chai

09/06/2020

06:55 PM Bug #47328 (Resolved): nautilus: ObjectStore/SimpleCloneTest: invalid rm coll
job getting dead, but not sure what killed it(is it because smithi was not able to handle the said number of threads?... Deepika Upadhyay
10:26 AM Backport #46932 (Resolved): nautilus: librados: add LIBRBD_SUPPORTS_GETADDRS support
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36853
m...
Nathan Cutler
09:11 AM Bug #47299: Assertion in pg_missing_set: p->second.need <= v || p->second.is_delete()
Neha Ojha wrote:
> Is it possible for you to capture osd logs with debug_osd=30? We'll also try to reproduce this at...
Denis Krienbühl

09/04/2020

11:15 PM Bug #47309 (Fix Under Review): mon/mon-last-epoch-clean.sh failure
Neha Ojha
09:50 PM Bug #47309 (Resolved): mon/mon-last-epoch-clean.sh failure
The test needs to be updated based on https://github.com/ceph/ceph/pull/36977... Neha Ojha
11:07 PM Backport #47297: octopus: osdmaps aren't being cleaned up automatically on healthy cluster
Neha Ojha wrote:
> https://github.com/ceph/ceph/pull/36981
merged
Yuri Weinstein
12:17 AM Backport #47297 (In Progress): octopus: osdmaps aren't being cleaned up automatically on healthy ...
Neha Ojha
12:08 AM Backport #47297 (Resolved): octopus: osdmaps aren't being cleaned up automatically on healthy clu...
https://github.com/ceph/ceph/pull/36981 Neha Ojha
10:32 PM Bug #47204: ceph osd getting shutdown after joining to cluster
The log shows systemd is stopping the osd, doing a clean shutdown via SIGTERM. It's unclear what caused systemd to st... Josh Durgin
10:23 PM Bug #47299 (Need More Info): Assertion in pg_missing_set: p->second.need <= v || p->second.is_del...
Is it possible for you to capture osd logs with debug_osd=30? We'll also try to reproduce this at our end. Neha Ojha
06:57 AM Bug #47299 (Need More Info): Assertion in pg_missing_set: p->second.need <= v || p->second.is_del...
Some of our ODSs will sometimes crash with the following message:... Denis Krienbühl
10:08 PM Bug #47300: mount.ceph fails to understand AAAA records from SRV record
Thanks for the detailed description. The earlier fix clearly depends on ms_bind_ipv6: https://github.com/ceph/ceph/pu... Josh Durgin
07:49 AM Bug #47300 (Resolved): mount.ceph fails to understand AAAA records from SRV record
Hello,
Unsure if this belongs to CephFS or RADOS :-). I have seen numerous of issues here regarding IPv6/AAAA re...
Daniël Vos
12:32 AM Backport #47296 (In Progress): nautilus: osdmaps aren't being cleaned up automatically on healthy...
Neha Ojha
12:07 AM Backport #47296 (Resolved): nautilus: osdmaps aren't being cleaned up automatically on healthy cl...
https://github.com/ceph/ceph/pull/36982 Neha Ojha
12:04 AM Bug #47290 (Pending Backport): osdmaps aren't being cleaned up automatically on healthy cluster
Neha Ojha
 

Also available in: Atom