Project

General

Profile

Activity

From 09/23/2020 to 10/22/2020

10/22/2020

10:11 PM Bug #47930: scrub/osd-recovery-scrub.sh: TEST_recovery_scrub: wait_background: return 1
/a/kchai-2020-10-21_07:01:44-rados-wip-kefu-testing-2020-10-21-1144-distro-basic-smithi/5545065 Neha Ojha
07:52 PM Bug #47952: Replicated pool creation fails Nautilus 14.2.12 build when cluster runs with filestor...
14.2.12 introduced the following change in https://github.com/ceph/ceph/pull/37474, which is probably the case you ar... Neha Ojha
07:08 PM Bug #47952 (New): Replicated pool creation fails Nautilus 14.2.12 build when cluster runs with fi...
Tried pool creation using ceph-ansibles-4.0 and replication pool failed with following error :
Build : Nautilus 1...
Prashant Tambe
06:10 PM Bug #47929: Huge RAM Usage on OSD recovery
Nop, not work the export-import behavior, because on recover, when need to recover that PG then OOM killed Luis Felipe Domínguez Vega
01:52 PM Bug #47929: Huge RAM Usage on OSD recovery
there are some extrange behavior because now in another failing OSD not work at all and i execute the export-remove a... Luis Felipe Domínguez Vega
03:47 AM Bug #47929: Huge RAM Usage on OSD recovery

Changed and used the ...
Luis Felipe Domínguez Vega
05:10 PM Bug #47951 (Fix Under Review): MonClient: mon_host with DNS Round Robin results in 'unable to par...
Patrick Donnelly
05:06 PM Bug #47951 (In Progress): MonClient: mon_host with DNS Round Robin results in 'unable to parse ad...
Patrick Donnelly
04:34 PM Bug #47951 (Resolved): MonClient: mon_host with DNS Round Robin results in 'unable to parse addrs'
I performed a test upgrade to 14.2.12 today on a cluster using IPv6 with Round Robin DNS for mon_host... Wido den Hollander
04:33 PM Bug #47949: scrub/osd-scrub-repair.sh: TEST_auto_repair_bluestore_scrub: return 1
... Neha Ojha
02:54 PM Bug #47949: scrub/osd-scrub-repair.sh: TEST_auto_repair_bluestore_scrub: return 1
Deepika Upadhyay wrote:
http://qa-proxy.ceph.com/teuthology/yuriw-2020-10-20_15:30:01-rados-wip-yuri5-testing-2020...
Deepika Upadhyay
01:06 PM Bug #47949 (New): scrub/osd-scrub-repair.sh: TEST_auto_repair_bluestore_scrub: return 1
... Deepika Upadhyay
03:53 PM Bug #40777 (New): hit assert in AuthMonitor::update_from_paxos
Neha Ojha
03:22 PM Bug #47767: octopus: setting noscrub crashed osd process
It happened again moments after setting nodeep-scrub:... Dan van der Ster
11:00 AM Bug #46732: teuthology.exceptions.MaxWhileTries: 'check for active or peered' reached maximum tri...
saw this recently, with same configuration description:
/a/yuriw-2020-10-20_15:30:01-rados-wip-yuri5-testing-2020-...
Deepika Upadhyay
07:41 AM Bug #47945 (Duplicate): scrubbing failure
description: rados/thrash/{0-size-min-size-overrides/3-size-2-min-size 1-pg-log-overrides/short_
2-recovery-overri...
Deepika Upadhyay
04:59 AM Bug #46845 (Resolved): Newly orchestrated OSD fails with 'unable to find any IPv4 address in netw...
https://github.com/ceph/ceph/pull/37709 Kefu Chai

10/21/2020

10:41 PM Bug #47929: Huge RAM Usage on OSD recovery
Well try with:
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-<osd id> --pgid "<stuck_pg get from log>" ...
Luis Felipe Domínguez Vega
06:15 PM Bug #47929: Huge RAM Usage on OSD recovery
ceph -s: https://pastebin.ubuntu.com/p/3rjd435Sdh/
ceph pg dump: https://pastebin.ubuntu.com/p/THsSd2J33s/
Luis Felipe Domínguez Vega
05:57 PM Bug #47929: Huge RAM Usage on OSD recovery
Can you please provide the output of "ceph -s" and "ceph pg dump"? Neha Ojha
04:29 PM Bug #47929 (New): Huge RAM Usage on OSD recovery
Hi, today mi Infra provider has a blackout, then the Ceph was try to
recover but are in an inconsistent state becaus...
Luis Felipe Domínguez Vega
10:32 PM Bug #47930: scrub/osd-recovery-scrub.sh: TEST_recovery_scrub: wait_background: return 1

Before scrubs were started in the background, not all PGs were in recovery. But somehow in this case the scrubs, p...
David Zafman
10:10 PM Bug #47930 (Resolved): scrub/osd-recovery-scrub.sh: TEST_recovery_scrub: wait_background: return 1
... Neha Ojha
10:13 PM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
rados/thrash-erasure-code-overwrites/{bluestore-bitmap ceph clusters/{fixed-2 openstack} fast/fast mon_election/conne... Neha Ojha
10:07 PM Bug #47024: rados/test.sh: api_tier_pp LibRadosTwoPoolsPP.ManifestSnapRefcount failed
/a/teuthology-2020-10-21_07:01:02-rados-master-distro-basic-smithi/5544858 Neha Ojha
04:05 PM Bug #46318: mon_recovery: quorum_status times out
rados/monthrash/{ceph clusters/3-mons mon_election/connectivity msgr-failures/few msgr/async-v1only objectstore/blues... Neha Ojha
01:43 PM Bug #47328 (Fix Under Review): nautilus: ObjectStore/SimpleCloneTest: invalid rm coll
Igor Fedotov
01:19 PM Bug #47328 (In Progress): nautilus: ObjectStore/SimpleCloneTest: invalid rm coll
Igor Fedotov

10/20/2020

10:15 PM Bug #40777: hit assert in AuthMonitor::update_from_paxos
https://github.com/facebook/rocksdb/issues/5558 shows the same issue. Brad Hubbard
12:28 PM Bug #47907 (Can't reproduce): test_mon_mon: ceph mon stat -f json parse error
Kefu Chai
06:19 AM Bug #47907: test_mon_mon: ceph mon stat -f json parse error
passed at https://pulpito.ceph.com/kchai-2020-10-20_04:38:01-rados-master-distro-basic-smithi/... Kefu Chai
12:08 AM Bug #47907 (Can't reproduce): test_mon_mon: ceph mon stat -f json parse error
... Neha Ojha
08:12 AM Bug #44420 (Fix Under Review): cephadm cluster: "ceph ping mon.*" works fine, but "ceph ping mon....
Mykola Golub

10/19/2020

07:23 PM Feature #47732: Issue health warning if a performance issue is occurring especially for ceph-osd ...
Look at swap to make sure memory isn't over provisioned to containers, for example.
Do containers swap or crash if...
David Zafman
07:17 PM Feature #47732: Issue health warning if a performance issue is occurring especially for ceph-osd ...
Include in Orchestator checks? David Zafman
09:01 AM Backport #47899 (In Progress): nautilus: mon stat prints plain text with -f json
Nathan Cutler
08:34 AM Backport #47899 (Resolved): nautilus: mon stat prints plain text with -f json
https://github.com/ceph/ceph/pull/37706 Nathan Cutler
08:58 AM Backport #47898 (In Progress): octopus: mon stat prints plain text with -f json
Nathan Cutler
08:34 AM Backport #47898 (Resolved): octopus: mon stat prints plain text with -f json
https://github.com/ceph/ceph/pull/37705 Nathan Cutler
06:08 AM Bug #46816 (Pending Backport): mon stat prints plain text with -f json
Kefu Chai
06:01 AM Bug #47024: rados/test.sh: api_tier_pp LibRadosTwoPoolsPP.ManifestSnapRefcount failed
... Kefu Chai

10/16/2020

02:45 PM Bug #43795 (Resolved): Ceph tools utilizing "global_[pre_]init" no longer process "early" environ...
Nathan Cutler
02:45 PM Backport #43996 (Rejected): mimic: Ceph tools utilizing "global_[pre_]init" no longer process "ea...
mimic EOL Nathan Cutler

10/15/2020

02:10 PM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
鑫 王 wrote:
> hi Igor,
> Is there any new progress?
Hi!
I haven't managed to reproduce this locally for mast...
Igor Fedotov
10:20 AM Bug #44420: cephadm cluster: "ceph ping mon.*" works fine, but "ceph ping mon.<id>" is broken
Sebastian Wagner wrote:
> might be a cephadm issue.
Indeed, it does seem to happen *only* when the daemon is runn...
Nathan Cutler
08:30 AM Backport #47741: octopus: mon: set session_timeout when adding to session_map
Nathan Cutler wrote:
> @Konstantin: Which master PR are you intending to backport to octopus here?
@Nathan:
look...
Wei-Chung Cheng

10/14/2020

01:57 PM Bug #43887: ceph_test_rados_delete_pools_parallel failure
rados/monthrash/{ceph clusters/3-mons msgr-failures/few msgr/async objectstore/filestore-xfs
rados supported-random-...
Deepika Upadhyay
01:52 PM Bug #24057 (New): cbt fails to copy results to the archive dir
Deepika Upadhyay

10/13/2020

12:02 AM Bug #47838: mon/test_mon_osdmap_prune.sh: first_pinned != trim_to
ok, fails 1 out 10 times but seems new, need to look more.... Neha Ojha

10/12/2020

11:45 PM Bug #47838: mon/test_mon_osdmap_prune.sh: first_pinned != trim_to
Did not reproduce here: https://pulpito.ceph.com/nojha-2020-10-12_19:42:27-rados:monthrash-master-distro-basic-smithi... Neha Ojha
05:08 PM Bug #47838 (In Progress): mon/test_mon_osdmap_prune.sh: first_pinned != trim_to
... Neha Ojha
11:33 PM Bug #24057: cbt fails to copy results to the archive dir
http://qa-proxy.ceph.com/teuthology/yuriw-2020-10-09_19:36:20-rados-wip-yuri-testing-2020-10-09-1112-octopus-distro-b... Deepika Upadhyay
12:40 PM Bug #46816 (Fix Under Review): mon stat prints plain text with -f json
Nathan Cutler
11:35 AM Bug #46816: mon stat prints plain text with -f json
https://github.com/ceph/ceph/pull/37635 Joao Eduardo Luis
09:47 AM Bug #46816 (Triaged): mon stat prints plain text with -f json
Joao Eduardo Luis

10/10/2020

11:30 AM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
/a/kchai-2020-10-10_09:47:31-rados-wip-kefu-testing-2020-10-09-1210-distro-basic-smithi/5512894 Kefu Chai
08:59 AM Backport #47826 (Resolved): octopus: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
https://github.com/ceph/ceph/pull/37853 Nathan Cutler
08:59 AM Backport #47825 (Resolved): nautilus: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
https://github.com/ceph/ceph/pull/37815 Nathan Cutler
08:54 AM Bug #47813 (Fix Under Review): osd op age is 4294967296
Every few seconds on a 14.2.11 OSD I see an op with age close to 2^32:... Dan van der Ster
03:52 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
hi Igor,
Is there any new progress?
Stellar Wang

10/09/2020

03:04 PM Backport #47363 (Resolved): octopus: pgs inconsistent, union_shard_errors=missing
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37048
m...
Nathan Cutler
03:39 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
Igor Fedotov wrote:
> Would you please collect perf counter dumps for both running benchmark (e.g. ) and on its comp...
Stellar Wang

10/08/2020

09:57 PM Bug #47804 (New): EC backend implementation isn't optimal when handling 4k overwrites
EC backend performs redundant read/write ops when handling partial 4k-aligned overwrite.
E.g. there is 64K object i...
Igor Fedotov
06:54 PM Backport #47363: octopus: pgs inconsistent, union_shard_errors=missing
Mykola Golub wrote:
> https://github.com/ceph/ceph/pull/37048
merged
Yuri Weinstein
06:44 PM Bug #46405 (Pending Backport): osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
Neha Ojha
05:54 PM Bug #45190: osd dump times out
/a/yuriw-2020-10-05_22:17:06-rados-wip-yuri7-testing-2020-10-05-1338-octopus-distro-basic-smithi/5500126/teuthology.l... Deepika Upadhyay
05:22 PM Bug #45948: ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus
rados/monthrash/{ceph clusters/3-mons msgr-failures/few msgr/async objectstore/filestore-xfs
rados supported-ran...
Deepika Upadhyay
05:13 PM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
description: rados/singleton/{all/thrash_cache_writeback_proxy_none msgr-failures/few
msgr/async-v2only objectst...
Deepika Upadhyay

10/07/2020

09:34 PM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
rados/singleton/{all/thrash_cache_writeback_proxy_none mon_election/classic msgr-failures/few msgr/async objectstore/... Neha Ojha
11:44 AM Feature #47775: limit osd_pglog size by memory
I've seen that octopus has a new option osd_target_pg_log_entries_per_osd. This looks good. I would therefore amend m... Dan van der Ster
11:17 AM Feature #47775 (New): limit osd_pglog size by memory
We have an S3 cluster with OSDs running out of memory due to the large amount of ram needed to hold 3000 pglog entrie... Dan van der Ster

10/06/2020

08:37 PM Bug #44352: pool listings are slow after deleting objects
We compacted OSDs from time to time and it helped at some time. We then moved .rgw.root pool just to SSD drives (it ... Serg Protsun
03:00 PM Bug #47767: octopus: setting noscrub crashed osd process
Mostly likely a regression caused by https://github.com/ceph/ceph/pull/36292. Neha Ojha
02:41 PM Bug #47767: octopus: setting noscrub crashed osd process
Sorry about the formatting -- I think it's still legible.
I found other osds with missing object errors also trigg...
Dan van der Ster
02:20 PM Bug #47767 (Resolved): octopus: setting noscrub crashed osd process
We just had a crash of one osd (out of ~1200) moments after we set noscrub and nodeep-scrub on the cluster.
Here...
Dan van der Ster
12:58 PM Feature #47766 (New): crushtool compile support from json crush map
When programmatically manipulating a CRUSH map, json is often used, unfortunately, "crushtool" does not support compi... Sébastien Han

10/05/2020

10:00 PM Feature #42659 (Duplicate): add a health_warn when mon_osd_report_timeout <= mon_osd_report_timeout
Neha Ojha
09:56 PM Feature #42659 (Resolved): add a health_warn when mon_osd_report_timeout <= mon_osd_report_timeout
Neha Ojha
10:00 PM Bug #40668 (Resolved): mon_osd_report_timeout should not be allowed to be less than 2x the value ...
Neha Ojha
09:57 PM Bug #40668 (Duplicate): mon_osd_report_timeout should not be allowed to be less than 2x the value...
Neha Ojha
09:31 PM Backport #47741: octopus: mon: set session_timeout when adding to session_map
@Konstantin: Which master PR are you intending to backport to octopus here? Nathan Cutler
02:27 AM Backport #47741 (Duplicate): octopus: mon: set session_timeout when adding to session_map
Konstantin Shalygin
03:46 PM Backport #47748 (In Progress): nautilus: mon: set session_timeout when adding to session_map
Wei-Chung Cheng
01:13 PM Backport #47748 (Resolved): nautilus: mon: set session_timeout when adding to session_map
https://github.com/ceph/ceph/pull/37554 Nathan Cutler
03:46 PM Backport #47747 (In Progress): octopus: mon: set session_timeout when adding to session_map
Wei-Chung Cheng
01:13 PM Backport #47747 (Resolved): octopus: mon: set session_timeout when adding to session_map
https://github.com/ceph/ceph/pull/37553 Nathan Cutler
03:16 PM Documentation #47754 (New): Orchestrator implementation status table is old
https://docs.ceph.com/en/latest/mgr/orchestrator/#current-implementation-status
This table, showing the current im...
Zac Dover
03:11 PM Documentation #47522 (Closed): Document "ceph df detail"
Zac Dover
03:09 PM Documentation #47523 (In Progress): ceph df documentation is outdated
Zac Dover
12:44 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
seems like leader has been down since long: ... Deepika Upadhyay
06:17 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
http://qa-proxy.ceph.com/teuthology/yuriw-2020-09-16_23:57:37-rados-wip-yuri8-testing-2020-09-16-2220-octopus-distro-... Deepika Upadhyay

10/04/2020

05:48 AM Bug #47697 (Pending Backport): mon: set session_timeout when adding to session_map
Kefu Chai

10/03/2020

02:09 AM Bug #37532 (Resolved): mon: expected_num_objects warning triggers on bluestore-only setups
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:05 AM Bug #44815 (Resolved): Pool stats increase after PG merged (PGMap::apply_incremental doesn't subt...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:04 AM Bug #46024 (Resolved): larger osd_scrub_max_preemptions values cause Floating point exception
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:04 AM Bug #46216 (Resolved): mon: log entry with garbage generated by bad memory access
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:03 AM Bug #46705 (Resolved): Negative peer_num_objects crashes osd
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:03 AM Bug #46824 (Resolved): "No such file or directory" when exporting or importing a pool if locator ...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:02 AM Bug #47159 (Resolved): add ability to clean_temps in osdmaptool
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:02 AM Bug #47309 (Resolved): mon/mon-last-epoch-clean.sh failure
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
01:36 AM Backport #47346 (Resolved): octopus: mon/mon-last-epoch-clean.sh failure
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37349
m...
Nathan Cutler
01:36 AM Backport #47251 (Resolved): octopus: add ability to clean_temps in osdmaptool
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37348
m...
Nathan Cutler
01:25 AM Backport #47345 (Resolved): nautilus: mon/mon-last-epoch-clean.sh failure
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37478
m...
Nathan Cutler
01:25 AM Backport #47250 (Resolved): nautilus: add ability to clean_temps in osdmaptool
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37477
m...
Nathan Cutler
01:24 AM Backport #46965 (Resolved): nautilus: Pool stats increase after PG merged (PGMap::apply_increment...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37476
m...
Nathan Cutler
01:24 AM Backport #46935 (Resolved): nautilus: "No such file or directory" when exporting or importing a p...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37475
m...
Nathan Cutler
01:24 AM Backport #46738 (Resolved): nautilus: mon: expected_num_objects warning triggers on bluestore-onl...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37474
m...
Nathan Cutler
01:23 AM Backport #46710 (Resolved): nautilus: Negative peer_num_objects crashes osd
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37473
m...
Nathan Cutler
01:23 AM Backport #46262 (Resolved): nautilus: larger osd_scrub_max_preemptions values cause Floating poin...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37470
m...
Nathan Cutler
01:23 AM Backport #46461 (Resolved): nautilus: pybind/mgr/balancer: should use "==" and "!=" for comparing...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37471
m...
Nathan Cutler

10/02/2020

10:09 PM Bug #40367 (Can't reproduce): "*** Caught signal (Segmentation fault) **" in upgrade:luminous-x-n...
Neha Ojha
10:07 PM Bug #40081 (Closed): mon: luminous crash attempting to decode maps after nautilus quorum has been...
Doesn't apply to nautilus and future releases. Luminous and mimic are EOL. Neha Ojha
10:02 PM Bug #40029 (Resolved): ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(Cep...
Neha Ojha
10:02 PM Bug #40029 (Rejected): ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(Cep...
This only seems to be a problem with luminous and mimic, which are EOL now. Neha Ojha
09:56 PM Bug #39366 (Can't reproduce): ClsLock.TestRenew failure
Neha Ojha
09:55 PM Bug #38513 (Rejected): luminous: "AsyncReserver.h: 190: FAILED assert(!queue_pointers.count(item)...
Luminous is EOL. Neha Ojha
09:54 PM Bug #38402 (Can't reproduce): ceph-objectstore-tool on down osd w/ not enough in osds
Neha Ojha
09:53 PM Bug #38375 (Need More Info): OSD segmentation fault on rbd create
Seems like we have lost the ceph-post-file due one of the lab incidents. Neha Ojha
09:46 PM Bug #38064 (Duplicate): librados::OPERATION_FULL_TRY not completely implemented, test LibRadosAio...
Josh Durgin
09:42 PM Bug #35974: Apparent export-diff/import-diff corruption
Looking at this again, it seems like a potential bug when reading from replicas and encountering an EIO - this should... Josh Durgin
09:18 PM Bug #46318: mon_recovery: quorum_status times out
Joao, are you working on a fix for this? Neha Ojha
09:15 PM Bug #36304: FAILED ceph_assert(p != pg_slots.end()) in OSDShard::register_and_wake_split_child(PG*)
Haven't seen this in a while. Neha Ojha
09:11 PM Bug #44362 (Can't reproduce): osd: uninitialized memory in sendmsg
Neha Ojha
09:03 PM Bug #46405 (Fix Under Review): osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
Neha Ojha
06:30 PM Feature #47732 (New): Issue health warning if a performance issue is occurring especially for cep...

This feature would identify a false network ping warning which might occur with a very busy ceph-osd(s).
The mon...
David Zafman
06:26 PM Backport #47346: octopus: mon/mon-last-epoch-clean.sh failure
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37349
merged
Yuri Weinstein
06:25 PM Backport #47251: octopus: add ability to clean_temps in osdmaptool
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37348
merged
Yuri Weinstein
04:30 PM Bug #45191: erasure-code/test-erasure-eio.sh: TEST_ec_single_recovery_error fails
http://qa-proxy.ceph.com/teuthology/yuriw-2020-10-01_17:46:11-rados-wip-yuri5-testing-2020-10-01-0834-octopus-distro-... Deepika Upadhyay

10/01/2020

11:04 PM Backport #46287 (Rejected): nautilus: mon: log entry with garbage generated by bad memory access
Not required for Nautilus. Patrick Donnelly
09:18 PM Bug #47508 (In Progress): Multiple read errors cause repeated entry/exit recovery for each error
David Zafman
06:10 PM Bug #47692: qa/standalone/osd/osd-backfill-stats.sh TEST_backfill_sizeup wait_for_clean timeout

qa/standalone/osd/osd-backfill-stats.sh:213: TEST_backfill_sizeup_out
https://pulpito.ceph.com/dzafman-2020-09...
David Zafman
05:18 PM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
/a/teuthology-2020-10-01_07:01:02-rados-master-distro-basic-smithi/5486214 Neha Ojha
05:14 PM Bug #47719 (Resolved): api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
... Neha Ojha
04:54 PM Backport #47345: nautilus: mon/mon-last-epoch-clean.sh failure
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37478
merged
Yuri Weinstein
04:54 PM Backport #47250: nautilus: add ability to clean_temps in osdmaptool
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37477
merged
Yuri Weinstein
04:53 PM Backport #46965: nautilus: Pool stats increase after PG merged (PGMap::apply_incremental doesn't ...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37476
merged
Yuri Weinstein
04:53 PM Backport #46935: nautilus: "No such file or directory" when exporting or importing a pool if loca...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37475
merged
Yuri Weinstein
04:52 PM Backport #46738: nautilus: mon: expected_num_objects warning triggers on bluestore-only setups
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37474
merged
Yuri Weinstein
04:51 PM Backport #46710: nautilus: Negative peer_num_objects crashes osd
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37473
merged
Yuri Weinstein
04:51 PM Backport #46262: nautilus: larger osd_scrub_max_preemptions values cause Floating point exception
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37470
merged
Yuri Weinstein
04:38 PM Backport #46461: nautilus: pybind/mgr/balancer: should use "==" and "!=" for comparing strings
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37471
merged
Yuri Weinstein
08:58 AM Bug #47697: mon: set session_timeout when adding to session_map
Not to my knowledge. When it happens, it is mostly transparent to the user -- the peer reopens the socket and attemp... Ilya Dryomov
02:22 AM Bug #47697: mon: set session_timeout when adding to session_map
Were there upstream QA tests that failed because of this? How did you learn of this problem? Patrick Donnelly
07:56 AM Bug #47712 (New): hdd pg's migrating when converting one ssd class osd to dmcrypt
I have pg's of hdd pools remapping, when I take out an ssd osd.
change crush reweight of ssd osd 33 to 0.0
[@ce...
none none

09/30/2020

05:58 PM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
rados/singleton/{all/osd-recovery-incomplete mon_election/connectivity msgr-failures/many msgr/async-v2only objectsto... Neha Ojha
05:56 PM Bug #47654: test_mon_pg: mon fails to join quorum to due election strategy mismatch
/a/teuthology-2020-09-30_07:01:02-rados-master-distro-basic-smithi/5483682 Neha Ojha
05:55 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
/a/teuthology-2020-09-30_07:01:02-rados-master-distro-basic-smithi/5483631 Neha Ojha
03:41 PM Documentation #46531 (Resolved): The default value of osd_scrub_during_recovery is false since v1...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
03:41 PM Feature #46663 (Resolved): Add pg count for pools in the `ceph df` command
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
03:40 PM Bug #46914 (Resolved): mon: stuck osd_pgtemp message forwards
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
03:39 PM Bug #47180 (Resolved): qa/standalone/mon/mon-handle-forward.sh failure
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:39 AM Bug #47697 (Fix Under Review): mon: set session_timeout when adding to session_map
Ilya Dryomov
11:30 AM Bug #47697 (Resolved): mon: set session_timeout when adding to session_map
With msgr2, the session is added in Monitor::ms_handle_accept() which is queued by ProtocolV2 at the end of handling ... Ilya Dryomov
07:14 AM Backport #46587 (Resolved): nautilus: The default value of osd_scrub_during_recovery is false sin...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37472
m...
Nathan Cutler
07:13 AM Backport #47091 (Resolved): octopus: mon: stuck osd_pgtemp message forwards
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37347
m...
Nathan Cutler
07:12 AM Backport #47258 (Resolved): octopus: Add pg count for pools in the `ceph df` command
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36945
m...
Nathan Cutler
01:06 AM Bug #47692: qa/standalone/osd/osd-backfill-stats.sh TEST_backfill_sizeup wait_for_clean timeout

qa/standalone/osd/osd-recovery-stats.sh TEST_recovery_replicated_out1 wait_for_clean time out
https://pulpito.c...
David Zafman
01:01 AM Bug #47692 (New): qa/standalone/osd/osd-backfill-stats.sh TEST_backfill_sizeup wait_for_clean ti...

wait_for_clean timeout in TEST_backfill_sizeup
https://pulpito.ceph.com/dzafman-2020-09-29_20:13:01-rados-wip-za...
David Zafman
12:57 AM Bug #47691 (New): osd-markdown.sh TEST_markdown_boot, osd not having enough time to boot

qa/standalone/osd/osd-markdown.sh:102: TEST_markdown_boot: ceph tell osd.0 get_latest_osdmap
2020-09-29T22:35:45....
David Zafman

09/29/2020

10:46 PM Bug #46405 (In Progress): osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
David Zafman
08:11 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
/a/teuthology-2020-09-29_07:01:02-rados-master-distro-basic-smithi/5480928 Neha Ojha
06:00 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
This change fixes the odd object names in the subtest, but shouldn't change help fix this problem. On my build machi... David Zafman
09:49 PM Backport #47091: octopus: mon: stuck osd_pgtemp message forwards
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37347
merged
Yuri Weinstein
09:47 PM Feature #46663: Add pg count for pools in the `ceph df` command
https://github.com/ceph/ceph/pull/36945 merged Yuri Weinstein
08:39 PM Bug #46603: osd/osd-backfill-space.sh: TEST_ec_backfill_simple: return 1

http://qa-proxy.ceph.com/teuthology/yuriw-2020-09-28_18:47:33-rados-wip-yuri-testing-2020-09-28-1007-octopus-distro...
Deepika Upadhyay
08:31 PM Bug #47153: monitor crash during upgrade due to LogSummary encoding changes between luminous and ...
Neha Ojha wrote:
> https://github.com/ceph/ceph/pull/36838 is not a fix for this issue. It was being used to reprodu...
Yuri Weinstein
08:27 PM Bug #47153 (New): monitor crash during upgrade due to LogSummary encoding changes between luminou...
https://github.com/ceph/ceph/pull/36838 is not a fix for this issue. It was being used to reproduce this issue but ha... Neha Ojha
08:25 PM Bug #47153: monitor crash during upgrade due to LogSummary encoding changes between luminous and ...
since the fix is targeting nautilus, I'll go out on a limb and fill in "Affected versions" with a guess. Nathan Cutler
08:24 PM Bug #47153 (Fix Under Review): monitor crash during upgrade due to LogSummary encoding changes be...
Nathan Cutler
04:56 PM Backport #47345 (In Progress): nautilus: mon/mon-last-epoch-clean.sh failure
Nathan Cutler
04:55 PM Backport #47250 (In Progress): nautilus: add ability to clean_temps in osdmaptool
Nathan Cutler
04:38 PM Backport #46965 (In Progress): nautilus: Pool stats increase after PG merged (PGMap::apply_increm...
Nathan Cutler
04:37 PM Backport #46935 (In Progress): nautilus: "No such file or directory" when exporting or importing ...
Nathan Cutler
04:36 PM Backport #46738 (In Progress): nautilus: mon: expected_num_objects warning triggers on bluestore-...
Nathan Cutler
04:36 PM Backport #46710 (In Progress): nautilus: Negative peer_num_objects crashes osd
Nathan Cutler
04:33 PM Backport #46587 (In Progress): nautilus: The default value of osd_scrub_during_recovery is false ...
Nathan Cutler
04:32 PM Backport #46461 (In Progress): nautilus: pybind/mgr/balancer: should use "==" and "!=" for compar...
Nathan Cutler
04:31 PM Backport #46287: nautilus: mon: log entry with garbage generated by bad memory access
don't see how the code change applies to nautilus Nathan Cutler
04:31 PM Backport #46287 (Need More Info): nautilus: mon: log entry with garbage generated by bad memory a...
Nathan Cutler
04:27 PM Backport #46262 (In Progress): nautilus: larger osd_scrub_max_preemptions values cause Floating p...
Nathan Cutler
12:12 PM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
鑫 王 wrote:
>
> In addition, Can you tell me which fields you care about perf counters.
Everything under "bluest...
Igor Fedotov
11:38 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
Igor Fedotov wrote:
> So a short summary for now is:
> 1) High memory consumption is just temporary and goes away o...
Stellar Wang
11:09 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
Would you please collect perf counter dumps for both running benchmark (e.g. in the middle of it) and on its completi... Igor Fedotov
11:08 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
鑫 王 wrote:
>
> *Question:*
> In mempool dump information, bluestore_writing takes up most of the memory. How is t...
Igor Fedotov
11:00 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
So a short summary for now is:
1) High memory consumption is just temporary and goes away on writing benchmark compl...
Igor Fedotov
10:56 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
hi Igor,
*I do the following today.*
# Adjust rocksDB parameters (max_write_buffer_number=4,write_buffer_size=128...
Stellar Wang
03:12 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
Igor Fedotov wrote:
> And could you please try the same benchmark against replicated pool? Would this have the same ...
Stellar Wang
02:46 AM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
Igor Fedotov wrote:
> Does memory consumption stay that high after benchmark is completed/terminated?
Answer: Mem...
Stellar Wang
12:32 AM Bug #45647: "ceph --cluster ceph --log-early osd last-stat-seq osd.0" times out due to msgr-failu...
rados/singleton/{all/peer mon_election/connectivity msgr-failures/many msgr/async-v2only objectstore/bluestore-bitmap... Neha Ojha
12:30 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
/a/teuthology-2020-09-28_07:01:02-rados-master-distro-basic-smithi/5476775... Neha Ojha

09/28/2020

07:30 PM Backport #47599 (Resolved): octopus: qa/standalone/mon/mon-handle-forward.sh failure
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36705
m...
Nathan Cutler
07:30 PM Backport #47600 (Resolved): nautilus: qa/standalone/mon/mon-handle-forward.sh failure
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36704
m...
Nathan Cutler
02:51 PM Backport #47600: nautilus: qa/standalone/mon/mon-handle-forward.sh failure
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36704
merged
Yuri Weinstein
06:05 PM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
And could you please try the same benchmark against replicated pool? Would this have the same problem? Igor Fedotov
06:04 PM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
Does memory consumption stay that high after benchmark is completed/terminated?
Igor Fedotov
02:09 PM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
hi lgor,
Thank you for your quick feedback, Osd memory still exceeds the set threshold of 2G when i run again it,...
Stellar Wang
12:16 PM Bug #47673: cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
This might be related to https://tracker.ceph.com/issues/46658
Could you please collect a mempool dump from an OSD...
Igor Fedotov
11:47 AM Bug #47673 (New): cephfs 4k randwrite + EC pool(2+1) + single node all OSDs OOM
A 4K random write scenario in a single-node full SSD cephfs will cause the OSD memory space to grow indefinitely and ... Stellar Wang
05:08 PM Bug #47654: test_mon_pg: mon fails to join quorum to due election strategy mismatch
/a/teuthology-2020-09-28_07:01:02-rados-master-distro-basic-smithi/5476834 Neha Ojha
01:57 PM Documentation #47523 (Duplicate): ceph df documentation is outdated
Zac Dover
01:57 PM Documentation #47523: ceph df documentation is outdated
See issue # 47522.
https://tracker.ceph.com/issues/47522
Zac Dover
01:53 AM Feature #47666: Ceph pool history
I second this, having benefited from ZFS history in my Solaris 10 days.
For years we've relied on shell history fo...
Anthony D'Atri

09/27/2020

10:59 PM Backport #47599: octopus: qa/standalone/mon/mon-handle-forward.sh failure
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36705
merged
Yuri Weinstein
10:18 PM Feature #47666 (New): Ceph pool history
Introduce a "ceph pool $pool history" command to obtain historic information that modified pool state. I.e. changing ... Stefan Kooman
04:14 PM Bug #47452 (Resolved): invalid values of crush-failure-domain should not be allowed while creatin...
Kefu Chai

09/26/2020

04:36 PM Bug #47590: osd do not respect scrub schedule
Neha Ojha wrote:
> Can you provide the output of "ceph config dump"?
WHO MASK LEVEL OPTION ...
Petr Bena

09/25/2020

09:21 PM Bug #47590 (Need More Info): osd do not respect scrub schedule
Can you provide the output of "ceph config dump"? Neha Ojha
08:32 PM Bug #47654 (Resolved): test_mon_pg: mon fails to join quorum to due election strategy mismatch
... Neha Ojha
08:10 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
/a/teuthology-2020-09-25_07:01:01-rados-master-distro-basic-smithi/5466817 Neha Ojha
08:09 PM Bug #45615: api_watch_notify_pp: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify/1 f...
/a/teuthology-2020-09-25_07:01:01-rados-master-distro-basic-smithi/5466707 Neha Ojha
02:44 PM Bug #46877: mon_clock_skew_check: expected MON_CLOCK_SKEW but got none
Deepika Upadhyay wrote:
> this time job got dead after this warning:
>
> /a/yuriw-2020-08-20_19:48:15-rados-wip-...
Neha Ojha

09/24/2020

08:07 PM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
http://qa-proxy.ceph.com/teuthology/yuriw-2020-09-23_21:32:26-rados-wip-yuri4-testing-2020-09-23-1206-nautilus-distro... Deepika Upadhyay
02:37 PM Bug #46877: mon_clock_skew_check: expected MON_CLOCK_SKEW but got none
this time job got dead after this warning:
/a/yuriw-2020-08-20_19:48:15-rados-wip-yuri-testing-2020-08-17-1723-oc...
Deepika Upadhyay
01:18 PM Bug #45615: api_watch_notify_pp: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify/1 f...
http://pulpito.ceph.com/yuriw-2020-09-23_15:16:58-rados-wip-yuri-testing-2020-09-22-1332-octopus-distro-basic-smithi/... Deepika Upadhyay
09:14 AM Bug #47626 (New): process will crash by invalidate pointer
1:Version: mimic 13.2.9-0.el7.aarch64.rpm
2: coredump gbd info
(gdb) bt
#0 now (this=0x30) at /usr/src/debug/c...
Yi Li
12:42 AM Documentation #47522: Document "ceph df detail"
https://docs.ceph.com/en/latest/man/8/ceph/#df -- man page location
https://docs.ceph.com/en/latest/rados/operatio...
Zac Dover

09/23/2020

03:04 PM Bug #47617 (New): rebuild_mondb: daemon-helper: command failed with exit status 1
/a/yuriw-2020-09-16_23:57:37-rados-wip-yuri8-testing-2020-09-16-2220-octopus-distro-basic-smithi/5441511/teuthology.l... Deepika Upadhyay
11:52 AM Backport #47599 (In Progress): octopus: qa/standalone/mon/mon-handle-forward.sh failure
Nathan Cutler
09:08 AM Backport #47599 (Resolved): octopus: qa/standalone/mon/mon-handle-forward.sh failure
https://github.com/ceph/ceph/pull/36705 Nathan Cutler
11:49 AM Backport #47346 (In Progress): octopus: mon/mon-last-epoch-clean.sh failure
Nathan Cutler
11:48 AM Backport #47251 (In Progress): octopus: add ability to clean_temps in osdmaptool
Nathan Cutler
11:47 AM Backport #47091 (In Progress): octopus: mon: stuck osd_pgtemp message forwards
Nathan Cutler
11:15 AM Bug #47290 (Resolved): osdmaps aren't being cleaned up automatically on healthy cluster
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
09:09 AM Backport #47600 (In Progress): nautilus: qa/standalone/mon/mon-handle-forward.sh failure
Nathan Cutler
09:08 AM Backport #47600 (Resolved): nautilus: qa/standalone/mon/mon-handle-forward.sh failure
https://github.com/ceph/ceph/pull/36704 Nathan Cutler
08:19 AM Backport #47362 (Resolved): nautilus: pgs inconsistent, union_shard_errors=missing
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37051
m...
Nathan Cutler
08:05 AM Backport #47297 (Resolved): octopus: osdmaps aren't being cleaned up automatically on healthy clu...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36981
m...
Nathan Cutler
 

Also available in: Atom