Project

General

Profile

Activity

From 05/11/2020 to 06/09/2020

06/09/2020

09:34 PM Backport #45780 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen i...
Brad Hubbard
02:58 PM Backport #45780: nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35387
merged
Yuri Weinstein
09:02 PM Bug #42716: Pool creation error message is hidden on FileStore-backed pools
That wasn't the initial issue reported.
What happen if you run "ceph osd pool create foo2 2048" instead ? (assumin...
Dimitri Savineau
07:38 PM Bug #42716 (Resolved): Pool creation error message is hidden on FileStore-backed pools
closing this as already resolved.... Deepika Upadhyay
02:41 PM Bug #36337: OSDs crash with failed assertion in PGLog::merge_log as logs do not overlap
... Neha Ojha
02:41 PM Bug #45956 (New): verify takes forever to finish
rados/verify/{centos_latest.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-thrash/default/{default.yaml thra... Kefu Chai
12:24 PM Bug #45661 (Resolved): valgrind issue: UninitValue in ProtocolV2
In @master@ the PR #35407 has been closed in favor of https://github.com/ceph/ceph/pull/35186.
#35407 still might be...
Radoslaw Zarzynski
06:34 AM Bug #45948 (Duplicate): ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus
Oops, this is a dup of #43887 Brad Hubbard
06:31 AM Bug #45948 (Duplicate): ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus
/a/yuriw-2020-06-08_16:06:08-rados-wip-yuri2-testing-2020-06-08-1458-nautilus-distro-basic-smithi/5129541... Brad Hubbard
06:06 AM Bug #45947: ceph_test_rados_watch_notify hang seen in nautilus
Note https://tracker.ceph.com/issues/43861 removed this test from master because it was hanging. Brad Hubbard
06:02 AM Bug #45947: ceph_test_rados_watch_notify hang seen in nautilus
This is very similar to what is seen in #45946 so they may be related. Brad Hubbard
06:01 AM Bug #45947 (New): ceph_test_rados_watch_notify hang seen in nautilus
/a/yuriw-2020-06-08_16:06:08-rados-wip-yuri2-testing-2020-06-08-1458-nautilus-distro-basic-smithi/5129565... Brad Hubbard
05:32 AM Bug #45946 (New): ceph_test_rados_delete_pools_parallel hang seen in octopus
/a/yuriw-2020-05-29_15:51:00-rados-wip-yuri-testing-2020-05-28-2238-octopus-distro-basic-smithi/5103106... Brad Hubbard
04:28 AM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
... Kefu Chai
12:05 AM Bug #44510: osd/osd-recovery-space.sh TEST_recovery_test_simple failure
Seen again:
http://pulpito.ceph.com/dzafman-2020-06-08_11:45:40-rados-wip-zafman-testing-distro-basic-smithi/5130114
David Zafman

06/08/2020

11:51 PM Bug #43888: osd/osd-bench.sh 'tell osd.N bench' hang
Saw this in at least 17 jobs:
http://pulpito.ceph.com/dzafman-2020-06-08_11:45:40-rados-wip-zafman-testing-distro-...
David Zafman
11:39 PM Bug #45944 (Triaged): osd/osd-markdown.sh: TEST_osd_stop failed
This appears to be a rare condition when 15 seconds sleep was not enough. Neha Ojha
09:14 PM Bug #45944 (Triaged): osd/osd-markdown.sh: TEST_osd_stop failed
... Neha Ojha
09:10 PM Bug #45318: Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log running...
rados/multimon/{clusters/21 msgr-failures/few msgr/async-v1only no_pools objectstore/bluestore-comp-zlib rados suppor... Neha Ojha
07:39 PM Bug #45943 (Fix Under Review): Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
07:09 PM Bug #45943 (Resolved): Ceph Monitor heartbeat grace period does not reset.
The heartbeat grace timer does not reset after cluster network is stable for multiple days.
Implement a mechanism to...
Sridhar Seshasayee
06:31 PM Backport #45891 (In Progress): luminous: osd: pg stuck in waitactingchange when new acting set do...
Nathan Cutler
06:22 PM Backport #45892 (In Progress): mimic: osd: pg stuck in waitactingchange when new acting set doesn...
Nathan Cutler
12:51 PM Bug #45795 (Fix Under Review): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_back...
Ilya Dryomov
07:01 AM Bug #45916: cls_lock: unlimited shared lock created by libradosstriper api let node crash
add pr: https://github.com/ceph/ceph/pull/35467 Zhenyi Shu
06:50 AM Bug #45916 (Fix Under Review): cls_lock: unlimited shared lock created by libradosstriper api let...
_Background: Ceph liminous are running on our production and a service uses libradosstriper api to access ceph._
W...
Zhenyi Shu

06/06/2020

08:45 AM Backport #45357 (Resolved): octopus: rados: Sharded OpWQ drops suicide_grace after waiting for work
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34881
m...
Nathan Cutler
08:31 AM Backport #45884 (In Progress): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
Nathan Cutler
08:31 AM Backport #45882 (In Progress): octopus: Objecter: don't attempt to read from non-primary on EC pools
Nathan Cutler
08:30 AM Backport #45779 (In Progress): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen...
Nathan Cutler
08:29 AM Backport #45775 (In Progress): octopus: build_incremental_map_msg missing incremental map while s...
Nathan Cutler
08:28 AM Backport #45673 (In Progress): octopus: qa: powercycle: install task runs twice with double unwin...
Nathan Cutler
12:53 AM Bug #44314 (In Progress): osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_ou...
David Zafman

06/05/2020

10:52 PM Bug #44314: osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_out() (degraded ...

It would be helpful to see the osd logs when this happens. We are expecting the following sequence to occur.
St...
David Zafman
04:20 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
/a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117777 Neha Ojha
04:17 PM Bug #45424: api_watch_notify_pp: [ FAILED ] LibRadosWatchNotifyECPP.WatchNotify watch_notify_cx...
/a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117783 Neha Ojha
04:01 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
/a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5118028 Neha Ojha
03:58 PM Bug #44517: osd/osd-backfill-space.sh TEST_backfill_multi_partial: pgs didn't go active+clean
... Neha Ojha

06/04/2020

09:15 PM Bug #45868: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Similar... Neha Ojha
09:06 PM Bug #45661 (Fix Under Review): valgrind issue: UninitValue in ProtocolV2
https://github.com/ceph/ceph/pull/35407 Radoslaw Zarzynski
10:07 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
Pin-pointed to a branch of @PrimaryLogPG::do_manifest_flush()@:... Radoslaw Zarzynski
08:36 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
... Radoslaw Zarzynski
06:08 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
Ah, that makes sense. It should suffice to simply not populate_obc_watchers if replica. Samuel Just
05:42 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
After more digging, this doesn't appear to be related to notifies being sent to replicas.
The issue seems to be wi...
Ilya Dryomov
12:48 PM Backport #45890 (In Progress): nautilus: osd: pg stuck in waitactingchange when new acting set do...
Nathan Cutler
11:58 AM Backport #45890 (Resolved): nautilus: osd: pg stuck in waitactingchange when new acting set doesn...
https://github.com/ceph/ceph/pull/35389 Nathan Cutler
12:44 PM Backport #45883 (In Progress): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
Nathan Cutler
11:55 AM Backport #45883 (Resolved): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
https://github.com/ceph/ceph/pull/35388 Nathan Cutler
12:44 PM Backport #45780 (In Progress): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (see...
Nathan Cutler
12:43 PM Backport #45776 (In Progress): nautilus: build_incremental_map_msg missing incremental map while ...
Nathan Cutler
11:59 AM Backport #45892 (Rejected): mimic: osd: pg stuck in waitactingchange when new acting set doesn't ...
https://github.com/ceph/ceph/pull/35484 Nathan Cutler
11:59 AM Backport #45891 (Rejected): luminous: osd: pg stuck in waitactingchange when new acting set doesn...
https://github.com/ceph/ceph/pull/35485 Nathan Cutler
11:55 AM Backport #45884 (Resolved): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
https://github.com/ceph/ceph/pull/35445 Nathan Cutler
11:55 AM Backport #45882 (Resolved): octopus: Objecter: don't attempt to read from non-primary on EC pools
https://github.com/ceph/ceph/pull/35444 Nathan Cutler
07:16 AM Bug #45871 (New): Incorrect (0) number of slow requests in health check
ceph version 14.2.9-899-gc02349c600 (c02349c60052aaa6c7bd0c2270c7f7be16fab632) nautilus (stable)
Our cluster shows...
Eugen Block
12:24 AM Bug #40117 (Duplicate): PG stuck in WaitActingChange
Fixed in https://tracker.ceph.com/issues/41190 Neha Ojha
12:21 AM Bug #41190 (Pending Backport): osd: pg stuck in waitactingchange when new acting set doesn't change
Neha Ojha
12:20 AM Bug #41236 (Resolved): cosbench failures in rados/perf
Neha Ojha
12:18 AM Bug #41550 (Resolved): os/bluestore: fadvise_flag leak in generate_transaction
Neha Ojha
12:17 AM Bug #41677 (Resolved): Cephmon:fix mon crash
Fixed as a part of https://tracker.ceph.com/issues/41680. Neha Ojha
12:14 AM Bug #41913 (Resolved): With auto scaler operating stopping an OSD can lead to COT crashing instea...
Neha Ojha
12:08 AM Bug #45356 (Resolved): nautilus: rados/upgrade/mimic-x-singleton failures due to mon_client_direc...
Neha Ojha

06/03/2020

09:06 PM Bug #45733 (Pending Backport): osd-scrub-repair.sh: SyntaxError: invalid syntax
Neha Ojha
06:12 PM Bug #45733: osd-scrub-repair.sh: SyntaxError: invalid syntax
https://github.com/ceph/ceph/pull/35279 merged Yuri Weinstein
08:50 PM Backport #45357: octopus: rados: Sharded OpWQ drops suicide_grace after waiting for work
Dan Hill wrote:
> https://github.com/ceph/ceph/pull/34881
merged
Yuri Weinstein
08:34 PM Bug #45868 (Resolved): rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
... Neha Ojha
08:30 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
/a/yuriw-2020-06-02_15:07:59-rados-wip-yuri7-testing-2020-06-01-2256-octopus-distro-basic-smithi/5113082 - octopus Neha Ojha
04:44 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
Moving this since it appears to be a problem with the mon_thrasher (or the MONs or monclients).... Brad Hubbard
02:44 PM Bug #45793 (Pending Backport): Objecter: don't attempt to read from non-primary on EC pools
Kefu Chai
01:24 PM Backport #41533: mimic: Move bluefs alloc size initialization log message to log level 1
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30219
m...
Nathan Cutler
12:59 PM Bug #45857 (New): crimson/alien_store: alienstore cannot open_collections
setup: setting debug level 20 for bluestore, filestore and osd and using seastar with seastar_default_allocator + Rel... Deepika Upadhyay
01:50 AM Bug #9984: lttng_probe_unregister hangs on shutdown
/a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5104372
Possibly an instance of thi...
Brad Hubbard

06/02/2020

07:14 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
I see. Watch being a write and notify being a read has always tripped me, but I guess I looked at it from the side e... Ilya Dryomov
03:28 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
Well, osd-side notifies are reads in that they don't result in mutation. I think lingerops in general probably shoul... Samuel Just
10:38 AM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
Samuel Just wrote:
> Did that fire on the replica? At a guess, the problem is that notifies are being sent to repli...
Ilya Dryomov
02:07 AM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
It probably isn't https://tracker.ceph.com/issues/15391. Samuel Just
02:05 AM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
Did that fire on the replica? At a guess, the problem is that notifies are being sent to replicas, which would be wr... Samuel Just
07:08 PM Bug #45802 (Resolved): Health check failed: Reduced data availability: PG_AVAILABILITY
Casey Bodley
06:19 PM Bug #45802 (Fix Under Review): Health check failed: Reduced data availability: PG_AVAILABILITY
Neha Ojha
06:17 PM Bug #45802 (Triaged): Health check failed: Reduced data availability: PG_AVAILABILITY
Same root cause as https://tracker.ceph.com/issues/45619.
http://pulpito.ceph.com/teuthology-2020-05-30_03:05:02...
Neha Ojha
07:16 AM Bug #45809 (New): When out a osd, the `MAX AVAIL` doesn't change.
Environment: Luminous 12.2.12
I have a question about the pool's `MAX AVAIL` of `ceph df`.
When i out a osd, th...
chao wang
06:00 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
/a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5104057 Brad Hubbard
05:13 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
/a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5103952
/a/yuriw-2020-05-30_02:18:17-...
Brad Hubbard

06/01/2020

03:21 PM Bug #45802 (Resolved): Health check failed: Reduced data availability: PG_AVAILABILITY
multiple RGW tests are failing on different branches, with:... Casey Bodley
12:13 AM Bug #45796 (New): Ceph mon's sporadically report slow ops
We have recently upgraded our cluster to 14.2.9 from 10.2.6 and are in the process of a rolling rebuild of many of th... David Hows

05/31/2020

01:20 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
Sam, could you please take a look? Ilya Dryomov
01:19 PM Bug #45795 (Resolved): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().e...
I'm running into this assert while trying to exercise krbd with replica reads (particularly balanced reads):... Ilya Dryomov
12:34 PM Bug #45793: Objecter: don't attempt to read from non-primary on EC pools
Marking only for octopus, since replica reads are safe for general use only in octopus. Ilya Dryomov
12:32 PM Bug #45793 (Fix Under Review): Objecter: don't attempt to read from non-primary on EC pools
Ilya Dryomov
12:25 PM Bug #45793 (Resolved): Objecter: don't attempt to read from non-primary on EC pools
Ilya Dryomov

05/29/2020

05:31 PM Backport #45781 (Rejected): mimic: rados/test_envlibrados_for_rocksdb.sh build failure (seen in n...
Nathan Cutler
05:31 PM Backport #45780 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen i...
https://github.com/ceph/ceph/pull/35387 Nathan Cutler
05:31 PM Backport #45779 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in...
https://github.com/ceph/ceph/pull/35443 Nathan Cutler
05:30 PM Backport #45776 (Resolved): nautilus: build_incremental_map_msg missing incremental map while sna...
https://github.com/ceph/ceph/pull/35386 Nathan Cutler
05:30 PM Backport #45775 (Resolved): octopus: build_incremental_map_msg missing incremental map while snap...
https://github.com/ceph/ceph/pull/35442 Nathan Cutler
05:16 AM Bug #45761 (Need More Info): mon_thrasher: "Error ENXIO: mon unavailable" during sync_force comma...
/a/yuriw-2020-05-28_02:23:45-rados-wip-yuri-master_5.27.20-distro-basic-smithi/5097794... Brad Hubbard
04:11 AM Bug #45619 (Resolved): Health check failed: Reduced data availability: PG_AVAILABILITY
Kefu Chai
03:58 AM Bug #45760 (Resolved): osd-scrub-snaps.sh: TEST_scrub_snaps failed
Neha Ojha

05/28/2020

10:48 PM Bug #45760 (Fix Under Review): osd-scrub-snaps.sh: TEST_scrub_snaps failed
Neha Ojha
09:12 PM Bug #45760 (Resolved): osd-scrub-snaps.sh: TEST_scrub_snaps failed
... Neha Ojha
09:39 PM Bug #45660 (Resolved): osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
Neha Ojha
12:42 AM Bug #45660 (Fix Under Review): osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
Neha Ojha
08:57 PM Bug #45619 (Fix Under Review): Health check failed: Reduced data availability: PG_AVAILABILITY
Neha Ojha
01:52 PM Bug #41399 (Resolved): Move bluefs alloc size initialization log message to log level 1
Vikhyat Umrao
01:52 PM Backport #41533 (Resolved): mimic: Move bluefs alloc size initialization log message to log level 1
Vikhyat Umrao
07:17 AM Bug #45606 (Pending Backport): build_incremental_map_msg missing incremental map while snaptrim o...
Kefu Chai
06:38 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
... Kefu Chai
06:08 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
@/a/kchai-2020-05-27_23:43:53-rados-wip-kefu-testing-2020-05-27-2242-distro-basic-smithi/5097299/remote/*/log/valgrin... Kefu Chai
02:10 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
/a/yuriw-2020-05-24_19:30:40-rados-wip-yuri-master_5.24.20-distro-basic-smithi/5088037
/a/yuriw-2020-05-24_19:30:40-...
Brad Hubbard

05/27/2020

10:34 PM Bug #45733 (Fix Under Review): osd-scrub-repair.sh: SyntaxError: invalid syntax
Brad Hubbard
10:29 PM Bug #45733 (Resolved): osd-scrub-repair.sh: SyntaxError: invalid syntax
/a/yuriw-2020-05-23_15:15:01-rados-wip-yuri-master_5.22.20-distro-basic-smithi/5085557... Brad Hubbard
09:21 PM Bug #45660: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
... Neha Ojha
06:59 AM Bug #45660: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
/a/yuriw-2020-05-23_15:15:01-rados-wip-yuri-master_5.22.20-distro-basic-smithi/5085557 Brad Hubbard
09:05 PM Bug #44715: common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_list.back())->ops_in_...
Lowering severity since we haven't seen it in two weeks. Neha Ojha
08:37 PM Bug #45619 (Triaged): Health check failed: Reduced data availability: PG_AVAILABILITY
http://pulpito.front.sepia.ceph.com/yuvalif-2020-05-19_14:52:46-rgw:verify-fix-amqp-urls-with-vhosts-distro-basic-smi... Neha Ojha
06:35 PM Bug #45619: Health check failed: Reduced data availability: PG_AVAILABILITY
Neha Ojha wrote:
> Seen in the rados suite: /a/nojha-2020-05-21_19:33:40-rados-wip-32601-distro-basic-smithi/5077159...
Neha Ojha
01:58 PM Bug #44981 (Pending Backport): rados/test_envlibrados_for_rocksdb.sh build failure (seen in nauti...
Kefu Chai
08:22 AM Bug #45721 (Resolved): CommandFailedError: Command failed (workunit test rados/test_python.sh) FA...
/a/yuriw-2020-05-24_19:30:40-rados-wip-yuri-master_5.24.20-distro-basic-smithi/5088170... Brad Hubbard
07:02 AM Bug #43888: osd/osd-bench.sh 'tell osd.N bench' hang
/a/yuriw-2020-05-23_15:15:01-rados-wip-yuri-master_5.22.20-distro-basic-smithi/5085549 Brad Hubbard
02:20 AM Bug #43888: osd/osd-bench.sh 'tell osd.N bench' hang
/a/yuriw-2020-05-22_19:55:53-rados-wip-yuri-master_5.22.20-distro-basic-smithi/5083462 Brad Hubbard
06:55 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
/a/yuriw-2020-05-23_15:15:01-rados-wip-yuri-master_5.22.20-distro-basic-smithi/5085545
/a/yuriw-2020-05-23_15:15:01-...
Brad Hubbard

05/26/2020

03:11 PM Bug #45695: librados: significant memory consumption
David Disseldorp wrote:
> I've tested with in-memory logging disabled via the client ceph.conf:
>
> [...]
>
> ...
David Disseldorp
11:33 AM Bug #45706 (New): Memory usage in buffer_anon showing unbounded growth in osds on EC pool. (14.2.9)
Hi,
Re these threads in the mailing list: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/DPBVNJQX...
Sam Skipsey
07:12 AM Bug #45588 (Resolved): test_envlibrados_for_rocksdb.sh fails on master
Kefu Chai
04:44 AM Bug #45702 (Fix Under Review): PGLog::read_log_and_missing: ceph_assert(miter == missing.get_item...
/a/yuriw-2020-05-22_19:55:53-rados-wip-yuri-master_5.22.20-distro-basic-smithi/5083350... Brad Hubbard

05/25/2020

05:01 PM Backport #45677 (In Progress): nautilus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (s...
Nathan Cutler
04:58 PM Backport #45676 (In Progress): octopus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (se...
Nathan Cutler
02:28 PM Bug #43825 (Resolved): osd stuck down
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:28 PM Bug #44062 (Resolved): LibRadosWatchNotify.WatchNotify failure
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:27 PM Bug #44439 (Resolved): osd/osd-scrub-repair.sh fails: scrub/osd-scrub-repair.sh:698: TEST_repair_...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:27 PM Bug #44518 (Resolved): osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clean timeout
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:27 PM Bug #44532 (Resolved): nautilus: FAILED ceph_assert(head.version == 0 || e.version.version > head...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:26 PM Bug #45266 (Resolved): follower monitors can grow beyond memory target
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:51 AM Bug #45698 (New): PrioritizedQueue: messages in normal queue
if(i->second.front().first < i->second.num_tokens())
{
//nenver go in, if cost equal to num_tockens(),which valu...
liang chen
11:09 AM Backport #44686 (Resolved): nautilus: osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clea...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35047
m...
Nathan Cutler
11:09 AM Backport #45224 (Resolved): nautilus: LibRadosWatchNotify.WatchNotify failure
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35049
m...
Nathan Cutler
11:09 AM Backport #44689 (Resolved): nautilus: osd/osd-scrub-repair.sh fails: scrub/osd-scrub-repair.sh:69...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35048
m...
Nathan Cutler
11:08 AM Backport #43919 (Resolved): nautilus: osd stuck down
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35024
m...
Nathan Cutler
11:08 AM Backport #44841 (Resolved): nautilus: nautilus: FAILED ceph_assert(head.version == 0 || e.version...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34957
m...
Nathan Cutler
11:06 AM Backport #44490 (Resolved): nautilus: lz4 compressor corrupts data when buffers are unaligned
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35004
m...
Nathan Cutler
11:06 AM Backport #45391 (Resolved): nautilus: follower monitors can grow beyond memory target
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34916
m...
Nathan Cutler
11:06 AM Backport #45359 (Resolved): nautilus: rados: Sharded OpWQ drops suicide_grace after waiting for work
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34882
m...
Nathan Cutler
10:55 AM Bug #45695: librados: significant memory consumption
I've tested with in-memory logging disabled via the client ceph.conf:... David Disseldorp
10:27 AM Bug #45695: librados: significant memory consumption
I should have mentioned that my client ceph.conf is minimal, with only the _mon host_ and _keyring_ options set. David Disseldorp
10:22 AM Bug #45695 (New): librados: significant memory consumption
I did some valgrind massif heap profiling with the following simple librados (octopus 15.2.1) program:... David Disseldorp
02:28 AM Bug #45690 (New): pg_interval_t::check_new_interval is overly generous about guessing when EC PGs...
One EC PG stuck at peering+down forever, the problem occurs through the following steps:
Suppose the pg's acting set...
ming guo

05/24/2020

10:09 PM Bug #44981: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
Nathan Cutler wrote:
>
> New -> In Progress -> Fix Under Review -> Pending Backport
>
> This, I thought, was th...
Brad Hubbard
08:59 PM Bug #44981: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
Brad Hubbard wrote:
> Sorry Nathan, Could you explain why you changed this from 'In Progress' to 'Fix Under Review'?...
Nathan Cutler
09:04 PM Backport #45677 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen...
https://github.com/ceph/ceph/pull/35237 Nathan Cutler
09:04 PM Backport #45676 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen ...
https://github.com/ceph/ceph/pull/35236 Nathan Cutler
09:03 PM Backport #45673 (Resolved): octopus: qa: powercycle: install task runs twice with double unwind c...
https://github.com/ceph/ceph/pull/35441 Nathan Cutler
07:55 PM Bug #45606 (Fix Under Review): build_incremental_map_msg missing incremental map while snaptrim o...
Nathan Cutler
04:00 PM Bug #22052: ceph-mon: possible Leak in OSDMap::build_simple_optioned
... Kefu Chai

05/23/2020

09:56 PM Bug #45561 (Pending Backport): rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nau...
Brad Hubbard
03:11 PM Bug #24531: Mimic MONs have slow/long running ops
We had this issue yesterday. We had a broken mon cluster which I was able to repair by shutting down all mons, scalin... Daniel Poelzleithner

05/22/2020

06:54 PM Backport #44686: nautilus: osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clean timeout
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35047
merged
Yuri Weinstein
06:47 PM Backport #45224: nautilus: LibRadosWatchNotify.WatchNotify failure
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35049
merged
Yuri Weinstein
06:46 PM Backport #44689: nautilus: osd/osd-scrub-repair.sh fails: scrub/osd-scrub-repair.sh:698: TEST_rep...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35048
merged
Yuri Weinstein
06:40 PM Backport #43919: nautilus: osd stuck down
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35024
merged
Yuri Weinstein
06:39 PM Backport #44841: nautilus: nautilus: FAILED ceph_assert(head.version == 0 || e.version.version > ...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/34957
merged
Yuri Weinstein
04:53 PM Bug #45661 (Resolved): valgrind issue: UninitValue in ProtocolV2
... Neha Ojha
04:32 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
Has started appearing more frequently recently - /a/nojha-2020-05-21_19:33:40-rados-wip-32601-distro-basic-smithi/507... Neha Ojha
04:30 PM Bug #45660 (Resolved): osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
... Neha Ojha
04:24 PM Bug #45647: "ceph --cluster ceph --log-early osd last-stat-seq osd.0" times out due to msgr-failu...
/a/nojha-2020-05-21_19:33:40-rados-wip-32601-distro-basic-smithi/5076944/ Neha Ojha
03:36 AM Bug #45647 (New): "ceph --cluster ceph --log-early osd last-stat-seq osd.0" times out due to msgr...
... Kefu Chai
02:40 PM Bug #45619 (New): Health check failed: Reduced data availability: PG_AVAILABILITY
Seen in the rados suite: /a/nojha-2020-05-21_19:33:40-rados-wip-32601-distro-basic-smithi/5077159/ Neha Ojha
02:35 PM Bug #45619: Health check failed: Reduced data availability: PG_AVAILABILITY
We've been seeing a lot of this in the rgw suite over the last month or two. Casey Bodley
02:37 PM Bug #45298 (Resolved): cram: balancer/misplaced.t fails with 'Error EAGAIN: Some objects (0.00891...
This was a result of d4fbaf7ea959fd945857abd327271a97fb1da631, which only applies to master. Neha Ojha
04:41 AM Bug #45298 (Pending Backport): cram: balancer/misplaced.t fails with 'Error EAGAIN: Some objects ...
Kefu Chai
04:40 AM Feature #43324 (Resolved): Make zlib windowBits configurable for compression
Kefu Chai
04:30 AM Bug #45612 (Pending Backport): qa: powercycle: install task runs twice with double unwind causing...
Kefu Chai
04:03 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
... Kefu Chai
03:59 AM Bug #24613: luminous: rest/test.py fails with expected 200, got 400
/a/nojha-2020-05-21_19:42:29-rados-wip-29089-luminous-distro-basic-smithi/5077334 Brad Hubbard

05/21/2020

09:32 PM Bug #44981: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
Sorry Nathan, Could you explain why you changed this from 'In Progress' to 'Fix Under Review'? The PR has been review... Brad Hubbard
04:43 PM Bug #44981 (Fix Under Review): rados/test_envlibrados_for_rocksdb.sh build failure (seen in nauti...
Nathan Cutler
05:34 PM Bug #45614 (Resolved): qa/workunits/cephtool/test.sh failures due to dropping obsolete cache tier...
Nathan Cutler
04:41 PM Bug #45614: qa/workunits/cephtool/test.sh failures due to dropping obsolete cache tiering options
Backport will be handled via #45514 Nathan Cutler
02:52 AM Bug #45619: Health check failed: Reduced data availability: PG_AVAILABILITY
it's a new thing. also, before whitelist things, better off figure out why we should whitelist it. Kefu Chai

05/20/2020

09:13 PM Bug #45606: build_incremental_map_msg missing incremental map while snaptrim or backfilling
Nothing to worry about, this message should just be a dout instead. Neha Ojha
09:08 PM Bug #45619 (Need More Info): Health check failed: Reduced data availability: PG_AVAILABILITY
Is this something that has started appearing recently? If not, probably just needs whitelisting. Neha Ojha
07:34 AM Bug #45619 (Resolved): Health check failed: Reduced data availability: PG_AVAILABILITY
multiple RGW tests are failing on different branches, with:... Yuval Lifshitz
07:29 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
/a/nojha-2020-05-19_23:54:26-rados-wip-cephadm-test-distro-basic-smithi/5070712 Neha Ojha
03:20 PM Backport #44490: nautilus: lz4 compressor corrupts data when buffers are unaligned
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35004
merged
Yuri Weinstein
03:19 PM Backport #45391: nautilus: follower monitors can grow beyond memory target
Sridhar Seshasayee wrote:
> https://github.com/ceph/ceph/pull/34916
merged
Yuri Weinstein
03:19 PM Backport #45359: nautilus: rados: Sharded OpWQ drops suicide_grace after waiting for work
Dan Hill wrote:
> https://github.com/ceph/ceph/pull/34882
merged
Yuri Weinstein
10:42 AM Bug #45611: crimson: centos 8 vstart failure
caught segfault at points
1. run with next option in gdb: ...
Deepika Upadhyay
10:39 AM Bug #45611: crimson: centos 8 vstart failure
How to reproduce:
1. launch a centos 8 container and build vstart with -DWITH_SEASTAR=ON
2. start a vstart base...
Deepika Upadhyay
02:17 AM Bug #44981: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
Thanks Nathan. Brad Hubbard
02:16 AM Bug #44981 (In Progress): rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
Brad Hubbard
12:34 AM Bug #45615 (Resolved): api_watch_notify_pp: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.Watc...
... Neha Ojha
12:26 AM Bug #45614: qa/workunits/cephtool/test.sh failures due to dropping obsolete cache tiering options
/a/nojha-2020-05-19_00:53:41-rados-wip-revert-34894-distro-basic-smithi/5068016 Neha Ojha
12:24 AM Bug #45614 (Resolved): qa/workunits/cephtool/test.sh failures due to dropping obsolete cache tier...
Caused by https://github.com/ceph/ceph/pull/35015 Neha Ojha

05/19/2020

10:30 PM Bug #45612 (Fix Under Review): qa: powercycle: install task runs twice with double unwind causing...
Patrick Donnelly
10:24 PM Bug #45612 (Resolved): qa: powercycle: install task runs twice with double unwind causing fatal e...
Continuation of #45387. My fix was incomplete.
http://pulpito.ceph.com/teuthology-2020-04-25_03:09:02-powercycle-m...
Patrick Donnelly
07:23 PM Bug #45611: crimson: centos 8 vstart failure
caught some memory leaks using core dumps, but they seem to be related to asan/libc... Deepika Upadhyay
02:37 PM Bug #45611 (New): crimson: centos 8 vstart failure
... Deepika Upadhyay
11:44 AM Bug #45606 (Resolved): build_incremental_map_msg missing incremental map while snaptrim or backfi...
Hello,
I'm not sure if this is an issue or not. On one Cluster I see the following Messages, most times when snapt...
Bastian Mäuser
09:34 AM Backport #44370: nautilus: msg/async: the event center is blocked by rdma construct conection for...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34780
m...
Nathan Cutler
02:33 AM Backport #44370 (Resolved): nautilus: msg/async: the event center is blocked by rdma construct co...
Wei-Chung Cheng
02:54 AM Bug #45588: test_envlibrados_for_rocksdb.sh fails on master
http://pulpito.ceph.com/kchai-2020-05-19_02:54:14-rados:singleton-wip-kefu2-testing-2020-05-13-1200-distro-basic-smithi/ Kefu Chai

05/18/2020

03:42 PM Bug #45588: test_envlibrados_for_rocksdb.sh fails on master
https://github.com/facebook/rocksdb/pull/6855 Kefu Chai
03:41 PM Bug #45588 (Resolved): test_envlibrados_for_rocksdb.sh fails on master
... Kefu Chai
02:44 PM Backport #44370: nautilus: msg/async: the event center is blocked by rdma construct conection for...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/34780
merged
Yuri Weinstein

05/15/2020

03:38 PM Backport #44413: nautilus: FTBFS on s390x in openSUSE Build Service due to presence of -O2 in RPM...
c8af73e19ab02617411fe689ff1b98b8f4d096ca did not make v14.2.9, and it will be in v14.2.10. Ken Dreyer
11:24 AM Bug #45561 (In Progress): rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nautilus)
Brad Hubbard
06:59 AM Bug #45561 (Fix Under Review): rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nau...
Brad Hubbard
06:40 AM Bug #45561 (Resolved): rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nautilus)
http://qa-proxy.ceph.com/teuthology/bhubbard-2020-05-13_06:50:26-rados-wip-nautilus-badone-testing-2-distro-basic-smi... Brad Hubbard

05/14/2020

12:13 AM Bug #44715 (Need More Info): common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_list...
I am not able to reproduce this failure on octopus or on master:
http://pulpito.ceph.com/nojha-2020-05-13_17:20:4...
Neha Ojha

05/13/2020

09:09 PM Bug #45533 (Resolved): cls/queue: fix empty markers when listing entries
Neha Ojha
12:53 PM Bug #45533: cls/queue: fix empty markers when listing entries
already fixed in: https://github.com/ceph/ceph/pull/34788 Yuval Lifshitz
12:51 PM Bug #45533 (Resolved): cls/queue: fix empty markers when listing entries
markers are sometimes empty when listing entries Yuval Lifshitz
09:02 PM Backport #44489 (In Progress): mimic: lz4 compressor corrupts data when buffers are unaligned
Nathan Cutler
03:43 PM Backport #45224 (In Progress): nautilus: LibRadosWatchNotify.WatchNotify failure
Nathan Cutler
03:42 PM Backport #44689 (In Progress): nautilus: osd/osd-scrub-repair.sh fails: scrub/osd-scrub-repair.sh...
Nathan Cutler
03:40 PM Backport #44686 (In Progress): nautilus: osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_c...
Nathan Cutler
02:49 AM Bug #44981 (Fix Under Review): rados/test_envlibrados_for_rocksdb.sh build failure (seen in nauti...
Brad Hubbard

05/12/2020

06:07 PM Bug #45292: pg autoscaler merging issue
Sorry for the delay. We are working to get a reservation on one of our internal labs so we can recreate the issue and... Brian Wickersham
03:02 PM Bug #37875: osdmaps aren't being cleaned up automatically on healthy cluster
nautilus backport tracked by https://tracker.ceph.com/issues/45402 Nathan Cutler
03:01 PM Bug #37875 (Duplicate): osdmaps aren't being cleaned up automatically on healthy cluster
Nathan Cutler
02:30 PM Backport #43919 (In Progress): nautilus: osd stuck down
Nathan Cutler
02:29 PM Backport #43919: nautilus: osd stuck down
first attempted backport - https://github.com/ceph/ceph/pull/33156 - was closed Nathan Cutler
02:29 PM Backport #43919 (New): nautilus: osd stuck down
Nathan Cutler

05/11/2020

09:54 PM Bug #45356: nautilus: rados/upgrade/mimic-x-singleton failures due to mon_client_directed_command...
https://github.com/ceph/ceph/pull/34884 merged Yuri Weinstein
04:41 PM Backport #44490 (In Progress): nautilus: lz4 compressor corrupts data when buffers are unaligned
Nathan Cutler
02:23 PM Bug #44827 (Resolved): osd: incorrect read bytes stat in SPARSE_READ
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
02:21 PM Bug #45075 (Resolved): scrub/osd-scrub-repair.sh: TEST_auto_repair_bluestore_failed failure
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
01:51 PM Backport #45392 (Resolved): octopus: follower monitors can grow beyond memory target
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34917
m...
Nathan Cutler
12:04 PM Bug #44959 (Closed): health warning: pgs not deep-scrubbed in time although it was in time
Aaaha, that was it. Thank you very much!
I've set the @osd deep scrub interval@ under @[osd]@ so the mgr did not g...
Jonas Jelten
11:53 AM Bug #44959: health warning: pgs not deep-scrubbed in time although it was in time
Have you changed the values on the MGR? mgr checks that and if mgr still has defaults, it will issue warnings..
@c...
Katarzyna Myrek
03:28 AM Bug #45298: cram: balancer/misplaced.t fails with 'Error EAGAIN: Some objects (0.008913) are degr...
/a/yuriw-2020-05-04_17:54:17-rados-wip-yuri5-testing-2020-05-04-1554-nautilus-distro-basic-smithi/5022793 Brad Hubbard
 

Also available in: Atom