Project

General

Profile

Activity

From 02/07/2021 to 03/08/2021

03/08/2021

05:16 PM Backport #49482: pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcou...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39773
m...
Nathan Cutler
05:14 PM Backport #49532: pacific: osd ok-to-stop too conservative
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39737
m...
Nathan Cutler
05:07 PM Backport #49529 (In Progress): nautilus: "ceph osd crush set|reweight-subtree" commands do not se...
Nathan Cutler
05:06 PM Backport #49530 (In Progress): octopus: "ceph osd crush set|reweight-subtree" commands do not set...
Nathan Cutler
05:05 PM Backport #49528 (Resolved): pacific: "ceph osd crush set|reweight-subtree" commands do not set we...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39736
m...
Nathan Cutler
05:02 PM Backport #49526: pacific: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to class '...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39735
m...
Nathan Cutler
05:01 PM Backport #49404: pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39597
m...
Nathan Cutler
04:59 PM Backport #49404: pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
https://github.com/ceph/ceph/pull/39796
https://github.com/ceph/ceph/pull/39597
(double whammy)
Nathan Cutler
01:41 PM Backport #49640: nautilus: Disable and re-enable clog_to_monitors could trigger assertion
https://github.com/ceph/ceph/pull/39912 gerald yang
11:44 AM Bug #49409 (Pending Backport): osd run into dead loop and tell slow request when rollback snap wi...
Kefu Chai

03/07/2021

10:02 PM Backport #49377: pacific: building libcrc32
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/39902
ceph-backport.sh versi...
singuliere _
03:58 PM Backport #49482 (Resolved): pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/Manifes...
Loïc Dachary
03:55 PM Backport #49642 (Resolved): pacific: Disable and re-enable clog_to_monitors could trigger assertion
https://github.com/ceph/ceph/pull/40247 Backport Bot
03:55 PM Backport #49641 (Resolved): octopus: Disable and re-enable clog_to_monitors could trigger assertion
https://github.com/ceph/ceph/pull/39935 Backport Bot
03:55 PM Backport #49640 (Resolved): nautilus: Disable and re-enable clog_to_monitors could trigger assertion
https://github.com/ceph/ceph/pull/39912 Backport Bot
03:54 PM Bug #48946 (Pending Backport): Disable and re-enable clog_to_monitors could trigger assertion
Kefu Chai

03/06/2021

02:58 PM Backport #49533 (In Progress): octopus: osd ok-to-stop too conservative
https://github.com/ceph/ceph/pull/39887 Kefu Chai
02:43 PM Backport #49073 (Resolved): nautilus: crash in Objecter and CRUSH map lookup
Kefu Chai
01:16 AM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
This is where we sent the subops... Neha Ojha

03/05/2021

11:10 PM Bug #49524: ceph_test_rados_delete_pools_parallel didn't start
https://tracker.ceph.com/issues/45946 looks very similar Neha Ojha
11:04 PM Bug #49525: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 snaps missi...
Ronen, can you check if this is caused due to a race between scrub and snap remove. Neha Ojha
10:53 PM Bug #49403 (Duplicate): Caught signal (aborted) on mgrmap epoch 1 during librados init (rados-str...
Neha Ojha
07:15 PM Bug #48298: hitting mon_max_pg_per_osd right after creating OSD, then decreases slowly
Another observation: I have nobackfill set, and I'm currently adding 8 new OSDs.
The first of the newly added OSDs...
Jonas Jelten
05:15 PM Backport #49482: pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcou...
Myoungwon Oh wrote:
> https://github.com/ceph/ceph/pull/39773
merged
Yuri Weinstein
02:39 AM Bug #47419 (Resolved): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo bench...
Hopefully Neha Ojha
01:49 AM Backport #49565 (In Progress): pacific: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
https://github.com/ceph/ceph/pull/39844 Neha Ojha

03/04/2021

11:34 PM Bug #47419 (Fix Under Review): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p f...
Sage Weil
11:34 PM Bug #47419 (Duplicate): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo benc...
Sage Weil
04:33 PM Bug #47419: make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo bench 4 write -b...
https://jenkins.ceph.com/job/ceph-pull-requests/70513/consoleFull#10356408840526d21-3511-427d-909c-dd086c0d1034 Neha Ojha
11:21 PM Bug #49614 (Duplicate): src/test/smoke.sh:56: TEST_multimon: timeout 8 rados -p foo bench 4 writ...
Neha Ojha
11:11 PM Bug #49614: src/test/smoke.sh:56: TEST_multimon: timeout 8 rados -p foo bench 4 write -b 4096 --...
https://jenkins.ceph.com/job/ceph-pull-requests/70513/consoleFull#-1656021838e840cee4-f4a4-4183-81dd-42855615f2c1 Sage Weil
10:58 PM Bug #49614 (Duplicate): src/test/smoke.sh:56: TEST_multimon: timeout 8 rados -p foo bench 4 writ...
... Sage Weil
09:14 PM Bug #44631: ceph pg dump error code 124
/ceph/teuthology-archive/pdonnell-2021-03-04_03:51:01-fs-wip-pdonnell-testing-20210303.195715-distro-basic-smithi/593... Patrick Donnelly
05:39 PM Bug #44631: ceph pg dump error code 124
/a/yuriw-2021-03-02_20:59:34-rados-wip-yuri7-testing-2021-03-02-1118-nautilus-distro-basic-smithi/5928174 Neha Ojha
09:08 PM Backport #49532 (Resolved): pacific: osd ok-to-stop too conservative
Sage Weil
06:47 PM Backport #49404 (Resolved): pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
Sage Weil
06:47 PM Backport #49526 (Resolved): pacific: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound...
Sage Weil
06:44 PM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
/a/sage-2021-03-03_16:41:22-rados-wip-sage2-testing-2021-03-03-0744-pacific-distro-basic-smithi/5930113
Sage Weil
04:48 PM Bug #23117: PGs stuck in "activating" after osd_max_pg_per_osd_hard_ratio has been exceeded once
We also his this issue last week on Ceph Version 12.2.11.
Cluster configured with a replication factor of 3, issu...
Ross Martyn
01:21 PM Backport #48987: nautilus: ceph osd df tree reporting incorrect SIZE value for rack having an emp...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39126
m...
Nathan Cutler

03/03/2021

10:14 PM Bug #49104: crush weirdness: degraded PGs not marked as such, and choose_total_tries = 50 is too ...
Thanks for the analysis Neha.
Something that perhaps wasn't clear in comment 2 -- in each case where I print the `...
Dan van der Ster
06:48 PM Bug #49104 (Triaged): crush weirdness: degraded PGs not marked as such, and choose_total_tries = ...
Thanks for the detailed logs!
Firstly, the pg dump output can sometimes be a little laggy, so I am basing my asses...
Neha Ojha
09:53 PM Backport #48987 (Resolved): nautilus: ceph osd df tree reporting incorrect SIZE value for rack ha...
Brad Hubbard
04:05 PM Backport #48987: nautilus: ceph osd df tree reporting incorrect SIZE value for rack having an emp...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39126
merged
Yuri Weinstein
08:47 PM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
not seen in octopus and pacific so far, but pops sometimes in nautilus:... Deepika Upadhyay
08:39 PM Bug #49591 (New): no active mgr (MGR_DOWN)" in cluster log
seen in nautilus... Deepika Upadhyay
03:37 PM Bug #49584: Ceph OSD, MDS, MGR daemon does not _only_ bind to specified address when configured t...
After removing the specific public_addr and restarting the MDSes the situation returns to normal and the cluster reco... Stefan Kooman
03:22 PM Bug #49584 (New): Ceph OSD, MDS, MGR daemon does not _only_ bind to specified address when config...
Documentation (https://docs.ceph.com/en/octopus/rados/configuration/network-config-ref/#ceph-daemons) states the foll... Stefan Kooman
11:32 AM Backport #49055 (Resolved): nautilus: pick_a_shard() always select shard 0
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39651
m...
Nathan Cutler
09:40 AM Bug #40029: ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(CephContext*)+...
Florian Haas wrote:
> With thanks to Paul Emmerich in https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/threa...
Norman Shen
05:31 AM Bug #48417: unfound EC objects in sepia's LRC after upgrade
https://tracker.ceph.com/issues/48613#note-13 Deepika Upadhyay
12:28 AM Backport #49404 (In Progress): pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
David Zafman

03/02/2021

08:21 PM Bug #37808 (New): osd: osdmap cache weak_refs assert during shutdown
/ceph/teuthology-archive/pdonnell-2021-03-02_17:29:53-fs:verify-wip-pdonnell-testing-20210301.234318-distro-basic-smi... Patrick Donnelly
05:27 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
personal ref dir: all grep reside in **/home/ideepika/pg[3.1as0.log** in teuthology server
job: /a/teuthology-2021...
Deepika Upadhyay
05:24 PM Bug #49572 (Duplicate): MON_DOWN: mon.c fails to join quorum after un-blacklisting mon.a
This is the same as https://tracker.ceph.com/issues/47654... Neha Ojha
04:58 PM Bug #49572 (Duplicate): MON_DOWN: mon.c fails to join quorum after un-blacklisting mon.a
/a/sage-2021-03-01_20:24:37-rados-wip-sage-testing-2021-03-01-1118-distro-basic-smithi/5924612
it looks like the s...
Sage Weil
04:38 AM Backport #49482: pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcou...
https://github.com/ceph/ceph/pull/39773 Myoungwon Oh

03/01/2021

11:25 PM Bug #49409 (Fix Under Review): osd run into dead loop and tell slow request when rollback snap wi...
Neha Ojha
10:16 PM Backport #49567 (Resolved): nautilus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
https://github.com/ceph/ceph/pull/40697 Backport Bot
10:15 PM Backport #49566 (Resolved): octopus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
https://github.com/ceph/ceph/pull/40756 Backport Bot
10:15 PM Backport #49565 (Resolved): pacific: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
https://github.com/ceph/ceph/pull/39844 Backport Bot
10:10 PM Bug #47719 (Pending Backport): api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
Brad Hubbard
05:15 PM Backport #49055: nautilus: pick_a_shard() always select shard 0
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39651
merged
Yuri Weinstein
03:43 AM Bug #49543 (New): scrub a pool which size is 1 but found stat mismatch on objects and bytes

the pg has only one primary osd:...
Liu Lan

02/28/2021

11:55 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
/a/sage-2021-02-28_18:35:15-rados-wip-sage-testing-2021-02-28-1217-distro-basic-smithi/5921418
Sage Weil
11:52 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
I hit another instance of this here: /a/sage-2021-02-28_18:35:15-rados-wip-sage-testing-2021-02-28-1217-distro-basic-... Sage Weil
09:20 PM Bug #46318: mon_recovery: quorum_status times out
same symptom... cli command fails to contact mon
/a/sage-2021-02-28_18:35:15-rados-wip-sage-testing-2021-02-28-121...
Sage Weil
09:37 AM Bug #40029: ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(CephContext*)+...
Florian Haas wrote:
> With thanks to Paul Emmerich in https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/threa...
Norman Shen

02/27/2021

08:59 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
/a/sage-2021-02-27_17:50:29-rados-wip-sage2-testing-2021-02-27-0921-pacific-distro-basic-smithi/5919090
Sage Weil
03:20 PM Backport #49533 (Resolved): octopus: osd ok-to-stop too conservative
https://github.com/ceph/ceph/pull/39887 Backport Bot
03:20 PM Backport #49532 (Resolved): pacific: osd ok-to-stop too conservative
https://github.com/ceph/ceph/pull/39737 Backport Bot
03:20 PM Backport #49531 (Resolved): nautilus: osd ok-to-stop too conservative
https://github.com/ceph/ceph/pull/40676 Backport Bot
03:16 PM Bug #49392 (Pending Backport): osd ok-to-stop too conservative
pacific backport: https://github.com/ceph/ceph/pull/39737
Sage Weil
03:16 PM Backport #49530 (Resolved): octopus: "ceph osd crush set|reweight-subtree" commands do not set we...
https://github.com/ceph/ceph/pull/39919 Backport Bot
03:16 PM Backport #49529 (Resolved): nautilus: "ceph osd crush set|reweight-subtree" commands do not set w...
https://github.com/ceph/ceph/pull/39920 Backport Bot
03:15 PM Backport #49528 (Resolved): pacific: "ceph osd crush set|reweight-subtree" commands do not set we...
https://github.com/ceph/ceph/pull/39736 Backport Bot
03:15 PM Backport #49527 (Resolved): octopus: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound...
https://github.com/ceph/ceph/pull/40276 Backport Bot
03:15 PM Backport #49526 (Resolved): pacific: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound...
https://github.com/ceph/ceph/pull/39735 Backport Bot
03:14 PM Bug #48065 (Pending Backport): "ceph osd crush set|reweight-subtree" commands do not set weight o...
pacific backport: https://github.com/ceph/ceph/pull/39736
Sage Weil
03:13 PM Bug #49212 (Pending Backport): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to cl...
backport for pacific: https://github.com/ceph/ceph/pull/39735
Sage Weil
02:40 PM Bug #49525 (Resolved): found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 ...
... Sage Weil
02:36 PM Bug #48997: rados/singleton/all/recovery-preemption: defer backfill|defer recovery not found in logs
/a/sage-2021-02-26_22:19:00-rados-wip-sage-testing-2021-02-26-1412-distro-basic-smithi/5916984
Sage Weil
02:26 PM Bug #49521: build failure on centos-8, bad/incorrect use of #ifdef/#elif
N.B. fedora-33 and later have sigdescr_np(); ditto for rhel-9.
Also strsignal(3) is not *MT-SAFE*. (It's also not ...
Kaleb KEITHLEY
02:26 PM Bug #43584: MON_DOWN during mon_join process
/a/sage-2021-02-26_22:19:00-rados-wip-sage-testing-2021-02-26-1412-distro-basic-smithi/5917141
I think this is a g...
Sage Weil
02:17 PM Bug #49524 (Resolved): ceph_test_rados_delete_pools_parallel didn't start
... Sage Weil
02:11 PM Bug #49523 (New): rebuild-mondb doesn't populate mgr commands -> pg dump EINVAL
... Sage Weil

02/26/2021

09:37 PM Bug #49521 (New): build failure on centos-8, bad/incorrect use of #ifdef/#elif
building 15.2.9 for CentOS Storage SIG el8 I hit this compile error:
cmake ... -DWITH_REENTRANT_STRSIGNAL=ON ......
Kaleb KEITHLEY
03:30 PM Bug #44286: Cache tiering shows unfound objects after OSD reboots
We even hit that bug twice today by rebooting two of our cache servers.
What's interesting is that only hit_set ob...
Jan-Philipp Litza
01:53 PM Feature #49505 (New): Warn about extremely anomalous commit_latencies
In a EC cluster with ~500 hdd osds, we suffered a drop in write performance from 30GiB/s down to 3GiB/s due to one si... Dan van der Ster

02/25/2021

10:24 PM Bug #49461 (Duplicate): rados/upgrade/pacific-x/parallel: upgrade incomplete
Neha Ojha
05:19 PM Backport #49397 (In Progress): octopus: rados/dashboard: Health check failed: Telemetry requires ...
Nathan Cutler
05:18 PM Backport #49398 (Resolved): pacific: rados/dashboard: Health check failed: Telemetry requires re-...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39484
m...
Nathan Cutler
05:18 PM Backport #49398 (In Progress): pacific: rados/dashboard: Health check failed: Telemetry requires ...
Nathan Cutler
03:00 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
/a/sage-2021-02-24_21:26:56-rados-wip-sage-testing-2021-02-24-1457-distro-basic-smithi/5912284
Sage Weil
02:47 PM Bug #49487 (Fix Under Review): osd:scrub skip some pg
Kefu Chai
08:27 AM Bug #49487 (Resolved): osd:scrub skip some pg
ENV:1 mon,1 mgr,1 osd
create a pool with 8 pg, change the value of osd_scrub_min_interval to trigger reschedule
...
wencong wan
02:37 PM Bug #46318 (Need More Info): mon_recovery: quorum_status times out
having trouble reproducing (after about 150 jobs). adding increased debugging to master with https://github.com/ceph... Sage Weil
01:29 PM Backport #49134: pacific: test_envlibrados_for_rocksdb.sh: EnvLibradosMutipoolTest.DBBulkLoadKeys...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39264
m...
Nathan Cutler
09:23 AM Support #49489 (New): Getting Long heartbeat and slow requests on ceph luminous 12.2.13
1. Current environment is integrated with ceph and openstack
2. It has NVME and SSD disks only
3. We have create fo...
ceph ceph
07:09 AM Bug #49427 (In Progress): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing()....
https://github.com/ceph/ceph/pull/39670 Myoungwon Oh
06:30 AM Backport #49482 (Resolved): pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/Manifes...
https://github.com/ceph/ceph/pull/39773 Backport Bot
06:29 AM Bug #47024 (Duplicate): rados/test.sh: api_tier_pp LibRadosTwoPoolsPP.ManifestSnapRefcount failed
Kefu Chai
06:29 AM Bug #48915 (Duplicate): api_tier_pp: LibRadosTwoPoolsPP.ManifestFlushDupCount failed
Kefu Chai
06:28 AM Bug #48786 (Pending Backport): api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapR...
Kefu Chai
05:29 AM Bug #49460 (Resolved): qa/workunits/cephtool/test.sh: test_mon_osd_create_destroy fails
Kefu Chai
03:58 AM Bug #49468 (Resolved): rados: "Command crashed: 'rados -p cephfs_metadata rmxattr 10000000000.000...
Kefu Chai
02:58 AM Bug #49468: rados: "Command crashed: 'rados -p cephfs_metadata rmxattr 10000000000.00000000 parent'"
Add more failures. Patrick Donnelly
02:42 AM Bug #49468 (Resolved): rados: "Command crashed: 'rados -p cephfs_metadata rmxattr 10000000000.000...
... Patrick Donnelly
12:04 AM Bug #49463: qa/standalone/misc/rados-striper.sh: Caught signal in thread_name:rados
rados/singleton/{all/radostool mon_election/classic msgr-failures/many msgr/async-v2only objectstore/bluestore-comp-z... Neha Ojha

02/24/2021

10:07 PM Bug #49463 (Can't reproduce): qa/standalone/misc/rados-striper.sh: Caught signal in thread_name:r...
... Neha Ojha
09:34 PM Bug #49461 (Duplicate): rados/upgrade/pacific-x/parallel: upgrade incomplete
... Neha Ojha
09:26 PM Bug #49460 (Fix Under Review): qa/workunits/cephtool/test.sh: test_mon_osd_create_destroy fails
Neha Ojha
08:58 PM Bug #49460 (Resolved): qa/workunits/cephtool/test.sh: test_mon_osd_create_destroy fails
... Neha Ojha
09:00 PM Bug #49212 (Fix Under Review): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to cl...
earlier,... Sage Weil
08:48 PM Bug #49212 (In Progress): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to class '...
... Sage Weil
10:45 AM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
Nokia ceph-users wrote:
> Do you suspect that this is something relevant to 14.2.2 and could be solved with a higher...
Igor Fedotov
04:55 AM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
Do you suspect that this is something relevant to 14.2.2 and could be solved with a higher version? Nokia ceph-users
09:16 AM Bug #49448 (New): If OSD types are changed, pools rules can become unresolvable without providing...
When some OSDs in a cluster are of a specific type, such as hdd_aes, and the type is used in a rule, if the type of s... linzhou zhou
05:07 AM Bug #49428 (Triaged): ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool create...
Brad Hubbard
04:50 AM Bug #49428: ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool create failed wi...
TLDR skip to ********* MON.A ************** below.
So this looks like a race. The calls seem to be serialized in t...
Brad Hubbard
12:48 AM Bug #49428: ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool create failed wi...
Here's the error from the mon log.... Brad Hubbard
05:06 AM Bug #47719 (In Progress): api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
Brad Hubbard
12:50 AM Bug #49427: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
Most likely, the problem is that the object being dirtied is present, but the prior clone is missing pending recovery. Samuel Just

02/23/2021

11:52 PM Bug #49427: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
dec_refcount_by_dirty is related to tiering/dedeup which got added fairly recently in https://github.com/ceph/ceph/pu... Neha Ojha
10:36 PM Bug #49427: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
/a/bhubbard-2021-02-23_02:25:14-rados-master-distro-basic-smithi/5905669 Brad Hubbard
01:44 AM Bug #49427 (Resolved): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_...
/a/bhubbard-2021-02-22_23:51:15-rados-master-distro-basic-smithi/5904732
rados/verify/{centos_latest ceph clusters...
Brad Hubbard
11:15 PM Bug #49403: Caught signal (aborted) on mgrmap epoch 1 during librados init (rados-striper)
/a/sage-2021-02-23_06:29:23-rados-wip-sage-testing-2021-02-22-2228-distro-basic-smithi/5906245
Sage Weil
09:06 PM Backport #49055 (In Progress): nautilus: pick_a_shard() always select shard 0
Nathan Cutler
03:40 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
... Deepika Upadhyay
12:23 PM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
Nokia ceph-users wrote:
> Hi , Another occurrence
>
> _2021-02-22 09:19:43.010071 mon.cn1 (mon.0) 267937 : cluste...
Nokia ceph-users
12:23 PM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
Hi , Another occurrence
_2021-02-22 09:19:43.010071 mon.cn1 (mon.0) 267937 : cluster [INF] osd.146 marked down aft...
Nokia ceph-users
04:34 AM Bug #48065: "ceph osd crush set|reweight-subtree" commands do not set weight on device class subtree
Sage Weil wrote:
> BTW Mykola I would suggest using 'ceph osd crush reweight osd.N' (which works fine already) inste...
Mykola Golub
02:02 AM Bug #49428 (Duplicate): ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool crea...
/a/bhubbard-2021-02-22_23:51:15-rados-master-distro-basic-smithi/5904720... Brad Hubbard
12:42 AM Bug #49069 (Resolved): mds crashes on v15.2.8 -> master upgrade decoding MMgrConfigure
Sage Weil

02/22/2021

08:22 PM Bug #48065: "ceph osd crush set|reweight-subtree" commands do not set weight on device class subtree
BTW Mykola I would suggest using 'ceph osd crush reweight osd.N' (which works fine already) instead of the 'ceph osd ... Sage Weil
08:22 PM Bug #48065 (Fix Under Review): "ceph osd crush set|reweight-subtree" commands do not set weight o...
Sage Weil
07:37 PM Bug #46318 (In Progress): mon_recovery: quorum_status times out
Neha Ojha wrote:
> We are still seeing these.
>
> /a/teuthology-2021-01-18_07:01:01-rados-master-distro-basic-smi...
Sage Weil
09:50 AM Bug #49409 (New): osd run into dead loop and tell slow request when rollback snap with using cach...
xin mycho
08:59 AM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
If we are happy with https://github.com/ceph/ceph/pull/39601 in theory perhaps we need to extend it to cover the othe... Brad Hubbard
06:31 AM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
Hey Sage, I think you meant /a/sage-2021-02-20_16:46:42-rados-wip-sage2-testing-2021-02-20-0942-distro-basic-smithi/5... Brad Hubbard
02:17 AM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
First, the relevant test code from src/test/librados/watch_notify.cc.... Brad Hubbard

02/21/2021

04:50 PM Backport #49404 (Resolved): pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
https://github.com/ceph/ceph/pull/39597
Backport Bot
04:48 PM Bug #48984 (Pending Backport): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
Sage Weil
04:46 PM Bug #49403 (Duplicate): Caught signal (aborted) on mgrmap epoch 1 during librados init (rados-str...
... Sage Weil
04:42 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
/a/sage-2021-02-20_16:46:42-rados-wip-sage2-testing-2021-02-20-0942-distro-basic-smithi/5899129
Sage Weil
11:48 AM Bug #48998: Scrubbing terminated -- not all pgs were active and clean
rados/singleton/{all/lost-unfound-delete mon_election/classic msgr-failures/none msgr/async-v1only objectstore/bluest... Kefu Chai
03:35 AM Backport #49402 (Resolved): octopus: rados: Health check failed: 1/3 mons down, quorum a,c (MON_D...
https://github.com/ceph/ceph/pull/40138 Backport Bot
03:35 AM Backport #49401 (Resolved): pacific: rados: Health check failed: 1/3 mons down, quorum a,c (MON_D...
https://github.com/ceph/ceph/pull/40137 Backport Bot
03:32 AM Bug #45441 (Pending Backport): rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" ...
Kefu Chai

02/20/2021

07:41 PM Bug #48386 (Resolved): Paxos::restart() and Paxos::shutdown() can race leading to use-after-free ...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
12:33 PM Bug #49395 (Resolved): ceph-test rpm missing gtest dependencies
Kefu Chai
03:47 AM Bug #49395 (Fix Under Review): ceph-test rpm missing gtest dependencies
Patrick Donnelly

02/19/2021

11:59 PM Bug #45647: "ceph --cluster ceph --log-early osd last-stat-seq osd.0" times out due to msgr-failu...
/a/teuthology-2021-02-17_03:31:03-rados-pacific-distro-basic-smithi/5889472 Neha Ojha
11:30 PM Backport #49398 (Resolved): pacific: rados/dashboard: Health check failed: Telemetry requires re-...
https://github.com/ceph/ceph/pull/39484 Backport Bot
11:30 PM Backport #49397 (Resolved): octopus: rados/dashboard: Health check failed: Telemetry requires re-...
https://github.com/ceph/ceph/pull/39704 Backport Bot
11:29 PM Bug #49212 (Duplicate): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to class 'ss...
Neha Ojha
11:25 PM Bug #48990 (Pending Backport): rados/dashboard: Health check failed: Telemetry requires re-opt-in...
Neha Ojha
11:24 PM Bug #48990: rados/dashboard: Health check failed: Telemetry requires re-opt-in (TELEMETRY_CHANGED...
pacific backport merged: https://github.com/ceph/ceph/pull/39484 Josh Durgin
11:24 PM Bug #40809: qa: "Failed to send signal 1: None" in rados
Deepika Upadhyay wrote:
> this happens due to dispatch delay.
> Testing with increased values for a test case can ...
Neha Ojha
07:45 AM Bug #40809: qa: "Failed to send signal 1: None" in rados
this happens due to dispatch delay.
Testing with increased values for a test case can lead to this failure:
/ceph/...
Deepika Upadhyay
11:05 PM Bug #44945: Mon High CPU usage when another mon syncing from it
Wout van Heeswijk wrote:
> I think this might be related to #42830. If so it may be resolved with Ceph Nautilus 14.2...
Neha Ojha
11:04 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
Will do. Brad Hubbard
10:54 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
Brad, can you please take look at this one? Neha Ojha
09:25 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
/a/teuthology-2021-02-17_03:31:03-rados-pacific-distro-basic-smithi/5889235 Neha Ojha
10:49 PM Bug #49359: osd: warning: unused variable
f9f9270d75d3bc6383604addefc2386318ecfc8b was done to fix another warning, definitely not high priority :) Neha Ojha
10:47 PM Bug #45441 (Fix Under Review): rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" ...
Sage Weil
10:44 PM Bug #39039 (Duplicate): mon connection reset, command not resent
let's track this at #45647 Sage Weil
10:33 PM Bug #47003 (Duplicate): ceph_test_rados test error. Reponses out of order due to the connection d...
Neha Ojha
10:29 PM Feature #39339: prioritize backfill of metadata pools, automatically
I think this tracker can be marked resolved since pull request 29181 merged. David Zafman
10:26 PM Bug #48468 (Need More Info): ceph-osd crash before being up again
Hi Clement,
Can you reproduce this with logs?...
Sage Weil
10:19 PM Bug #49393 (Need More Info): Segmentation fault in ceph::logging::Log::entry()
Sage Weil
09:11 PM Bug #49393 (Can't reproduce): Segmentation fault in ceph::logging::Log::entry()
... Neha Ojha
10:16 PM Bug #49395 (Resolved): ceph-test rpm missing gtest dependencies
... Sage Weil
09:28 PM Bug #48841 (Fix Under Review): test_turn_off_module: wait_until_equal timed out
Neha Ojha
08:09 PM Bug #49392 (Resolved): osd ok-to-stop too conservative
Currently 'osd ok-to-stop' is too conservative: if the pg is degraded, and is touched by an osd we might stop, it alw... Sage Weil
04:43 PM Backport #49320 (In Progress): octopus: thrash_cache_writeback_proxy_none: FAILED ceph_assert(ver...
https://github.com/ceph/ceph/pull/39578 Neha Ojha
02:51 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
investigating the 2 unfound objects, `when all_unfound_are_queried_or_lost all of
might_have_unfound` all participat...
Deepika Upadhyay
01:53 PM Bug #49104: crush weirdness: degraded PGs not marked as such, and choose_total_tries = 50 is too ...
Neha Ojha wrote:
> Regarding Problem A, will it be possible for you to share osd logs with debug_osd=20 to demonstra...
Dan van der Ster
01:26 PM Backport #49377 (Resolved): pacific: building libcrc32
https://github.com/ceph/ceph/pull/39902 Backport Bot
10:31 AM Bug #49231: MONs unresponsive over extended periods of time
I think I found the reason for this behaviour. I managed to pull extended logs during an incident and saw that the MO... Frank Schilder
01:40 AM Bug #48984 (Fix Under Review): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
Neha Ojha

02/18/2021

05:20 PM Bug #49359: osd: warning: unused variable
https://stackoverflow.com/a/50176479 Patrick Donnelly
05:17 PM Bug #49359 (New): osd: warning: unused variable
... Patrick Donnelly
03:30 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
... Sebastian Wagner
03:21 PM Bug #49259 (Resolved): test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
Sebastian Wagner
03:21 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
turned out to be caused by https://github.com/ceph/ceph/pull/39530 Sebastian Wagner
10:16 AM Bug #49353 (Need More Info): Random OSDs being marked as down even when there is very less activi...
osd.149 went down at 03:25:26
2021-01-14 03:25:25.974634 mon.cn1 (mon.0) 384654 : cluster [INF] osd.149 marked down ...
Igor Fedotov
09:51 AM Bug #49353 (Need More Info): Random OSDs being marked as down even when there is very less activi...
Hi,
We recently see some random OSDs being marked as down status with the below message on one of our Nautilus cl...
Nokia ceph-users
08:06 AM Backport #48495 (Resolved): nautilus: Paxos::restart() and Paxos::shutdown() can race leading to ...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39160
m...
Nathan Cutler

02/17/2021

08:22 PM Bug #48984 (In Progress): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
David Zafman
08:18 PM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs

Proposed fix in https://github.com/ceph/ceph/pull/39535
Needs extensive testing
David Zafman
05:41 PM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs

If a requested scrub runs into a rejected remote reservation, the m_planned_scrub is already reset. This means tha...
David Zafman
07:51 PM Bug #48990: rados/dashboard: Health check failed: Telemetry requires re-opt-in (TELEMETRY_CHANGED...
https://github.com/ceph/ceph/pull/39484 merged Yuri Weinstein
04:29 PM Backport #49073: nautilus: crash in Objecter and CRUSH map lookup
Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/39197
merged
Yuri Weinstein
04:29 PM Backport #48495: nautilus: Paxos::restart() and Paxos::shutdown() can race leading to use-after-f...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/39160
merged
Yuri Weinstein
10:20 AM Backport #49320 (Resolved): octopus: thrash_cache_writeback_proxy_none: FAILED ceph_assert(versio...
https://github.com/ceph/ceph/pull/39578 Backport Bot
10:15 AM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
http://qa-proxy.ceph.com/teuthology/yuriw-2021-02-16_16:01:09-rados-wip-yuri-testing-2021-02-08-1109-octopus-distro-b... Deepika Upadhyay

02/16/2021

10:51 PM Bug #49259 (Need More Info): test_rados_api tests timeout with cephadm (plus extremely large OSD ...
Brad Hubbard
09:30 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
From IRC:... Neha Ojha
06:00 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
Sebastian Wagner wrote:
> sage: this is related to thrashing and only happens within cephadm. non-cephadm is not aff...
Neha Ojha
08:47 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
Argh! So it does, my bad. Please ignore comment 22 for now. Brad Hubbard
08:44 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
Deepika Upadhyay wrote:
> /ceph/teuthology-archive/yuriw-2021-02-15_20:25:26-rados-wip-yuri3-testing-2021-02-15-1020...
Neha Ojha
07:51 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
Deepika Upadhyay wrote:
> /ceph/teuthology-archive/yuriw-2021-02-15_20:25:26-rados-wip-yuri3-testing-2021-02-15-1020...
Brad Hubbard
07:14 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
-/ceph/teuthology-archive/yuriw-2021-02-15_20:25:26-rados-wip-yuri3-testing-2021-02-15-1020-nautilus-distro-basic-gib... Deepika Upadhyay
03:28 PM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
Deepika, -I don't understand why or how the "workaround" addresses the issue here. probably you could file a PR based... Kefu Chai
10:58 AM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
hey Kefu! Should we use this workaround meanwhile the real bug is being fixed?... Deepika Upadhyay
08:39 AM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
created https://github.com/ceph/ceph/pull/39491 in hope to work around this. Kefu Chai
04:49 AM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
filed https://bugzilla.redhat.com/show_bug.cgi?id=1929043 Kefu Chai
04:48 AM Bug #49303 (In Progress): FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on ...
... Kefu Chai
01:23 PM Bug #49190: LibRadosMiscConnectFailure_ConnectFailure_Test: FAILED ceph_assert(p != obs_call_gate...
Bug #40868 is not related Jos Collin

02/15/2021

10:35 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
... Brad Hubbard
03:40 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
sage: this is related to thrashing and only happens within cephadm. non-cephadm is not affected Sebastian Wagner
04:22 AM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
Managed to reproduce this with some manageable large osd logs.
On the first osd, just before the slow ops begin we...
Brad Hubbard

02/13/2021

12:28 AM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
https://pulpito.ceph.com/swagner-2021-02-11_11:00:52-rados:cephadm-wip-swagner3-testing-2021-02-10-1322-distro-basic-... Sebastian Wagner

02/12/2021

11:00 PM Backport #48986 (Resolved): pacific: ceph osd df tree reporting incorrect SIZE value for rack hav...
Brad Hubbard
10:44 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
This ran for nearly 24 hours.... Brad Hubbard
10:58 AM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
Brad Hubbard wrote:
> https://tracker.ceph.com/issues/39039 ?
>
> Maybe it would be worth a try to see if disabli...
Sebastian Wagner
07:11 AM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
https://tracker.ceph.com/issues/39039 ?
Maybe it would be worth a try to see if disabling cephx improves the situa...
Brad Hubbard
05:39 AM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
... Brad Hubbard
03:42 AM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
The stuck op is a copy of 16:8744f7fc:test-rados-api-smithi091-35842-17::big:head to 16:2b70cbe7:test-rados-api-smith... Brad Hubbard
02:34 AM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
From swagner-2021-02-11_10:31:33-rados:cephadm-wip-swagner-testing-2021-02-09-1126-distro-basic-smithi/5874513 there'... Brad Hubbard
10:09 PM Bug #49190 (Resolved): LibRadosMiscConnectFailure_ConnectFailure_Test: FAILED ceph_assert(p != ob...
Neha Ojha
10:00 PM Bug #49087 (Resolved): pacific: rados/upgrade/nautilus-x-singleton fails on 20.04
Neha Ojha
04:16 PM Bug #49087: pacific: rados/upgrade/nautilus-x-singleton fails on 20.04
https://github.com/ceph/ceph/pull/39214 merged Yuri Weinstein
06:15 PM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
https://github.com/ceph/ceph/pull/39179 merged Yuri Weinstein
06:11 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
These two unfound objects are of interest to us. Let's figure out why these are unfound.... Neha Ojha
01:50 PM Feature #49275 (Fix Under Review): [RFE] Add health warning in ceph status for filestore OSDs
Prashant D
01:43 PM Feature #49275 (Resolved): [RFE] Add health warning in ceph status for filestore OSDs
Along with health warn for filestore osds, the health detail should give OSD numbers which are still on filestore to ... Prashant D
09:06 AM Support #49268 (Closed): Blocked IOs up to 30 seconds when host powered down
Hello all,
I am facing an "issue" with my ceph cluster.

I have a small 6 nodes cluster.
Each node has 2 OSDs ...
Julien Demais

02/11/2021

11:26 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
/a/swagner-2021-02-11_10:31:33-rados:cephadm-wip-swagner-testing-2021-02-09-1126-distro-basic-smithi/5874516 shows th... Neha Ojha
09:21 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
/a/swagner-2021-02-11_10:31:33-rados:cephadm-wip-swagner-testing-2021-02-09-1126-distro-basic-smithi/5874516/
See...
Neha Ojha
08:01 PM Bug #49259 (Resolved): test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
swagner-2021-02-11_10:31:33-rados:cephadm-wip-swagner-testing-2021-02-09-1126-distro-basic-smithi/5874513... Sebastian Wagner
09:16 AM Bug #49231: MONs unresponsive over extended periods of time
Update: I start seeing this issue now 2 to 3 times a day, its getting really irritating. Possibly due to the large re... Frank Schilder
04:54 AM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
Trying a potential patch to see if I understand the actual root cause here. Brad Hubbard

02/10/2021

08:21 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
... Deepika Upadhyay
04:49 PM Bug #48786 (Fix Under Review): api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapR...
Neha Ojha
04:38 PM Bug #48786: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcount2 failed
https://pulpito.ceph.com/swagner-2021-02-10_11:41:39-rados:cephadm-wip-swagner-testing-2021-02-09-1126-distro-basic-s... Sebastian Wagner
03:12 PM Bug #46847: Loss of placement information on OSD reboot
Given the "severity" I'd be really glad if some of the Ceph core devs could have a look at this :) I'm really not tha... Jonas Jelten
11:00 AM Bug #46847: Loss of placement information on OSD reboot
Thanks for getting back on this. Your observations are exactly what I see as well. A note about severity of this bug.... Frank Schilder
10:40 AM Bug #49231 (New): MONs unresponsive over extended periods of time
Version: 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)
I'm repeatedly observing that the MONs ...
Frank Schilder

02/09/2021

10:21 PM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
http://qa-proxy.ceph.com/teuthology/bhubbard-2021-02-09_20:24:03-rados:singleton-nomsgr:all:lazy_omap_stats_output.ya... Brad Hubbard
06:32 AM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
If we look at the output from http://qa-proxy.ceph.com/teuthology/ideepika-2021-01-22_07:01:14-rados-wip-deepika-test... Brad Hubbard
02:43 AM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
From http://qa-proxy.ceph.com/teuthology/bhubbard-2021-02-08_22:46:10-rados:singleton-nomsgr:all:lazy_omap_stats_outp... Brad Hubbard
08:57 PM Bug #47380: mon: slow ops due to osd_failure
Copying my note from https://tracker.ceph.com/issues/43893#note-4
> Looking at this ticket again it's not a no_rep...
Greg Farnum
03:19 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
see teuthology: /home/ideepika/crt.log... Deepika Upadhyay

02/08/2021

11:49 PM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
I have what looks like three reproducers here, http://pulpito.front.sepia.ceph.com/bhubbard-2021-02-08_22:46:10-rados... Brad Hubbard
11:42 PM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
In nojha-2021-02-01_21:31:14-rados-wip-39145-distro-basic-smithi/5847125, where I ran the command manually, following... Neha Ojha
04:46 AM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
Looking into this further, in the successful case (from a fresh run on master) we see the following output.... Brad Hubbard
10:21 PM Backport #48496 (Resolved): octopus: Paxos::restart() and Paxos::shutdown() can race leading to u...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39161
m...
Nathan Cutler
07:28 PM Backport #48496: octopus: Paxos::restart() and Paxos::shutdown() can race leading to use-after-fr...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/39161
merged
Yuri Weinstein
07:18 PM Bug #48998: Scrubbing terminated -- not all pgs were active and clean
... Deepika Upadhyay
06:41 PM Bug #49064 (Resolved): test_envlibrados_for_rocksdb.sh: EnvLibradosMutipoolTest.DBBulkLoadKeysInR...
Neha Ojha
06:31 PM Bug #49064: test_envlibrados_for_rocksdb.sh: EnvLibradosMutipoolTest.DBBulkLoadKeysInRandomOrder ...
https://github.com/ceph/ceph/pull/39264 merged Yuri Weinstein
06:40 PM Backport #49134 (Resolved): pacific: test_envlibrados_for_rocksdb.sh: EnvLibradosMutipoolTest.DBB...
Neha Ojha
06:29 PM Bug #49069: mds crashes on v15.2.8 -> master upgrade decoding MMgrConfigure
https://github.com/ceph/ceph/pull/39237 merged Yuri Weinstein
10:46 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
/ceph/teuthology-archive/yuriw-2021-02-07_16:27:00-rados-wip-yuri8-testing-2021-01-2
7-1208-octopus-distro-basic-smi...
Deepika Upadhyay
10:40 AM Bug #49212 (Resolved): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to class 'ssd...
... Deepika Upadhyay
09:00 AM Bug #45647: "ceph --cluster ceph --log-early osd last-stat-seq osd.0" times out due to msgr-failu...
... Deepika Upadhyay

02/07/2021

10:42 PM Bug #48984 (Need More Info): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
Brad Hubbard
10:42 PM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
I haven't been able to reproduce this but the following is a review based on the code.
The last output from src/te...
Brad Hubbard
 

Also available in: Atom