Project

General

Profile

Activity

From 09/04/2018 to 10/03/2018

10/03/2018

11:25 PM Feature #36310 (New): Add norecover and nobackfill flags for per pool as we have for global cluster
Add norecover and nobackfill flags for per pool as we have for global cluster
This was done for noscrub and nodeep-s...
Vikhyat Umrao
09:35 PM Backport #24806 (Resolved): luminous: rgw workload makes osd memory explode
Nathan Cutler
09:34 PM Bug #23871 (Resolved): luminous->mimic: missing primary copy of xxx, wil try copies on 3, then fu...
Nathan Cutler
09:34 PM Backport #24908 (Resolved): luminous: luminous->mimic: missing primary copy of xxx, wil try copie...
Nathan Cutler
09:32 PM Bug #24588 (Resolved): osd: may get empty info at recovery
Nathan Cutler
09:32 PM Backport #24772 (Resolved): luminous: osd: may get empty info at recovery
Nathan Cutler
09:32 PM Bug #24486 (Resolved): osd: segv in Session::have_backoff
Nathan Cutler
09:32 PM Backport #24495 (Resolved): luminous: osd: segv in Session::have_backoff
Nathan Cutler
09:31 PM Bug #24371 (Resolved): Ceph-osd crash when activate SPDK
Nathan Cutler
09:31 PM Backport #24471 (Resolved): luminous: Ceph-osd crash when activate SPDK
Nathan Cutler
09:20 PM Bug #23916 (Resolved): LibRadosAio.PoolQuotaPP failed
Nathan Cutler
09:20 PM Backport #23924 (Resolved): luminous: LibRadosAio.PoolQuotaPP failed
Nathan Cutler
09:19 PM Bug #23713 (Resolved): High MON cpu usage when cluster is changing
Nathan Cutler
09:19 PM Backport #23912 (Resolved): luminous: mon: High MON cpu usage when cluster is changing
Nathan Cutler
09:18 PM Bug #36285: qa/workunits/cephtool/test.sh test fails setting pg_num to 97
Did all the OSDs actually come on? Note the last line there: specified pg_num 97 is too large (creating 87 new PGs on... Greg Farnum
09:18 PM Bug #22095 (Resolved): ceph status shows wrong number of objects
Nathan Cutler
09:18 PM Backport #23772 (Resolved): luminous: ceph status shows wrong number of objects
Nathan Cutler
09:16 PM Bug #23940 (Resolved): recursive lock of objecter session::lock on cancel
Nathan Cutler
09:16 PM Backport #23986 (Resolved): luminous: recursive lock of objecter session::lock on cancel
Nathan Cutler
09:09 PM Bug #36300: Clients receive "wrong fsid" error when CephX is disabled
I'll take a look. Greg Farnum
02:10 PM Bug #36300 (Resolved): Clients receive "wrong fsid" error when CephX is disabled
Related to the changes introduced here [1]. The following reproducer shows the issue hit by a client application:
...
Jason Dillaman
08:52 PM Feature #22086 (Resolved): ceph-objectstore-tool: Add option "dump-import" to examine an export
Nathan Cutler
08:52 PM Backport #22390 (Rejected): jewel: ceph-objectstore-tool: Add option "dump-import" to examine an ...
Jewel is EOL Nathan Cutler
07:47 PM Bug #36306 (Resolved): monstore tool rebuild does not generate creating_pgs
The rebuild function does not populate creating_pgs's created_pools. this leads to every (existing) pg being (re)crea... Sage Weil
05:25 PM Bug #36305 (New): test_mon_ping fails with [errno 2] error calling ping_monitor
... Neha Ojha
05:08 PM Bug #36304 (Need More Info): FAILED ceph_assert(p != pg_slots.end()) in OSDShard::register_and_wa...
... Neha Ojha
02:37 PM Backport #26932 (In Progress): luminous: scrub livelock
Nathan Cutler
01:00 PM Backport #26932 (Need More Info): luminous: scrub livelock
Nathan Cutler
12:19 PM Backport #26932 (In Progress): luminous: scrub livelock
Nathan Cutler
12:17 PM Backport #25145 (In Progress): luminous: Automatically set expected_num_objects for new pools wit...
Nathan Cutler
12:10 PM Backport #23998 (In Progress): luminous: osd/EC: slow/hung ops in multimds suite test
Nathan Cutler
08:24 AM Backport #23926 (Need More Info): luminous: disable bluestore cache caused a rocksdb error
An attempt at this backport is here: https://github.com/ceph/ceph/pull/24325
@smithfarm i think you need to backpo...
Nathan Cutler
08:10 AM Backport #36298 (Resolved): mimic: ceph pg ls creating: EINVAL
https://github.com/ceph/ceph/pull/24601 Nathan Cutler
08:09 AM Backport #36297 (Resolved): luminous: ceph pg ls creating: EINVAL
https://github.com/ceph/ceph/pull/24602 Nathan Cutler
08:09 AM Backport #36296 (Resolved): mimic: [objecter] client socket failure leads to hung connection
https://github.com/ceph/ceph/pull/24600 Nathan Cutler
08:09 AM Backport #36295 (Resolved): luminous: [objecter] client socket failure leads to hung connection
https://github.com/ceph/ceph/pull/24574 Nathan Cutler

10/02/2018

10:46 PM Bug #36183 (Pending Backport): [objecter] client socket failure leads to hung connection
Neha Ojha
10:45 PM Bug #36174 (Pending Backport): ceph pg ls creating: EINVAL
Neha Ojha
10:43 PM Bug #36174: ceph pg ls creating: EINVAL
Dan van der Ster wrote:
> https://github.com/ceph/ceph/pull/24262
merged
Yuri Weinstein
07:32 PM Backport #36292 (Resolved): mimic: pg dout log had backfill=[] and bft= which are the same thing
https://github.com/ceph/ceph/pull/24573 Nathan Cutler
02:55 PM Bug #20439: PG never finishes getting created
Seen again:
/a/dzafman-2018-09-26_22:31:44-rados-wip-zafman-testing-distro-basic-smithi/3074605
David Zafman
02:37 PM Bug #36289 (New): Converting Filestore OSD from leveldb to rocksdb backend on CentOS
This is a continuation of [1] this thread from the ML. The only difference we've found is that I'm using CentOS and t... David Turner
01:28 AM Bug #36260 (Resolved): qa/workunits/mon/test_mon_config_key.py fails
Kefu Chai

10/01/2018

10:51 PM Bug #36170 (Pending Backport): pg dout log had backfill=[] and bft= which are the same thing
Neha Ojha
10:39 PM Bug #36285 (New): qa/workunits/cephtool/test.sh test fails setting pg_num to 97

/a/dzafman-2018-09-26_22:31:44-rados-wip-zafman-testing-distro-basic-smithi/3074562
The portion of test shown ...
David Zafman
09:08 PM Bug #36177 (Fix Under Review): rados rm --force-full is blocked when cluster is in full status
Greg Farnum
08:07 PM Backport #35962 (Resolved): luminous: choose_acting picked want > pool size
Nathan Cutler
02:43 PM Backport #35962: luminous: choose_acting picked want > pool size
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24299
merged
Yuri Weinstein
08:06 PM Backport #36274 (Resolved): luminous: osd/PrimaryLogPG: fix potential pg-log overtrimming
Nathan Cutler
08:04 PM Backport #36274 (Resolved): luminous: osd/PrimaryLogPG: fix potential pg-log overtrimming
https://github.com/ceph/ceph/pull/24308 Nathan Cutler
08:06 PM Backport #36275 (In Progress): mimic: osd/PrimaryLogPG: fix potential pg-log overtrimming
Nathan Cutler
08:04 PM Backport #36275 (Resolved): mimic: osd/PrimaryLogPG: fix potential pg-log overtrimming
https://github.com/ceph/ceph/pull/24309 Nathan Cutler
05:36 PM Bug #36182: osd: hung op "osd.3 22 get_health_metrics reporting 2 slow ops, oldest is osd_op(mds....
Another with full logs (no cores):
/ceph/teuthology-archive/pdonnell-2018-10-01_03:14:44-fs-wip-pdonnell-testing-2...
Patrick Donnelly
04:36 PM Bug #22330: ec: src/common/interval_map.h: 161: FAILED assert(len > 0)
Latest instance with logs/cores: /ceph/teuthology-archive/pdonnell-2018-10-01_03:19:12-multimds-wip-pdonnell-testing-... Patrick Donnelly
04:35 PM Bug #36271 (Duplicate): src/common/interval_map.h: 161: FAILED ceph_assert(len > 0)
Patrick Donnelly
04:13 PM Bug #36271 (Duplicate): src/common/interval_map.h: 161: FAILED ceph_assert(len > 0)
... Patrick Donnelly
04:08 PM Bug #36270 (New): The "Many more objects per PG than average" warning does not work well when obj...
See https://bugzilla.redhat.com/show_bug.cgi?id=1633221
It's rare, but some clusters have pools with very differen...
Greg Farnum
02:42 PM Bug #36239: osd/PrimaryLogPG: fix potential pg-log overtrimming
merged https://github.com/ceph/ceph/pull/24308 Yuri Weinstein
02:36 AM Backport #35964 (In Progress): mimic: RADOS: probably missing clone location for async_recovery_t...
https://github.com/ceph/ceph/pull/24345 Prashant D
02:29 AM Backport #35963 (In Progress): mimic: choose_acting picked want > pool size
https://github.com/ceph/ceph/pull/24344 Prashant D

09/30/2018

08:01 AM Bug #36260 (Fix Under Review): qa/workunits/mon/test_mon_config_key.py fails
https://github.com/ceph/ceph/pull/24340 Kefu Chai

09/29/2018

04:06 PM Bug #36260 (Resolved): qa/workunits/mon/test_mon_config_key.py fails
... Kefu Chai

09/28/2018

04:23 PM Bug #36250 (Can't reproduce): ceph-osd process crashing
ceph-osd process crashes in thread msgr-worker. This happens with all OSDs in the cluster, roughly once per day at th... Josh Haft
08:29 AM Backport #24478 (In Progress): luminous: read object attrs failed at EC recovery
Nathan Cutler
08:16 AM Backport #23926 (In Progress): luminous: disable bluestore cache caused a rocksdb error
Nathan Cutler
06:57 AM Backport #35844 (Resolved): luminous: objecter cannot resend split-dropped op when racing with co...
Nathan Cutler
03:58 AM Backport #36131 (Resolved): luminous: "symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm" ...
Kefu Chai

09/27/2018

09:07 PM Backport #35844: luminous: objecter cannot resend split-dropped op when racing with con reset
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24188
merged
Yuri Weinstein
09:04 PM Backport #36131: luminous: "symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm" on centos 7.4
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24259
merged
Yuri Weinstein
02:48 PM Bug #36239 (Resolved): osd/PrimaryLogPG: fix potential pg-log overtrimming
https://github.com/ceph/ceph/pull/23317 Neha Ojha
02:31 PM Bug #35813 (Resolved): should remove mentioning of "scrubq" in ceph(8) manpage
Nathan Cutler
02:31 PM Backport #35855 (Resolved): mimic: should remove mentioning of "scrubq" in ceph(8) manpage
Nathan Cutler
02:29 PM Backport #35854 (Resolved): luminous: should remove mentioning of "scrubq" in ceph(8) manpage
Nathan Cutler
10:46 AM Bug #35974: Apparent export-diff/import-diff corruption
Cliff Pajaro wrote:
> [...]
> You mention cache tiering but that is not something we use.
Sorry, I mixed this on...
Jason Dillaman
05:57 AM Bug #35974: Apparent export-diff/import-diff corruption
Cliff Pajaro wrote:
> the issue is not seen with krbd (rbd map)
rbd_balance_snap_reads config option is only fo...
Mykola Golub
06:49 AM Backport #35962 (In Progress): luminous: choose_acting picked want > pool size
https://github.com/ceph/ceph/pull/24299 Prashant D
02:50 AM Bug #36105: OSD hangs during shutdown
https://github.com/ceph/ceph/pull/24296 David Zafman
12:57 AM Bug #35847 (In Progress): wrong cluster_network doesn't cause any errors and ends up using monito...
Working Greg's comments to my PR. Victor Denisov
12:17 AM Bug #25146 (Fix Under Review): "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x...
ceph/rocksdb: https://github.com/ceph/rocksdb/pull/40 Radoslaw Zarzynski

09/26/2018

10:04 PM Bug #36105 (In Progress): OSD hangs during shutdown
David Zafman
10:03 PM Bug #36105: OSD hangs during shutdown

In all 3 OSDs I found that hung, the last thread to drain is the smallest thread_index (thus the thread that handle...
David Zafman
01:51 AM Bug #36105: OSD hangs during shutdown

I'll look at this more tomorrow but I suspect that the following commit from https://github.com/ceph/ceph/pull/2273...
David Zafman
09:38 PM Bug #36096 (Need More Info): osd: crashing with errors: "write_log_and_missing with: dirty_to" an...
Could you please attach logs to this tracker? Neha Ojha
09:21 PM Bug #35810: FAILED assert(entries.begin()->version > info.last_update)
It may be related to error-pg-log entry bugs in 12.2.2. The latest luminous releases have fixed a few of these. Josh Durgin
09:20 PM Bug #35847 (Fix Under Review): wrong cluster_network doesn't cause any errors and ends up using m...
Neha Ojha
09:19 PM Bug #35974: Apparent export-diff/import-diff corruption
Should disable balanced and localized reads when a cache tier is in use. Neha Ojha
08:57 PM Bug #35974: Apparent export-diff/import-diff corruption
... Cliff Pajaro
06:15 PM Bug #35974: Apparent export-diff/import-diff corruption
Ultimately we have disabled the rbd_balance_snap_reads feature.
To help the developers troubleshoot the problem, h...
Cliff Pajaro
06:06 PM Bug #35974: Apparent export-diff/import-diff corruption
Moving to OSD team in case they have any follow-up -- but it just sounds like your OSDs were inconsistent for some re... Jason Dillaman
06:02 PM Bug #35974: Apparent export-diff/import-diff corruption
@Jason: If rbd_balance_snap_reads is enabled, deep-scrub fixes the issue. Cliff Pajaro
01:03 PM Bug #35974: Apparent export-diff/import-diff corruption
@Cliff: are you saying the deep-scrub fixed the corruption between snapshot0 and snapshot1 -- or are you saying that ... Jason Dillaman
02:45 PM Bug #22052 (Resolved): ceph-mon: possible Leak in OSDMap::build_simple_optioned
Kefu Chai
08:01 AM Bug #22052 (Fix Under Review): ceph-mon: possible Leak in OSDMap::build_simple_optioned
https://github.com/ceph/teuthology/pull/1213 Kefu Chai
05:15 AM Bug #36172: osd: hit suicide timeout
Most likely can't flush filestore output to the hardware. Can you thoroughly check the hardware is in perfect working... Brad Hubbard
04:18 AM Feature #36187 (New): Crush rule ssd-primary should take previous emit result into consideration
http://docs.ceph.com/docs/master/rados/operations/crush-map-edits/
The document entry "PLACING DIFFERENT POOLS ON DI...
Horace Ng
03:20 AM Bug #36166: pg merge can collide with remapped, upmap pgs
https://github.com/ceph/ceph/pull/24184 xie xingguo

09/25/2018

11:17 PM Bug #36186: failed to become clean before timeout expired - pg stuck in clean+premerge+peered
/a/nojha-2018-09-24_16:58:52-rados-master-distro-basic-smithi/3065624/ Neha Ojha
08:16 PM Bug #36186 (Resolved): failed to become clean before timeout expired - pg stuck in clean+premerge...
... Neha Ojha
11:12 PM Bug #36105: OSD hangs during shutdown
/a/nojha-2018-09-24_16:58:52-rados-master-distro-basic-smithi/3065653/ Neha Ojha
02:47 PM Bug #36105: OSD hangs during shutdown
I've reproduced this running the test in my local tree. I'll work on generating a core dump to find out what is stuck. David Zafman
10:29 PM Bug #36174 (In Progress): ceph pg ls creating: EINVAL
Nathan Cutler
08:43 AM Bug #36174: ceph pg ls creating: EINVAL
https://github.com/ceph/ceph/pull/24262 Dan van der Ster
08:41 AM Bug #36174 (Resolved): ceph pg ls creating: EINVAL
... Dan van der Ster
10:17 PM Bug #36182: osd: hung op "osd.3 22 get_health_metrics reporting 2 slow ops, oldest is osd_op(mds....
Logs in /a/pdonnell-2018-09-25_01:23:37-fs-wip-pdonnell-testing-20180924.230702-distro-basic-smithi/3066511/remote/log Neha Ojha
05:11 PM Bug #36182 (Resolved): osd: hung op "osd.3 22 get_health_metrics reporting 2 slow ops, oldest is ...
From: http://pulpito.ceph.com/pdonnell-2018-09-25_01:23:37-fs-wip-pdonnell-testing-20180924.230702-distro-basic-smith... Patrick Donnelly
06:19 PM Bug #36183 (Fix Under Review): [objecter] client socket failure leads to hung connection
*PR*: https://github.com/ceph/ceph/pull/24276 Jason Dillaman
05:50 PM Bug #36183 (Resolved): [objecter] client socket failure leads to hung connection
During an rbd-mirror thrash test run, the process failed to shut down cleanly because it was stuck in an librados rea... Jason Dillaman
02:05 PM Feature #24176: osd: add command to drop OSD cache
Patrick, sorry I completely missed your comment. I opened a PR for it: https://github.com/ceph/ceph/pull/24270 Mohamad Gebai
01:28 PM Bug #22837 (Resolved): discover_all_missing() not always called during activating
Nathan Cutler
01:28 PM Backport #26992 (Resolved): luminous: discover_all_missing() not always called during activating
Nathan Cutler
10:24 AM Bug #36177: rados rm --force-full is blocked when cluster is in full status
https://github.com/ceph/ceph/pull/24264 Honggang Yang
09:57 AM Bug #36177 (Resolved): rados rm --force-full is blocked when cluster is in full status
... Honggang Yang
09:54 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
FWIW, we hit this issue several times, it seems relate with our operational works that change `mon_osd_force_trim_to`... Xiaoxi Chen
07:33 AM Bug #36172 (New): osd: hit suicide timeout
ceph version 0.94.9-9.el7cp
A osd-drive died some days agoo and after a restart today again with the same error:
...
Bernd Hennig
06:07 AM Backport #36132 (In Progress): mimic: "symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm" ...
https://github.com/ceph/ceph/pull/24260 Kefu Chai
06:02 AM Backport #36131 (In Progress): luminous: "symbol lookup error: ceph-osd: undefined symbol: _ZdaPv...
https://github.com/ceph/ceph/pull/24259 Kefu Chai
02:35 AM Bug #35810: FAILED assert(entries.begin()->version > info.last_update)
Neha Ojha wrote:
> Hi Chang. Can you reproduce this bug with higher level of debugging? It is hard to find out what'...
Chang Liu

09/24/2018

09:43 PM Bug #36170: pg dout log had backfill=[] and bft= which are the same thing
https://github.com/ceph/ceph/pull/24256 Neha Ojha
09:25 PM Bug #36170 (Resolved): pg dout log had backfill=[] and bft= which are the same thing

This is confusing to log analysis. I would have preferred to leave bft= added in 2013, but backfill=[] is in mimic...
David Zafman
06:09 PM Bug #36166 (Resolved): pg merge can collide with remapped, upmap pgs
If either source or target pg is remapped or has an upmap it may map to a different set of osds. Sage Weil
06:07 PM Bug #22329: mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
/ceph/teuthology-archive/pdonnell-2018-09-23_19:17:54-fs-wip-pdonnell-testing-20180923.160923-distro-basic-smithi/306... Patrick Donnelly
05:44 PM Bug #35847 (In Progress): wrong cluster_network doesn't cause any errors and ends up using monito...
PR: https://github.com/ceph/ceph/pull/24236 Victor Denisov
03:56 PM Bug #36164 (New): cephtool/test fails 'ceph tell mon.a help' with EINTR
... Sage Weil
02:22 PM Bug #36163 (Fix Under Review): mon osdmap cash too small during upgrade to mimic
https://github.com/ceph/ceph/pull/24247 Sage Weil
02:15 PM Bug #36163 (Resolved): mon osdmap cash too small during upgrade to mimic
At least one large cluster upgrading from luminous to mimic had its' mons fall over due to heavy load that turned out... Sage Weil
11:01 AM Backport #36150 (Resolved): mimic: output format is invalid of the crush tree json dumper
https://github.com/ceph/ceph/pull/24481 Nathan Cutler
11:01 AM Backport #36149 (Resolved): luminous: output format is invalid of the crush tree json dumper
https://github.com/ceph/ceph/pull/24482 Nathan Cutler
11:00 AM Backport #36132 (Resolved): mimic: "symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm" on ...
https://github.com/ceph/ceph/pull/24260 Nathan Cutler
11:00 AM Backport #36131 (Resolved): luminous: "symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm" ...
https://github.com/ceph/ceph/pull/24259 Nathan Cutler
08:50 AM Bug #24373 (Resolved): osd: eternal stuck PG in 'unfound_recovery'
Nathan Cutler
08:50 AM Backport #24501 (Resolved): luminous: osd: eternal stuck PG in 'unfound_recovery'
Nathan Cutler

09/23/2018

03:27 PM Support #36115: After Mimic upgrade OSD's stuck at booting.
My main kernel is: Linux 4.14.70-1-lts Also I tried 4.18.8-arch1-1-ARCH. Nothing changed.
I'm sure this problem re...
morphin by
03:12 PM Support #36115: After Mimic upgrade OSD's stuck at booting.
IPERF test between 2 node: https://paste.ubuntu.com/p/7rRYSSqtyh/
I dont think this is related to network or firew...
morphin by
02:49 PM Support #36115 (New): After Mimic upgrade OSD's stuck at booting.
After Luminous to Mimic upgrade when I try to start an OSD. Its
stucking at "booting". (I edit the hostnames so do n...
morphin by

09/22/2018

03:48 PM Bug #36113 (New): fusestore test umount failed?
... Sage Weil
03:46 PM Bug #21143: bad RESETSESSION between OSDs?
/a/sage-2018-09-22_02:47:58-rados-master-distro-basic-smithi/3053124
seeing more of this!
Sage Weil
03:42 PM Bug #26972 (Fix Under Review): cluster [ERR] Error -2 reading object
https://github.com/ceph/ceph/pull/24225 Sage Weil
03:30 PM Bug #24866 (Resolved): FAILED assert(0 == "past_interval start interval mismatch") in check_past_...
resolved by https://github.com/ceph/ceph/pull/24064 Sage Weil

09/21/2018

11:28 PM Bug #35974: Apparent export-diff/import-diff corruption
I did a deep analysis of the export-diff files created with read-balance set to true and false. When read-balance is... Cliff Pajaro
11:24 PM Bug #22329 (Need More Info): mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
Neha Ojha wrote:
> Patrick, which set of logs have the (Leak_DefinitelyLost, Leak_IndirectlyLost) errors?
Old log...
Patrick Donnelly
09:43 PM Bug #22329: mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
Patrick, which set of logs have the (Leak_DefinitelyLost, Leak_IndirectlyLost) errors? Neha Ojha
09:55 PM Bug #36073 (Resolved): failed to recover before timeout expired -- premerge+peered PGs?
Neha Ojha
09:54 PM Bug #35810 (Need More Info): FAILED assert(entries.begin()->version > info.last_update)
Hi Chang. Can you reproduce this bug with higher level of debugging? It is hard to find out what's happening from the... Neha Ojha
09:31 PM Bug #22330: ec: src/common/interval_map.h: 161: FAILED assert(len > 0)
Neha Ojha
09:27 PM Bug #24866 (Need More Info): FAILED assert(0 == "past_interval start interval mismatch") in check...
Neha Ojha
01:22 PM Bug #35955: ceph-objectstore-tool past_intervals broken
This is fixed for nautilus since the behavior totally changed with https://github.com/ceph/ceph/pull/23985. The prob... Sage Weil
01:22 PM Bug #35955 (Resolved): ceph-objectstore-tool past_intervals broken
Sage Weil
04:38 AM Bug #23828: ec gen object leaks into different filestore collection just after split
... Kefu Chai
03:50 AM Backport #35854 (In Progress): luminous: should remove mentioning of "scrubq" in ceph(8) manpage
https://github.com/ceph/ceph/pull/24211 Prashant D
03:47 AM Backport #35855 (In Progress): mimic: should remove mentioning of "scrubq" in ceph(8) manpage
https://github.com/ceph/ceph/pull/24210 Prashant D

09/20/2018

10:56 PM Bug #36105: OSD hangs during shutdown
Yes, the kill_daemons failed because after 6 minutes several terminated OSDs still hadn't finished shutting down. I ... David Zafman
10:20 PM Bug #36105: OSD hangs during shutdown
Sage Weil
10:19 PM Bug #36105 (Resolved): OSD hangs during shutdown
... Sage Weil
10:30 PM Bug #25153 (Pending Backport): output format is invalid of the crush tree json dumper
Sage Weil
10:10 PM Backport #26992: luminous: discover_all_missing() not always called during activating
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23817
merged
Yuri Weinstein
09:17 PM Bug #35974: Apparent export-diff/import-diff corruption
@Jason
For the logs sent previously, I performed export-diff between snapshot1 and snapshot2. When I did an rbd exp...
Cliff Pajaro
07:23 PM Subtask #36091 (In Progress): [rbd top] collect client perf stats when query is enabled
Mykola Golub
01:29 PM Bug #24485: LibRadosTwoPoolsPP.ManifestUnset failure
/a/sage-2018-09-19_21:52:06-rados-wip-sage2-testing-2018-09-19-1236-distro-basic-smithi/3044506 Sage Weil
12:29 PM Bug #36096 (Need More Info): osd: crashing with errors: "write_log_and_missing with: dirty_to" an...
Our Ceph with version Mimic is crashing OSD nodes with the following error in the log:... Greg Smith
02:20 AM Backport #35844 (In Progress): luminous: objecter cannot resend split-dropped op when racing with...
https://github.com/ceph/ceph/pull/24188 Prashant D

09/19/2018

10:46 PM Subtask #36091 (Resolved): [rbd top] collect client perf stats when query is enabled
The OSD's 'collect_perf_metrics' MgrClient callback should record whether or not the query is enabled/disabled and ma... Jason Dillaman
03:13 PM Bug #35974: Apparent export-diff/import-diff corruption
@Patrick: Interesting find. If it truly is related to just that option, we will have to get a RADOS core team member ... Jason Dillaman
01:27 PM Bug #21143: bad RESETSESSION between OSDs?
... Sage Weil
12:35 PM Backport #35836 (In Progress): mimic: mon: mgr options not parse propertly
https://github.com/ceph/ceph/pull/24176 Prashant D
09:56 AM Bug #35969 (Pending Backport): "symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm" on cent...
Kefu Chai

09/18/2018

10:24 PM Bug #35682 (Resolved): 34164d55c839acd35bbb1be5279e3e23e3bec1fd broke the librados examples
Sage Weil
07:56 PM Bug #36040: mon: Valgrind: mon (InvalidFree, InvalidWrite, InvalidRead)
Also in Mimic: /ceph/teuthology-archive/yuriw-2018-09-13_19:40:54-fs-mimic-distro-basic-smithi/3018437/remote/smithi0... Patrick Donnelly
05:39 PM Feature #24176: osd: add command to drop OSD cache
Mohamad, any update on this? Patrick Donnelly
04:20 PM Bug #36073 (In Progress): failed to recover before timeout expired -- premerge+peered PGs?
https://github.com/ceph/ceph/pull/24064
https://github.com/ceph/ceph/pull/23985
Sage Weil
03:24 PM Bug #36073 (Resolved): failed to recover before timeout expired -- premerge+peered PGs?
Appeared between 93748a325cd8 ("Merge pull request #23944 from ceph/wip-s3a-update-mirror") and 5a3344f0e52c ("Merge ... Ilya Dryomov
03:38 PM Bug #24485: LibRadosTwoPoolsPP.ManifestUnset failure
/a/kchai-2018-09-18_07:16:16-rados-wip-kefu2-testing-2018-09-18-1224-distro-basic-smithi/3037527 Kefu Chai
03:11 PM Bug #22330: ec: src/common/interval_map.h: 161: FAILED assert(len > 0)
Running the multimds:basic suite with --filter 'clusters/9-mds.yaml conf/{client.yaml mds.yaml mon.yaml osd.yaml} inl... Neha Ojha
03:09 PM Bug #21931: osd: src/osd/ECBackend.cc: 2164: FAILED assert((offset + length) <= (range.first.get_...
Running the multimds:basic suite with --filter 'clusters/9-mds.yaml conf/{client.yaml mds.yaml mon.yaml osd.yaml} inl... Neha Ojha
01:08 AM Bug #35849 (Closed): mimic: test_envlibrados_for_rocksdb.sh: build failed with error: #endif with...
sure. Neha Ojha
12:28 AM Bug #35849: mimic: test_envlibrados_for_rocksdb.sh: build failed with error: #endif without #if
Ah, I see what happened. The github.com/facebook/rocksdb/ was broken on the day these tasks failed. See https://githu... Brad Hubbard

09/17/2018

08:55 PM Bug #22329: mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
See also #36040 Patrick Donnelly
08:38 PM Bug #22329: mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
Still not seeing anything in RADOS runs AFAIK, but I did notice there might be some disparity in coverage....
>13:...
Greg Farnum
07:35 PM Bug #22329: mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
-/ceph/teuthology-archive/pdonnell-2018-09-13_04:59:57-multimds-wip-pdonnell-testing-20180913.024004-distro-basic-smi... Patrick Donnelly
08:54 PM Bug #36040 (New): mon: Valgrind: mon (InvalidFree, InvalidWrite, InvalidRead)
From: /ceph/teuthology-archive/pdonnell-2018-09-13_04:59:57-multimds-wip-pdonnell-testing-20180913.024004-distro-basi... Patrick Donnelly
02:24 PM Bug #35849: mimic: test_envlibrados_for_rocksdb.sh: build failed with error: #endif without #if
Hey Brad,
I had reproduced it here: http://pulpito.ceph.com/nojha-2018-09-07_17:42:05-rados:singleton-mimic-distro...
Neha Ojha
10:28 AM Bug #35849: mimic: test_envlibrados_for_rocksdb.sh: build failed with error: #endif without #if
Hey Neha,
Can you reproduce this?
I tried mimicking the job in a Bionic container and it builds correctly. I al...
Brad Hubbard
07:53 AM Bug #35923 (Resolved): "ceph_assert(values.size() == 2)" in PG::peek_map_epoch()
Kefu Chai
06:26 AM Bug #35969 (Fix Under Review): "symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm" on cent...
Kefu Chai
06:25 AM Bug #35969 (In Progress): "symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm" on centos 7.4
https://github.com/ceph/ceph/pull/24124
as suggested by Brad, we can just bump the BuildRequires of gperftools.
Kefu Chai
05:42 AM Bug #24835: osd daemon spontaneous segfault
Soenke,
Could you upload a coredump for each of the different backtraces as well as details of your environment (t...
Brad Hubbard

09/14/2018

08:07 PM Bug #24022 (Resolved): "ceph tell osd.x bench" writes resulting JSON to stderr instead of stdout.
Nathan Cutler
08:06 PM Backport #35941 (Resolved): luminous: "ceph tell osd.x bench" writes resulting JSON to stderr ins...
Nathan Cutler
04:47 PM Backport #35941: luminous: "ceph tell osd.x bench" writes resulting JSON to stderr instead of std...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23680
merged
Yuri Weinstein
08:05 PM Bug #23370 (Resolved): mgrc's ms_handle_reset races with send_pgstats()
Nathan Cutler
08:05 PM Backport #23408 (Resolved): luminous: mgrc's ms_handle_reset races with send_pgstats()
Nathan Cutler
04:46 PM Backport #23408: luminous: mgrc's ms_handle_reset races with send_pgstats()
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23791
merged
Yuri Weinstein
08:04 PM Bug #25112 (Resolved): osd,mon: increase mon_max_pg_per_osd to 250
Nathan Cutler
08:04 PM Backport #25177 (Resolved): luminous: osd,mon: increase mon_max_pg_per_osd to 300
Nathan Cutler
04:45 PM Backport #25177: luminous: osd,mon: increase mon_max_pg_per_osd to 300
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23862
merged
Yuri Weinstein
08:03 PM Bug #25175 (Resolved): rados python bindings use prval from stack
Nathan Cutler
08:03 PM Backport #25203 (Resolved): luminous: rados python bindings use prval from stack
Nathan Cutler
04:45 PM Backport #25203: luminous: rados python bindings use prval from stack
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23864
merged
Yuri Weinstein
08:03 PM Bug #25108 (Resolved): object errors found in be_select_auth_object() aren't logged the same
Nathan Cutler
08:02 PM Backport #32106 (Resolved): luminous: object errors found in be_select_auth_object() aren't logge...
Nathan Cutler
04:44 PM Backport #32106: luminous: object errors found in be_select_auth_object() aren't logged the same
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23871
merged
Yuri Weinstein
06:43 PM Bug #35974: Apparent export-diff/import-diff corruption
The exports between 3b and 3c were identical.
All the clients that are mounting the filesystems are currently using ...
Patrick McLean
02:35 PM Bug #35974 (Need More Info): Apparent export-diff/import-diff corruption
@Patrick: were the resulting exports different between run 3b and 3c? The logs indicate that they read the same data ... Jason Dillaman
02:31 PM Bug #35969: "symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm" on centos 7.4
asked on ceph-{maintainers,users,developers} to see if we can drop the support of centos 7.4, turns out it's a no-go.... Kefu Chai
11:45 AM Bug #23431: OSD Segmentation fault in thread_name:safe_timer
See #23352
The fix is in 12.2.8
Brad Hubbard
11:42 AM Bug #23431: OSD Segmentation fault in thread_name:safe_timer
Hi,
Same issue with ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)
My OSDs ar...
Kevin Tibi
10:53 AM Bug #35682: 34164d55c839acd35bbb1be5279e3e23e3bec1fd broke the librados examples
Brad Hubbard wrote:
> Working on a teuthology task to do a test build of the examples as well.
@Brad: I already h...
Nathan Cutler
04:28 AM Bug #35682 (In Progress): 34164d55c839acd35bbb1be5279e3e23e3bec1fd broke the librados examples
Brad Hubbard
04:28 AM Bug #35682: 34164d55c839acd35bbb1be5279e3e23e3bec1fd broke the librados examples
https://github.com/ceph/ceph/pull/24098
Working on a teuthology task to do a test build of the examples as well.
Brad Hubbard

09/13/2018

08:51 PM Bug #35974: Apparent export-diff/import-diff corruption
I have attached logs of the export-diffs run with "--debug-rbd=20" and "--debug-rados=20" Patrick McLean
05:58 PM Bug #35974 (Need More Info): Apparent export-diff/import-diff corruption
From the ML:
We utilize Ceph RBDs for our users' storage and need to keep data synchronized across data centres. F...
Jason Dillaman
05:48 PM Backport #35942 (Resolved): mimic: "ceph tell osd.x bench" writes resulting JSON to stderr instea...
Nathan Cutler
03:16 PM Backport #35942: mimic: "ceph tell osd.x bench" writes resulting JSON to stderr instead of stdout.
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24041
merged
Yuri Weinstein
01:41 PM Bug #27363: 'rbd rm' does not clean tiered pool completly
Moving this to the core team. This appears to be an issue w/ the cache tier. In my test, after removing the image, al... Jason Dillaman
11:51 AM Bug #21965: mon/MonClient.cc: 478: FAILED assert(authenticate_err == 0)
hi, @sage, we encounter this assert after running the same qa case(workloads/rados_api_tests.yaml) for 30 times, but ... huang jun
11:34 AM Bug #35969: "symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm" on centos 7.4
this issue resembles #23653. both of them are related to new memory management APIs. #23653 was related to @aligned_a... Kefu Chai
10:47 AM Bug #35969 (Resolved): "symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm" on centos 7.4
see /a/kchai-2018-09-13_01:57:49-ceph-disk-wip-fix-35906-distro-basic-ovh/3012294... Kefu Chai
08:49 AM Bug #35808: ceph osd ok-to-stop result dosen't match the real situation
xie xingguo wrote:
> I see you are using a pool min_size of 3, so no replicas is allowed to be offline and hence the...
frank lin
08:46 AM Bug #35808: ceph osd ok-to-stop result dosen't match the real situation
John Spray wrote:
> It's a little bit odd that the ok-to-stop command said 4 PGs, but you actually had 5 PGs go inco...
frank lin
08:09 AM Documentation #35968: [doc][jewel] sync documentation "OSD Config Reference" default values with ...
... Tomas Petr
08:09 AM Documentation #35968: [doc][jewel] sync documentation "OSD Config Reference" default values with ...
OPTION(osd_map_cache_size, OPT_INT, 200)
OPTION(osd_scrub_during_recovery, OPT_BOOL, false) // Allow new scrubs to...
Tomas Petr
08:08 AM Documentation #35968 (Won't Fix): [doc][jewel] sync documentation "OSD Config Reference" default ...
http://docs.ceph.com/docs/jewel/rados/configuration/osd-config-ref/
Change following default values to the one use...
Tomas Petr
08:05 AM Documentation #35967 (Resolved): [doc] sync documentation "OSD Config Reference" default values w...
for:
http://docs.ceph.com/docs/master/rados/configuration/osd-config-ref/
http://docs.ceph.com/docs/mimic/rados/con...
Tomas Petr
05:52 AM Backport #35964 (Resolved): mimic: RADOS: probably missing clone location for async_recovery_targets
https://github.com/ceph/ceph/pull/24345 Nathan Cutler
05:51 AM Bug #26875 (Resolved): kv: MergeOperator name() returns string, and caller calls c_str() on the t...
Nathan Cutler
05:51 AM Backport #26908 (Resolved): luminous: kv: MergeOperator name() returns string, and caller calls c...
Nathan Cutler
05:49 AM Backport #35963 (Resolved): mimic: choose_acting picked want > pool size
https://github.com/ceph/ceph/pull/24344 Nathan Cutler
05:49 AM Backport #35962 (Resolved): luminous: choose_acting picked want > pool size
https://github.com/ceph/ceph/pull/24299 Nathan Cutler
01:58 AM Bug #35546 (Pending Backport): RADOS: probably missing clone location for async_recovery_targets
Changing status to Pending Backport to get the mimic backport tracker ticket opened for this. Neha Ojha

09/12/2018

09:54 PM Backport #26908: luminous: kv: MergeOperator name() returns string, and caller calls c_str() on t...
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/23566
merged
Yuri Weinstein
09:05 PM Bug #35847: wrong cluster_network doesn't cause any errors and ends up using monitor network?
Agreed, we should have a clear error when one of the networks does not work. Josh Durgin
09:03 PM Bug #35924 (Pending Backport): choose_acting picked want > pool size
Josh Durgin
08:07 PM Bug #35542: Backfill and recovery should validate all checksums
Sage Weil wrote:
> I'm unclear what checksum is not being checked. There is only *sometimes* a full object checksum...
Greg Farnum
03:34 PM Bug #35542: Backfill and recovery should validate all checksums
I'm unclear what checksum is not being checked. There is only *sometimes* a full object checksum that we can validat... Sage Weil
04:12 PM Bug #26868 (Resolved): PGLog.cc: saw valgrind issues while accessing complete_to->version
Nathan Cutler
04:12 PM Backport #26910 (Resolved): luminous: PGLog.cc: saw valgrind issues while accessing complete_to->...
Nathan Cutler
03:24 PM Backport #26910: luminous: PGLog.cc: saw valgrind issues while accessing complete_to->version
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/23211
merged
Yuri Weinstein
04:11 PM Bug #25198 (Resolved): FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
Nathan Cutler
04:11 PM Backport #25199 (Resolved): luminous: FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
Nathan Cutler
03:24 PM Backport #25199: luminous: FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23211
merged
Yuri Weinstein
04:11 PM Bug #25184 (Resolved): osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
Nathan Cutler
04:11 PM Backport #25219 (Resolved): luminous: osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
Nathan Cutler
03:24 PM Backport #25219: luminous: osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23211
mergedReviewed-by: Josh Durgin <jdurgin@redhat.com>
Yuri Weinstein
04:11 PM Feature #23979 (Resolved): Limit pg log length during recovery/backfill so that we don't run out ...
Nathan Cutler
04:10 PM Backport #24988 (Resolved): luminous: Limit pg log length during recovery/backfill so that we don...
Nathan Cutler
03:24 PM Backport #24988: luminous: Limit pg log length during recovery/backfill so that we don't run out ...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23211
merged
Yuri Weinstein
03:58 PM Bug #35955 (Resolved): ceph-objectstore-tool past_intervals broken
... Sage Weil
02:28 PM Bug #23879: test_mon_osdmap_prune.sh fails
... Kefu Chai
02:22 PM Bug #20086: LibRadosLockECPP.LockSharedDurPP gets EEXIST
http://pulpito.ceph.com/joshd-2018-09-12_06:44:56-rados-wip-luminous-cache-autotune-distro-basic-smithi/3010389/
<...
Josh Durgin
02:04 PM Bug #35947: mon_status doesn't populate outside_quorum when some mons are down
@sage: indeed :-).
Maybe rename the original use for "outside_quorum" to "outside_election" or something similar t...
Stefan Kooman
01:48 PM Bug #35947: mon_status doesn't populate outside_quorum when some mons are down
We could add a new field for monitors that are... not part of the quorum, but I'm not sure what I'd call it if not "o... Sage Weil
01:29 PM Bug #35947: mon_status doesn't populate outside_quorum when some mons are down
Yes, `outside quorum` is solely used to track which monitors are outside of the quorum during an election; once the e... Joao Eduardo Luis
12:52 PM Bug #35947: mon_status doesn't populate outside_quorum when some mons are down
Let me see if I get this right.
'After a successful election, `outside_quorum` is cleared."
^^ Do I understand ...
Stefan Kooman
12:45 PM Bug #35947: mon_status doesn't populate outside_quorum when some mons are down
`outside quorum` does not pertain to down monitors. We may change that if people think it's more obvious, but the mai... Joao Eduardo Luis
11:50 AM Bug #35947: mon_status doesn't populate outside_quorum when some mons are down
ceph mon_status -f json | jq '.outside_quorum'
[]
^^ HEALTH_OK
ceph mon_status -f json | jq '.outside_quorum'
...
Stefan Kooman
10:53 AM Bug #35947: mon_status doesn't populate outside_quorum when some mons are down
The structure in question is the mon_status output, so it would be useful if you could look at the output of the mon_... John Spray
08:05 AM Bug #35947 (New): mon_status doesn't populate outside_quorum when some mons are down
I noticed the "mon_outside_quorum' metric always returns "0", despite if there are mons outside quorum or not:
cep...
Stefan Kooman
02:02 PM Bug #35923 (Fix Under Review): "ceph_assert(values.size() == 2)" in PG::peek_map_epoch()
https://github.com/ceph/ceph/pull/24061 Sage Weil
01:58 PM Bug #35923 (In Progress): "ceph_assert(values.size() == 2)" in PG::peek_map_epoch()
This is fall-out from merge vs delete pg resurrection:... Sage Weil
05:22 AM Bug #35682: 34164d55c839acd35bbb1be5279e3e23e3bec1fd broke the librados examples
Thanks John, That's one of the options I'm looking into. Brad Hubbard

09/11/2018

08:05 PM Backport #35942 (In Progress): mimic: "ceph tell osd.x bench" writes resulting JSON to stderr ins...
Nathan Cutler
07:17 PM Backport #35942 (Resolved): mimic: "ceph tell osd.x bench" writes resulting JSON to stderr instea...
https://github.com/ceph/ceph/pull/24041 Nathan Cutler
08:01 PM Backport #35941 (In Progress): luminous: "ceph tell osd.x bench" writes resulting JSON to stderr ...
Nathan Cutler
07:17 PM Backport #35941 (Resolved): luminous: "ceph tell osd.x bench" writes resulting JSON to stderr ins...
https://github.com/ceph/ceph/pull/23680 Nathan Cutler
04:30 PM Bug #24022 (Pending Backport): "ceph tell osd.x bench" writes resulting JSON to stderr instead of...
Nathan Cutler
04:12 PM Bug #35924 (Fix Under Review): choose_acting picked want > pool size
https://github.com/ceph/ceph/pull/24035 Sage Weil
02:24 PM Bug #35924 (Resolved): choose_acting picked want > pool size
... Sage Weil
03:58 PM Bug #20694: osd/ReplicatedBackend.cc: 1417: FAILED assert(get_parent()->get_log().get_log().obje...
Seen in mimic: /a/yuriw-2018-09-10_16:59:58-rados-wip-yuri-testing-2018-09-10-1525-mimic-distro-basic-smithi/3002608/ Neha Ojha
12:57 PM Bug #35923: "ceph_assert(values.size() == 2)" in PG::peek_map_epoch()
#10629 has the same backtrace. Kefu Chai
12:55 PM Bug #35923 (Resolved): "ceph_assert(values.size() == 2)" in PG::peek_map_epoch()
now, there are two keys to check:... Kefu Chai
12:27 PM Bug #35833 (Resolved): error: 'unique_ptr' in namespace 'std' does not name a type when compiling...
Kefu Chai
12:24 PM Feature #35544 (Resolved): "osd df" should show OSD state
Kefu Chai
12:19 PM Bug #35682: 34164d55c839acd35bbb1be5279e3e23e3bec1fd broke the librados examples
I'm seeing the same thing.
I'm guessing that this is happening because the include of assert.h in buffer.h is pick...
John Spray
12:05 PM Bug #23879: test_mon_osdmap_prune.sh fails
/a/kchai-2018-09-11_09:51:05-rados-wip-kefu-testing-2018-09-10-1219-distro-basic-mira/3005452/teuthology.log
<pr...
Kefu Chai
10:45 AM Bug #35808: ceph osd ok-to-stop result dosen't match the real situation
It's a little bit odd that the ok-to-stop command said 4 PGs, but you actually had 5 PGs go incomplete, but basically... John Spray
09:03 AM Bug #35808: ceph osd ok-to-stop result dosen't match the real situation
I see you are using a pool min_size of 3, so no replicas is allowed to be offline and hence the result is expected? xie xingguo

09/10/2018

09:00 PM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
The test code that needs to be fixed is only present in Mimic and master. David Zafman
03:24 PM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
https://github.com/ceph/ceph/pull/24013 David Zafman
03:36 PM Backport #35909 (Resolved): mimic: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
https://github.com/ceph/ceph/pull/24017 David Zafman

09/09/2018

08:15 AM Tasks #25186 (In Progress): setup repo for building dependencies like boost, rocksdb, which are n...
https://github.com/ceph/ceph/pull/23995 Kefu Chai

09/08/2018

05:05 PM Bug #24975 (Resolved): valgrind-leaks.yaml: expected valgrind issues and found none
Nathan Cutler
05:05 PM Backport #24992 (Resolved): mimic: valgrind-leaks.yaml: expected valgrind issues and found none
Nathan Cutler
03:33 PM Backport #24992: mimic: valgrind-leaks.yaml: expected valgrind issues and found none
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23744
merged
Yuri Weinstein
06:31 AM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
Adding jewel because we are seeing an "osd-scrub-repair.sh" make check issue in jewel (not sure if it's this same iss... Nathan Cutler
02:56 AM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
It turns out this is just a difference in the iterator for the function throwing the exception.... David Zafman
01:51 AM Bug #35546 (Resolved): RADOS: probably missing clone location for async_recovery_targets
xie xingguo

09/07/2018

11:40 PM Bug #35833 (In Progress): error: 'unique_ptr' in namespace 'std' does not name a type when compil...
https://github.com/ceph/ceph/pull/23992 Brad Hubbard
07:28 AM Bug #35833 (Resolved): error: 'unique_ptr' in namespace 'std' does not name a type when compiling...
We should be able to compile a librados client program, such as examples/librados/hello_world.cc, on a system with li... Brad Hubbard
11:26 PM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
-https://github.com/ceph/ceph/pull/23991-
David Zafman
11:15 PM Bug #35845 (In Progress): osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
David Zafman
06:43 PM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed

This must be caused by differences in the grep command on different distributions. It passes sometimes including o...
David Zafman
04:39 PM Bug #35845 (Resolved): osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
... Neha Ojha
11:09 PM Backport #35855 (Resolved): mimic: should remove mentioning of "scrubq" in ceph(8) manpage
https://github.com/ceph/ceph/pull/24210 Patrick Donnelly
11:09 PM Backport #35854 (Resolved): luminous: should remove mentioning of "scrubq" in ceph(8) manpage
https://github.com/ceph/ceph/pull/24211 Patrick Donnelly
09:51 PM Feature #85: osd: pg_num shrink
Yeah, merged now! Sage Weil
07:38 PM Feature #85: osd: pg_num shrink
Sage, were you going to merge https://github.com/ceph/ceph/pull/20469 ? Nathan Cutler
06:55 PM Feature #85 (Resolved): osd: pg_num shrink
\o/ Sage Weil
08:54 PM Bug #24801: PG num_bytes becomes huge
Fix is included in pull request https://github.com/ceph/ceph/pull/22797 David Zafman
06:57 PM Bug #22165 (Resolved): split pg not actually created, gets stuck in state unknown
by commit fdfc5c64 Sage Weil
06:56 PM Bug #26970 (Resolved): src/osd/OSDMap.h: 1065: FAILED assert(__null != pool)
Sage Weil
06:05 PM Bug #35849 (Closed): mimic: test_envlibrados_for_rocksdb.sh: build failed with error: #endif with...
... Neha Ojha
05:28 PM Bug #35847 (Resolved): wrong cluster_network doesn't cause any errors and ends up using monitor n...
1) set any random valid cluster network eg: cluster_network: 17.20.20.0/24
2) setup cluster , notice the cluster com...
Vasu Kulkarni
02:10 PM Bug #20694: osd/ReplicatedBackend.cc: 1417: FAILED assert(get_parent()->get_log().get_log().obje...
/a/sage-2018-09-06_16:02:58-rados-wip-sage-testing-2018-09-05-1559-distro-basic-smithi/2985475 Sage Weil
12:37 PM Backport #35067 (In Progress): luminous: deep scrub cannot find the bitrot if the object is cached
-https://github.com/ceph/ceph/pull/23980- Prashant D
11:12 AM Bug #35813 (Pending Backport): should remove mentioning of "scrubq" in ceph(8) manpage
Kefu Chai
10:22 AM Backport #35844 (Resolved): luminous: objecter cannot resend split-dropped op when racing with co...
https://github.com/ceph/ceph/pull/24188 Nathan Cutler
10:22 AM Backport #35843 (Resolved): mimic: objecter cannot resend split-dropped op when racing with con r...
https://github.com/ceph/ceph/pull/24970 Nathan Cutler
10:20 AM Backport #35836 (Resolved): mimic: mon: mgr options not parse propertly
https://github.com/ceph/ceph/pull/24176 Nathan Cutler

09/06/2018

09:21 PM Support #27203: osd down while bucket is deleting
The heartbeat timing out like that means the OSD is overloaded - in particular delete operations for RGW can overwhel... Josh Durgin
02:11 PM Bug #35813 (Fix Under Review): should remove mentioning of "scrubq" in ceph(8) manpage
Kefu Chai
02:09 PM Bug #35813 (Resolved): should remove mentioning of "scrubq" in ceph(8) manpage
https://github.com/ceph/ceph/pull/23959 Kefu Chai
01:59 PM Bug #35076 (Pending Backport): mon: mgr options not parse propertly
Kefu Chai
11:16 AM Bug #27206 (Resolved): rpm: should change ceph-mgr package depency from py-bcrypt to python2-bcrypt
Nathan Cutler
11:15 AM Backport #27212 (Resolved): mimic: rpm: should change ceph-mgr package depency from py-bcrypt to ...
Nathan Cutler
09:57 AM Bug #35810 (Can't reproduce): FAILED assert(entries.begin()->version > info.last_update)
... Chang Liu
09:01 AM Bug #35808 (Rejected): ceph osd ok-to-stop result dosen't match the real situation
The cluster is in healthy status, when I tried to run ceph osd ok-to-stop 0 it returns... frank lin
06:41 AM Backport #25144 (Resolved): mimic: Automatically set expected_num_objects for new pools with >=10...
Nathan Cutler
06:41 AM Feature #22750 (Resolved): libradosstriper conditional compile
w00t! Nathan Cutler
06:40 AM Backport #27213 (Resolved): mimic: libradosstriper conditional compile
Nathan Cutler
06:37 AM Backport #32108 (Resolved): mimic: object errors found in be_select_auth_object() aren't logged t...
Nathan Cutler
06:26 AM Bug #26940 (Resolved): force-create-pg broken
Nathan Cutler
06:26 AM Backport #34532 (Resolved): mimic: force-create-pg broken
Nathan Cutler
06:08 AM Backport #35068 (Resolved): mimic: deep scrub cannot find the bitrot if the object is cached
Nathan Cutler
06:06 AM Backport #26907 (Resolved): mimic: kv: MergeOperator name() returns string, and caller calls c_st...
Nathan Cutler
05:51 AM Backport #26909 (Resolved): mimic: PGLog.cc: saw valgrind issues while accessing complete_to->ver...
Nathan Cutler
05:50 AM Backport #25220 (Resolved): mimic: osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
Nathan Cutler
05:50 AM Backport #25200 (Resolved): mimic: FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
Nathan Cutler
05:50 AM Backport #24989 (Resolved): mimic: Limit pg log length during recovery/backfill so that we don't ...
Nathan Cutler
05:00 AM Bug #25153: output format is invalid of the crush tree json dumper
New commit to solve the review problems: https://github.com/ceph/ceph/pull/23319/commits/fa1056cfc32ce3bf932d7c71f281... Oshyn Song
02:36 AM Bug #27988 (In Progress): Warn if queue of scrubs ready to run exceeds some threshold
https://github.com/ceph/ceph/pull/23848 David Zafman
12:33 AM Bug #22544 (Pending Backport): objecter cannot resend split-dropped op when racing with con reset
Kefu Chai

09/05/2018

09:52 PM Backport #25144: mimic: Automatically set expected_num_objects for new pools with >=100 PGs per OSD
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23860
merged
Yuri Weinstein
09:50 PM Backport #27213: mimic: libradosstriper conditional compile
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23869
merged
Yuri Weinstein
09:43 PM Backport #26931 (Resolved): mimic: scrub livelock
Sage Weil
09:42 PM Backport #25176 (Resolved): mimic: osd,mon: increase mon_max_pg_per_osd to 300
Sage Weil
09:42 PM Backport #25204 (Resolved): mimic: rados python bindings use prval from stack
Sage Weil
09:39 PM Backport #32108: mimic: object errors found in be_select_auth_object() aren't logged the same
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23870
mergedReviewed-by: David Zafman <dzafman@redhat.com>
Yuri Weinstein
09:38 PM Backport #34532: mimic: force-create-pg broken
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23872
merged
Yuri Weinstein
09:37 PM Backport #35068: mimic: deep scrub cannot find the bitrot if the object is cached
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23873
merged
Yuri Weinstein
09:32 PM Backport #26909: mimic: PGLog.cc: saw valgrind issues while accessing complete_to->version
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/23403
merged
Yuri Weinstein
09:32 PM Backport #25220: mimic: osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23403
merged
Yuri Weinstein
09:32 PM Backport #25200: mimic: FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23403
merged
Yuri Weinstein
09:32 PM Backport #24989: mimic: Limit pg log length during recovery/backfill so that we don't run out of ...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23403
merged
Yuri Weinstein
09:24 PM Backport #26907: mimic: kv: MergeOperator name() returns string, and caller calls c_str() on the ...
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/23865
merged
Yuri Weinstein
10:33 AM Feature #35687 (New): rgw: storing and reading total usage data to construct rgw service monitor ...
There are problems for the current rgw usage data storing and reading implementation:
1. The usage data will be ac...
Oshyn Song
05:08 AM Bug #35682 (Resolved): 34164d55c839acd35bbb1be5279e3e23e3bec1fd broke the librados examples
... Brad Hubbard
12:58 AM Bug #35546 (Resolved): RADOS: probably missing clone location for async_recovery_targets
https://github.com/ceph/ceph/pull/23895 xie xingguo

09/04/2018

11:33 PM Feature #35545: mon: show warning when running with an even number of mons
https://github.com/ceph/ceph/pull/23922 Paul Emmerich
11:16 PM Feature #35545 (New): mon: show warning when running with an even number of mons
People seem to like configuring clusters with 4 monitors for some reason. I've seen this more than once in the wild.
Paul Emmerich
09:48 PM Feature #35544: "osd df" should show OSD state
Implementation is here: https://github.com/ceph/ceph/pull/23921 Paul Emmerich
09:31 PM Feature #35544 (Resolved): "osd df" should show OSD state
It's midly irritating that "osd df (tree)" doesn't shows the osd status while "osd tree" does. Paul Emmerich
06:40 PM Bug #35542: Backfill and recovery should validate all checksums
Nope, 12.2.6 was the one that didn't handle checksums properly. So this looks like a real issue, although I think we ... Greg Farnum
06:27 PM Bug #35542: Backfill and recovery should validate all checksums
Oh, this may just be 12.2.5 being broken? In which case we can close. Greg Farnum
06:27 PM Bug #35542 (Won't Fix): Backfill and recovery should validate all checksums
From the thread "Copying without crc check when peering may lack reliability" on ceph-devel, it appears that backfill... Greg Farnum
04:45 PM Feature #19944 (Rejected): [RFE]: add option/support config persistence with ceph tell command
This seems to be addressed by the centralized config introduced in mimic. Joao Eduardo Luis
01:49 PM Bug #21557: osd.6 found snap mapper error on pg 2.0 oid 2:0e781f33:::smithi14431805-379 ... :187 ...
Another one: ... Sage Weil
12:09 PM Bug #34529 (Resolved): cbt tests in rados qa suite fails
This was a result of http://status.sepia.ceph.com/incident/3676
dmick restarted the VM
David Galloway
11:38 AM Tasks #25186: setup repo for building dependencies like boost, rocksdb, which are not provided by...
for building ceph-libboost, use https://github.com/tchaikov/boost... Kefu Chai
 

Also available in: Atom