Project

General

Profile

Activity

From 08/13/2018 to 09/11/2018

09/11/2018

08:05 PM Backport #35942 (In Progress): mimic: "ceph tell osd.x bench" writes resulting JSON to stderr ins...
Nathan Cutler
07:17 PM Backport #35942 (Resolved): mimic: "ceph tell osd.x bench" writes resulting JSON to stderr instea...
https://github.com/ceph/ceph/pull/24041 Nathan Cutler
08:01 PM Backport #35941 (In Progress): luminous: "ceph tell osd.x bench" writes resulting JSON to stderr ...
Nathan Cutler
07:17 PM Backport #35941 (Resolved): luminous: "ceph tell osd.x bench" writes resulting JSON to stderr ins...
https://github.com/ceph/ceph/pull/23680 Nathan Cutler
04:30 PM Bug #24022 (Pending Backport): "ceph tell osd.x bench" writes resulting JSON to stderr instead of...
Nathan Cutler
04:12 PM Bug #35924 (Fix Under Review): choose_acting picked want > pool size
https://github.com/ceph/ceph/pull/24035 Sage Weil
02:24 PM Bug #35924 (Resolved): choose_acting picked want > pool size
... Sage Weil
03:58 PM Bug #20694: osd/ReplicatedBackend.cc: 1417: FAILED assert(get_parent()->get_log().get_log().obje...
Seen in mimic: /a/yuriw-2018-09-10_16:59:58-rados-wip-yuri-testing-2018-09-10-1525-mimic-distro-basic-smithi/3002608/ Neha Ojha
12:57 PM Bug #35923: "ceph_assert(values.size() == 2)" in PG::peek_map_epoch()
#10629 has the same backtrace. Kefu Chai
12:55 PM Bug #35923 (Resolved): "ceph_assert(values.size() == 2)" in PG::peek_map_epoch()
now, there are two keys to check:... Kefu Chai
12:27 PM Bug #35833 (Resolved): error: 'unique_ptr' in namespace 'std' does not name a type when compiling...
Kefu Chai
12:24 PM Feature #35544 (Resolved): "osd df" should show OSD state
Kefu Chai
12:19 PM Bug #35682: 34164d55c839acd35bbb1be5279e3e23e3bec1fd broke the librados examples
I'm seeing the same thing.
I'm guessing that this is happening because the include of assert.h in buffer.h is pick...
John Spray
12:05 PM Bug #23879: test_mon_osdmap_prune.sh fails
/a/kchai-2018-09-11_09:51:05-rados-wip-kefu-testing-2018-09-10-1219-distro-basic-mira/3005452/teuthology.log
<pr...
Kefu Chai
10:45 AM Bug #35808: ceph osd ok-to-stop result dosen't match the real situation
It's a little bit odd that the ok-to-stop command said 4 PGs, but you actually had 5 PGs go incomplete, but basically... John Spray
09:03 AM Bug #35808: ceph osd ok-to-stop result dosen't match the real situation
I see you are using a pool min_size of 3, so no replicas is allowed to be offline and hence the result is expected? xie xingguo

09/10/2018

09:00 PM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
The test code that needs to be fixed is only present in Mimic and master. David Zafman
03:24 PM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
https://github.com/ceph/ceph/pull/24013 David Zafman
03:36 PM Backport #35909 (Resolved): mimic: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
https://github.com/ceph/ceph/pull/24017 David Zafman

09/09/2018

08:15 AM Tasks #25186 (In Progress): setup repo for building dependencies like boost, rocksdb, which are n...
https://github.com/ceph/ceph/pull/23995 Kefu Chai

09/08/2018

05:05 PM Bug #24975 (Resolved): valgrind-leaks.yaml: expected valgrind issues and found none
Nathan Cutler
05:05 PM Backport #24992 (Resolved): mimic: valgrind-leaks.yaml: expected valgrind issues and found none
Nathan Cutler
03:33 PM Backport #24992: mimic: valgrind-leaks.yaml: expected valgrind issues and found none
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23744
merged
Yuri Weinstein
06:31 AM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
Adding jewel because we are seeing an "osd-scrub-repair.sh" make check issue in jewel (not sure if it's this same iss... Nathan Cutler
02:56 AM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
It turns out this is just a difference in the iterator for the function throwing the exception.... David Zafman
01:51 AM Bug #35546 (Resolved): RADOS: probably missing clone location for async_recovery_targets
xie xingguo

09/07/2018

11:40 PM Bug #35833 (In Progress): error: 'unique_ptr' in namespace 'std' does not name a type when compil...
https://github.com/ceph/ceph/pull/23992 Brad Hubbard
07:28 AM Bug #35833 (Resolved): error: 'unique_ptr' in namespace 'std' does not name a type when compiling...
We should be able to compile a librados client program, such as examples/librados/hello_world.cc, on a system with li... Brad Hubbard
11:26 PM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
-https://github.com/ceph/ceph/pull/23991-
David Zafman
11:15 PM Bug #35845 (In Progress): osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
David Zafman
06:43 PM Bug #35845: osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed

This must be caused by differences in the grep command on different distributions. It passes sometimes including o...
David Zafman
04:39 PM Bug #35845 (Resolved): osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
... Neha Ojha
11:09 PM Backport #35855 (Resolved): mimic: should remove mentioning of "scrubq" in ceph(8) manpage
https://github.com/ceph/ceph/pull/24210 Patrick Donnelly
11:09 PM Backport #35854 (Resolved): luminous: should remove mentioning of "scrubq" in ceph(8) manpage
https://github.com/ceph/ceph/pull/24211 Patrick Donnelly
09:51 PM Feature #85: osd: pg_num shrink
Yeah, merged now! Sage Weil
07:38 PM Feature #85: osd: pg_num shrink
Sage, were you going to merge https://github.com/ceph/ceph/pull/20469 ? Nathan Cutler
06:55 PM Feature #85 (Resolved): osd: pg_num shrink
\o/ Sage Weil
08:54 PM Bug #24801: PG num_bytes becomes huge
Fix is included in pull request https://github.com/ceph/ceph/pull/22797 David Zafman
06:57 PM Bug #22165 (Resolved): split pg not actually created, gets stuck in state unknown
by commit fdfc5c64 Sage Weil
06:56 PM Bug #26970 (Resolved): src/osd/OSDMap.h: 1065: FAILED assert(__null != pool)
Sage Weil
06:05 PM Bug #35849 (Closed): mimic: test_envlibrados_for_rocksdb.sh: build failed with error: #endif with...
... Neha Ojha
05:28 PM Bug #35847 (Resolved): wrong cluster_network doesn't cause any errors and ends up using monitor n...
1) set any random valid cluster network eg: cluster_network: 17.20.20.0/24
2) setup cluster , notice the cluster com...
Vasu Kulkarni
02:10 PM Bug #20694: osd/ReplicatedBackend.cc: 1417: FAILED assert(get_parent()->get_log().get_log().obje...
/a/sage-2018-09-06_16:02:58-rados-wip-sage-testing-2018-09-05-1559-distro-basic-smithi/2985475 Sage Weil
12:37 PM Backport #35067 (In Progress): luminous: deep scrub cannot find the bitrot if the object is cached
-https://github.com/ceph/ceph/pull/23980- Prashant D
11:12 AM Bug #35813 (Pending Backport): should remove mentioning of "scrubq" in ceph(8) manpage
Kefu Chai
10:22 AM Backport #35844 (Resolved): luminous: objecter cannot resend split-dropped op when racing with co...
https://github.com/ceph/ceph/pull/24188 Nathan Cutler
10:22 AM Backport #35843 (Resolved): mimic: objecter cannot resend split-dropped op when racing with con r...
https://github.com/ceph/ceph/pull/24970 Nathan Cutler
10:20 AM Backport #35836 (Resolved): mimic: mon: mgr options not parse propertly
https://github.com/ceph/ceph/pull/24176 Nathan Cutler

09/06/2018

09:21 PM Support #27203: osd down while bucket is deleting
The heartbeat timing out like that means the OSD is overloaded - in particular delete operations for RGW can overwhel... Josh Durgin
02:11 PM Bug #35813 (Fix Under Review): should remove mentioning of "scrubq" in ceph(8) manpage
Kefu Chai
02:09 PM Bug #35813 (Resolved): should remove mentioning of "scrubq" in ceph(8) manpage
https://github.com/ceph/ceph/pull/23959 Kefu Chai
01:59 PM Bug #35076 (Pending Backport): mon: mgr options not parse propertly
Kefu Chai
11:16 AM Bug #27206 (Resolved): rpm: should change ceph-mgr package depency from py-bcrypt to python2-bcrypt
Nathan Cutler
11:15 AM Backport #27212 (Resolved): mimic: rpm: should change ceph-mgr package depency from py-bcrypt to ...
Nathan Cutler
09:57 AM Bug #35810 (Can't reproduce): FAILED assert(entries.begin()->version > info.last_update)
... Chang Liu
09:01 AM Bug #35808 (Rejected): ceph osd ok-to-stop result dosen't match the real situation
The cluster is in healthy status, when I tried to run ceph osd ok-to-stop 0 it returns... frank lin
06:41 AM Backport #25144 (Resolved): mimic: Automatically set expected_num_objects for new pools with >=10...
Nathan Cutler
06:41 AM Feature #22750 (Resolved): libradosstriper conditional compile
w00t! Nathan Cutler
06:40 AM Backport #27213 (Resolved): mimic: libradosstriper conditional compile
Nathan Cutler
06:37 AM Backport #32108 (Resolved): mimic: object errors found in be_select_auth_object() aren't logged t...
Nathan Cutler
06:26 AM Bug #26940 (Resolved): force-create-pg broken
Nathan Cutler
06:26 AM Backport #34532 (Resolved): mimic: force-create-pg broken
Nathan Cutler
06:08 AM Backport #35068 (Resolved): mimic: deep scrub cannot find the bitrot if the object is cached
Nathan Cutler
06:06 AM Backport #26907 (Resolved): mimic: kv: MergeOperator name() returns string, and caller calls c_st...
Nathan Cutler
05:51 AM Backport #26909 (Resolved): mimic: PGLog.cc: saw valgrind issues while accessing complete_to->ver...
Nathan Cutler
05:50 AM Backport #25220 (Resolved): mimic: osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
Nathan Cutler
05:50 AM Backport #25200 (Resolved): mimic: FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
Nathan Cutler
05:50 AM Backport #24989 (Resolved): mimic: Limit pg log length during recovery/backfill so that we don't ...
Nathan Cutler
05:00 AM Bug #25153: output format is invalid of the crush tree json dumper
New commit to solve the review problems: https://github.com/ceph/ceph/pull/23319/commits/fa1056cfc32ce3bf932d7c71f281... Oshyn Song
02:36 AM Bug #27988 (In Progress): Warn if queue of scrubs ready to run exceeds some threshold
https://github.com/ceph/ceph/pull/23848 David Zafman
12:33 AM Bug #22544 (Pending Backport): objecter cannot resend split-dropped op when racing with con reset
Kefu Chai

09/05/2018

09:52 PM Backport #25144: mimic: Automatically set expected_num_objects for new pools with >=100 PGs per OSD
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23860
merged
Yuri Weinstein
09:50 PM Backport #27213: mimic: libradosstriper conditional compile
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23869
merged
Yuri Weinstein
09:43 PM Backport #26931 (Resolved): mimic: scrub livelock
Sage Weil
09:42 PM Backport #25176 (Resolved): mimic: osd,mon: increase mon_max_pg_per_osd to 300
Sage Weil
09:42 PM Backport #25204 (Resolved): mimic: rados python bindings use prval from stack
Sage Weil
09:39 PM Backport #32108: mimic: object errors found in be_select_auth_object() aren't logged the same
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23870
mergedReviewed-by: David Zafman <dzafman@redhat.com>
Yuri Weinstein
09:38 PM Backport #34532: mimic: force-create-pg broken
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23872
merged
Yuri Weinstein
09:37 PM Backport #35068: mimic: deep scrub cannot find the bitrot if the object is cached
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23873
merged
Yuri Weinstein
09:32 PM Backport #26909: mimic: PGLog.cc: saw valgrind issues while accessing complete_to->version
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/23403
merged
Yuri Weinstein
09:32 PM Backport #25220: mimic: osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23403
merged
Yuri Weinstein
09:32 PM Backport #25200: mimic: FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23403
merged
Yuri Weinstein
09:32 PM Backport #24989: mimic: Limit pg log length during recovery/backfill so that we don't run out of ...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23403
merged
Yuri Weinstein
09:24 PM Backport #26907: mimic: kv: MergeOperator name() returns string, and caller calls c_str() on the ...
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/23865
merged
Yuri Weinstein
10:33 AM Feature #35687 (New): rgw: storing and reading total usage data to construct rgw service monitor ...
There are problems for the current rgw usage data storing and reading implementation:
1. The usage data will be ac...
Oshyn Song
05:08 AM Bug #35682 (Resolved): 34164d55c839acd35bbb1be5279e3e23e3bec1fd broke the librados examples
... Brad Hubbard
12:58 AM Bug #35546 (Resolved): RADOS: probably missing clone location for async_recovery_targets
https://github.com/ceph/ceph/pull/23895 xie xingguo

09/04/2018

11:33 PM Feature #35545: mon: show warning when running with an even number of mons
https://github.com/ceph/ceph/pull/23922 Paul Emmerich
11:16 PM Feature #35545 (New): mon: show warning when running with an even number of mons
People seem to like configuring clusters with 4 monitors for some reason. I've seen this more than once in the wild.
Paul Emmerich
09:48 PM Feature #35544: "osd df" should show OSD state
Implementation is here: https://github.com/ceph/ceph/pull/23921 Paul Emmerich
09:31 PM Feature #35544 (Resolved): "osd df" should show OSD state
It's midly irritating that "osd df (tree)" doesn't shows the osd status while "osd tree" does. Paul Emmerich
06:40 PM Bug #35542: Backfill and recovery should validate all checksums
Nope, 12.2.6 was the one that didn't handle checksums properly. So this looks like a real issue, although I think we ... Greg Farnum
06:27 PM Bug #35542: Backfill and recovery should validate all checksums
Oh, this may just be 12.2.5 being broken? In which case we can close. Greg Farnum
06:27 PM Bug #35542 (Won't Fix): Backfill and recovery should validate all checksums
From the thread "Copying without crc check when peering may lack reliability" on ceph-devel, it appears that backfill... Greg Farnum
04:45 PM Feature #19944 (Rejected): [RFE]: add option/support config persistence with ceph tell command
This seems to be addressed by the centralized config introduced in mimic. Joao Eduardo Luis
01:49 PM Bug #21557: osd.6 found snap mapper error on pg 2.0 oid 2:0e781f33:::smithi14431805-379 ... :187 ...
Another one: ... Sage Weil
12:09 PM Bug #34529 (Resolved): cbt tests in rados qa suite fails
This was a result of http://status.sepia.ceph.com/incident/3676
dmick restarted the VM
David Galloway
11:38 AM Tasks #25186: setup repo for building dependencies like boost, rocksdb, which are not provided by...
for building ceph-libboost, use https://github.com/tchaikov/boost... Kefu Chai

09/02/2018

05:07 PM Bug #23352: osd: segfaults under normal operation
Phat Le Ton wrote:
> I've just seen 12.2.8 release, Was your patch included in this release ?
Yes. See https://tr...
Nathan Cutler
04:35 PM Bug #23352: osd: segfaults under normal operation
Brad Hubbard wrote:
> I've created a test package here based on 12.2.7 and including the one line patch above.
>
...
Phat Le Ton
01:30 PM Backport #35068 (In Progress): mimic: deep scrub cannot find the bitrot if the object is cached
Nathan Cutler
01:17 PM Backport #34532 (In Progress): mimic: force-create-pg broken
Nathan Cutler
12:58 PM Backport #32106 (In Progress): luminous: object errors found in be_select_auth_object() aren't lo...
Nathan Cutler
12:44 PM Backport #32108 (In Progress): mimic: object errors found in be_select_auth_object() aren't logge...
Nathan Cutler
12:36 PM Backport #27213 (In Progress): mimic: libradosstriper conditional compile
Nathan Cutler
12:29 PM Feature #22750: libradosstriper conditional compile
https://github.com/ceph/ceph/pull/21983 Nathan Cutler
12:26 PM Backport #27212 (In Progress): mimic: rpm: should change ceph-mgr package depency from py-bcrypt ...
Nathan Cutler
12:22 PM Backport #26910 (In Progress): luminous: PGLog.cc: saw valgrind issues while accessing complete_t...
Nathan Cutler
12:18 PM Backport #26909 (In Progress): mimic: PGLog.cc: saw valgrind issues while accessing complete_to->...
Nathan Cutler
12:08 PM Backport #26908 (In Progress): luminous: kv: MergeOperator name() returns string, and caller call...
Nathan Cutler
12:06 PM Backport #26907 (In Progress): mimic: kv: MergeOperator name() returns string, and caller calls c...
Nathan Cutler
12:04 PM Backport #25203 (In Progress): luminous: rados python bindings use prval from stack
Nathan Cutler
12:03 PM Backport #25204 (In Progress): mimic: rados python bindings use prval from stack
Nathan Cutler
12:00 PM Backport #25177 (In Progress): luminous: osd,mon: increase mon_max_pg_per_osd to 300
Nathan Cutler
11:59 AM Backport #25176 (In Progress): mimic: osd,mon: increase mon_max_pg_per_osd to 300
Nathan Cutler
11:52 AM Backport #25144 (In Progress): mimic: Automatically set expected_num_objects for new pools with >...
Nathan Cutler
11:49 AM Backport #24992 (In Progress): mimic: valgrind-leaks.yaml: expected valgrind issues and found none
Nathan Cutler

09/01/2018

08:49 PM Bug #22544 (Fix Under Review): objecter cannot resend split-dropped op when racing with con reset
https://github.com/ceph/ceph/pull/23850 Sage Weil
08:43 PM Bug #22544: objecter cannot resend split-dropped op when racing with con reset
Here, it happened:... Sage Weil
07:20 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Some steps tried to reproduce the bug:
1. Create a luminous cluster running in Kubernetes using hostNetwork and th...
Dexter John Genterone

08/31/2018

10:08 PM Bug #35076 (Resolved): mon: mgr options not parse propertly
... Sage Weil
05:17 PM Bug #35075 (New): copy-get stuck sending osd_op
... Sage Weil
11:07 AM Backport #35071 (Resolved): mimic: FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::p...
https://github.com/ceph/ceph/pull/24918 Nathan Cutler
11:06 AM Backport #35068 (Resolved): mimic: deep scrub cannot find the bitrot if the object is cached
https://github.com/ceph/ceph/pull/23873 Nathan Cutler
11:06 AM Backport #35067 (Resolved): luminous: deep scrub cannot find the bitrot if the object is cached
https://github.com/ceph/ceph/pull/24802 Nathan Cutler
08:53 AM Bug #34541 (Pending Backport): deep scrub cannot find the bitrot if the object is cached
https://github.com/ceph/ceph/pull/23629 Kefu Chai
08:53 AM Bug #34541 (Resolved): deep scrub cannot find the bitrot if the object is cached
quote from https://github.com/ceph/ceph/pull/23629
> Say a object who has data caches, but in a while later, cache...
Kefu Chai

08/30/2018

03:20 PM Backport #34532 (Resolved): mimic: force-create-pg broken
https://github.com/ceph/ceph/pull/23872 Nathan Cutler
01:53 PM Bug #26940 (Pending Backport): force-create-pg broken
Sage Weil
12:06 PM Bug #34529 (Resolved): cbt tests in rados qa suite fails
seems http://drop.ceph.com/qa/cosbench-0.4.2.c3.1.zip is not reachable anymore.... Kefu Chai
05:10 AM Backport #26992 (In Progress): luminous: discover_all_missing() not always called during activating
https://github.com/ceph/ceph/pull/23817 Prashant D

08/29/2018

09:51 PM Bug #25076 (Duplicate): MON crash when upgrading luminous v12.2.7 -> mimic v13.2.0 during ceph-fu...
Sage Weil
09:29 PM Bug #34321 (New): OSD crash because of DBObjectMap.cc: 662: FAILED assert(state.legacy)
Version: 12.2.7
The following crash is observed during normal operation of the cluster, so no particular steps to ...
Maks Kowalik
08:08 PM Bug #27988: Warn if queue of scrubs ready to run exceeds some threshold

I'm want to fix 3 things here. First, user submitted scrubs are queued as due to occur immediately, but overdue sc...
David Zafman
05:25 PM Bug #24612 (Pending Backport): FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::prune...
Sage Weil
03:13 PM Bug #26994 (Resolved): test_module_commands (tasks.mgr.test_module_selftest.TestModuleSelftest) f...
Kefu Chai

08/28/2018

08:23 PM Bug #24033 (Resolved): rados: not all exceptions accept keyargs
Nathan Cutler
08:22 PM Backport #25178 (Resolved): mimic: rados: not all exceptions accept keyargs
Nathan Cutler
07:53 PM Backport #25178: mimic: rados: not all exceptions accept keyargs
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23335
merged
Yuri Weinstein
12:42 PM Bug #33561 (New): PG repair doesn't start on an inconsistent group
Version: 12.2.7
Issue timeline:
1.Deep-scrub discovered inconsistency in one group on a pool with 4 replicas - the ...
Maks Kowalik
12:33 PM Bug #33420 (New): Forced deep-scrub doesn't start
Version: 12.2.7
Issue timeline:
1. Cyclic deep-scrub discovered inconsistency:
2018-08-23 17:21:07.933458 osd....
Maks Kowalik
11:11 AM Backport #32108 (Resolved): mimic: object errors found in be_select_auth_object() aren't logged t...
https://github.com/ceph/ceph/pull/23870 Nathan Cutler
11:11 AM Backport #32106 (Resolved): luminous: object errors found in be_select_auth_object() aren't logge...
https://github.com/ceph/ceph/pull/23871 Nathan Cutler
05:23 AM Bug #27988: Warn if queue of scrubs ready to run exceeds some threshold
Talking with Sage, he believes there is already a warning status if you have scrubs that haven't run for more than 2x... David Turner

08/27/2018

09:21 PM Bug #20775 (Resolved): ceph_test_rados parameter error
Brad Hubbard
07:55 PM Bug #25182: Upmaps forgotten after restarting OSDs
I believe these log messages explain why the upmaps are being removed, but I'll attach the relevant section of the lo... Bryan Stillwell
06:39 PM Bug #25182: Upmaps forgotten after restarting OSDs
Bryan Stillwell wrote:
> What debugging logs would be helpful in figuring this out? I just restarted an OSD on my 1...
Sage Weil
06:07 PM Bug #25182: Upmaps forgotten after restarting OSDs
What debugging logs would be helpful in figuring this out? I just restarted an OSD on my 13.2.1-based cluster and al... Bryan Stillwell
06:44 PM Bug #23576: osd: active+clean+inconsistent pg will not scrub or repair
Created tracker https://tracker.ceph.com/issues/27988 to add warning about too many scrubs pending. David Zafman
04:26 PM Bug #23576: osd: active+clean+inconsistent pg will not scrub or repair
David Turner wrote:
> I came across this again as well and I did some more testing. As it turns out what resolved t...
David Turner
04:26 PM Bug #23576: osd: active+clean+inconsistent pg will not scrub or repair
I cam across this again as well and I did some more testing. As it turns out what resolved this issue for me was inc... David Turner
01:33 PM Bug #23576: osd: active+clean+inconsistent pg will not scrub or repair
Hi - we are still experiencing this issue on 12.2.7 (so latest Luminous version)... Jacek S.
06:43 PM Bug #27988 (Rejected): Warn if queue of scrubs ready to run exceeds some threshold

The sched_scrub_pg set could be scanned during a new insert and the number of scrubs that are ready to be run could...
David Zafman
05:18 PM Bug #27985 (Resolved): force-backfill sets forced_recovery instead of forced_backfill in 13.2.1
I've noticed that using force-backfill in Mimic seems to be broken. It sets forced_recovery instead of forced_backfi... Bryan Stillwell
04:17 AM Support #27203: osd down while bucket is deleting
Actually,this issue still upset me
-2> 2018-08-23 16:14:52.673287 7f3aeb536700 1 heartbeat_map is_healthy 'OS...
伟杰 谭

08/26/2018

12:50 PM Bug #24612: FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::prune_init()
https://github.com/ceph/ceph/pull/23742
Currently missing: a reproducer. Reproducing may not be trivial because th...
Joao Eduardo Luis

08/25/2018

08:42 PM Bug #27363 (New): 'rbd rm' does not clean tiered pool completly
mimic (13.2.1)
linux kernel: 4.18.3-1.el7.elrepo.x86_64
ceph osd crush rule create-replicated hddreplrule default...
Fyodor Ustinov
05:26 PM Bug #27362 (New): Wrong erasure pool MAX AVAIL size calculation with technique=reed_sol_r6_op
... Fyodor Ustinov
05:53 AM Bug #24022: "ceph tell osd.x bench" writes resulting JSON to stderr instead of stdout.
luminous backport https://github.com/ceph/ceph/pull/23680 Konstantin Shalygin

08/24/2018

05:14 PM Bug #25084 (Resolved): Attempt to read object that can't be repaired loops forever
David Zafman
05:13 PM Bug #25108 (Pending Backport): object errors found in be_select_auth_object() aren't logged the same
David Zafman
05:12 PM Bug #24801: PG num_bytes becomes huge

So far with assert added to object_stat_sum_t::add() we saw this. Still not sure why the num_bytes is off.
<pr...
David Zafman
12:54 PM Bug #24612 (In Progress): FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::prune_init()
Joao Eduardo Luis
02:00 AM Backport #26931 (In Progress): mimic: scrub livelock
https://github.com/ceph/ceph/pull/23722 Prashant D

08/23/2018

09:22 PM Backport #27213 (Resolved): mimic: libradosstriper conditional compile
https://github.com/ceph/ceph/pull/23869 Nathan Cutler
09:21 PM Backport #27212 (Resolved): mimic: rpm: should change ceph-mgr package depency from py-bcrypt to ...
https://github.com/ceph/ceph/pull/23868 Nathan Cutler
09:20 PM Bug #25057 (Resolved): jewel->luminous: osdmap crc mismatch
Nathan Cutler
09:20 PM Backport #25101 (Resolved): mimic: jewel->luminous: osdmap crc mismatch
Nathan Cutler
11:31 AM Feature #22750 (Pending Backport): libradosstriper conditional compile
Nathan Cutler
11:21 AM Feature #22750 (Resolved): libradosstriper conditional compile
Kefu Chai
11:28 AM Bug #27206 (Pending Backport): rpm: should change ceph-mgr package depency from py-bcrypt to pyth...
https://github.com/ceph/ceph/pull/23648 Kefu Chai
11:27 AM Bug #27206 (Resolved): rpm: should change ceph-mgr package depency from py-bcrypt to python2-bcrypt
Current deplist of ceph-mgr rpm package contains py-bcrypt depency which conflicts with python2-bcrypt needed for pyt... Kefu Chai
11:23 AM Bug #26998 (Resolved): IOPS churn with "osd op queue" = "mclock_opclass" or "mclock_client"
Kefu Chai
08:19 AM Support #27203: osd down while bucket is deleting
Format is ugly,my fault 伟杰 谭
07:59 AM Support #27203 (New): osd down while bucket is deleting
My environment is
[tanweijie@gz-ceph-52-202 ~]$ ceph --version
ceph version 12.2.5 (cad919881333ac92274171586c827e0...
伟杰 谭

08/22/2018

10:20 PM Feature #26975: Rados level IO priority for OSD operations
Do note that
1) "Messages" can already have priority, although its utility at this point is quite limited it's not t...
Greg Farnum
09:32 PM Bug #26880 (Resolved): ceph-base debian package compiled on ubuntu/xenial has unmet runtime depen...
Nathan Cutler
09:31 PM Backport #26881 (Resolved): mimic: ceph-base debian package compiled on ubuntu/xenial has unmet r...
Nathan Cutler
09:19 PM Bug #26971: failed to become clean before timeout expired
Looks like a PG is active+undersized state. Maybe the balancer screwed up? Greg Farnum
09:14 PM Backport #24359 (Resolved): mimic: osd: leaked Session on osd.7
Nathan Cutler
09:00 PM Bug #24875 (Resolved): OSD: still returning EIO instead of recovering objects on checksum errors
Nathan Cutler
09:00 PM Backport #25226 (Resolved): mimic: OSD: still returning EIO instead of recovering objects on chec...
Nathan Cutler
08:46 PM Backport #25101: mimic: jewel->luminous: osdmap crc mismatch
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23226
merged
Yuri Weinstein
05:37 PM Bug #27053: qa: thrashosds: "[ERR] : 2.0 has 1 objects unfound and apparently lost"
Similar failure seen in mimic: /a/yuriw-2018-08-21_23:27:39-rados-wip-yuri5-testing-2018-08-21-2033-mimic-distro-basi... Neha Ojha
03:39 PM Bug #27053 (New): qa: thrashosds: "[ERR] : 2.0 has 1 objects unfound and apparently lost"
This is for 12.2.8
Run: http://pulpito.ceph.com/yuriw-2018-08-21_16:17:40-rados-luminous-distro-basic-smithi/
Job...
Yuri Weinstein
05:26 PM Bug #27055 (New): mimic: FAILED assert((uint64_t)buf.st_size == expected) in SyntheticWorkloadSta...
... Neha Ojha
08:51 AM Bug #24956: osd: parent process need to restart log service after fork, or ceph-osd will not work...
PR:https://github.com/ceph/ceph/pull/23685 Hsiao-Yin Tseng
06:28 AM Bug #26994 (Fix Under Review): test_module_commands (tasks.mgr.test_module_selftest.TestModuleSel...
https://github.com/ceph/ceph/pull/23681 Kefu Chai
03:45 AM Bug #23352 (Resolved): osd: segfaults under normal operation
The patch is only relevant to the osds. Brad Hubbard
02:56 AM Bug #26998: IOPS churn with "osd op queue" = "mclock_opclass" or "mclock_client"
Kefu Chai
02:14 AM Bug #26998: IOPS churn with "osd op queue" = "mclock_opclass" or "mclock_client"
- https://github.com/ceph/dmclock/pull/58
- https://github.com/ceph/ceph/pull/23643
Kefu Chai
02:13 AM Bug #26998 (Resolved): IOPS churn with "osd op queue" = "mclock_opclass" or "mclock_client"
for more details on this issue, please refer to https://github.com/ceph/dmclock/pull/58 . in short, if "osd op queue"... Kefu Chai

08/21/2018

08:22 PM Bug #25146 (In Progress): "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:para...
Very early fix: https://github.com/rzarzynski/rocksdb/tree/wip-bug-25146.
The case appears more complicated as the...
Radoslaw Zarzynski
07:58 PM Bug #26880: ceph-base debian package compiled on ubuntu/xenial has unmet runtime dependencies
https://github.com/ceph/ceph/pull/23490 merged Yuri Weinstein
07:30 PM Bug #26994: test_module_commands (tasks.mgr.test_module_selftest.TestModuleSelftest) fails
Something like this will probably fix it... Noah Watkins
06:49 PM Bug #26994: test_module_commands (tasks.mgr.test_module_selftest.TestModuleSelftest) fails
Here's the culprit: hello isn't packaged so it can't announce its commands.... Noah Watkins
06:45 PM Bug #26994: test_module_commands (tasks.mgr.test_module_selftest.TestModuleSelftest) fails
The manager logs show all the modules except for `hello` being loaded... Noah Watkins
05:55 PM Bug #26994: test_module_commands (tasks.mgr.test_module_selftest.TestModuleSelftest) fails
I can't reproduce this... it is as if the monitor has not received a summary of commands from the manager at the the ... Noah Watkins
04:39 PM Bug #26994 (Resolved): test_module_commands (tasks.mgr.test_module_selftest.TestModuleSelftest) f...
in https://github.com/ceph/ceph/pull/23558/commits/00223d2364b5a6cc32eb5f83f5a642b5aef2c946 , hello is used for testi... Kefu Chai
04:03 PM Backport #26992 (Resolved): luminous: discover_all_missing() not always called during activating
https://github.com/ceph/ceph/pull/23817 Nathan Cutler
04:01 PM Feature #26975: Rados level IO priority for OSD operations
For "Rados level" I mean librados API at least, and implementation in OSD too. Марк Коренберг
03:59 PM Feature #26975 (New): Rados level IO priority for OSD operations
What I mean:
Suppose busy Ceph cluster.
Every OSD has many IO requests from clients in it's queue. Today, all r...
Марк Коренберг
12:56 AM Bug #26972 (Resolved): cluster [ERR] Error -2 reading object

http://qa-proxy.ceph.com/teuthology/dzafman-2018-08-17_08:14:49-rados-wip-zafman-testing4-distro-basic-smithi/29146...
David Zafman
12:42 AM Bug #26971 (Duplicate): failed to become clean before timeout expired

http://qa-proxy.ceph.com/teuthology/dzafman-2018-08-16_17:35:08-rados:thrash-wip-zafman-testing4-distro-basic-smith...
David Zafman
12:32 AM Bug #26970 (Resolved): src/osd/OSDMap.h: 1065: FAILED assert(__null != pool)

http://qa-proxy.ceph.com/teuthology/dzafman-2018-08-16_17:35:08-rados:thrash-wip-zafman-testing4-distro-basic-smith...
David Zafman

08/20/2018

11:19 PM Bug #22837 (Pending Backport): discover_all_missing() not always called during activating

Based on information from http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021512.html I'm marking ...
David Zafman
05:53 PM Feature #24232 (Fix Under Review): Add new command ceph mon status
Nathan Cutler

08/19/2018

03:12 PM Feature #26948 (Resolved): librados: add a way to get a count of omap vals in an iterator
https://github.com/ceph/ceph/pull/23593 Kefu Chai
01:58 PM Bug #24485: LibRadosTwoPoolsPP.ManifestUnset failure
/a/kchai-2018-08-19_13:01:23-rados-wip-kefu-testing-2018-08-19-1812-distro-basic-mira/2925024/ Kefu Chai

08/17/2018

09:10 PM Backport #24359: mimic: osd: leaked Session on osd.7
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22339
merged
Yuri Weinstein
02:27 PM Bug #26958 (Resolved): osd/ReplicatedBackend.cc: 1321: FAILED assert(get_parent()->get_log().get_...
... Sage Weil
09:36 AM Bug #26880 (Pending Backport): ceph-base debian package compiled on ubuntu/xenial has unmet runti...
Nathan Cutler
03:20 AM Feature #26955: os/filestore: Add switch to turn on/off filestore dir splitting
https://github.com/ceph/ceph/pull/23460
1. Refined HashIndex::must_split() to be more readable.
2. Introduced a h...
Zhi Zhang
03:19 AM Feature #26955 (New): os/filestore: Add switch to turn on/off filestore dir splitting
We had done pre-split and increased split multiple, etc, at the beginning of building cluster in order to reduce the ... Zhi Zhang
12:16 AM Bug #25108: object errors found in be_select_auth_object() aren't logged the same
David Zafman

08/16/2018

10:46 PM Backport #26870 (Resolved): mimic: osd: segfaults under normal operation
Brad Hubbard
05:58 PM Bug #24612: FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::prune_init()
/a/sage-2018-08-15_15:49:39-rados-wip-sage2-testing-2018-08-15-0731-distro-basic-smithi/2908178
Sage Weil

08/15/2018

11:40 PM Bug #25084 (Fix Under Review): Attempt to read object that can't be repaired loops forever
David Zafman
11:35 PM Backport #25227 (Resolved): luminous: OSD: still returning EIO instead of recovering objects on c...
David Zafman
02:36 PM Feature #26948 (Resolved): librados: add a way to get a count of omap vals in an iterator
We currently have functions like rados_read_op_omap_get_vals2 that hand back an iterator to a userland caller. There ... Jeff Layton

08/14/2018

10:43 PM Bug #24866: FAILED assert(0 == "past_interval start interval mismatch") in check_past_interval_bo...
Generally yes, but I havne't been able to reproduce to test a solution. I take it this has happened to you?
I'm h...
Sage Weil
01:34 PM Bug #24866: FAILED assert(0 == "past_interval start interval mismatch") in check_past_interval_bo...
Guys, is there a way for an OSD to recover from this error? Kuba Stańczak
09:58 PM Bug #26947 (Resolved): ENOENT on collection_move_rename from divergent activate
... Sage Weil
04:56 PM Bug #26940 (Fix Under Review): force-create-pg broken
https://github.com/ceph/ceph/pull/23572 Sage Weil
03:53 PM Bug #26940 (Resolved): force-create-pg broken
This commit -
https://github.com/ceph/ceph/commit/7797ed67d2f9140b7eb9f182b06d04233e1e309c
has introduced regressio...
Sage Weil
04:33 AM Backport #26908 (Need More Info): luminous: kv: MergeOperator name() returns string, and caller c...
Prashant D
04:33 AM Backport #26908 (In Progress): luminous: kv: MergeOperator name() returns string, and caller call...
https://github.com/ceph/ceph/pull/23566 Prashant D

08/13/2018

06:46 PM Backport #26932 (Resolved): luminous: scrub livelock
https://github.com/ceph/ceph/pull/24396 (initial backport)
https://github.com/ceph/ceph/pull/24659 (follow-on fix)
Nathan Cutler
06:46 PM Backport #26931 (Resolved): mimic: scrub livelock
https://github.com/ceph/ceph/pull/23722 Nathan Cutler
06:01 PM Bug #26890 (Pending Backport): scrub livelock
Sage Weil
07:38 AM Bug #20059: miscounting degraded objects
Just adding another reference to #21803 here — this fix was meant to fix that issue as well, which it apparently did ... Florian Haas
03:14 AM Bug #23352: osd: segfaults under normal operation
Brad Hubbard wrote:
> I've created a test package here based on 12.2.7 and including the one line patch above.
>
...
lin zhou
12:59 AM Feature #24232: Add new command ceph mon status
PR: https://github.com/ceph/ceph/pull/23525 Hsiao-Yin Tseng
 

Also available in: Atom