Project

General

Profile

Activity

From 07/12/2018 to 08/10/2018

08/10/2018

08:13 PM Bug #23352: osd: segfaults under normal operation
https://github.com/ceph/ceph/pull/23459 merged Yuri Weinstein
04:54 AM Bug #12615: Repair of Erasure Coded pool with an unrepairable object causes pg state to lose clea...
In a replicated case which in which all copies are bad, a rep_repair_primary_object() can cause loss of clean and ins... David Zafman
04:44 AM Bug #25084: Attempt to read object that can't be repaired loops forever
I don't think we should backport this change. In Luminous and possibly upgraded to Mimic there is a possibility that... David Zafman
12:01 AM Bug #25084: Attempt to read object that can't be repaired loops forever
https://github.com/ceph/ceph/pull/23518 David Zafman
02:54 AM Bug #26875 (Pending Backport): kv: MergeOperator name() returns string, and caller calls c_str() ...
Kefu Chai
02:47 AM Bug #24485: LibRadosTwoPoolsPP.ManifestUnset failure
/a/kchai-2018-08-09_12:29:04-rados-wip-kefu-testing-2018-08-08-1144-distro-basic-smithi/2885459/ Kefu Chai
12:04 AM Bug #19753: Deny reservation if expected backfill size would put us over backfill_full_ratio
https://github.com/ceph/ceph/pull/22797 David Zafman

08/09/2018

11:52 PM Bug #25084 (In Progress): Attempt to read object that can't be repaired loops forever
What I actually ran into is that when do_read() fails because of the CRC mismatch, the recovery repair can pull from ... David Zafman
07:56 PM Backport #24333 (In Progress): luminous: local_reserver double-reservation of backfilled pg
PR: https://github.com/ceph/ceph/pull/23493 Victor Denisov
06:37 PM Feature #21366 (Resolved): tools/ceph-objectstore-tool: split filestore directories offline to ta...
David Zafman
06:37 PM Backport #24845 (Resolved): luminous: tools/ceph-objectstore-tool: split filestore directories of...
David Zafman
02:00 PM Bug #26891 (New): backfill reservation deadlock/stall

on backfill target:
- get backfill request, queue RequestBackfillPrio...
Sage Weil
01:34 PM Bug #26890: scrub livelock
https://github.com/ceph/ceph/pull/23512 Sage Weil
01:32 PM Bug #26890 (Resolved): scrub livelock
- both osds locally reserve a scrub slot
- both osds send a scrub schedule request
- both scrub requests are reject...
Sage Weil
08:03 AM Bug #26880: ceph-base debian package compiled on ubuntu/xenial has unmet runtime dependencies
Full info for ceph-base package:... Piotr Dalek
08:00 AM Bug #26880: ceph-base debian package compiled on ubuntu/xenial has unmet runtime dependencies
Tried on fresh Ubuntu 16.04 vm to build Ceph packages for master branch, resulting .debs still depend on libstdc++6 (... Piotr Dalek
07:57 AM Bug #26880: ceph-base debian package compiled on ubuntu/xenial has unmet runtime dependencies
as per Piotr Dałek we can reproduce this issue on master even with the fix . Kefu Chai

08/08/2018

09:42 PM Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-di...
I think we need to fix this sooner rather than later. My suggestion is to incorporate enough of the original rocksdb... Sage Weil
09:10 PM Bug #26878 (Closed): `osd destroy` command hangs
NOTABUG. :)
Presumably will have to update the ceph-volume tests but the louder notification PR is well on its way t...
Greg Farnum
06:17 PM Bug #26878: `osd destroy` command hangs
master PR https://github.com/ceph/ceph/pull/23492 Alfredo Deza
12:03 PM Bug #26878 (Closed): `osd destroy` command hangs
Running latest master without a manager daemon makes `osd destroy` commands hang.
ceph version 14.0.0-1906-g637bb2...
Alfredo Deza
06:52 PM Feature #1126 (Rejected): crush: extend rule definition
actually, you can do the above, just set size=3 and you'll get 2 in first rack and 1 in second rack. Sage Weil
06:49 PM Feature #85 (Fix Under Review): osd: pg_num shrink
https://github.com/ceph/ceph/pull/20469 Sage Weil
06:33 PM Feature #84 (In Progress): mon: auto adjust pg_num as pool grows
Sage Weil
05:16 PM Backport #24845: luminous: tools/ceph-objectstore-tool: split filestore directories offline to ta...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23418
merged
Yuri Weinstein
12:50 PM Backport #26881 (In Progress): mimic: ceph-base debian package compiled on ubuntu/xenial has unme...
Kefu Chai
12:44 PM Backport #26881 (Resolved): mimic: ceph-base debian package compiled on ubuntu/xenial has unmet r...
https://github.com/ceph/ceph/pull/23490 Kefu Chai
12:44 PM Bug #26880 (Pending Backport): ceph-base debian package compiled on ubuntu/xenial has unmet runti...
https://github.com/ceph/ceph/pull/22990
https://github.com/ceph/ceph/pull/23432
Kefu Chai
12:30 PM Bug #26880 (Resolved): ceph-base debian package compiled on ubuntu/xenial has unmet runtime depen...
... Kefu Chai
08:34 AM Backport #26839 (In Progress): mimic: librados application's symbol could conflict with the libce...
-https://github.com/ceph/ceph/pull/23484- Prashant D
08:32 AM Backport #26840 (In Progress): luminous: librados application's symbol could conflict with the li...
https://github.com/ceph/ceph/pull/23483 Prashant D
03:35 AM Bug #25209 (Resolved): cls/test_cls_numops.sh aborts
Kefu Chai
01:53 AM Bug #26875 (Fix Under Review): kv: MergeOperator name() returns string, and caller calls c_str() ...
https://github.com/ceph/ceph/pull/23477 Kefu Chai

08/07/2018

11:06 PM Bug #23857: flush (manifest) vs async recovery causes out of order op
/a/yuriw-2018-08-06_20:38:17-rados-wip_master_8_6_2018-distro-basic-smithi/2873966/
the order of events here:
<...
Neha Ojha
10:02 PM Bug #26875 (Resolved): kv: MergeOperator name() returns string, and caller calls c_str() on the t...
On Tue, 7 Aug 2018, Réka Nikolett Kovács wrote:
> Hi,
>
> I am working on a bug finding tool that looks for a ...
Sage Weil
07:26 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
ubuntu@mastercontroller01:~$ ceph -s
cluster:
id: dc00b525-7dca-435a-bfa6-c0b9b216e1f2
health: HEALT...
Dexter John Genterone
07:24 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
attaching new osd log. Dexter John Genterone
06:12 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
We've encountered this again when we were adding a new OSD. Couldn't get the gdb as there was none installed and the ... Dexter John Genterone
06:21 PM Bug #24866: FAILED assert(0 == "past_interval start interval mismatch") in check_past_interval_bo...
attaching OSD log. Dexter John Genterone
06:19 PM Bug #24866: FAILED assert(0 == "past_interval start interval mismatch") in check_past_interval_bo...
We encountered this issue after trying out a patch for https://tracker.ceph.com/issues/21142.
Is it safe to bypas...
Dexter John Genterone
08:50 AM Bug #25108 (Fix Under Review): object errors found in be_select_auth_object() aren't logged the same
https://github.com/ceph/ceph/pull/23376/ Kefu Chai
03:32 AM Bug #26868 (Pending Backport): PGLog.cc: saw valgrind issues while accessing complete_to->version
Neha Ojha
02:02 AM Backport #26871 (In Progress): luminous: osd: segfaults under normal operation
https://github.com/ceph/ceph/pull/23459 Brad Hubbard
01:25 AM Backport #26871 (Resolved): luminous: osd: segfaults under normal operation
https://github.com/ceph/ceph/pull/23459 Brad Hubbard
02:01 AM Backport #26870 (In Progress): mimic: osd: segfaults under normal operation
https://github.com/ceph/ceph/pull/23458 Brad Hubbard
01:24 AM Backport #26870 (Resolved): mimic: osd: segfaults under normal operation
https://github.com/ceph/ceph/pull/23458 Brad Hubbard

08/06/2018

08:30 PM Backport #24495: luminous: osd: segv in Session::have_backoff
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22729
merged
Yuri Weinstein
08:24 PM Backport #24501: luminous: osd: eternal stuck PG in 'unfound_recovery'
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22546
merged
Yuri Weinstein
06:49 PM Bug #24613: luminous: rest/test.py fails with expected 200, got 400
This one looks similar.
/a/yuriw-2018-08-03_19:54:05-rados-wip-yuri-testing-2018-08-03-1639-luminous-distro-basic-...
Neha Ojha
06:46 PM Bug #26868 (Fix Under Review): PGLog.cc: saw valgrind issues while accessing complete_to->version
https://github.com/ceph/ceph/pull/23450 Neha Ojha
06:35 PM Bug #26868 (In Progress): PGLog.cc: saw valgrind issues while accessing complete_to->version
Neha Ojha
06:28 PM Bug #26868 (Resolved): PGLog.cc: saw valgrind issues while accessing complete_to->version
This occurred during a rados run of https://tracker.ceph.com/issues/24988. This failure has not been seen on master o... Neha Ojha
02:52 PM Bug #23352 (Pending Backport): osd: segfaults under normal operation
Kefu Chai
02:51 PM Bug #24875 (Pending Backport): OSD: still returning EIO instead of recovering objects on checksum...
Kefu Chai

08/04/2018

09:58 PM Bug #24174: PrimaryLogPG::try_flush_mark_clean mixplaced ctx release
This was seen in luminous. Could this be related?... Neha Ojha

08/03/2018

11:45 PM Feature #24917: Gracefully deal with upgrades when bluestore skipping of data_digest becomes active

We need to wait to turn off data_digest once all OSDs are running bluestore AND we must disallow a filestore OSD to...
David Zafman
10:42 PM Bug #23492 (Resolved): Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasure-e...
David Zafman
10:42 PM Backport #24864 (Resolved): luminous: Abort in OSDMap::decode() during qa/standalone/erasure-code...
David Zafman
03:11 PM Backport #24864: luminous: Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasu...
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/23025
merged
Yuri Weinstein
10:40 PM Feature #24949 (Resolved): luminous: Allow scrub to fix Luminous 12.2.6 corruption of data_digest
David Zafman
10:39 PM Backport #25128 (Resolved): mimic: Allow scrub to fix Luminous 12.2.6 corruption of data_digest
David Zafman
10:39 PM Backport #26841 (Closed): mimic: luminous: Allow scrub to fix Luminous 12.2.6 corruption of data_...
David Zafman
04:02 PM Backport #26841 (Closed): mimic: luminous: Allow scrub to fix Luminous 12.2.6 corruption of data_...
Patrick Donnelly
10:35 PM Backport #25126 (Resolved): mimic: Allow repair of an object with a bad data_digest in object_inf...
David Zafman
10:35 PM Feature #25085 (Resolved): Allow repair of an object with a bad data_digest in object_info on all...
David Zafman
10:18 PM Bug #24875 (In Progress): OSD: still returning EIO instead of recovering objects on checksum errors
David Zafman
05:59 PM Backport #24888: luminous: osd: crash in OpTracker::unregister_inflight_op via OSD::get_health_me...
Radek, can you take a look at backporting this? Josh Durgin
04:02 PM Backport #26840 (Resolved): luminous: librados application's symbol could conflict with the libce...
https://github.com/ceph/ceph/pull/23483 Patrick Donnelly
04:02 PM Backport #26839 (Resolved): mimic: librados application's symbol could conflict with the libceph-...
https://github.com/ceph/ceph/pull/24708 Patrick Donnelly
03:24 PM Backport #23772: luminous: ceph status shows wrong number of objects
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22680
merged
Yuri Weinstein
03:22 PM Backport #24471: luminous: Ceph-osd crash when activate SPDK
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22686
merged
Yuri Weinstein
03:15 PM Backport #24772: luminous: osd: may get empty info at recovery
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22862
merged
Yuri Weinstein
12:57 PM Bug #25154 (Pending Backport): librados application's symbol could conflict with the libceph-common
Kefu Chai
08:10 AM Bug #24835: osd daemon spontaneous segfault
The problem still persists with Mimic 13.2.1 (on the same cluster as above). Errors in ceph::buffer::list appear to h... Soenke Schippmann
12:09 AM Backport #25199 (In Progress): luminous: FAILED assert(trim_to <= info.last_complete) in PGLog::t...
Neha Ojha
12:08 AM Backport #25219 (In Progress): luminous: osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
Neha Ojha
12:07 AM Backport #25200 (In Progress): mimic: FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
Neha Ojha
12:07 AM Backport #25220 (In Progress): mimic: osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
Neha Ojha
12:07 AM Backport #24989 (In Progress): mimic: Limit pg log length during recovery/backfill so that we don...
Neha Ojha
12:06 AM Bug #23352: osd: segfaults under normal operation
I've created a test package here based on 12.2.7 and including the one line patch above.
https://shaman.ceph.com/r...
Brad Hubbard

08/02/2018

11:57 PM Bug #23352 (In Progress): osd: segfaults under normal operation
https://github.com/ceph/ceph/pull/23404 Brad Hubbard
08:56 PM Bug #23352: osd: segfaults under normal operation
Brad - you can just use kjetil@medallia.com Kjetil Joergensen
08:26 AM Bug #23352: osd: segfaults under normal operation
Thanks Kjetil,
I think you are right, we should hold the lock in update_osd_health(). Not sure how we all missed tha...
Brad Hubbard
08:24 PM Bug #22330: ec: src/common/interval_map.h: 161: FAILED assert(len > 0)
/ceph/teuthology-archive/pdonnell-2018-08-02_13:06:29-multimds-wip-pdonnell-testing-20180802.044402-testing-basic-smi... Patrick Donnelly
08:21 PM Bug #21931: osd: src/osd/ECBackend.cc: 2164: FAILED assert((offset + length) <= (range.first.get_...
Run with cores/logs: /ceph/teuthology-archive/pdonnell-2018-08-02_13:06:29-multimds-wip-pdonnell-testing-20180802.044... Patrick Donnelly
03:57 PM Bug #25182: Upmaps forgotten after restarting OSDs
Hmm, I wasn't able to reproduce this... Sage Weil
03:30 PM Bug #25182: Upmaps forgotten after restarting OSDs
It is expected that the upmaps may evaporate if the "raw" CRUSH mapping changes. This shouldn't happen for osd up/do... Sage Weil
03:59 AM Bug #24875: OSD: still returning EIO instead of recovering objects on checksum errors
*master PR*: https://github.com/ceph/ceph/pull/23377 Nathan Cutler
03:57 AM Backport #25227 (In Progress): luminous: OSD: still returning EIO instead of recovering objects o...
Nathan Cutler
03:56 AM Backport #25226 (In Progress): mimic: OSD: still returning EIO instead of recovering objects on c...
Nathan Cutler

08/01/2018

11:37 PM Backport #25227 (Resolved): luminous: OSD: still returning EIO instead of recovering objects on c...
https://github.com/ceph/ceph/pull/23379 David Zafman
11:32 PM Backport #25226 (Resolved): mimic: OSD: still returning EIO instead of recovering objects on chec...
https://github.com/ceph/ceph/pull/23378 David Zafman
11:26 PM Bug #25211 (Fix Under Review): bug in PerfCounters
Josh Durgin
12:23 PM Bug #25211: bug in PerfCounters
https://github.com/ceph/ceph/pull/23362 hongpeng lu
12:16 PM Bug #25211 (Resolved): bug in PerfCounters
when we call PerfCounters::inc() and read_avg() at the same time, maybe the result is not what we want.
show the c...
hongpeng lu
10:58 PM Bug #24875: OSD: still returning EIO instead of recovering objects on checksum errors
David Zafman
10:18 PM Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-di...
another option would be to only partially revert, and keep just the bits that ignore the older deleted log files. Sage Weil
02:03 PM Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-di...
an alternative option is to whip up a tool to rebuild the manifest to remove the dummy File4 with kDeletedLogNumberHa... Kefu Chai
12:45 PM Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-di...
it's a regression in rocksdb. the rocksdb in mimic (eaee6d3beab3429232ceb188377a3f94e844fca7) is f4a857da0b720691effc... Kefu Chai
06:28 AM Bug #25146: "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:parallel-master-di...
i create a vstart.sh cluster using mimic branch, and ceph-monstore-tool from master is able to open it just fine.
...
Kefu Chai
09:54 PM Feature #24949: luminous: Allow scrub to fix Luminous 12.2.6 corruption of data_digest
mimic "backport" is actually a forward port from luminous Nathan Cutler
05:37 PM Feature #24949 (Pending Backport): luminous: Allow scrub to fix Luminous 12.2.6 corruption of dat...
David Zafman
09:50 PM Backport #25220 (Resolved): mimic: osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
https://github.com/ceph/ceph/pull/23403 Nathan Cutler
09:50 PM Backport #25219 (Resolved): luminous: osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
https://github.com/ceph/ceph/pull/23211 Nathan Cutler
09:47 PM Bug #24484 (Resolved): osdc: wrong offset in BufferHead
Nathan Cutler
09:47 PM Backport #24584 (Resolved): luminous: osdc: wrong offset in BufferHead
Nathan Cutler
03:37 PM Backport #24584: luminous: osdc: wrong offset in BufferHead
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22865
merged
Yuri Weinstein
05:56 PM Bug #23352: osd: segfaults under normal operation
For MgrClient::update_osd_health, does the move-assignment compile into updating a pointer to a std::vector, or does ... Kjetil Joergensen
05:41 AM Bug #23352: osd: segfaults under normal operation
Just adding another "me too" on this. I've hit this on Luminous 12.2.7 also under Ubuntu 16.04.4 with 4.15.0-24-gener... Richard Bade
02:08 AM Bug #23352: osd: segfaults under normal operation
... Brad Hubbard
05:43 PM Bug #25108: object errors found in be_select_auth_object() aren't logged the same
Kefu:
my concern is that, we don't reset object_error before moving to another ScrubMap. so once we identify an erro...
David Zafman
05:41 PM Bug #25108 (In Progress): object errors found in be_select_auth_object() aren't logged the same
David Zafman
05:38 PM Feature #25085 (Pending Backport): Allow repair of an object with a bad data_digest in object_inf...
David Zafman
05:38 PM Backport #25127 (Resolved): luminous: Allow repair of an object with a bad data_digest in object_...
David Zafman
03:44 PM Bug #25184 (Pending Backport): osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
Neha Ojha
02:07 PM Bug #25209 (Fix Under Review): cls/test_cls_numops.sh aborts
-https://github.com/ceph/ceph/pull/23364-
i think https://github.com/ceph/ceph/pull/23432 is a better fix.
Kefu Chai
05:46 AM Bug #25209: cls/test_cls_numops.sh aborts
i think we should revert https://github.com/ceph/ceph/pull/22990 Kefu Chai
05:44 AM Bug #25209: cls/test_cls_numops.sh aborts
... Kefu Chai
05:28 AM Bug #25209 (Resolved): cls/test_cls_numops.sh aborts
... Kefu Chai
01:06 PM Bug #25181 (Duplicate): /mon/OSDMonitor.cc: 1821: FAILED assert(osdmap_manifest.pinned.empty())
Sage Weil
01:06 PM Bug #24612: FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::prune_init()
/a/sage-2018-07-31_21:57:28-rados-wip-sage-testing-2018-07-31-1436-distro-basic-smithi/2844443
/a/sage-2018-07-30_13...
Sage Weil

07/31/2018

10:53 PM Bug #25174: osd: assert failure with FAILED assert(repop_queue.front() == repop) In function 'vo...
Do we have logs for this failure somewhere? Neha Ojha
10:47 PM Backport #25199: luminous: FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
This is dependent on a couple of other backports. Assigning it to myself. Neha Ojha
10:45 PM Backport #25199 (Resolved): luminous: FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
https://github.com/ceph/ceph/pull/23211 Nathan Cutler
10:45 PM Backport #24068 (Resolved): luminous: osd sends op_reply out of order
Nathan Cutler
07:48 PM Backport #24068: luminous: osd sends op_reply out of order
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/23137
merged
Yuri Weinstein
10:45 PM Backport #25204 (Resolved): mimic: rados python bindings use prval from stack
https://github.com/ceph/ceph/pull/23863 Nathan Cutler
10:45 PM Backport #25203 (Resolved): luminous: rados python bindings use prval from stack
https://github.com/ceph/ceph/pull/23864 Nathan Cutler
10:45 PM Backport #25200 (Resolved): mimic: FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
https://github.com/ceph/ceph/pull/23403 Nathan Cutler
09:24 PM Bug #25198 (Pending Backport): FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
Sage Weil
06:27 PM Bug #25198 (Fix Under Review): FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
https://github.com/ceph/ceph/pull/23354 Neha Ojha
05:48 PM Bug #25198 (Resolved): FAILED assert(trim_to <= info.last_complete) in PGLog::trim()
... Neha Ojha
08:02 PM Bug #23352: osd: segfaults under normal operation
Latest crash just happened here, no messages not in the OSD log, but crash dump is generated and dmesg shows:
[Tue...
Alex Gorbachev
07:25 PM Bug #24485: LibRadosTwoPoolsPP.ManifestUnset failure
/a/sage-2018-07-31_14:52:20-rados:thrash-wip-sage2-testing-2018-07-30-1049-distro-basic-smithi/2843268 Sage Weil
07:08 PM Bug #25175 (Pending Backport): rados python bindings use prval from stack
https://github.com/ceph/ceph/pull/23334 Sage Weil
03:03 PM Bug #25194 (Can't reproduce): Negative stats found by deep-scrub

http://pulpito.ceph.com/dzafman-2018-07-30_12:09:07-rados-wip-zafman-testing-distro-basic-smithi/2839428
log_cha...
David Zafman
02:52 PM Feature #21710: add wildcard for namespaces
Not at all. Douglas Fuller
12:43 AM Feature #21710: add wildcard for namespaces
Hi Douglas, I started in on this and forgot to reassign the ticket! Mind if I take this one? Jesse Williamson
03:57 AM Tasks #25186 (In Progress): setup repo for building dependencies like boost, rocksdb, which are n...
we need to build boost, spdk, dpdk, fio, rocksdb, gperftools, seastar for preparing the build dependencies for each P... Kefu Chai
12:08 AM Bug #25184 (Fix Under Review): osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
Neha Ojha

07/30/2018

11:50 PM Bug #25184 (Resolved): osd/PGLog.cc: use lgeneric_subdout instead of generic_dout
https://github.com/ceph/ceph/pull/23340 Neha Ojha
09:36 PM Bug #25182 (Resolved): Upmaps forgotten after restarting OSDs
Problem:
I have a small cluster at home and I noticed that during the upgrade from 12.2.5 -> 12.2.7 and the upgrade ...
Bryan Stillwell
08:47 PM Backport #25178 (In Progress): mimic: rados: not all exceptions accept keyargs
Nathan Cutler
07:23 PM Backport #25178 (Resolved): mimic: rados: not all exceptions accept keyargs
https://github.com/ceph/ceph/pull/23335 Nathan Cutler
08:17 PM Bug #24485: LibRadosTwoPoolsPP.ManifestUnset failure
/a/sage-2018-07-30_13:46:50-rados-wip-sage3-testing-2018-07-28-1512-distro-basic-smithi/2838971 Sage Weil
08:16 PM Bug #25181 (Duplicate): /mon/OSDMonitor.cc: 1821: FAILED assert(osdmap_manifest.pinned.empty())
... Sage Weil
07:26 PM Bug #25112: osd,mon: increase mon_max_pg_per_osd to 250
Please note that the value has been changed from 300->250 for this tracker. The PR reflects the correct value. Neha Ojha
02:23 PM Bug #25112 (Pending Backport): osd,mon: increase mon_max_pg_per_osd to 250
Kefu Chai
07:23 PM Backport #25177 (Resolved): luminous: osd,mon: increase mon_max_pg_per_osd to 300
https://github.com/ceph/ceph/pull/23862 Nathan Cutler
07:23 PM Backport #25176 (Resolved): mimic: osd,mon: increase mon_max_pg_per_osd to 300
https://github.com/ceph/ceph/pull/23861 Nathan Cutler
07:15 PM Bug #25175 (Resolved): rados python bindings use prval from stack
these methods include
- omap_get_vals
- omap_get_keys
- omap-get-vals-by-keys
Sage Weil
06:53 PM Bug #24686 (Resolved): change default filestore_merge_threshold to -10
Nathan Cutler
06:53 PM Backport #24748 (Resolved): luminous: change default filestore_merge_threshold to -10
Nathan Cutler
04:45 PM Backport #24748: luminous: change default filestore_merge_threshold to -10
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22814
merged
Yuri Weinstein
06:51 PM Backport #24083 (Resolved): luminous: rados: not all exceptions accept keyargs
Nathan Cutler
04:43 PM Backport #24083: luminous: rados: not all exceptions accept keyargs
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22979
merged
Yuri Weinstein
06:35 PM Bug #25174 (Can't reproduce): osd: assert failure with FAILED assert(repop_queue.front() == repop...
branch: luminous
description: rados:downstream:singleton/{all/ec-lost-unfound.yaml msgr-failures/many.yaml
...
Shylesh Kumar
05:07 PM Bug #25153 (Fix Under Review): output format is invalid of the crush tree json dumper
Patrick Donnelly
11:56 AM Bug #25153: output format is invalid of the crush tree json dumper
Reference the pull request: https://github.com/ceph/ceph/pull/23319 Oshyn Song
11:50 AM Bug #25153 (Resolved): output format is invalid of the crush tree json dumper
The output json string is invalid for "ceph osd crush tree --format=json" command. It contains an array of "nodes" an... Oshyn Song
01:42 PM Bug #25155 (Can't reproduce): mon crash from 'ceph osd erasure-code-profile set lrcprofile name=l...
... Sage Weil
12:15 PM Bug #25154: librados application's symbol could conflict with the libceph-common
https://github.com/ceph/ceph/pull/23320 Kefu Chai
12:15 PM Bug #25154 (Resolved): librados application's symbol could conflict with the libceph-common
quoting from Zongyou Yao's mail from ceph-devel ML
> Internally, we have a program using librados C++ api to perio...
Kefu Chai
07:20 AM Bug #24785 (Resolved): mimic selinux denials comm="tp_fstore_op / comm="ceph-osd dev=dm-0 and dm-1
Boris Ranto
07:20 AM Backport #25143 (Resolved): luminous: mimic selinux denials comm="tp_fstore_op / comm="ceph-osd ...
Merged. Boris Ranto
07:19 AM Backport #25142 (Resolved): mimic: mimic selinux denials comm="tp_fstore_op / comm="ceph-osd dev...
Merged. Boris Ranto

07/28/2018

07:55 PM Bug #20798: LibRadosLockECPP.LockExclusiveDurPP gets EEXIST
/a/sage-2018-07-27_22:50:28-rados-wip-sage-testing-2018-07-27-0744-distro-basic-smithi/2826326 Sage Weil
02:46 PM Bug #25146 (Resolved): "rocksdb: Corruption: Can't access /000000.sst" in upgrade:mimic-x:paralle...
This is on mew mimic-x suite https://github.com/ceph/ceph/pull/23292
Run: http://pulpito.ceph.com/yuriw-2018-07-27_2...
Yuri Weinstein
09:12 AM Backport #25143 (In Progress): luminous: mimic selinux denials comm="tp_fstore_op / comm="ceph-o...
Nathan Cutler
09:11 AM Backport #25143 (Resolved): luminous: mimic selinux denials comm="tp_fstore_op / comm="ceph-osd ...
https://github.com/ceph/ceph/pull/23296 Nathan Cutler
09:11 AM Backport #25142 (In Progress): mimic: mimic selinux denials comm="tp_fstore_op / comm="ceph-osd ...
Nathan Cutler
09:10 AM Backport #25142 (Resolved): mimic: mimic selinux denials comm="tp_fstore_op / comm="ceph-osd dev...
https://github.com/ceph/ceph/pull/23295 Nathan Cutler
09:11 AM Backport #25145 (Resolved): luminous: Automatically set expected_num_objects for new pools with >...
https://github.com/ceph/ceph/pull/24395 Nathan Cutler
09:11 AM Backport #25144 (Resolved): mimic: Automatically set expected_num_objects for new pools with >=10...
https://github.com/ceph/ceph/pull/23860 Nathan Cutler

07/27/2018

11:55 PM Bug #24785: mimic selinux denials comm="tp_fstore_op / comm="ceph-osd dev=dm-0 and dm-1
Mimic back-port:
https://github.com/ceph/ceph/pull/23295
Luminous back-port:
https://github.com/ceph/ceph/pu...
Boris Ranto
10:18 PM Bug #24785 (Pending Backport): mimic selinux denials comm="tp_fstore_op / comm="ceph-osd dev=dm-...
Boris Ranto
07:01 AM Bug #24785 (Fix Under Review): mimic selinux denials comm="tp_fstore_op / comm="ceph-osd dev=dm-...
Boris Ranto
07:00 AM Bug #24785: mimic selinux denials comm="tp_fstore_op / comm="ceph-osd dev=dm-0 and dm-1
The manual testing suggests this should fix this issue:
https://github.com/ceph/ceph/pull/23278
Boris Ranto
03:03 AM Bug #23352: osd: segfaults under normal operation
Dan van der Ster wrote:
> Can we see that state from the coredump somehow? Basically none of our clusters should hav...
Brad Hubbard

07/26/2018

07:35 PM Backport #23670 (Need More Info): luminous: auth: ceph auth add does not sanity-check caps
non-trivial backport. One attempt was already made - https://github.com/ceph/ceph/pull/21361 - but it was implicated ... Nathan Cutler
07:31 PM Backport #23670 (Rejected): luminous: auth: ceph auth add does not sanity-check caps
see discussion in https://github.com/ceph/ceph/pull/21361 Nathan Cutler
07:17 PM Feature #24949: luminous: Allow scrub to fix Luminous 12.2.6 corruption of data_digest
https://github.com/ceph/ceph/pull/23236
Includes backport from master of https://github.com/ceph/ceph/pull/23217
David Zafman
07:15 PM Backport #25128 (Resolved): mimic: Allow scrub to fix Luminous 12.2.6 corruption of data_digest
https://github.com/ceph/ceph/pull/23272
(includes backport of https://tracker.ceph.com/issues/25085 from master)
David Zafman
07:12 PM Backport #25127 (Resolved): luminous: Allow repair of an object with a bad data_digest in object_...
https://github.com/ceph/ceph/pull/23236 David Zafman
07:11 PM Backport #25126 (Resolved): mimic: Allow repair of an object with a bad data_digest in object_inf...
https://github.com/ceph/ceph/pull/23272 David Zafman
05:47 PM Bug #24687 (Pending Backport): Automatically set expected_num_objects for new pools with >=100 PG...
Douglas Fuller
05:17 PM Bug #24687: Automatically set expected_num_objects for new pools with >=100 PGs per OSD
Removed pgcalc message while pgcalc updates are considered Douglas Fuller
05:19 PM Cleanup #25124 (New): Add message to consult pgcalc for expected_num_objects
Currently we warn the user when attempting to create a filestore pool that appears to be intended to store a large nu... Douglas Fuller
03:58 PM Bug #25106: Ceph-osd coredumps on launch
Either the patch here: https://github.com/ceph/ceph/pull/22954
Doesn't fix the bug, or this is not a duplicate iss...
Michael Jones
03:58 AM Bug #25108: object errors found in be_select_auth_object() aren't logged the same

I ran a subtest of osd-scrub-repair based on pull request https://github.com/ceph/ceph/pull/23217. I also added a ...
David Zafman
03:18 AM Bug #24664: osd: crash in OpTracker::unregister_inflight_op via OSD::get_health_metrics
Need help with the luminous backport, which is needed to fix a failure in upgrade/luminous-x. Nathan Cutler
12:40 AM Bug #25112 (Fix Under Review): osd,mon: increase mon_max_pg_per_osd to 250
https://github.com/ceph/ceph/pull/23251 Neha Ojha
12:19 AM Bug #25112 (Resolved): osd,mon: increase mon_max_pg_per_osd to 250
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1603615 Neha Ojha
12:21 AM Bug #25076: MON crash when upgrading luminous v12.2.7 -> mimic v13.2.0 during ceph-fuse task
It appears the crash can be explained on the same basis as in the case of "bug #24664":https://tracker.ceph.com/issue... Radoslaw Zarzynski

07/25/2018

10:41 PM Bug #25106 (Duplicate): Ceph-osd coredumps on launch
this will be fixed in 13.2.1 Josh Durgin
06:10 PM Bug #25106 (Duplicate): Ceph-osd coredumps on launch
See https://tracker.ceph.com/issues/24993
The problem:
ceph-volume lvm create --bluestore --data /dev/sda <- wor...
Michael Jones
10:39 PM Bug #24667: osd: SIGSEGV in MMgrReport::encode_payload
downgrading due to lack of recurrence Josh Durgin
07:59 PM Bug #25108 (Resolved): object errors found in be_select_auth_object() aren't logged the same

object errors found in be_select_auth_object() aren't logged the same as errors found in be_compare_scrub_objects()...
David Zafman
01:59 PM Backport #25101 (In Progress): mimic: jewel->luminous: osdmap crc mismatch
Nathan Cutler
01:58 PM Backport #25101 (Resolved): mimic: jewel->luminous: osdmap crc mismatch
Nathan Cutler
01:57 PM Backport #25101 (Resolved): mimic: jewel->luminous: osdmap crc mismatch
https://github.com/ceph/ceph/pull/23226 Nathan Cutler
01:58 PM Backport #25100 (Resolved): luminous: jewel->luminous: osdmap crc mismatch
Nathan Cutler
01:57 PM Backport #25100 (In Progress): luminous: jewel->luminous: osdmap crc mismatch
Nathan Cutler
01:56 PM Backport #25100 (Resolved): luminous: jewel->luminous: osdmap crc mismatch
https://github.com/ceph/ceph/pull/23227 Nathan Cutler
12:27 PM Bug #25057: jewel->luminous: osdmap crc mismatch
luminous: https://github.com/ceph/ceph/pull/23227
mimic: https://github.com/ceph/ceph/pull/23226
Sage Weil
12:07 PM Bug #25057: jewel->luminous: osdmap crc mismatch
The problem was that CRUSH_TUNABLES5 was associated with kraken instead of jewel in 0ceb5c0, backported to luminous ... Sage Weil
12:00 PM Bug #25057 (Pending Backport): jewel->luminous: osdmap crc mismatch
https://github.com/ceph/ceph/pull/23220 Sage Weil
09:01 AM Bug #23352: osd: segfaults under normal operation
Dan van der Ster wrote:
> * The OSD health metric changes sure are a juicy candidate to be the root cause -- but we ...
Dan van der Ster
08:11 AM Bug #23352: osd: segfaults under normal operation
Brad Hubbard wrote:
> I was also thinking that, since the OSDHealthMetric related code only triggers when there are ...
Dan van der Ster
02:37 AM Bug #23352: osd: segfaults under normal operation
Thanks Roberto,
Your core, as well as the last uploaded by Alex show the now familiar corruption to the vtable of ...
Brad Hubbard

07/24/2018

08:54 PM Backport #24988: luminous: Limit pg log length during recovery/backfill so that we don't run out ...
https://github.com/ceph/ceph/pull/23211 Neha Ojha
06:34 PM Backport #24988 (In Progress): luminous: Limit pg log length during recovery/backfill so that we ...
Neha Ojha
08:52 PM Feature #25085 (In Progress): Allow repair of an object with a bad data_digest in object_info on ...
David Zafman
08:51 PM Feature #25085: Allow repair of an object with a bad data_digest in object_info on all replicas
https://github.com/ceph/ceph/pull/23217 David Zafman
08:46 PM Feature #25085 (Resolved): Allow repair of an object with a bad data_digest in object_info on all...

We've seen this due to a bug in Luminous 12.2.6, but it may have been seen in other cases.
David Zafman
08:44 PM Bug #25084 (Resolved): Attempt to read object that can't be repaired loops forever

If all replicas are of an object are bad causes a loop of continuous recovery and calls to rep_repair_primary_objec...
David Zafman
07:18 PM Bug #25057 (In Progress): jewel->luminous: osdmap crc mismatch
Sage Weil
06:41 PM Bug #25057: jewel->luminous: osdmap crc mismatch
/a/teuthology-2018-07-20_04:23:01-upgrade:jewel-x-luminous-distro-basic-smithi/2799173
is an instance where the mo...
Sage Weil
11:07 AM Bug #25076 (Duplicate): MON crash when upgrading luminous v12.2.7 -> mimic v13.2.0 during ceph-fu...
Teuthology log: http://qa-proxy.ceph.com/teuthology/smithfarm-2018-07-24_02:10:24-upgrade:luminous-x-mimic-distro-bas... Nathan Cutler
09:27 AM Bug #23352: osd: segfaults under normal operation
Hi Brad,
We spotted again this issue in one of our clusters, just 2 hours after we upgraded from 12.2.5 -> 12.2.7...
Roberto Valverde
09:10 AM Bug #23352: osd: segfaults under normal operation
Thanks Alex,
I'll check out the core tomorrow and let you know.
I have been working on instrumenting the ceph-o...
Brad Hubbard
02:06 AM Documentation #4640 (Resolved): rados.8 should document import/export
https://github.com/ceph/ceph/pull/23186 Nathan Cutler

07/23/2018

09:52 PM Bug #24909: RBD client IOPS pool stats are incorrect (2x higher; includes IO hints as an op)
Jason Dillaman wrote:
> https://github.com/ceph/ceph/pull/23029
merged
Yuri Weinstein
09:40 PM Bug #21496 (Fix Under Review): doc: Manually editing a CRUSH map, Word 'type' missing.
https://github.com/ceph/ceph/pull/23192 Jos Collin
07:37 PM Bug #21496: doc: Manually editing a CRUSH map, Word 'type' missing.
Jos Collin wrote:
> Remy, Please create a PR.
Done.
Anonymous
05:16 PM Feature #1203: osd: priority or fairness osd operations
https://github.com/ceph/dmclock Patrick Donnelly
04:52 PM Support #24980: Pg Inconsistent - failed to pick suitable auth object
Patrick Donnelly wrote:
> Please seek assistance for these kinds of issues on ceph-users mailing list.
Hi Patrick...
Alon Avrahami
04:46 PM Support #24980 (Rejected): Pg Inconsistent - failed to pick suitable auth object
Please seek assistance for these kinds of issues on ceph-users mailing list. Patrick Donnelly
02:27 PM Bug #23352: osd: segfaults under normal operation
After the upgrade to 12.2.7 I am still seeing crashes on OSDs. Please check and advise if a separate tracker should b... Alex Gorbachev
10:28 AM Bug #24994: active+clean+inconsistent PGs after Upgrade to 12.2.7 and deep scrub
Robert Sander wrote:
> On the production cluster the RBD pool is affected. Do I really need to stop the VMs and do...
Brad Hubbard
09:54 AM Bug #24994: active+clean+inconsistent PGs after Upgrade to 12.2.7 and deep scrub
Brad Hubbard wrote:
> For the data_digest_mismatch_info error with client activity stopped, read the data from thi...
Robert Sander
09:18 AM Bug #24994: active+clean+inconsistent PGs after Upgrade to 12.2.7 and deep scrub
Oops, my mistake, terribly sorry. I gave you the procedure for an omap_digest_mismatch_info error.
For the data_di...
Brad Hubbard
08:10 AM Bug #24994: active+clean+inconsistent PGs after Upgrade to 12.2.7 and deep scrub
Brad Hubbard wrote:
> 1. rados -p [name_of_pool_2] setomapval rbd_data.4048d8238e1f29.00000000000002e6 temporary-k...
Robert Sander
10:09 AM Bug #24835: osd daemon spontaneous segfault
After spending a week trying to get Ubuntu/systemd to allow a core dump to be created, we finally have two different ... Christian Schlittchen

07/22/2018

10:09 PM Bug #24994: active+clean+inconsistent PGs after Upgrade to 12.2.7 and deep scrub
In the case of pg 2.34 above where the only error is "data_digest_mismatch_info" and all the data digests except the ... Brad Hubbard
12:55 PM Bug #25057 (Resolved): jewel->luminous: osdmap crc mismatch
The upgrade/jewel-x runs for 12.2.6 and 12.2.7 threw osdmap crc mismatch errors. Sage Weil

07/21/2018

06:32 PM Backport #25055 (In Progress): mimic: doc: http://docs.ceph.com/docs/mimic/rados/operations/pg-st...
Nathan Cutler
06:26 PM Backport #25055 (Resolved): mimic: doc: http://docs.ceph.com/docs/mimic/rados/operations/pg-states/
https://github.com/ceph/ceph/pull/23163 Nathan Cutler
04:12 PM Bug #24994: active+clean+inconsistent PGs after Upgrade to 12.2.7 and deep scrub
I have the same issue Anton Neubauer
12:08 PM Bug #21496: doc: Manually editing a CRUSH map, Word 'type' missing.
Remy, Please create a PR. Jos Collin
11:56 AM Bug #24923 (Pending Backport): doc: http://docs.ceph.com/docs/mimic/rados/operations/pg-states/
https://github.com/ceph/ceph/pull/21520 Jos Collin

07/20/2018

11:42 PM Bug #24304 (Resolved): MgrStatMonitor decode crash on 12.2.4->12.2.5 upgrade
Josh Durgin
04:43 PM Bug #24785: mimic selinux denials comm="tp_fstore_op / comm="ceph-osd dev=dm-0 and dm-1
Running here: http://pulpito.ceph.com/vasu-2018-07-20_16:43:09-ceph-deploy-mimic-distro-basic-ovh/ Vasu Kulkarni
03:03 PM Bug #25017 (Duplicate): log [ERR] : 1.3 past_intervals [182,196) start interval does not contain ...
Josh Durgin
12:38 PM Bug #25017 (Duplicate): log [ERR] : 1.3 past_intervals [182,196) start interval does not contain ...
... Sage Weil
11:56 AM Bug #24938: luminous: rados listomapkeys & listomapvals don't return data.
This sounds familiar: http://tracker.ceph.com/issues/16211
We used the workaround to set and rm a dummy key/val and ...
Dan van der Ster
08:34 AM Bug #24938: luminous: rados listomapkeys & listomapvals don't return data.
Sorry that was just a single example to keep it short. listomapkeys doesn't return any data for any bucket in this cl... Magnus Grönlund
07:31 AM Bug #24994: active+clean+inconsistent PGs after Upgrade to 12.2.7 and deep scrub
... Robert Sander
01:39 AM Bug #24994: active+clean+inconsistent PGs after Upgrade to 12.2.7 and deep scrub
Can you post the output of 'rados list-inconsistent-obj 2.53 --format=json-pretty' ? Brad Hubbard
02:14 AM Bug #25011 (New): competing scrubs stuck reserving local -> remote
In this run: http://pulpito.ceph.com/yuriw-2018-07-18_20:14:43-rados-mimic-distro-basic-smithi/2794751/
osd.0 and ...
Josh Durgin
01:06 AM Backport #24068 (In Progress): luminous: osd sends op_reply out of order
Nathan Cutler
01:06 AM Backport #25010 (In Progress): mimic: osd sends op_reply out of order
Nathan Cutler
12:59 AM Backport #25010 (Resolved): mimic: osd sends op_reply out of order
https://github.com/ceph/ceph/pull/23136 Nathan Cutler

07/19/2018

09:59 PM Bug #23827: osd sends op_reply out of order
http://pulpito.ceph.com/yuriw-2018-07-18_21:37:13-powercycle-mimic-distro-basic-smithi/2796128/ indicates that this n... Neha Ojha
12:57 PM Bug #24994: active+clean+inconsistent PGs after Upgrade to 12.2.7 and deep scrub
I have now added "osd skip data digest = true" as per release notes and restarted all OSDs.
I still have inconsist...
Robert Sander
08:36 AM Bug #24994 (New): active+clean+inconsistent PGs after Upgrade to 12.2.7 and deep scrub
Hi,
a deep scrub revealed 59 active+clean+inconsistent PGs at one customer's cluster and 50 active+clean+inconsist...
Robert Sander
06:11 AM Backport #24989 (Need More Info): mimic: Limit pg log length during recovery/backfill so that we ...
Nathan Cutler
06:10 AM Backport #24988 (Need More Info): luminous: Limit pg log length during recovery/backfill so that ...
Nathan Cutler
06:10 AM Backport #24992 (Resolved): mimic: valgrind-leaks.yaml: expected valgrind issues and found none
https://github.com/ceph/ceph/pull/23744 Nathan Cutler

07/18/2018

09:42 PM Backport #24989: mimic: Limit pg log length during recovery/backfill so that we don't run out of ...
We can hold off on this backport for now. Need to let this bake in master for a while. Neha Ojha
08:00 PM Backport #24989 (Resolved): mimic: Limit pg log length during recovery/backfill so that we don't ...
https://github.com/ceph/ceph/pull/23403 Nathan Cutler
09:42 PM Backport #24988: luminous: Limit pg log length during recovery/backfill so that we don't run out ...
We can hold off on this backport for now. Need to let this bake in master for a while.
Also, this backport is going ...
Neha Ojha
08:00 PM Backport #24988 (Resolved): luminous: Limit pg log length during recovery/backfill so that we don...
https://github.com/ceph/ceph/pull/23211 Nathan Cutler
09:38 PM Bug #24975 (Pending Backport): valgrind-leaks.yaml: expected valgrind issues and found none
This issue has been fixed in master by https://github.com/ceph/ceph/pull/22261
Needs to be backported to mimic.
Neha Ojha
09:14 PM Bug #24935 (Duplicate): SafeTimer? osd killed by kernel for Segmentation fault
This appears to be another instance of #23352. Josh Durgin
09:12 PM Bug #24938: luminous: rados listomapkeys & listomapvals don't return data.
Did you check that this bucket actually has any entries? These commands are tested in our suite. Greg Farnum
08:46 PM Bug #24990 (Resolved): api_watch_notify: LibRadosWatchNotify.Watch3Timeout failed
... Neha Ojha
06:10 PM Feature #23979 (Pending Backport): Limit pg log length during recovery/backfill so that we don't ...
Josh Durgin
04:15 PM Support #24980: Pg Inconsistent - failed to pick suitable auth object
Alon Avrahami wrote:
> Hi,
>
>
> We have ceph cluster installed with Luminous 12.2.2 using bluestore.
> All no...
Alon Avrahami
01:24 PM Support #24980 (Rejected): Pg Inconsistent - failed to pick suitable auth object
Hi,
We have ceph cluster installed with Luminous 12.2.2 using bluestore.
All nodes are Intel servers with 1.6TB...
Alon Avrahami
03:42 PM Backport #24472 (Resolved): mimic: Ceph-osd crash when activate SPDK
Nathan Cutler
02:32 PM Backport #24472: mimic: Ceph-osd crash when activate SPDK
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22684
merged
Yuri Weinstein
03:36 PM Bug #24950 (Resolved): Running osd_skip_data_digest in a mixed cluster is not ideal
Nathan Cutler
03:35 PM Backport #24865 (Resolved): mimic: Abort in OSDMap::decode() during qa/standalone/erasure-code/te...
Nathan Cutler
02:20 PM Backport #24865: mimic: Abort in OSDMap::decode() during qa/standalone/erasure-code/test-erasure-...
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/23024
merged
Yuri Weinstein
03:14 PM Backport #24951 (Resolved): mimic: Running osd_skip_data_digest in a mixed cluster is not ideal
David Zafman
02:24 PM Backport #24951: mimic: Running osd_skip_data_digest in a mixed cluster is not ideal
David Zafman wrote:
> https://github.com/ceph/ceph/pull/23084
nerged
Yuri Weinstein
02:22 PM Bug #23965: FAIL: s3tests.functional.test_s3.test_multipart_upload_resend_part with ec cache pools
https://github.com/ceph/ceph/pull/23096 merged Yuri Weinstein
11:20 AM Documentation #20894 (Resolved): rados manpage does not document "cleanup"
https://github.com/ceph/ceph/pull/16777 Nathan Cutler

07/17/2018

10:48 PM Bug #24975 (Resolved): valgrind-leaks.yaml: expected valgrind issues and found none
... Neha Ojha
10:43 PM Bug #24974 (New): Segmentation fault in tcmalloc::ThreadCache::ReleaseToCentralCache()
... Neha Ojha
08:32 PM Backport #24583 (Resolved): mimic: osdc: wrong offset in BufferHead
Nathan Cutler
08:10 PM Backport #24583: mimic: osdc: wrong offset in BufferHead
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22869
merged
Yuri Weinstein
06:21 PM Feature #23979 (Fix Under Review): Limit pg log length during recovery/backfill so that we don't ...
https://github.com/ceph/ceph/pull/23098 Neha Ojha
05:39 PM Bug #24687: Automatically set expected_num_objects for new pools with >=100 PGs per OSD
Neha Ojha
01:37 PM Bug #20645 (Closed): bluesfs wal failed to allocate (assert(0 == "allocate failed... wtf"))
Igor Fedotov
09:58 AM Bug #24956 (Resolved): osd: parent process need to restart log service after fork, or ceph-osd wi...
ceph-osd parent process need to restart log service after fork, or ceph-osd will not work correctly when the option l... mingshuai wang

07/16/2018

09:18 PM Bug #24950: Running osd_skip_data_digest in a mixed cluster is not ideal
https://github.com/ceph/ceph/pull/23083 David Zafman
09:14 PM Bug #24950 (Resolved): Running osd_skip_data_digest in a mixed cluster is not ideal

If osd_skip_data_digest in a mixed BlueStore/FileStore cluster is dangerous because we loose data_digest integrity ...
David Zafman
09:17 PM Backport #24951 (Resolved): mimic: Running osd_skip_data_digest in a mixed cluster is not ideal
https://github.com/ceph/ceph/pull/23084 David Zafman
09:08 PM Feature #24949 (Resolved): luminous: Allow scrub to fix Luminous 12.2.6 corruption of data_digest

I'm thinking that while osd_distrust_data_digest=true we should automatically ignore data_digest errors and repair ...
David Zafman
07:36 PM Bug #23352: osd: segfaults under normal operation
We actually got one on July 15: Jul 14 23:54:42 roc04r-sc3a080 kernel: [6988357.283555] safe_timer[19917]: segfault a... Alex Gorbachev
03:54 AM Bug #23352: osd: segfaults under normal operation
The latest core uploaded by Dan in comment 66 is slightly different to the others we've seen so far.
Once again th...
Brad Hubbard
02:24 PM Bug #24687: Automatically set expected_num_objects for new pools with >=100 PGs per OSD
https://github.com/ceph/ceph/pull/23072 Douglas Fuller
02:24 PM Bug #24687 (Fix Under Review): Automatically set expected_num_objects for new pools with >=100 PG...
Because a value for expected_num_objects is too difficult to determine automatically, instead we print a suggestion t... Douglas Fuller
11:16 AM Bug #24938 (New): luminous: rados listomapkeys & listomapvals don't return data.
Hi,
rados listomapkeys & rados listomapvals don't return data when running Luminous, tested on 12.2.4 and 12.2.6:
...
Magnus Grönlund
08:52 AM Bug #24935 (Duplicate): SafeTimer? osd killed by kernel for Segmentation fault
My environment :
[root@gz-ceph-52-203 log]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
[root@gz-...
伟杰 谭
12:57 AM Bug #18209: src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())
Noting the same issue, per ceph-users list post:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-July/028...
David Young

07/15/2018

05:46 AM Documentation #24924 (Resolved): doc: typo in crush-map docs
Each time the OSD starts, it verifies it is in the correct location in the CRUSH map and, if it is not, it moved its... Michael Jones

07/14/2018

09:04 PM Bug #24923 (Resolved): doc: http://docs.ceph.com/docs/mimic/rados/operations/pg-states/
Undersized
The placement group fewer copies than the configured pool replication level.
Missing "has"
Michael Jones
07:57 PM Bug #23871: luminous->mimic: missing primary copy of xxx, wil try copies on 3, then full-object r...
For the luminous regression, this will reproduce the issue:... Sage Weil

07/13/2018

11:02 PM Feature #24917 (New): Gracefully deal with upgrades when bluestore skipping of data_digest become...

Once the data_digest is no longer being used, but is still set from an earlier version, we can get EIO from read bu...
David Zafman
09:26 PM Backport #24083 (In Progress): luminous: rados: not all exceptions accept keyargs
PR: https://github.com/ceph/ceph/pull/22979 Victor Denisov
03:52 PM Bug #24597 (Resolved): FAILED assert(0 == "ERROR: source must exist") in FileStore::_collection_m...
Nathan Cutler
05:09 AM Bug #24597: FAILED assert(0 == "ERROR: source must exist") in FileStore::_collection_move_rename()
Could cephfs trigger this issue? There have been two reports of cephfs_metadata pool crc errors on the users ML this ... Dan van der Ster
03:51 PM Backport #24891 (Resolved): mimic: FAILED assert(0 == "ERROR: source must exist") in FileStore::_...
Nathan Cutler
03:18 PM Backport #24891: mimic: FAILED assert(0 == "ERROR: source must exist") in FileStore::_collection_...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22997
merged
Yuri Weinstein
03:00 PM Bug #24875: OSD: still returning EIO instead of recovering objects on checksum errors
FTR, this crc issue is probably due to an incomplete backport to 12.2.6 of the skip_digest changes for bluestore:
...
Dan van der Ster
01:55 PM Bug #24909 (Fix Under Review): RBD client IOPS pool stats are incorrect (2x higher; includes IO h...
https://github.com/ceph/ceph/pull/23029 Jason Dillaman
01:47 PM Bug #24909 (In Progress): RBD client IOPS pool stats are incorrect (2x higher; includes IO hints ...
Jason Dillaman
01:47 PM Bug #24909 (Resolved): RBD client IOPS pool stats are incorrect (2x higher; includes IO hints as ...
While running performance testing with Ceph metrics gathering statistics on the cluster, I noticed that while my RBD ... Jason Dillaman
12:58 PM Backport #24908 (In Progress): luminous: luminous->mimic: missing primary copy of xxx, wil try co...
Nathan Cutler
12:57 PM Backport #24908 (Resolved): luminous: luminous->mimic: missing primary copy of xxx, wil try copie...
https://github.com/ceph/ceph/pull/23028 Nathan Cutler
12:26 PM Backport #24890 (Resolved): luminous: FAILED assert(0 == "ERROR: source must exist") in FileStore...
Nathan Cutler
12:26 PM Bug #23871: luminous->mimic: missing primary copy of xxx, wil try copies on 3, then full-object r...
original fix is fe5038c7f9577327f82913b4565712c53903ee48
luminosu backport https://github.com/ceph/ceph/pull/23028
Sage Weil
12:06 PM Bug #23871 (Pending Backport): luminous->mimic: missing primary copy of xxx, wil try copies on 3,...
Sage Weil
11:31 AM Backport #24888 (Need More Info): luminous: osd: crash in OpTracker::unregister_inflight_op via O...
non-trivial backport. There are two conflicts. The first conflict can be resolved by cherry-picking 17a192ba5cdbe2129... Nathan Cutler
11:23 AM Backport #24889 (In Progress): mimic: osd: crash in OpTracker::unregister_inflight_op via OSD::ge...
Nathan Cutler
11:22 AM Backport #24864 (In Progress): luminous: Abort in OSDMap::decode() during qa/standalone/erasure-c...
Nathan Cutler
11:20 AM Backport #24865 (In Progress): mimic: Abort in OSDMap::decode() during qa/standalone/erasure-code...
Nathan Cutler

07/12/2018

11:56 PM Bug #24801 (In Progress): PG num_bytes becomes huge
David Zafman
07:38 PM Bug #24600 (Resolved): ValueError: too many values to unpack due to lack of subdir
Nathan Cutler
07:38 PM Backport #24617 (Resolved): mimic: ValueError: too many values to unpack due to lack of subdir
Nathan Cutler
04:36 PM Backport #24617: mimic: ValueError: too many values to unpack due to lack of subdir
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22888
merged
Yuri Weinstein
02:05 PM Bug #24875: OSD: still returning EIO instead of recovering objects on checksum errors
Is this the relevant fix? https://github.com/ceph/ceph/commit/4667280f8afe6cd68dfffea61d7530581f3dd0eb
Alessandro'...
Dan van der Ster
12:27 PM Backport #24890 (In Progress): luminous: FAILED assert(0 == "ERROR: source must exist") in FileSt...
Nathan Cutler
10:18 AM Backport #24890 (Resolved): luminous: FAILED assert(0 == "ERROR: source must exist") in FileStore...
https://github.com/ceph/ceph/pull/22976 Nathan Cutler
11:03 AM Backport #24891 (In Progress): mimic: FAILED assert(0 == "ERROR: source must exist") in FileStore...
Nathan Cutler
10:18 AM Backport #24891 (Resolved): mimic: FAILED assert(0 == "ERROR: source must exist") in FileStore::_...
https://github.com/ceph/ceph/pull/22997 Nathan Cutler
10:50 AM Bug #24150 (Resolved): LibRadosMiscPool.PoolCreationRace segv
Nathan Cutler
10:50 AM Backport #24204 (Resolved): mimic: LibRadosMiscPool.PoolCreationRace segv
Nathan Cutler
12:06 AM Backport #24204: mimic: LibRadosMiscPool.PoolCreationRace segv
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22291
merged
Yuri Weinstein
10:50 AM Bug #24321 (Resolved): assert manager.get_num_active_clean() == pg_num on rados/singleton/all/max...
Nathan Cutler
10:49 AM Backport #24329 (Resolved): mimic: assert manager.get_num_active_clean() == pg_num on rados/singl...
Nathan Cutler
12:05 AM Backport #24329: mimic: assert manager.get_num_active_clean() == pg_num on rados/singleton/all/ma...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22492
merged
Yuri Weinstein
10:48 AM Backport #24747 (Resolved): mimic: change default filestore_merge_threshold to -10
Nathan Cutler
12:03 AM Backport #24747: mimic: change default filestore_merge_threshold to -10
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22813
merged
Yuri Weinstein
10:48 AM Bug #24365 (Resolved): cosbench stuck at booting cosbench driver
Nathan Cutler
10:47 AM Backport #24473 (Resolved): mimic: cosbench stuck at booting cosbench driver
Nathan Cutler
12:03 AM Backport #24473: mimic: cosbench stuck at booting cosbench driver
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22887
merged
Yuri Weinstein
10:46 AM Bug #24487 (Resolved): osd: choose_acting loop
Nathan Cutler
10:46 AM Backport #24618 (Resolved): mimic: osd: choose_acting loop
Nathan Cutler
12:02 AM Backport #24618: mimic: osd: choose_acting loop
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22889
merged
Yuri Weinstein
10:46 AM Bug #24349 (Resolved): osd: stray osds in async_recovery_targets cause out of order ops
Nathan Cutler
10:46 AM Backport #24383 (Resolved): mimic: osd: stray osds in async_recovery_targets cause out of order ops
Nathan Cutler
12:02 AM Backport #24383: mimic: osd: stray osds in async_recovery_targets cause out of order ops
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22889
merged
Yuri Weinstein
10:45 AM Backport #24805 (Resolved): mimic: rgw workload makes osd memory explode
Nathan Cutler
12:00 AM Backport #24805: mimic: rgw workload makes osd memory explode
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/22960
merged
Yuri Weinstein
10:36 AM Backport #24771 (Resolved): mimic: osd: may get empty info at recovery
Nathan Cutler
10:18 AM Backport #24889 (Resolved): mimic: osd: crash in OpTracker::unregister_inflight_op via OSD::get_h...
https://github.com/ceph/ceph/pull/23026 Nathan Cutler
10:18 AM Backport #24888 (Rejected): luminous: osd: crash in OpTracker::unregister_inflight_op via OSD::ge...
Nathan Cutler
03:03 AM Bug #24664 (Pending Backport): osd: crash in OpTracker::unregister_inflight_op via OSD::get_healt...
Sage Weil
03:01 AM Bug #24597 (Pending Backport): FAILED assert(0 == "ERROR: source must exist") in FileStore::_coll...
Sage Weil
 

Also available in: Atom