Project

General

Profile

Activity

From 01/04/2018 to 02/02/2018

02/02/2018

11:08 PM Backport #22707: luminous: ceph_objectstore_tool: no flush before collection_empty() calls; Objec...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/19967
merged
Yuri Weinstein
11:01 PM Backport #22389: luminous: ceph-objectstore-tool: Add option "dump-import" to examine an export
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/19487
merged
Yuri Weinstein
11:00 PM Backport #22399: luminous: Manager daemon x is unresponsive. No standby daemons available
Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/19501
merged
Yuri Weinstein
09:15 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
Frank Li wrote:
> I've updated all the ceph-mon with the RPMs from the patch repo, they came up fine, and I've resta...
Frank Li
09:14 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
I've updated all the ceph-mon with the RPMs from the patch repo, they came up fine, and I've restarted the OSDs, but ... Frank Li
08:29 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
Just for future operational references, is there anyway to revert the Monitor map to a previous state in the case of ... Frank Li
06:22 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
Please note the Crash happend on the monitor, not the OSD, the OSDs all stayed up, but all the monitor crashed. Frank Li
06:21 PM Bug #22847: ceph osd force-create-pg cause all ceph-mon to crash and unable to come up again
-4> 2018-01-31 22:47:22.942381 7fc641d0b700 1 -- 10.102.52.37:6789/0 <== mon.0 10.102.52.37:6789/0 0 ==== log(1 ... Frank Li
06:09 PM Bug #22847 (Fix Under Review): ceph osd force-create-pg cause all ceph-mon to crash and unable to...
https://github.com/ceph/ceph/pull/20267 Sage Weil
05:46 PM Bug #22847 (Need More Info): ceph osd force-create-pg cause all ceph-mon to crash and unable to c...
Can you attach the entire osd log for the crashed osd? (In particular, we need to see what assertion failed.) Thanks! Sage Weil
07:32 PM Bug #22902 (Resolved): src/osd/PG.cc: 6455: FAILED assert(0 == "we got a bad state machine event")

http://pulpito.ceph.com/dzafman-2018-02-01_09:46:36-rados-wip-zafman-testing-distro-basic-smithi/2138315
I think...
David Zafman
07:23 PM Bug #22834 (Resolved): Primary ends up in peer_info which isn't supposed to be there
David Zafman
09:48 AM Bug #22257 (Resolved): mon: mgrmaps not trimmed
Nathan Cutler
09:48 AM Backport #22258 (Resolved): mon: mgrmaps not trimmed
Nathan Cutler
09:47 AM Backport #22402 (Resolved): luminous: osd: replica read can trigger cache promotion
Nathan Cutler
08:05 AM Backport #22807 (Resolved): luminous: "osd pool stats" shows recovery information bugly
Nathan Cutler
07:54 AM Bug #22715 (Resolved): log entries weirdly zeroed out after 'osd pg-temp' command
Nathan Cutler
07:54 AM Backport #22744 (Resolved): luminous: log entries weirdly zeroed out after 'osd pg-temp' command
Nathan Cutler
05:46 AM Documentation #22843: [doc][luminous] the configuration guide still contains osd_op_threads and d...
For downstream Red Hat products, you should use the Red Hat bugzilla to report bugs. This is the upstream bug tracker... Nathan Cutler
05:15 AM Backport #22013 (In Progress): jewel: osd/ReplicatedPG.cc: recover_replicas: object added to miss...
Nathan Cutler
12:17 AM Bug #22882: Objecter deadlocked on op budget while holding rwlock in ms_handle_reset()
When I saw the test running for 4 hours my first thought was that the cluster was unhealthy -- but all OSDs were up a... Jason Dillaman

02/01/2018

11:26 PM Bug #22117 (Resolved): crushtool decompile prints bogus when osd < max_osd_id are missing
Nathan Cutler
11:25 PM Backport #22199 (Resolved): crushtool decompile prints bogus when osd < max_osd_id are missing
Nathan Cutler
11:24 PM Bug #22113 (Resolved): osd: pg limit on replica test failure
Nathan Cutler
11:24 PM Backport #22176 (Resolved): luminous: osd: pg limit on replica test failure
Nathan Cutler
11:24 PM Bug #21907 (Resolved): On pg repair the primary is not favored as was intended
Nathan Cutler
11:23 PM Backport #22213 (Resolved): luminous: On pg repair the primary is not favored as was intended
Nathan Cutler
11:10 PM Backport #22258: mon: mgrmaps not trimmed
Kefu Chai wrote:
> mgrmonitor does not trim old mgrmaps. these can accumulate forever.
>
> https://github.com/ce...
Yuri Weinstein
11:08 PM Backport #22402: luminous: osd: replica read can trigger cache promotion
Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/19499
merged
Yuri Weinstein
11:04 PM Bug #22673 (Resolved): osd checks out-of-date osdmap for DESTROYED flag on start
Nathan Cutler
11:03 PM Backport #22761 (Resolved): luminous: osd checks out-of-date osdmap for DESTROYED flag on start
Nathan Cutler
11:01 PM Backport #22761: luminous: osd checks out-of-date osdmap for DESTROYED flag on start
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/20068
merged
Yuri Weinstein
11:00 PM Backport #22807: luminous: "osd pool stats" shows recovery information bugly
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/20150
merged
Yuri Weinstein
10:59 PM Bug #22419 (Resolved): Pool Compression type option doesn't apply to new OSD's
Nathan Cutler
10:59 PM Backport #22502 (Resolved): luminous: Pool Compression type option doesn't apply to new OSD's
Nathan Cutler
09:04 PM Backport #22502: luminous: Pool Compression type option doesn't apply to new OSD's
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/20106
merged
Yuri Weinstein
10:43 PM Bug #22887 (Duplicate): osd/ECBackend.cc: 2202: FAILED assert((offset + length) <= (range.first.g...
... Patrick Donnelly
10:29 PM Bug #22882: Objecter deadlocked on op budget while holding rwlock in ms_handle_reset()
It's not quite that simple; ops on a failed OSD or closed session get moved into the homeless_session and at a quick ... Greg Farnum
06:52 PM Bug #22882: Objecter deadlocked on op budget while holding rwlock in ms_handle_reset()
http://qa-proxy.ceph.com/teuthology/jdillaman-2018-02-01_08:21:33-rbd-wip-jd-testing-luminous-distro-basic-smithi/213... Jason Dillaman
06:51 PM Bug #22882 (Resolved): Objecter deadlocked on op budget while holding rwlock in ms_handle_reset()
... Jason Dillaman
09:06 PM Bug #22715: log entries weirdly zeroed out after 'osd pg-temp' command
merged https://github.com/ceph/ceph/pull/20042 Yuri Weinstein
06:09 PM Bug #22881 (Resolved): scrub interaction with HEAD boundaries and snapmapper repair is broken
symptom:... Sage Weil
11:45 AM Bug #22842: (luminous) ceph-disk prepare of simple filestore failed with 'Unable to set partition...
John Spray wrote:
> I would suspect that something is strange about the disk (non-GPT partition table perhaps?), and...
Enrico Labedzki
11:11 AM Bug #22842: (luminous) ceph-disk prepare of simple filestore failed with 'Unable to set partition...
I would suspect that something is strange about the disk (non-GPT partition table perhaps?), and you're getting less-... John Spray
11:43 AM Backport #22449: jewel: Visibility for snap trim queue length
presumably non-trivial backport; assigning to the developer Nathan Cutler
11:40 AM Feature #22448 (Pending Backport): Visibility for snap trim queue length
Nathan Cutler
10:49 AM Backport #22866 (Resolved): jewel: ceph osd df json output validation reported invalid numbers (-...
https://github.com/ceph/ceph/pull/20344 Nathan Cutler
08:12 AM Bug #21750 (Resolved): scrub stat mismatch on bytes
The code is gone. xie xingguo
05:42 AM Bug #22848: Pull the cable,5mins later,Put back to the cable,pg stuck a long time ulitl to resta...
why pgs status is peering alawys, I could sure that such as monitor osd both ok.
those pg state machine should wo...
Yong Wang
05:32 AM Bug #22848 (New): Pull the cable,5mins later,Put back to the cable,pg stuck a long time ulitl to...
Hi all,
We have 3 nodes ceph cluster, version 10.2.10.
new installing enviroment and prosessional rpms from downlo...
Yong Wang
04:29 AM Bug #22847 (Resolved): ceph osd force-create-pg cause all ceph-mon to crash and unable to come up...
during the course of trouble-shooting an osd issue, I ran this command:
ceph osd force-create-pg 1.ace11d67
then al...
Frank Li

01/31/2018

10:39 PM Bug #22656: scrub mismatch on bytes (cache pools)
We just aren't assigning that much priority to cache tiering. Greg Farnum
10:27 PM Bug #22752 (Fix Under Review): snapmapper inconsistency, crash on luminous
Greg Farnum
03:32 PM Bug #22440: New pgs per osd hard limit can cause peering issues on existing clusters
https://github.com/ceph/ceph/pull/20204 Kefu Chai
01:53 PM Documentation #22843 (Won't Fix): [doc][luminous] the configuration guide still contains osd_op_t...
In the configuration guide for RHCS 3 is still mentioned osd_op_threads, which is not already part of RHCS 3 code.
...
Tomas Petr
01:51 PM Bug #22842 (New): (luminous) ceph-disk prepare of simple filestore failed with 'Unable to set par...
Hi,
can't create a simple filestore with help of ceph-disk under ubuntu trusy, please have a look on this...
<p...
Enrico Labedzki
12:50 PM Bug #21496: doc: Manually editing a CRUSH map, Word 'type' missing.
Yes, I believe so. Anonymous
12:48 PM Bug #21496: doc: Manually editing a CRUSH map, Word 'type' missing.
Sorry Is it fine now?
Kallepalli Mounika Smitha
12:44 PM Bug #21496: doc: Manually editing a CRUSH map, Word 'type' missing.
No, the line should be:... Anonymous
12:34 PM Bug #21496: doc: Manually editing a CRUSH map, Word 'type' missing.
Did the changes. Please tell the changes are correct or not.Please review. Sorry if I am wrong. Kallepalli Mounika Smitha
12:03 PM Bug #22142 (Resolved): mon doesn't send health status after paxos service is inactive temporarily
John Spray
12:03 PM Backport #22421 (Resolved): mon doesn't send health status after paxos service is inactive tempor...
John Spray
04:10 AM Bug #22837 (Resolved): discover_all_missing() not always called during activating

Sometimes discover_all_missing() isn't called so we don't get a complete picture of misplaced objects. This makes ...
David Zafman
12:44 AM Backport #22164: luminous: cluster [ERR] Unhandled exception from module 'balancer' while running...
Prashant D wrote:
> https://github.com/ceph/ceph/pull/19023
merged
Yuri Weinstein
12:44 AM Backport #22167: luminous: Various odd clog messages for mons
Prashant D wrote:
> https://github.com/ceph/ceph/pull/19031
merged
Yuri Weinstein
12:43 AM Backport #22199: crushtool decompile prints bogus when osd < max_osd_id are missing
Jan Fajerski wrote:
> https://github.com/ceph/ceph/pull/19039
merged
Yuri Weinstein
12:41 AM Backport #22176: luminous: osd: pg limit on replica test failure
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/19059
merged
Yuri Weinstein
12:40 AM Backport #22213: luminous: On pg repair the primary is not favored as was intended
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/19083
merged
Yuri Weinstein
12:13 AM Bug #22834: Primary ends up in peer_info which isn't supposed to be there

Workaround
https://github.com/ceph/ceph/pull/20189
David Zafman

01/30/2018

11:43 PM Bug #22834 (Resolved): Primary ends up in peer_info which isn't supposed to be there

rados/singleton/{all/lost-unfound.yaml msgr-failures/few.yaml msgr/random.yaml objectstore/bluestore-bitmap.yaml ra...
David Zafman
04:01 PM Bug #22440: New pgs per osd hard limit can cause peering issues on existing clusters
will backport https://github.com/ceph/ceph/pull/18614 to luminous. it helps to make this status more visible to user. Kefu Chai

01/29/2018

09:26 PM Bug #22656: scrub mismatch on bytes (cache pools)
/a/sage-2018-01-29_18:07:24-rados-wip-sage-testing-2018-01-29-0927-distro-basic-smithi/2122957
description: rados/th...
Sage Weil
08:01 PM Bug #22743: "RadosModel.h: 854: FAILED assert(0)" in upgrade:hammer-x-jewel-distro-basic-smithi
I don't think a bug in a hammer binary during an upgrade test to jewel is an urgent problem at this point? Greg Farnum
03:15 PM Bug #22201: PG removal with ceph-objectstore-tool segfaulting
We're getting close to converting the OSDs in this cluster to Bluestore. If you would like any tests to be run on th... David Turner
02:56 PM Bug #22668: osd/ExtentCache.h: 371: FAILED assert(tid == 0)
simpler fix: https://github.com/ceph/ceph/pull/20169 Sage Weil
02:38 PM Bug #22440: New pgs per osd hard limit can cause peering issues on existing clusters
First, perhaps this will help to make these issues more visible: https://github.com/ceph/ceph/pull/20167
Second, i...
Dan van der Ster
10:23 AM Bug #20086 (Fix Under Review): LibRadosLockECPP.LockSharedDurPP gets EEXIST
https://github.com/ceph/ceph/pull/20161 Kefu Chai
07:28 AM Bug #20086: LibRadosLockECPP.LockSharedDurPP gets EEXIST
/a/kchai-2018-01-28_09:53:35-rados-wip-kefu-testing-2018-01-27-2356-distro-basic-mira/2120659... Kefu Chai
01:15 AM Bug #21471 (Resolved): mon osd feature checks for osdmap flags and require-osd-release fail if 0 ...
Brad Hubbard

01/28/2018

11:59 PM Backport #22807 (In Progress): luminous: "osd pool stats" shows recovery information bugly
https://github.com/ceph/ceph/pull/20150 Prashant D
12:31 AM Backport #22818 (In Progress): jewel: repair_test fails due to race with osd start
Nathan Cutler

01/27/2018

08:35 AM Backport #21872 (In Progress): jewel: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
Nathan Cutler
06:49 AM Bug #22662 (Pending Backport): ceph osd df json output validation reported invalid numbers (-nan)...
Nathan Cutler

01/26/2018

06:01 PM Backport #22818 (Resolved): jewel: repair_test fails due to race with osd start
https://github.com/ceph/ceph/pull/20146 Nathan Cutler
05:54 PM Bug #22662: ceph osd df json output validation reported invalid numbers (-nan) (jewel)
+1 for null, which is an English word and hence far more comprehensible than "NaN", which is what I would call "Progr... Nathan Cutler
05:42 PM Bug #21577 (Resolved): ceph-monstore-tool --readable mode doesn't understand FSMap, MgrMap
Nathan Cutler
05:41 PM Backport #21636 (Resolved): luminous: ceph-monstore-tool --readable mode doesn't understand FSMap...
Nathan Cutler
05:21 PM Bug #20705 (Pending Backport): repair_test fails due to race with osd start
Seen in Jewel so marking for backport
http://qa-proxy.ceph.com/teuthology/dzafman-2018-01-25_13:41:04-rados-wip-za...
David Zafman
05:16 PM Backport #21872: jewel: ObjectStore/StoreTest.FiemapHoles/3 fails with kstore
This backport is needed as seen in:
http://qa-proxy.ceph.com/teuthology/dzafman-2018-01-25_13:41:04-rados-wip-zafm...
David Zafman
11:55 AM Bug #18239 (New): nan in ceph osd df again
Chang Liu
11:54 AM Bug #18239 (Duplicate): nan in ceph osd df again
duplicated with #22662 Chang Liu
08:00 AM Backport #22808 (Rejected): jewel: "osd pool stats" shows recovery information bugly
Nathan Cutler
08:00 AM Backport #22807 (Resolved): luminous: "osd pool stats" shows recovery information bugly
https://github.com/ceph/ceph/pull/20150 Nathan Cutler
07:30 AM Bug #22727 (Pending Backport): "osd pool stats" shows recovery information bugly
Kefu Chai

01/25/2018

07:59 PM Bug #20243 (Resolved): Improve size scrub error handling and ignore system attrs in xattr checking
David Zafman
07:59 PM Backport #21051 (Resolved): luminous: Improve size scrub error handling and ignore system attrs i...
David Zafman
07:58 PM Bug #21382 (Resolved): Erasure code recovery should send additional reads if necessary
David Zafman
07:56 PM Bug #22145 (Resolved): PG stuck in recovery_unfound
David Zafman
07:56 PM Bug #20059 (Resolved): miscounting degraded objects
David Zafman
07:55 PM Backport #22724 (Resolved): luminous: miscounting degraded objects
David Zafman
07:33 PM Backport #22724 (Fix Under Review): luminous: miscounting degraded objects
Included in https://github.com/ceph/ceph/pull/20055 David Zafman
07:55 PM Backport #22387 (Resolved): luminous: PG stuck in recovery_unfound
David Zafman
07:35 PM Backport #22387 (Fix Under Review): luminous: PG stuck in recovery_unfound
David Zafman
07:54 PM Backport #21653 (Resolved): luminous: Erasure code recovery should send additional reads if neces...
David Zafman
07:53 PM Backport #22069 (Resolved): luminous: osd/ReplicatedPG.cc: recover_replicas: object added to miss...
David Zafman
04:18 PM Bug #22662: ceph osd df json output validation reported invalid numbers (-nan) (jewel)
Chang Liu wrote:
> Enrico Labedzki wrote:
> > Chang Liu wrote:
> > > Enrico Labedzki wrote:
> > > > Chang Liu wro...
Enrico Labedzki
03:45 PM Bug #22662: ceph osd df json output validation reported invalid numbers (-nan) (jewel)
Enrico Labedzki wrote:
> Chang Liu wrote:
> > Enrico Labedzki wrote:
> > > Chang Liu wrote:
> > > > Sage Weil wro...
Chang Liu
09:40 AM Bug #22662: ceph osd df json output validation reported invalid numbers (-nan) (jewel)
Chang Liu wrote:
> Enrico Labedzki wrote:
> > Chang Liu wrote:
> > > Sage Weil wrote:
> > > > 1. it's not valid j...
Enrico Labedzki
09:30 AM Bug #22662: ceph osd df json output validation reported invalid numbers (-nan) (jewel)
Enrico Labedzki wrote:
> Chang Liu wrote:
> > Sage Weil wrote:
> > > 1. it's not valid json.. Formatter shouldn't ...
Chang Liu
08:52 AM Bug #22662: ceph osd df json output validation reported invalid numbers (-nan) (jewel)
Chang Liu wrote:
> Sage Weil wrote:
> > 1. it's not valid json.. Formatter shouldn't allow it
> > 2. we should hav...
Enrico Labedzki
06:36 AM Bug #22662: ceph osd df json output validation reported invalid numbers (-nan) (jewel)
Sage Weil wrote:
> 1. it's not valid json.. Formatter shouldn't allow it
> 2. we should have a valid value (or 0) t...
Chang Liu
04:02 AM Bug #22662: ceph osd df json output validation reported invalid numbers (-nan) (jewel)

This bug has been fixed by https://github.com/ceph/ceph/pull/13531. We should backport it to Jewel.
Chang Liu
04:08 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi

As Josh said it seems easier to trigger in Jewel. This is based on my attempt to reproduce in master.
All 50 ma...
David Zafman
02:22 AM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
Looking through the logs more with David, we found this sequence of events in 1946610:
1) osd.5 gets a write to ob...
Josh Durgin
12:45 PM Bug #22266: mgr/PyModuleRegistry.cc: 139: FAILED assert(map.epoch > 0)
Master PR for second round of backporting: https://github.com/ceph/ceph/pull/19780
Luminous backport PR: https://g...
Nathan Cutler
08:44 AM Bug #22656: scrub mismatch on bytes (cache pools)
Happened here as well: http://pulpito.ceph.com/smithfarm-2018-01-24_19:46:55-rados-wip-smithfarm-testing-distro-basic... Nathan Cutler
04:24 AM Backport #22794 (In Progress): jewel: heartbeat peers need to be updated when a new OSD added int...
https://github.com/ceph/ceph/pull/20108 Kefu Chai
04:14 AM Backport #22794 (Resolved): jewel: heartbeat peers need to be updated when a new OSD added into a...
https://github.com/ceph/ceph/pull/20108 Kefu Chai
04:13 AM Backport #22793 (Rejected): osd: sends messages to marked-down peers
i wanted to backport fix of #18004 not this one. Kefu Chai
04:12 AM Backport #22793 (Rejected): osd: sends messages to marked-down peers
the async osdmap updates introduce a new problem:
- handle_osd_map map X marks down osd Y
- pg thread uses map X-...
Kefu Chai

01/24/2018

09:18 PM Backport #21636: luminous: ceph-monstore-tool --readable mode doesn't understand FSMap, MgrMap
Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/18754
merged
Yuri Weinstein
09:10 PM Bug #22329: mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
New one:
/ceph/teuthology-archive/yuriw-2018-01-23_20:26:59-multimds-wip-yuri-testing-2018-01-22-1653-luminous-tes...
Patrick Donnelly
07:56 PM Backport #22502: luminous: Pool Compression type option doesn't apply to new OSD's
Master commit was reverted - redoing the backport. Nathan Cutler
06:12 PM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
Disregard my previous comment; different error message for the same assert was unfortunately buried in the logs. Sorr... Joao Eduardo Luis
06:04 PM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
FWIW, I am currently reproducing this quite reliably on my dev env, on a quite outdated version of master (cbe78ae629... Joao Eduardo Luis
01:55 PM Bug #21407 (Resolved): backoff causes out of order op
Nathan Cutler
01:54 PM Backport #21794 (Resolved): luminous: backoff causes out of order op
Nathan Cutler
11:23 AM Backport #22450 (In Progress): luminous: Visibility for snap trim queue length
https://github.com/ceph/ceph/pull/20098 Piotr Dalek

01/23/2018

11:57 PM Bug #21566 (Resolved): OSDService::recovery_need_sleep read+updated without locking
Nathan Cutler
11:57 PM Backport #21697 (Resolved): luminous: OSDService::recovery_need_sleep read+updated without locking
Nathan Cutler
11:06 PM Backport #21697: luminous: OSDService::recovery_need_sleep read+updated without locking
Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/18753
merged
Yuri Weinstein
11:56 PM Backport #21785 (Resolved): luminous: OSDMap cache assert on shutdown
Nathan Cutler
11:07 PM Backport #21785: luminous: OSDMap cache assert on shutdown
Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/18749
merged
Yuri Weinstein
11:55 PM Bug #21845 (Resolved): Objecter::_send_op unnecessarily constructs costly hobject_t
Nathan Cutler
11:55 PM Backport #21921 (Resolved): luminous: Objecter::_send_op unnecessarily constructs costly hobject_t
Nathan Cutler
11:09 PM Backport #21921: luminous: Objecter::_send_op unnecessarily constructs costly hobject_t
Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/18745
merged
Yuri Weinstein
11:54 PM Backport #21922 (Resolved): luminous: Objecter::C_ObjectOperation_sparse_read throws/catches exce...
Nathan Cutler
11:10 PM Backport #21922: luminous: Objecter::C_ObjectOperation_sparse_read throws/catches exceptions on -...
Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/18744
merged
Yuri Weinstein
11:25 PM Bug #21818 (Resolved): ceph_test_objectstore fails ObjectStore/StoreTest.Synthetic/1 (filestore) ...
Nathan Cutler
11:25 PM Backport #21924 (Resolved): luminous: ceph_test_objectstore fails ObjectStore/StoreTest.Synthetic...
Nathan Cutler
11:10 PM Backport #21924: luminous: ceph_test_objectstore fails ObjectStore/StoreTest.Synthetic/1 (filesto...
Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/18742
merged
Yuri Weinstein
08:30 PM Backport #22423 (Closed): luminous: osd: initial minimal efforts to clean up PG interface
I was able to cleanly backport http://tracker.ceph.com/issues/22069 without this large change. David Zafman
11:01 AM Bug #22351: Couldn't init storage provider (RADOS)
No, I set it to Luminous based on the request by theanalyst in https://github.com/ceph/ceph/pull/20023. I'm fine with... Brad Hubbard
10:24 AM Bug #22351: Couldn't init storage provider (RADOS)
@Brad Assigning to you and leaving the backport field on "luminous" (but feel free to zero it out if it's enough to m... Nathan Cutler
10:14 AM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
@David I can only guess that this is not reproducible in master and that's why it requires a luminous-only fix. Could... Nathan Cutler
10:01 AM Backport #22761 (In Progress): luminous: osd checks out-of-date osdmap for DESTROYED flag on start
Nathan Cutler
09:40 AM Backport #22761 (Resolved): luminous: osd checks out-of-date osdmap for DESTROYED flag on start
https://github.com/ceph/ceph/pull/20068 Nathan Cutler
07:48 AM Bug #22673 (Pending Backport): osd checks out-of-date osdmap for DESTROYED flag on start
Kefu Chai
06:38 AM Bug #22727: "osd pool stats" shows recovery information bugly
need to backport it to jewel and luminous. but it at least dates back to 9.2.0. see also http://lists.ceph.com/piperm... Kefu Chai
06:32 AM Bug #22727 (Fix Under Review): "osd pool stats" shows recovery information bugly
Kefu Chai

01/22/2018

11:50 PM Bug #22419 (Pending Backport): Pool Compression type option doesn't apply to new OSD's
Sage Weil
08:12 AM Bug #22419 (Fix Under Review): Pool Compression type option doesn't apply to new OSD's
https://github.com/ceph/ceph/pull/20044 Kefu Chai
11:46 PM Bug #22711 (Resolved): qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands: expect...
Sage Weil
12:53 PM Bug #22711 (Fix Under Review): qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands:...
https://github.com/ceph/ceph/pull/20046 Kefu Chai
11:06 AM Bug #22711: qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands: expect_false test...
the weirdness of this issue is that some PGs are mapped to a single OSD:... Kefu Chai
03:13 AM Bug #22711: qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands: expect_false test...
the curr_object_copies_rate value in PGMap.cc dump_object_stat_sum is .5, which is counteracting the 2x replication f... Sage Weil
07:04 PM Bug #22752: snapmapper inconsistency, crash on luminous
https://github.com/ceph/ceph/pull/20040 Sage Weil
07:03 PM Bug #22752 (Resolved): snapmapper inconsistency, crash on luminous
from Stefan Priebe on ceph-devel ML:... Sage Weil
06:47 PM Backport #22387 (In Progress): luminous: PG stuck in recovery_unfound

Included with another dependent backport as https://github.com/ceph/ceph/pull/20055
David Zafman
12:40 PM Backport #22387 (Need More Info): luminous: PG stuck in recovery_unfound
Non-trivial backport Nathan Cutler
02:27 PM Feature #22750 (Fix Under Review): libradosstriper conditional compile
-https://github.com/ceph/ceph/pull/18197- Nathan Cutler
01:21 PM Feature #22750 (Resolved): libradosstriper conditional compile
Currently libradosstriper is a hard dependency of the rados CLI tool.
Please add a "WITH_LIBRADOSSTRIPER" compile-...
Nathan Cutler
02:16 PM Bug #22746 (Fix Under Review): osd/common: ceph-osd process is terminated by the logratote task
John Spray
11:51 AM Bug #22746 (Resolved): osd/common: ceph-osd process is terminated by the logratote task
1. Construct the scene:
(1) step 1:
Open the terminal_1, and
Prepare the cmd: "killall -q -1 ceph-mon ...
huanwen ren
12:59 PM Support #22749 (Closed): dmClock OP classification
Why does dmClock algorithm in CEPH attribute recovery's read and write OP to osd_op_queue_mclock_osd_sub, so that whe... 何 伟俊
12:41 PM Backport #22724 (Need More Info): luminous: miscounting degraded objects
Nathan Cutler
12:41 PM Backport #22724: luminous: miscounting degraded objects
David, while you're doing this one, can you include https://tracker.ceph.com/issues/22387 as well? Nathan Cutler
12:23 PM Support #22680 (Resolved): mons segmentation faults New 12.2.2 cluster
Nathan Cutler
03:04 AM Bug #22715 (Pending Backport): log entries weirdly zeroed out after 'osd pg-temp' command
Kefu Chai
03:04 AM Backport #22744 (In Progress): luminous: log entries weirdly zeroed out after 'osd pg-temp' command
https://github.com/ceph/ceph/pull/20042 Kefu Chai
03:03 AM Backport #22744 (Resolved): luminous: log entries weirdly zeroed out after 'osd pg-temp' command
https://github.com/ceph/ceph/pull/20042 Kefu Chai

01/21/2018

08:29 PM Bug #22715 (Resolved): log entries weirdly zeroed out after 'osd pg-temp' command
Sage Weil
06:56 PM Bug #22743 (New): "RadosModel.h: 854: FAILED assert(0)" in upgrade:hammer-x-jewel-distro-basic-sm...
Run: http://pulpito.ceph.com/teuthology-2018-01-19_01:15:02-upgrade:hammer-x-jewel-distro-basic-smithi/
Job: 2088826...
Yuri Weinstein

01/20/2018

11:18 PM Bug #22351 (In Progress): Couldn't init storage provider (RADOS)
Reopening this and reassigning it to RADOS as there are a couple of changes we can make to logging to make this easie... Brad Hubbard

01/19/2018

04:16 PM Support #20108: PGs are not remapped correctly when one host fails
Hi,
Thank you for your answer!
I've seen that page before, but which tunable are you suggesting for the problem...
Laszlo Budai
09:59 AM Bug #22233 (Fix Under Review): prime_pg_temp breaks on uncreated pgs
https://github.com/ceph/ceph/pull/20025 Kefu Chai
09:08 AM Bug #22711: qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands: expect_false test...
... Chang Liu
02:51 AM Support #22553: ceph-object-tool can not remove metadata pool's object
not something wrong with disk,it can be repeat peng zhang

01/18/2018

10:57 PM Support #20108: PGs are not remapped correctly when one host fails
http://docs.ceph.com/docs/master/rados/operations/crush-map/?highlight=tunables#tunables Greg Farnum
07:02 PM Bug #22351 (Closed): Couldn't init storage provider (RADOS)
Yehuda Sadeh
10:47 AM Bug #22351: Couldn't init storage provider (RADOS)
Brad Hubbard wrote:
>
> (6*1024)*3 = 18432, thus 18432/47 ~ 392 PGs per OSD. You omitted the size of the pools.
...
Nikos Kormpakis
03:21 AM Bug #22351: Couldn't init storage provider (RADOS)
https://ceph.com/pgcalc/ should be used as a guide/starting point. Brad Hubbard
03:07 PM Bug #22727: "osd pool stats" shows recovery information bugly
https://github.com/ceph/ceph/pull/20009 Chang Liu
05:18 AM Bug #22727 (In Progress): "osd pool stats" shows recovery information bugly
Chang Liu
03:16 AM Bug #22727 (Resolved): "osd pool stats" shows recovery information bugly
... Chang Liu
03:51 AM Bug #22715 (Fix Under Review): log entries weirdly zeroed out after 'osd pg-temp' command
https://github.com/ceph/ceph/pull/19998 Sage Weil

01/17/2018

10:28 PM Bug #22351: Couldn't init storage provider (RADOS)
Nikos Kormpakis wrote:
> But I still cannot understand why I'm hitting this error.
> Regarding my cluster, I have t...
Brad Hubbard
01:15 PM Bug #22351: Couldn't init storage provider (RADOS)
Brad Hubbard wrote:
> I'm able to reproduce something like what you are seeing, the messages are a little different....
Nikos Kormpakis
03:30 AM Bug #22351: Couldn't init storage provider (RADOS)
I'm able to reproduce something like what you are seeing, the messages are a little different.
What I see is this....
Brad Hubbard
12:12 AM Bug #22351: Couldn't init storage provider (RADOS)
It turns out what we need is the hexadecimal int representation of '-34' from the ltrace output.
$ c++filt </tmp/l...
Brad Hubbard
10:26 PM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
Ryan Anstey wrote:
> I'm working on fixing all my inconsistent pgs but I'm having issues with rados get... hopefully...
Brian Andrus
09:07 PM Bug #22656: scrub mismatch on bytes (cache pools)
/a/sage-2018-01-17_14:40:55-rados-wip-sage-testing-2018-01-16-2156-distro-basic-smithi/2082959
description: rados/...
Sage Weil
07:54 PM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
David Zafman
07:48 PM Bug #20059: miscounting degraded objects
https://github.com/ceph/ceph/pull/19850 David Zafman
07:36 PM Bug #21387 (Can't reproduce): mark_unfound_lost hangs
Multiple fixes to mark_all_unfound_lost() has fixed this. Possibly the most important master branch commit is 689bff... David Zafman
06:00 PM Bug #22668 (Fix Under Review): osd/ExtentCache.h: 371: FAILED assert(tid == 0)
https://github.com/ceph/ceph/pull/19989 Sage Weil
05:10 PM Backport #22724 (Resolved): luminous: miscounting degraded objects
on bigbang,... David Zafman
04:39 PM Bug #22673 (Fix Under Review): osd checks out-of-date osdmap for DESTROYED flag on start
note: you can work around this by waiting a bit until some osd maps trim from the monitor.
https://github.com/ceph...
Sage Weil
02:54 PM Bug #22673: osd checks out-of-date osdmap for DESTROYED flag on start
It looks like the _preboot destroyed check should go after we catch up on maps. Sage Weil
02:53 PM Bug #22673: osd checks out-of-date osdmap for DESTROYED flag on start
This is a real bug, should be straightforward to fix. Thanks for the report! Sage Weil
02:59 PM Bug #22544: objecter cannot resend split-dropped op when racing with con reset
Hmm, I'm not sure what the best fix is. Do you see a good path to fixing this with ms_handle_connect()? Sage Weil
02:57 PM Bug #22659 (In Progress): During the cache tiering configuration ,ceph-mon daemon getting crashed...
This will need to be backported to luminous and jewel once merged. Joao Eduardo Luis
09:36 AM Bug #22659: During the cache tiering configuration ,ceph-mon daemon getting crashed after setting...
https://github.com/ceph/ceph/pull/19983 Jing Li
02:55 PM Bug #22662: ceph osd df json output validation reported invalid numbers (-nan) (jewel)
1. it's not valid json.. Formatter shouldn't allow it
2. we should have a valid value (or 0) to use
Sage Weil
02:52 PM Bug #22661 (Triaged): Segmentation fault occurs when the following CLI is executed
Joao Eduardo Luis
02:51 PM Bug #22672 (Triaged): OSDs frequently segfault in PrimaryLogPG::find_object_context() with empty ...
Joao Eduardo Luis
02:28 PM Bug #22597 (Fix Under Review): "sudo chown -R ceph:ceph /var/lib/ceph/osd/ceph-0'" fails in upgra...
https://github.com/ceph/ceph/pull/19987 Kefu Chai
01:32 PM Bug #22233 (In Progress): prime_pg_temp breaks on uncreated pgs
Kefu Chai
11:24 AM Support #22664: some random OSD are down (with a Abort signal on exception) after replace/rebuild...
Hi Greg,
can you point me to the link, as far we have seen yet, all ulimit 10 times higher as needed on all nodes....
Enrico Labedzki

01/16/2018

09:49 PM Bug #22715 (Resolved): log entries weirdly zeroed out after 'osd pg-temp' command
... Sage Weil
07:59 PM Bug #20059 (Pending Backport): miscounting degraded objects
David Zafman
07:10 PM Bug #22711 (Resolved): qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands: expect...
... Sage Weil
07:09 PM Bug #22677 (Resolved): rados/test_rados_tool.sh failure
Sage Weil
04:16 PM Bug #22351: Couldn't init storage provider (RADOS)
Hello,
we're facing the same issue on a Luminous cluster.
Some info about the cluster:
Version: ceph version 1...
Nikos Kormpakis
03:08 PM Bug #20874: osd/PGLog.h: 1386: FAILED assert(miter == missing.get_items().end() || (miter->second...
/a/sage-2018-01-16_03:08:54-rados-wip-sage2-testing-2018-01-15-1257-distro-basic-smithi/2077982... Sage Weil
01:33 PM Backport #22707 (In Progress): luminous: ceph_objectstore_tool: no flush before collection_empty(...
Nathan Cutler
01:30 PM Backport #22707 (Resolved): luminous: ceph_objectstore_tool: no flush before collection_empty() c...
https://github.com/ceph/ceph/pull/19967 Nathan Cutler
01:21 PM Bug #22409 (Pending Backport): ceph_objectstore_tool: no flush before collection_empty() calls; O...
Sage Weil
12:53 PM Support #20108: PGs are not remapped correctly when one host fails
Hello,
I'm sorry I've missed your message. Can you please give me some clues about the "newer crush tunables" that...
Laszlo Budai
12:48 PM Bug #22668: osd/ExtentCache.h: 371: FAILED assert(tid == 0)
/a/sage-2018-01-15_18:49:16-rados-wip-sage-testing-2018-01-14-1341-distro-basic-smithi/2076047 Sage Weil
12:48 PM Bug #22668: osd/ExtentCache.h: 371: FAILED assert(tid == 0)
/a/sage-2018-01-15_18:49:16-rados-wip-sage-testing-2018-01-14-1341-distro-basic-smithi/2075822 Sage Weil
11:10 AM Support #22680: mons segmentation faults New 12.2.2 cluster
Thanks! We had jemalloc in LD_PRELOAD since Infernalis, so i didn't think about that. I removed this from sysconfig, ... Kenneth Waegeman

01/15/2018

07:26 PM Feature #22442: ceph daemon mon.id mon_status -> ceph daemon mon.id status
Joao, did mon_status just precede the other status commands, or was there a reason for them to be different? Greg Farnum
07:22 PM Bug #22486: ceph shows wrong MAX AVAIL with hybrid (chooseleaf firstn 1, chooseleaf firstn -1) CR...
Well, the hybrid ruleset isn't giving you as much host isolation as you're probably thinking, since it can select an ... Greg Farnum
07:11 PM Support #22664 (Closed): some random OSD are down (with a Abort signal on exception) after replac...
It's failing to create a new thread. You probably need to bump the ulimit; this is discussed in the documentation. :) Greg Farnum
07:08 PM Support #22680: mons segmentation faults New 12.2.2 cluster
This is buried in the depths of RocksDB doing IO, so the only causes I know of/can think of are
1) you've found an u...
Greg Farnum
10:39 AM Support #22680 (Resolved): mons segmentation faults New 12.2.2 cluster

Hi all,
I installed a new Luminous 12.2.2 cluster. The monitors were up at first, but quickly started failing, s...
Kenneth Waegeman
05:48 PM Backport #22387: luminous: PG stuck in recovery_unfound
Include commit 64047e1 "osd: Don't start recovery for missing until active pg state set" from https://github.com/ceph... David Zafman
11:00 AM Support #22531: OSD flapping under repair/scrub after recieve inconsistent PG LFNIndex.cc: 439: F...
Josh Durgin wrote:
> Can you provide a directory listing for pg 1.f? It seems a file that does not obey the internal...
Jan Michlik
06:12 AM Bug #22351: Couldn't init storage provider (RADOS)
Brad Hubbard wrote:
> If this is a RADOS function returning ERANGE (34) then it should be possible to find it by att...
Amine Liu
05:05 AM Bug #22351: Couldn't init storage provider (RADOS)
If this is a RADOS function returning ERANGE (34) then it should be possible to find it by attempting to start the ra... Brad Hubbard
03:26 AM Bug #20059 (Fix Under Review): miscounting degraded objects
David Zafman
02:56 AM Bug #22668: osd/ExtentCache.h: 371: FAILED assert(tid == 0)
/a//kchai-2018-01-11_06:11:31-rados-wip-kefu-testing-2018-01-11-1036-distro-basic-mira/2058373/remote/mira002/log/cep... Kefu Chai

01/14/2018

10:46 PM Bug #22672: OSDs frequently segfault in PrimaryLogPG::find_object_context() with empty clone_snap...
To (relatively) stabilise the frequently crashing OSDs, we've added an early -ENOENT return to PrimaryLogPG::find_obj... David Disseldorp
04:37 PM Bug #22677: rados/test_rados_tool.sh failure
https://github.com/ceph/ceph/pull/19946 Sage Weil

01/13/2018

03:54 PM Bug #22677 (Resolved): rados/test_rados_tool.sh failure
... Sage Weil

01/12/2018

10:43 PM Bug #22438 (Resolved): mon: leak in lttng dlopen / __tracepoints__init
Patrick Donnelly
06:29 AM Bug #22438: mon: leak in lttng dlopen / __tracepoints__init
https://github.com/ceph/teuthology/pull/1144 Kefu Chai
10:23 PM Bug #22672: OSDs frequently segfault in PrimaryLogPG::find_object_context() with empty clone_snap...
That looks like a good way to investigate. We've seen a few reports of issues with cache tier snapshots since that re... Greg Farnum
02:54 PM Bug #22672: OSDs frequently segfault in PrimaryLogPG::find_object_context() with empty clone_snap...
to detect this case during scrub, I'm currently testing the following change:
-https://github.com/ddiss/ceph/commit/...
David Disseldorp
12:55 AM Bug #22672 (Triaged): OSDs frequently segfault in PrimaryLogPG::find_object_context() with empty ...
Environment is a Luminous cache-tiered deployment with some of the hot-tier OSDs converted to bluestore. The remainin... David Disseldorp
07:38 PM Bug #22063: "RadosModel.h: 1703: FAILED assert(!version || comp->get_version64() == version)" inr...
Also in http://qa-proxy.ceph.com/teuthology/teuthology-2017-11-17_18:17:24-rados-jewel-distro-basic-smithi/1857527/te... David Zafman
07:36 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
Yuri Weinstein wrote:
> Also in http://qa-proxy.ceph.com/teuthology/teuthology-2017-11-17_18:17:24-rados-jewel-distr...
David Zafman
07:18 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
As 17815 has to do with when scrub is allowed to start, it wouldn't be related to this bug. David Zafman
01:03 PM Bug #22673 (Resolved): osd checks out-of-date osdmap for DESTROYED flag on start
When trying an in-place migration of a filestore to bluestore OSD, we encountered a situation where ceph-osd would re... J Mozdzen
07:45 AM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
i am rerunning the failed test at http://pulpito.ceph.com/kchai-2018-01-12_07:44:06-multimds-wip-pdonnell-testing-201... Kefu Chai
07:29 AM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
i agree it's a bug in osd. but i don't think osd should return -ENOENT in this case. as Sage pointed out, it should c... Kefu Chai
01:15 AM Bug #22351: Couldn't init storage provider (RADOS)
Abhishek Lekshmanan wrote:
> can you tell us the ceph pg num and pgp num setting in ceph.conf (or rather paste teh c...
Amine Liu

01/11/2018

09:43 PM Bug #22668 (Resolved): osd/ExtentCache.h: 371: FAILED assert(tid == 0)
... Sage Weil
06:52 PM Bug #22351: Couldn't init storage provider (RADOS)
can you tell us the ceph pg num and pgp num setting in ceph.conf (or rather paste teh ceph.conf retracting sensitive ... Abhishek Lekshmanan
04:05 PM Bug #22561: PG stuck during recovery, requires OSD restart
OSD 32 was running and actively serving client IO. Paul Emmerich
02:39 PM Support #22664 (Closed): some random OSD are down (with a Abort signal on exception) after replac...
Hello,
currently we are facing with a strange behavior, where some OSDs are got ramdomly down with a Abort signal,...
Enrico Labedzki
12:57 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Recovery from non starting OSDs in this case is as following. Run OSD with debug:... Zdenek Janda
10:55 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Also several osds (as you can see the ceph osd tree output) are getting dumped out of the crush map. After putting th... Michal Cila
10:44 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
More info on affected PG... Zdenek Janda
10:39 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
I have succeeded in identifying faulty PG:... Zdenek Janda
10:17 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
Adding last 10000 lines of strace of OSD affected by the bug.
The ABRT signal is generated right after ...
Zdenek Janda
09:45 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
also adding our current ceph -s/ceph osd tree state:... Josef Zelenka
09:44 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
we are also affected by this bug. we are running luminous 12.2.2 on ubuntu 16.04, 3 node cluster, 8 HDDs per node, bl... Josef Zelenka
10:30 AM Bug #22662 (Resolved): ceph osd df json output validation reported invalid numbers (-nan) (jewel)
Hi,
we have a monitoring script which parses the 'ceph osd df -f json' output, but from time to time it will happe...
Enrico Labedzki
08:36 AM Bug #22661 (Triaged): Segmentation fault occurs when the following CLI is executed
Observation:
--------------
It is observed that when a user executes the CLI without providing the value of osd-u...
Debashis Mondal
07:34 AM Bug #22659 (In Progress): During the cache tiering configuration ,ceph-mon daemon getting crashed...
Observation:
--------------
Before setting the value of "hit_set_count" Ceph health was OK but after configuring th...
Debashis Mondal
02:54 AM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
OSD should reply -ENOENT for that case. should be OSD bug Zheng Yan

01/10/2018

11:38 PM Bug #22351: Couldn't init storage provider (RADOS)
Related to the ERROR: failed to initialize watch: (34) Numerical result out of range, it looks a class path issue. Th... Javier M. Mellid
11:38 PM Backport #22658 (In Progress): filestore: randomize split threshold
Josh Durgin
10:39 PM Backport #22658 (Resolved): filestore: randomize split threshold
https://github.com/ceph/ceph/pull/19906 Josh Durgin
10:16 PM Feature #15835 (Pending Backport): filestore: randomize split threshold
Josh Durgin
10:03 PM Support #22531: OSD flapping under repair/scrub after recieve inconsistent PG LFNIndex.cc: 439: F...
Can you provide a directory listing for pg 1.f? It seems a file that does not obey the internal naming rules of files... Josh Durgin
09:48 PM Bug #22561: PG stuck during recovery, requires OSD restart
Was OSD 32 running at the time? It sounds like correct behavior if OSD 32 was not reachable. It might have been marke... Josh Durgin
09:44 PM Support #22566: Some osd remain 100% CPU after upgrade jewel => luminous (v12.2.2) and some work
This is likely the singe-time startup cost of accounting for a bug in omap, where the osd has to scan the whole omap ... Josh Durgin
09:39 PM Bug #22597: "sudo chown -R ceph:ceph /var/lib/ceph/osd/ceph-0'" fails in upgrade test
IIRC we didn't have the ceph user in hammer - need to account for that in the suite if we want to keep running it at ... Josh Durgin
09:36 PM Bug #22641 (Resolved): uninit condition in PrimaryLogPG::process_copy_chunk_manifest
Josh Durgin
09:22 PM Bug #22641: uninit condition in PrimaryLogPG::process_copy_chunk_manifest
myoungwon oh wrote:
> https://github.com/ceph/ceph/pull/19874
merged
Yuri Weinstein
09:22 PM Bug #22656 (New): scrub mismatch on bytes (cache pools)
... Sage Weil
09:21 PM Bug #21557: osd.6 found snap mapper error on pg 2.0 oid 2:0e781f33:::smithi14431805-379 ... :187 ...
/a/yuriw-2018-01-09_21:50:35-rados-wip-yuri2-testing-2018-01-09-1813-distro-basic-smithi/2050823
another one.
<...
Sage Weil
09:01 PM Bug #20086: LibRadosLockECPP.LockSharedDurPP gets EEXIST
/a/yuriw-2018-01-09_21:50:35-rados-wip-yuri2-testing-2018-01-09-1813-distro-basic-smithi/2050802
Sage Weil
03:34 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
https://github.com/ceph/ceph/pull/19759 Kefu Chai
03:33 PM Bug #22539 (Pending Backport): bluestore: New OSD - Caught signal - bstore_kv_sync
Kefu Chai
02:56 PM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
That would be an fs bug, sure.
However, shouldn't the OSD not assert due to an object not existing?
Patrick Donnelly
02:48 PM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
I think the problem here is that the object doesn't exist but we're doing omap_setkeys on it.. which doesn't implicit... Sage Weil
08:57 AM Bug #22438 (Fix Under Review): mon: leak in lttng dlopen / __tracepoints__init
https://github.com/ceph/teuthology/pull/1143 Kefu Chai
08:16 AM Bug #22525 (Fix Under Review): auth: ceph auth add does not sanity-check caps
Jos Collin

01/09/2018

10:39 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
Actually, I may have seen an instance of the failure in a run that did not include 17815, so please don't take what I... Nathan Cutler
05:49 PM Bug #21557: osd.6 found snap mapper error on pg 2.0 oid 2:0e781f33:::smithi14431805-379 ... :187 ...
Not 100% sure if that's the same issue but we have a customer who faces an assert in SnapMapper::get_snaps()
2018-01...
Igor Fedotov
04:02 PM Bug #22641: uninit condition in PrimaryLogPG::process_copy_chunk_manifest
https://github.com/ceph/ceph/pull/19874 Myoungwon Oh
02:43 PM Bug #22641 (Resolved): uninit condition in PrimaryLogPG::process_copy_chunk_manifest
... Sage Weil
03:54 PM Bug #22278: FreeBSD fails to build with WITH_SPDK=ON
patch merged in DPDK. waiting for SPDK to pick up the latest DPDK. Kefu Chai
03:49 PM Support #22520 (Closed): nearfull threshold is not cleared when osd really is not nearfull.
You need to change this in the osd map, not the config. "ceph osd set-nearfull-ratio" or something similar. Greg Farnum
02:59 PM Bug #22409 (Resolved): ceph_objectstore_tool: no flush before collection_empty() calls; ObjectSto...
Kefu Chai
01:52 AM Bug #22351: Couldn't init storage provider (RADOS)
Orit Wasserman wrote:
> what is your pool configuration?
all default, just a default pool 'rbd'.
Amine Liu

01/08/2018

11:54 PM Bug #22624 (Duplicate): filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No suc...
... Patrick Donnelly
12:35 PM Bug #22409 (Fix Under Review): ceph_objectstore_tool: no flush before collection_empty() calls; O...
Igor Fedotov
12:35 PM Bug #22409: ceph_objectstore_tool: no flush before collection_empty() calls; ObjectStore/StoreTes...
https://github.com/ceph/ceph/pull/19764 Igor Fedotov
08:21 AM Bug #22409: ceph_objectstore_tool: no flush before collection_empty() calls; ObjectStore/StoreTes...
sage, i am taking this ticket from you. as it's simple enough and it won't cause too much duplication of efforts.
...
Kefu Chai
07:22 AM Bug #22415 (Duplicate): 'pg dump' fails after mon rebuild
Kefu Chai

01/06/2018

01:29 AM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
For DTS this should be fixed in the 7.1 release. Brad Hubbard
12:35 AM Bug #20439: PG never finishes getting created
Same thing in http://pulpito.ceph.com/yuriw-2018-01-04_20:43:14-rados-wip-yuri4-testing-2018-01-04-1750-distro-basic-... Josh Durgin

01/05/2018

03:57 PM Bug #22597 (Resolved): "sudo chown -R ceph:ceph /var/lib/ceph/osd/ceph-0'" fails in upgrade test
http://pulpito.ceph.com/kchai-2018-01-05_15:34:38-upgrade-wip-kefu-testing-2018-01-04-1836-distro-basic-mira/
<pre...
Kefu Chai
09:51 AM Bug #22525: auth: ceph auth add does not sanity-check caps
-https://github.com/ceph/ceph/pull/19794- Jing Li

01/04/2018

07:13 PM Bug #22351 (Need More Info): Couldn't init storage provider (RADOS)
what is your pool configuration? Orit Wasserman
02:42 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
So my OSDs had the default Bluestore layout the first time around, i.e. a 100MB DB/WAL (xfs) partition followed by th... Jon Heese
07:06 AM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
Jon Heese wrote:
> Unfortunately, `ceph-disk zap /dev/sde` does not wipe enough of the disk to avoid this issue. As...
Hua Liu
02:35 PM Bug #22266 (Pending Backport): mgr/PyModuleRegistry.cc: 139: FAILED assert(map.epoch > 0)
Kefu Chai
02:32 PM Bug #22266 (Resolved): mgr/PyModuleRegistry.cc: 139: FAILED assert(map.epoch > 0)
Sage Weil
01:23 PM Bug #22266 (Fix Under Review): mgr/PyModuleRegistry.cc: 139: FAILED assert(map.epoch > 0)
http://tracker.ceph.com/issues/22266 Kefu Chai
01:52 PM Support #22566 (New): Some osd remain 100% CPU after upgrade jewel => luminous (v12.2.2) and some...
h1. I have some OSDs that remain at 100% startup without any debug info in the logs :... David Casier
07:12 AM Support #22422: Block fsid does not match our fsid
See, [[http://tracker.ceph.com/issues/22354]] Hua Liu
01:07 AM Bug #22561 (New): PG stuck during recovery, requires OSD restart
We are sometimes encountering issues with PGs getting stuck in recovery.
For example, we ran some stress tests wit...
Paul Emmerich
 

Also available in: Atom