Activity
From 12/25/2017 to 01/23/2018
01/23/2018
- 11:57 PM Bug #21566 (Resolved): OSDService::recovery_need_sleep read+updated without locking
- 11:57 PM Backport #21697 (Resolved): luminous: OSDService::recovery_need_sleep read+updated without locking
- 11:06 PM Backport #21697: luminous: OSDService::recovery_need_sleep read+updated without locking
- Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/18753
merged - 11:56 PM Backport #21785 (Resolved): luminous: OSDMap cache assert on shutdown
- 11:07 PM Backport #21785: luminous: OSDMap cache assert on shutdown
- Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/18749
merged - 11:55 PM Bug #21845 (Resolved): Objecter::_send_op unnecessarily constructs costly hobject_t
- 11:55 PM Backport #21921 (Resolved): luminous: Objecter::_send_op unnecessarily constructs costly hobject_t
- 11:09 PM Backport #21921: luminous: Objecter::_send_op unnecessarily constructs costly hobject_t
- Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/18745
merged - 11:54 PM Backport #21922 (Resolved): luminous: Objecter::C_ObjectOperation_sparse_read throws/catches exce...
- 11:10 PM Backport #21922: luminous: Objecter::C_ObjectOperation_sparse_read throws/catches exceptions on -...
- Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/18744
merged - 11:25 PM Bug #21818 (Resolved): ceph_test_objectstore fails ObjectStore/StoreTest.Synthetic/1 (filestore) ...
- 11:25 PM Backport #21924 (Resolved): luminous: ceph_test_objectstore fails ObjectStore/StoreTest.Synthetic...
- 11:10 PM Backport #21924: luminous: ceph_test_objectstore fails ObjectStore/StoreTest.Synthetic/1 (filesto...
- Shinobu Kinjo wrote:
> https://github.com/ceph/ceph/pull/18742
merged - 08:30 PM Backport #22423 (Closed): luminous: osd: initial minimal efforts to clean up PG interface
- I was able to cleanly backport http://tracker.ceph.com/issues/22069 without this large change.
- 11:01 AM Bug #22351: Couldn't init storage provider (RADOS)
- No, I set it to Luminous based on the request by theanalyst in https://github.com/ceph/ceph/pull/20023. I'm fine with...
- 10:24 AM Bug #22351: Couldn't init storage provider (RADOS)
- @Brad Assigning to you and leaving the backport field on "luminous" (but feel free to zero it out if it's enough to m...
- 10:14 AM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
- @David I can only guess that this is not reproducible in master and that's why it requires a luminous-only fix. Could...
- 10:01 AM Backport #22761 (In Progress): luminous: osd checks out-of-date osdmap for DESTROYED flag on start
- 09:40 AM Backport #22761 (Resolved): luminous: osd checks out-of-date osdmap for DESTROYED flag on start
- https://github.com/ceph/ceph/pull/20068
- 07:48 AM Bug #22673 (Pending Backport): osd checks out-of-date osdmap for DESTROYED flag on start
- 06:38 AM Bug #22727: "osd pool stats" shows recovery information bugly
- need to backport it to jewel and luminous. but it at least dates back to 9.2.0. see also http://lists.ceph.com/piperm...
- 06:32 AM Bug #22727 (Fix Under Review): "osd pool stats" shows recovery information bugly
01/22/2018
- 11:50 PM Bug #22419 (Pending Backport): Pool Compression type option doesn't apply to new OSD's
- 08:12 AM Bug #22419 (Fix Under Review): Pool Compression type option doesn't apply to new OSD's
- https://github.com/ceph/ceph/pull/20044
- 11:46 PM Bug #22711 (Resolved): qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands: expect...
- 12:53 PM Bug #22711 (Fix Under Review): qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands:...
- https://github.com/ceph/ceph/pull/20046
- 11:06 AM Bug #22711: qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands: expect_false test...
- the weirdness of this issue is that some PGs are mapped to a single OSD:...
- 03:13 AM Bug #22711: qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands: expect_false test...
- the curr_object_copies_rate value in PGMap.cc dump_object_stat_sum is .5, which is counteracting the 2x replication f...
- 07:04 PM Bug #22752: snapmapper inconsistency, crash on luminous
- https://github.com/ceph/ceph/pull/20040
- 07:03 PM Bug #22752 (Resolved): snapmapper inconsistency, crash on luminous
- from Stefan Priebe on ceph-devel ML:...
- 06:47 PM Backport #22387 (In Progress): luminous: PG stuck in recovery_unfound
Included with another dependent backport as https://github.com/ceph/ceph/pull/20055- 12:40 PM Backport #22387 (Need More Info): luminous: PG stuck in recovery_unfound
- Non-trivial backport
- 02:27 PM Feature #22750 (Fix Under Review): libradosstriper conditional compile
- -https://github.com/ceph/ceph/pull/18197-
- 01:21 PM Feature #22750 (Resolved): libradosstriper conditional compile
- Currently libradosstriper is a hard dependency of the rados CLI tool.
Please add a "WITH_LIBRADOSSTRIPER" compile-... - 02:16 PM Bug #22746 (Fix Under Review): osd/common: ceph-osd process is terminated by the logratote task
- 11:51 AM Bug #22746 (Resolved): osd/common: ceph-osd process is terminated by the logratote task
- 1. Construct the scene:
(1) step 1:
Open the terminal_1, and
Prepare the cmd: "killall -q -1 ceph-mon ... - 12:59 PM Support #22749 (Closed): dmClock OP classification
- Why does dmClock algorithm in CEPH attribute recovery's read and write OP to osd_op_queue_mclock_osd_sub, so that whe...
- 12:41 PM Backport #22724 (Need More Info): luminous: miscounting degraded objects
- 12:41 PM Backport #22724: luminous: miscounting degraded objects
- David, while you're doing this one, can you include https://tracker.ceph.com/issues/22387 as well?
- 12:23 PM Support #22680 (Resolved): mons segmentation faults New 12.2.2 cluster
- 03:04 AM Bug #22715 (Pending Backport): log entries weirdly zeroed out after 'osd pg-temp' command
- 03:04 AM Backport #22744 (In Progress): luminous: log entries weirdly zeroed out after 'osd pg-temp' command
- https://github.com/ceph/ceph/pull/20042
- 03:03 AM Backport #22744 (Resolved): luminous: log entries weirdly zeroed out after 'osd pg-temp' command
- https://github.com/ceph/ceph/pull/20042
01/21/2018
- 08:29 PM Bug #22715 (Resolved): log entries weirdly zeroed out after 'osd pg-temp' command
- 06:56 PM Bug #22743 (New): "RadosModel.h: 854: FAILED assert(0)" in upgrade:hammer-x-jewel-distro-basic-sm...
- Run: http://pulpito.ceph.com/teuthology-2018-01-19_01:15:02-upgrade:hammer-x-jewel-distro-basic-smithi/
Job: 2088826...
01/20/2018
- 11:18 PM Bug #22351 (In Progress): Couldn't init storage provider (RADOS)
- Reopening this and reassigning it to RADOS as there are a couple of changes we can make to logging to make this easie...
01/19/2018
- 04:16 PM Support #20108: PGs are not remapped correctly when one host fails
- Hi,
Thank you for your answer!
I've seen that page before, but which tunable are you suggesting for the problem... - 09:59 AM Bug #22233 (Fix Under Review): prime_pg_temp breaks on uncreated pgs
- https://github.com/ceph/ceph/pull/20025
- 09:08 AM Bug #22711: qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands: expect_false test...
- ...
- 02:51 AM Support #22553: ceph-object-tool can not remove metadata pool's object
- not something wrong with disk,it can be repeat
01/18/2018
- 10:57 PM Support #20108: PGs are not remapped correctly when one host fails
- http://docs.ceph.com/docs/master/rados/operations/crush-map/?highlight=tunables#tunables
- 07:02 PM Bug #22351 (Closed): Couldn't init storage provider (RADOS)
- 10:47 AM Bug #22351: Couldn't init storage provider (RADOS)
- Brad Hubbard wrote:
>
> (6*1024)*3 = 18432, thus 18432/47 ~ 392 PGs per OSD. You omitted the size of the pools.
... - 03:21 AM Bug #22351: Couldn't init storage provider (RADOS)
- https://ceph.com/pgcalc/ should be used as a guide/starting point.
- 03:07 PM Bug #22727: "osd pool stats" shows recovery information bugly
- https://github.com/ceph/ceph/pull/20009
- 05:18 AM Bug #22727 (In Progress): "osd pool stats" shows recovery information bugly
- 03:16 AM Bug #22727 (Resolved): "osd pool stats" shows recovery information bugly
- ...
- 03:51 AM Bug #22715 (Fix Under Review): log entries weirdly zeroed out after 'osd pg-temp' command
- https://github.com/ceph/ceph/pull/19998
01/17/2018
- 10:28 PM Bug #22351: Couldn't init storage provider (RADOS)
- Nikos Kormpakis wrote:
> But I still cannot understand why I'm hitting this error.
> Regarding my cluster, I have t... - 01:15 PM Bug #22351: Couldn't init storage provider (RADOS)
- Brad Hubbard wrote:
> I'm able to reproduce something like what you are seeing, the messages are a little different.... - 03:30 AM Bug #22351: Couldn't init storage provider (RADOS)
- I'm able to reproduce something like what you are seeing, the messages are a little different.
What I see is this.... - 12:12 AM Bug #22351: Couldn't init storage provider (RADOS)
- It turns out what we need is the hexadecimal int representation of '-34' from the ltrace output.
$ c++filt </tmp/l... - 10:26 PM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- Ryan Anstey wrote:
> I'm working on fixing all my inconsistent pgs but I'm having issues with rados get... hopefully... - 09:07 PM Bug #22656: scrub mismatch on bytes (cache pools)
- /a/sage-2018-01-17_14:40:55-rados-wip-sage-testing-2018-01-16-2156-distro-basic-smithi/2082959
description: rados/... - 07:54 PM Bug #21833: Multiple asserts caused by DNE pgs left behind after lots of OSD restarts
- 07:48 PM Bug #20059: miscounting degraded objects
- https://github.com/ceph/ceph/pull/19850
- 07:36 PM Bug #21387 (Can't reproduce): mark_unfound_lost hangs
- Multiple fixes to mark_all_unfound_lost() has fixed this. Possibly the most important master branch commit is 689bff...
- 06:00 PM Bug #22668 (Fix Under Review): osd/ExtentCache.h: 371: FAILED assert(tid == 0)
- https://github.com/ceph/ceph/pull/19989
- 05:10 PM Backport #22724 (Resolved): luminous: miscounting degraded objects
- on bigbang,...
- 04:39 PM Bug #22673 (Fix Under Review): osd checks out-of-date osdmap for DESTROYED flag on start
- note: you can work around this by waiting a bit until some osd maps trim from the monitor.
https://github.com/ceph... - 02:54 PM Bug #22673: osd checks out-of-date osdmap for DESTROYED flag on start
- It looks like the _preboot destroyed check should go after we catch up on maps.
- 02:53 PM Bug #22673: osd checks out-of-date osdmap for DESTROYED flag on start
- This is a real bug, should be straightforward to fix. Thanks for the report!
- 02:59 PM Bug #22544: objecter cannot resend split-dropped op when racing with con reset
- Hmm, I'm not sure what the best fix is. Do you see a good path to fixing this with ms_handle_connect()?
- 02:57 PM Bug #22659 (In Progress): During the cache tiering configuration ,ceph-mon daemon getting crashed...
- This will need to be backported to luminous and jewel once merged.
- 09:36 AM Bug #22659: During the cache tiering configuration ,ceph-mon daemon getting crashed after setting...
- https://github.com/ceph/ceph/pull/19983
- 02:55 PM Bug #22662: ceph osd df json output validation reported invalid numbers (-nan) (jewel)
- 1. it's not valid json.. Formatter shouldn't allow it
2. we should have a valid value (or 0) to use - 02:52 PM Bug #22661 (Triaged): Segmentation fault occurs when the following CLI is executed
- 02:51 PM Bug #22672 (Triaged): OSDs frequently segfault in PrimaryLogPG::find_object_context() with empty ...
- 02:28 PM Bug #22597 (Fix Under Review): "sudo chown -R ceph:ceph /var/lib/ceph/osd/ceph-0'" fails in upgra...
- https://github.com/ceph/ceph/pull/19987
- 01:32 PM Bug #22233 (In Progress): prime_pg_temp breaks on uncreated pgs
- 11:24 AM Support #22664: some random OSD are down (with a Abort signal on exception) after replace/rebuild...
- Hi Greg,
can you point me to the link, as far we have seen yet, all ulimit 10 times higher as needed on all nodes....
01/16/2018
- 09:49 PM Bug #22715 (Resolved): log entries weirdly zeroed out after 'osd pg-temp' command
- ...
- 07:59 PM Bug #20059 (Pending Backport): miscounting degraded objects
- 07:10 PM Bug #22711 (Resolved): qa/workunits/cephtool/test.sh fails with test_mon_cephdf_commands: expect...
- ...
- 07:09 PM Bug #22677 (Resolved): rados/test_rados_tool.sh failure
- 04:16 PM Bug #22351: Couldn't init storage provider (RADOS)
- Hello,
we're facing the same issue on a Luminous cluster.
Some info about the cluster:
Version: ceph version 1... - 03:08 PM Bug #20874: osd/PGLog.h: 1386: FAILED assert(miter == missing.get_items().end() || (miter->second...
- /a/sage-2018-01-16_03:08:54-rados-wip-sage2-testing-2018-01-15-1257-distro-basic-smithi/2077982...
- 01:33 PM Backport #22707 (In Progress): luminous: ceph_objectstore_tool: no flush before collection_empty(...
- 01:30 PM Backport #22707 (Resolved): luminous: ceph_objectstore_tool: no flush before collection_empty() c...
- https://github.com/ceph/ceph/pull/19967
- 01:21 PM Bug #22409 (Pending Backport): ceph_objectstore_tool: no flush before collection_empty() calls; O...
- 12:53 PM Support #20108: PGs are not remapped correctly when one host fails
- Hello,
I'm sorry I've missed your message. Can you please give me some clues about the "newer crush tunables" that... - 12:48 PM Bug #22668: osd/ExtentCache.h: 371: FAILED assert(tid == 0)
- /a/sage-2018-01-15_18:49:16-rados-wip-sage-testing-2018-01-14-1341-distro-basic-smithi/2076047
- 12:48 PM Bug #22668: osd/ExtentCache.h: 371: FAILED assert(tid == 0)
- /a/sage-2018-01-15_18:49:16-rados-wip-sage-testing-2018-01-14-1341-distro-basic-smithi/2075822
- 11:10 AM Support #22680: mons segmentation faults New 12.2.2 cluster
- Thanks! We had jemalloc in LD_PRELOAD since Infernalis, so i didn't think about that. I removed this from sysconfig, ...
01/15/2018
- 07:26 PM Feature #22442: ceph daemon mon.id mon_status -> ceph daemon mon.id status
- Joao, did mon_status just precede the other status commands, or was there a reason for them to be different?
- 07:22 PM Bug #22486: ceph shows wrong MAX AVAIL with hybrid (chooseleaf firstn 1, chooseleaf firstn -1) CR...
- Well, the hybrid ruleset isn't giving you as much host isolation as you're probably thinking, since it can select an ...
- 07:11 PM Support #22664 (Closed): some random OSD are down (with a Abort signal on exception) after replac...
- It's failing to create a new thread. You probably need to bump the ulimit; this is discussed in the documentation. :)
- 07:08 PM Support #22680: mons segmentation faults New 12.2.2 cluster
- This is buried in the depths of RocksDB doing IO, so the only causes I know of/can think of are
1) you've found an u... - 10:39 AM Support #22680 (Resolved): mons segmentation faults New 12.2.2 cluster
Hi all,
I installed a new Luminous 12.2.2 cluster. The monitors were up at first, but quickly started failing, s...- 05:48 PM Backport #22387: luminous: PG stuck in recovery_unfound
- Include commit 64047e1 "osd: Don't start recovery for missing until active pg state set" from https://github.com/ceph...
- 11:00 AM Support #22531: OSD flapping under repair/scrub after recieve inconsistent PG LFNIndex.cc: 439: F...
- Josh Durgin wrote:
> Can you provide a directory listing for pg 1.f? It seems a file that does not obey the internal... - 06:12 AM Bug #22351: Couldn't init storage provider (RADOS)
- Brad Hubbard wrote:
> If this is a RADOS function returning ERANGE (34) then it should be possible to find it by att... - 05:05 AM Bug #22351: Couldn't init storage provider (RADOS)
- If this is a RADOS function returning ERANGE (34) then it should be possible to find it by attempting to start the ra...
- 03:26 AM Bug #20059 (Fix Under Review): miscounting degraded objects
- 02:56 AM Bug #22668: osd/ExtentCache.h: 371: FAILED assert(tid == 0)
- /a//kchai-2018-01-11_06:11:31-rados-wip-kefu-testing-2018-01-11-1036-distro-basic-mira/2058373/remote/mira002/log/cep...
01/14/2018
- 10:46 PM Bug #22672: OSDs frequently segfault in PrimaryLogPG::find_object_context() with empty clone_snap...
- To (relatively) stabilise the frequently crashing OSDs, we've added an early -ENOENT return to PrimaryLogPG::find_obj...
- 04:37 PM Bug #22677: rados/test_rados_tool.sh failure
- https://github.com/ceph/ceph/pull/19946
01/13/2018
01/12/2018
- 10:43 PM Bug #22438 (Resolved): mon: leak in lttng dlopen / __tracepoints__init
- 06:29 AM Bug #22438: mon: leak in lttng dlopen / __tracepoints__init
- https://github.com/ceph/teuthology/pull/1144
- 10:23 PM Bug #22672: OSDs frequently segfault in PrimaryLogPG::find_object_context() with empty clone_snap...
- That looks like a good way to investigate. We've seen a few reports of issues with cache tier snapshots since that re...
- 02:54 PM Bug #22672: OSDs frequently segfault in PrimaryLogPG::find_object_context() with empty clone_snap...
- to detect this case during scrub, I'm currently testing the following change:
-https://github.com/ddiss/ceph/commit/... - 12:55 AM Bug #22672 (Triaged): OSDs frequently segfault in PrimaryLogPG::find_object_context() with empty ...
- Environment is a Luminous cache-tiered deployment with some of the hot-tier OSDs converted to bluestore. The remainin...
- 07:38 PM Bug #22063: "RadosModel.h: 1703: FAILED assert(!version || comp->get_version64() == version)" inr...
- Also in http://qa-proxy.ceph.com/teuthology/teuthology-2017-11-17_18:17:24-rados-jewel-distro-basic-smithi/1857527/te...
- 07:36 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
- Yuri Weinstein wrote:
> Also in http://qa-proxy.ceph.com/teuthology/teuthology-2017-11-17_18:17:24-rados-jewel-distr... - 07:18 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
- As 17815 has to do with when scrub is allowed to start, it wouldn't be related to this bug.
- 01:03 PM Bug #22673 (Resolved): osd checks out-of-date osdmap for DESTROYED flag on start
- When trying an in-place migration of a filestore to bluestore OSD, we encountered a situation where ceph-osd would re...
- 07:45 AM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
- i am rerunning the failed test at http://pulpito.ceph.com/kchai-2018-01-12_07:44:06-multimds-wip-pdonnell-testing-201...
- 07:29 AM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
- i agree it's a bug in osd. but i don't think osd should return -ENOENT in this case. as Sage pointed out, it should c...
- 01:15 AM Bug #22351: Couldn't init storage provider (RADOS)
- Abhishek Lekshmanan wrote:
> can you tell us the ceph pg num and pgp num setting in ceph.conf (or rather paste teh c...
01/11/2018
- 09:43 PM Bug #22668 (Resolved): osd/ExtentCache.h: 371: FAILED assert(tid == 0)
- ...
- 06:52 PM Bug #22351: Couldn't init storage provider (RADOS)
- can you tell us the ceph pg num and pgp num setting in ceph.conf (or rather paste teh ceph.conf retracting sensitive ...
- 04:05 PM Bug #22561: PG stuck during recovery, requires OSD restart
- OSD 32 was running and actively serving client IO.
- 02:39 PM Support #22664 (Closed): some random OSD are down (with a Abort signal on exception) after replac...
- Hello,
currently we are facing with a strange behavior, where some OSDs are got ramdomly down with a Abort signal,... - 12:57 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
- Recovery from non starting OSDs in this case is as following. Run OSD with debug:...
- 10:55 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
- Also several osds (as you can see the ceph osd tree output) are getting dumped out of the crush map. After putting th...
- 10:44 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
- More info on affected PG...
- 10:39 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
- I have succeeded in identifying faulty PG:...
- 10:17 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
- Adding last 10000 lines of strace of OSD affected by the bug.
The ABRT signal is generated right after ... - 09:45 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
- also adding our current ceph -s/ceph osd tree state:...
- 09:44 AM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
- we are also affected by this bug. we are running luminous 12.2.2 on ubuntu 16.04, 3 node cluster, 8 HDDs per node, bl...
- 10:30 AM Bug #22662 (Resolved): ceph osd df json output validation reported invalid numbers (-nan) (jewel)
- Hi,
we have a monitoring script which parses the 'ceph osd df -f json' output, but from time to time it will happe... - 08:36 AM Bug #22661 (Triaged): Segmentation fault occurs when the following CLI is executed
- Observation:
--------------
It is observed that when a user executes the CLI without providing the value of osd-u... - 07:34 AM Bug #22659 (In Progress): During the cache tiering configuration ,ceph-mon daemon getting crashed...
- Observation:
--------------
Before setting the value of "hit_set_count" Ceph health was OK but after configuring th... - 02:54 AM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
- OSD should reply -ENOENT for that case. should be OSD bug
01/10/2018
- 11:38 PM Bug #22351: Couldn't init storage provider (RADOS)
- Related to the ERROR: failed to initialize watch: (34) Numerical result out of range, it looks a class path issue. Th...
- 11:38 PM Backport #22658 (In Progress): filestore: randomize split threshold
- 10:39 PM Backport #22658 (Resolved): filestore: randomize split threshold
- https://github.com/ceph/ceph/pull/19906
- 10:16 PM Feature #15835 (Pending Backport): filestore: randomize split threshold
- 10:03 PM Support #22531: OSD flapping under repair/scrub after recieve inconsistent PG LFNIndex.cc: 439: F...
- Can you provide a directory listing for pg 1.f? It seems a file that does not obey the internal naming rules of files...
- 09:48 PM Bug #22561: PG stuck during recovery, requires OSD restart
- Was OSD 32 running at the time? It sounds like correct behavior if OSD 32 was not reachable. It might have been marke...
- 09:44 PM Support #22566: Some osd remain 100% CPU after upgrade jewel => luminous (v12.2.2) and some work
- This is likely the singe-time startup cost of accounting for a bug in omap, where the osd has to scan the whole omap ...
- 09:39 PM Bug #22597: "sudo chown -R ceph:ceph /var/lib/ceph/osd/ceph-0'" fails in upgrade test
- IIRC we didn't have the ceph user in hammer - need to account for that in the suite if we want to keep running it at ...
- 09:36 PM Bug #22641 (Resolved): uninit condition in PrimaryLogPG::process_copy_chunk_manifest
- 09:22 PM Bug #22641: uninit condition in PrimaryLogPG::process_copy_chunk_manifest
- myoungwon oh wrote:
> https://github.com/ceph/ceph/pull/19874
merged - 09:22 PM Bug #22656 (New): scrub mismatch on bytes (cache pools)
- ...
- 09:21 PM Bug #21557: osd.6 found snap mapper error on pg 2.0 oid 2:0e781f33:::smithi14431805-379 ... :187 ...
- /a/yuriw-2018-01-09_21:50:35-rados-wip-yuri2-testing-2018-01-09-1813-distro-basic-smithi/2050823
another one.
<... - 09:01 PM Bug #20086: LibRadosLockECPP.LockSharedDurPP gets EEXIST
- /a/yuriw-2018-01-09_21:50:35-rados-wip-yuri2-testing-2018-01-09-1813-distro-basic-smithi/2050802
- 03:34 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- https://github.com/ceph/ceph/pull/19759
- 03:33 PM Bug #22539 (Pending Backport): bluestore: New OSD - Caught signal - bstore_kv_sync
- 02:56 PM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
- That would be an fs bug, sure.
However, shouldn't the OSD not assert due to an object not existing? - 02:48 PM Bug #22624: filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No such file or di...
- I think the problem here is that the object doesn't exist but we're doing omap_setkeys on it.. which doesn't implicit...
- 08:57 AM Bug #22438 (Fix Under Review): mon: leak in lttng dlopen / __tracepoints__init
- https://github.com/ceph/teuthology/pull/1143
- 08:16 AM Bug #22525 (Fix Under Review): auth: ceph auth add does not sanity-check caps
01/09/2018
- 10:39 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
- Actually, I may have seen an instance of the failure in a run that did not include 17815, so please don't take what I...
- 05:49 PM Bug #21557: osd.6 found snap mapper error on pg 2.0 oid 2:0e781f33:::smithi14431805-379 ... :187 ...
- Not 100% sure if that's the same issue but we have a customer who faces an assert in SnapMapper::get_snaps()
2018-01... - 04:02 PM Bug #22641: uninit condition in PrimaryLogPG::process_copy_chunk_manifest
- https://github.com/ceph/ceph/pull/19874
- 02:43 PM Bug #22641 (Resolved): uninit condition in PrimaryLogPG::process_copy_chunk_manifest
- ...
- 03:54 PM Bug #22278: FreeBSD fails to build with WITH_SPDK=ON
- patch merged in DPDK. waiting for SPDK to pick up the latest DPDK.
- 03:49 PM Support #22520 (Closed): nearfull threshold is not cleared when osd really is not nearfull.
- You need to change this in the osd map, not the config. "ceph osd set-nearfull-ratio" or something similar.
- 02:59 PM Bug #22409 (Resolved): ceph_objectstore_tool: no flush before collection_empty() calls; ObjectSto...
- 01:52 AM Bug #22351: Couldn't init storage provider (RADOS)
- Orit Wasserman wrote:
> what is your pool configuration?
all default, just a default pool 'rbd'.
01/08/2018
- 11:54 PM Bug #22624 (Duplicate): filestore: 3180: FAILED assert(0 == "unexpected error"): error (2) No suc...
- ...
- 12:35 PM Bug #22409 (Fix Under Review): ceph_objectstore_tool: no flush before collection_empty() calls; O...
- 12:35 PM Bug #22409: ceph_objectstore_tool: no flush before collection_empty() calls; ObjectStore/StoreTes...
- https://github.com/ceph/ceph/pull/19764
- 08:21 AM Bug #22409: ceph_objectstore_tool: no flush before collection_empty() calls; ObjectStore/StoreTes...
- sage, i am taking this ticket from you. as it's simple enough and it won't cause too much duplication of efforts.
... - 07:22 AM Bug #22415 (Duplicate): 'pg dump' fails after mon rebuild
01/06/2018
- 01:29 AM Bug #22220: osd/ReplicatedPG.h:1667:14: internal compiler error: in force_type_die, at dwarf2out....
- For DTS this should be fixed in the 7.1 release.
- 12:35 AM Bug #20439: PG never finishes getting created
- Same thing in http://pulpito.ceph.com/yuriw-2018-01-04_20:43:14-rados-wip-yuri4-testing-2018-01-04-1750-distro-basic-...
01/05/2018
- 03:57 PM Bug #22597 (Resolved): "sudo chown -R ceph:ceph /var/lib/ceph/osd/ceph-0'" fails in upgrade test
- http://pulpito.ceph.com/kchai-2018-01-05_15:34:38-upgrade-wip-kefu-testing-2018-01-04-1836-distro-basic-mira/
<pre... - 09:51 AM Bug #22525: auth: ceph auth add does not sanity-check caps
- -https://github.com/ceph/ceph/pull/19794-
01/04/2018
- 07:13 PM Bug #22351 (Need More Info): Couldn't init storage provider (RADOS)
- what is your pool configuration?
- 02:42 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
- So my OSDs had the default Bluestore layout the first time around, i.e. a 100MB DB/WAL (xfs) partition followed by th...
- 07:06 AM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
- Jon Heese wrote:
> Unfortunately, `ceph-disk zap /dev/sde` does not wipe enough of the disk to avoid this issue. As... - 02:35 PM Bug #22266 (Pending Backport): mgr/PyModuleRegistry.cc: 139: FAILED assert(map.epoch > 0)
- 02:32 PM Bug #22266 (Resolved): mgr/PyModuleRegistry.cc: 139: FAILED assert(map.epoch > 0)
- 01:23 PM Bug #22266 (Fix Under Review): mgr/PyModuleRegistry.cc: 139: FAILED assert(map.epoch > 0)
- http://tracker.ceph.com/issues/22266
- 01:52 PM Support #22566 (New): Some osd remain 100% CPU after upgrade jewel => luminous (v12.2.2) and some...
- h1. I have some OSDs that remain at 100% startup without any debug info in the logs :...
- 07:12 AM Support #22422: Block fsid does not match our fsid
- See, [[http://tracker.ceph.com/issues/22354]]
- 01:07 AM Bug #22561 (New): PG stuck during recovery, requires OSD restart
- We are sometimes encountering issues with PGs getting stuck in recovery.
For example, we ran some stress tests wit...
01/03/2018
- 09:28 PM Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi
- So Nathan seems to have narrowed it down to https://github.com/ceph/ceph/pull/17815 - can you look at this when you'r...
- 09:23 PM Support #22422: Block fsid does not match our fsid
- It looks like you may have had a partial prepare there in the past - if you're sure it's the right disk, wipe it with...
- 09:22 PM Bug #22438 (Resolved): mon: leak in lttng dlopen / __tracepoints__init
- 09:17 PM Support #22466 (Closed): PG failing to map to any OSDs
- 09:08 PM Support #22553: ceph-object-tool can not remove metadata pool's object
- Is there possibly something wrong with that disk?
- 03:28 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
- Jon Heese wrote:
> Unfortunately, `ceph-disk zap /dev/sde` does not wipe enough of the disk to avoid this issue. As... - 01:41 AM Bug #22346 (Resolved): OSD_ORPHAN issues after jewel->luminous upgrade, but orphaned osds not in ...
- Not for me.
$ crushtool -d crushmap.bad -o crushmap.bad.txt
$ crushtool -d crushmap.good -o crushmap.good.txt
$ ...
01/02/2018
- 09:03 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- Alright, that fixed it!
It also fixed the heavy IO issue as well as the rather large amount of consumption I was s... - 06:20 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- Sorry for the spam.
That broke it good!!!... - 06:15 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- Was able to out them all:...
- 06:14 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- I can't mark the OSDs out....
- 03:42 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- Hard to say excatly, but I would not be surprised to see any manner of odd behaviors with a huge map like that--we ha...
- 04:28 PM Bug #22354: v12.2.2 unable to create bluestore osd using ceph-disk
- Unfortunately, `ceph-disk zap /dev/sde` does not wipe enough of the disk to avoid this issue. As I mentioned above, ...
- 01:01 PM Support #22553 (New): ceph-object-tool can not remove metadata pool's object
- i put an object to the rbd pool
rados -p rbd put qinli.sh
then stop osd and remove it
[root@lab71 ~]# ceph-objec...
12/31/2017
- 11:13 PM Bug #21388: inconsistent pg but repair does nothing reporting head data_digest != data_digest fro...
- I'm working on fixing all my inconsistent pgs but I'm having issues with rados get... hopefully I'm just doing the co...
12/30/2017
- 02:30 AM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- I had no idea the ID would impact the map calculations that way (makes sense now)!!! Very good to know! And those I...
12/29/2017
- 10:34 PM Bug #22539 (In Progress): bluestore: New OSD - Caught signal - bstore_kv_sync
- Brian, note that one reason why this triggered is that your osdmap is huge... because you have some osds with very la...
12/28/2017
- 11:02 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- I'm a bit lost hence trying to re-arrange things:
Let's handle the crash first.
IMO it's caused by throttle value...
12/27/2017
- 04:46 AM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- A chunk from the mon log:
https://pastebin.com/MA1BStEc
Some screenshots of the IO:
https://imgur.com/a/BOKWc
... - 04:29 AM Bug #22544 (Resolved): objecter cannot resend split-dropped op when racing with con reset
- @
if (split && con && con->has_features(CEPH_FEATUREMASK_RESEND_ON_SPLIT)) {
return RECALC_OP_TARGET_NEED_RES...
12/26/2017
- 11:03 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- UI Lag seems to be related to heavy load to the OS SSD from the monitor services. The monitor service does a lot of I...
- 10:51 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- Edit, UI is lagging again. But its odd. SOME things lag, but GLXGears isn't. IO blocking of some sort? Adding mor...
- 10:47 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- Confirmed the line was there. Added the extra debug line, but this time when I started it is came right online (almo...
- 09:18 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- Given the object names in action it looks like that's osd map update or something that triggers the issue. Not the us...
- 06:41 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- Additional note, there is no data on the cluster other than the built in pools. So there is very little information ...
- 05:46 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- This will make only the fourth OSD in the cluster. Would that impact the overflowed value? What can I do to capture...
- 01:22 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- As a workaround one can try to set (temporarily until initial rebuild completes?) bluestore_throttle_bytes = 0 at the...
- 01:07 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- 32-bit value in throttle_bytes is overflowed - see:
2017-12-25 13:18:06.783304 7f37a7a2a700 10 bluestore(/var/lib/ce...
12/25/2017
- 09:27 PM Bug #22539: bluestore: New OSD - Caught signal - bstore_kv_sync
- Added to ceph.conf:
debug bluestore = 20
debug osd = 20
Waited for crash, captured log, but its too large even c... - 08:46 PM Bug #22539 (Resolved): bluestore: New OSD - Caught signal - bstore_kv_sync
- After rebuilding a demo cluster, OSD on one node can no longer be created.
Looking though the log I see this error... - 03:24 AM Support #22466: PG failing to map to any OSDs
- When delete the osds outside of the default root , the problem's solved.
Also available in: Atom