Project

General

Profile

Activity

From 11/21/2018 to 12/20/2018

12/20/2018

11:08 PM Bug #37714: test_dump_pgstate_history: Can't find expected values in history object, failing
This one is pretty reproducible on master with the filter 'all/admin_socket_output.yaml rados.yaml supported-random-d... Neha Ojha
07:28 PM Bug #37264 (In Progress): scrub warning check incorrectly uses mon scrub interval
David Zafman
05:09 PM Bug #23145: OSD crashes during recovery of EC pg
I've seen this on 12.2.5 and 12.2.10. I unfortunately can't offer any further logs files :/
Just to confirm that t...
Paul Emmerich
04:12 PM Bug #37511 (Fix Under Review): merge target placeholder may get wrong PastIntervals from source
https://github.com/ceph/ceph/pull/25652 Sage Weil
11:47 AM Backport #37688 (Need More Info): mimic: Command failed on smithi191 with status 1: '\n sudo yum ...
Nathan Cutler
11:47 AM Backport #37687 (Need More Info): luminous: Command failed on smithi191 with status 1: '\n sudo y...
Nathan Cutler
10:50 AM Bug #37720: Ceph-osd is halt when enable SPDK
Please review the correction https://github.com/ceph/ceph/pull/25646 Anonymous
10:11 AM Bug #37720: Ceph-osd is halt when enable SPDK
I'm working on the issue. Anonymous
10:11 AM Bug #37720 (Resolved): Ceph-osd is halt when enable SPDK
When set up development Ceph cluster enabling SPDK, observed ceph-osd is halt on aarch64 platform and assert on x86 p... Anonymous
05:47 AM Bug #36405: unittest_seastar_messenger failure on ARM
from dmesg:... Kefu Chai
01:20 AM Bug #37718 (Rejected): ceph-osdomap-tool crashes

Rebuilding the binary fixed the problem. It looked like a library incompatibility because safe_to_start_threads sh...
David Zafman
12:38 AM Bug #37718 (Rejected): ceph-osdomap-tool crashes

$ ../qa/run-standalone.sh "osd-scrub-snaps.sh TEST_scrub_snaps"
...
../qa/standalone/scrub/osd-scrub-snaps.sh:100...
David Zafman

12/19/2018

10:53 PM Bug #37705 (Closed): list-inconsistent-pg fails with EINVAL
Neha Ojha
10:07 PM Bug #37583 (Resolved): mix luminous + master mons break ceph cli
Greg Farnum
06:18 PM Backport #37688: mimic: Command failed on smithi191 with status 1: '\n sudo yum -y install ceph-r...
Let's hold off on this backport, since the original issue has not been resolved yet. Neha Ojha
06:18 PM Backport #37687: luminous: Command failed on smithi191 with status 1: '\n sudo yum -y install cep...
Let's hold off on this backport, since the original issue has not been resolved yet. Neha Ojha
06:06 PM Bug #37654: FAILED ceph_assert(info.history.same_interval_since != 0) in PG::start_peering_interv...
/a/nojha-2018-12-19_01:41:09-rados-master-distro-basic-smithi/3375485/ Neha Ojha
06:03 PM Bug #37716 (New): failed to recover before timeout expired due to pgs going into backfill_toofull
... Neha Ojha
05:38 PM Bug #37673 (Won't Fix): latency between "initiated" and "queued_for_pg"
after testing locally, i think this is expected behavior.
test settings:
* docker container osd-host.alpha, 172...
Kefu Chai
05:36 PM Bug #37714 (Resolved): test_dump_pgstate_history: Can't find expected values in history object, f...
... Neha Ojha
11:35 AM Bug #37706 (Fix Under Review): list-inconsistent-pg fails with EINVAL
https://github.com/ceph/ceph/pull/25632 Yehuda Sadeh
02:55 AM Bug #37706: list-inconsistent-pg fails with EINVAL

I was suspicious of https://github.com/ceph/ceph/pull/23298, so I reverted all 74 commits and the problem wouldn't ...
David Zafman
05:52 AM Bug #20874: osd/PGLog.h: 1386: FAILED assert(miter == missing.get_items().end() || (miter->second...
Hit this again:
http://pulpito.ceph.com/xxg-2018-12-19_01:25:39-rados:thrash-wip-no-upmap-for-merge-distro-basic-s...
xie xingguo

12/18/2018

07:10 PM Bug #37706: list-inconsistent-pg fails with EINVAL
... David Zafman
07:04 PM Bug #37706 (Resolved): list-inconsistent-pg fails with EINVAL

Seen in run-standalone.sh runs of osd-scrub-snaps.sh and osd-scrub-repair.sh:...
David Zafman
06:56 PM Bug #37705 (Closed): list-inconsistent-pg fails with EINVAL
David Zafman
11:26 AM Backport #37698 (In Progress): mimic: osd_memory_target: failed assert when options mismatch
Nathan Cutler
11:10 AM Backport #37698 (Resolved): mimic: osd_memory_target: failed assert when options mismatch
https://github.com/ceph/ceph/pull/25605 Nathan Cutler
11:25 AM Backport #37697 (In Progress): luminous: osd_memory_target: failed assert when options mismatch
Nathan Cutler
11:10 AM Backport #37697 (Resolved): luminous: osd_memory_target: failed assert when options mismatch
https://github.com/ceph/ceph/pull/25604 Nathan Cutler
11:23 AM Backport #37686 (In Progress): mimic: list-inconsistent-obj output truncated, causing osd-scrub-r...
Nathan Cutler
11:09 AM Backport #37686 (Resolved): mimic: list-inconsistent-obj output truncated, causing osd-scrub-repa...
https://github.com/ceph/ceph/pull/25603 Nathan Cutler
11:09 AM Backport #37690 (Resolved): luminous: ceph-objectstore-tool: Add HashInfo to object dump output
https://github.com/ceph/ceph/pull/25722 Nathan Cutler
11:09 AM Backport #37689 (Resolved): mimic: ceph-objectstore-tool: Add HashInfo to object dump output
https://github.com/ceph/ceph/pull/25721 Nathan Cutler
11:09 AM Backport #37688 (Resolved): mimic: Command failed on smithi191 with status 1: '\n sudo yum -y ins...
https://github.com/ceph/ceph/pull/26201 Nathan Cutler
11:09 AM Backport #37687 (Rejected): luminous: Command failed on smithi191 with status 1: '\n sudo yum -y ...
Nathan Cutler
03:58 AM Bug #37511: merge target placeholder may get wrong PastIntervals from source
/a/sage-2018-12-17_17:34:16-rados-wip-sage2-testing-2018-12-17-0911-distro-basic-smithi/3372061 Sage Weil
02:01 AM Bug #37679 (Fix Under Review): osd: pull object from the shard who missing it
FAILED assert(get_parent()->get_log().get_log().objects.count(soid) && (get_parent()->get_log().get_log().objects.fin... Zengran Zhang
12:39 AM Bug #37507 (Pending Backport): osd_memory_target: failed assert when options mismatch
xie xingguo

12/17/2018

10:28 PM Feature #37597 (Pending Backport): ceph-objectstore-tool: Add HashInfo to object dump output
David Zafman
10:27 PM Bug #37653 (Pending Backport): list-inconsistent-obj output truncated, causing osd-scrub-repair.s...
https://github.com/ceph/ceph/pull/25548 David Zafman
09:34 PM Bug #37656: FileStore::_do_transaction() crashed with error 17 (merge collection vs osd restart)
/a/dzafman-2018-12-14_11:02:20-rados-wip-zafman-testing-distro-basic-smithi/3362534 Josh Durgin
07:54 PM Bug #36497: FAILED ceph_assert(can_write == WriteStatus::NOWRITE) in ProtocolV1::replace()
/a/dzafman-2018-12-14_11:02:20-rados-wip-zafman-testing-distro-basic-smithi/3362409 Josh Durgin
07:53 PM Bug #20694: osd/ReplicatedBackend.cc: 1417: FAILED assert(get_parent()->get_log().get_log().obje...
/a/dzafman-2018-12-14_11:02:20-rados-wip-zafman-testing-distro-basic-smithi/3362388 Josh Durgin
04:52 PM Bug #36686: osd: pg log hard limit can cause crash during upgrade
Oliver Freyermuth wrote:
> Let me extend that question with:
> What's the clean upgrade path for those on 12.2.8 or...
Neha Ojha
04:27 PM Bug #36686: osd: pg log hard limit can cause crash during upgrade
Let me extend that question with:
What's the clean upgrade path for those on 12.2.8 or 12.2.10 (and wanting to upgra...
Oliver Freyermuth
04:23 PM Bug #36686: osd: pg log hard limit can cause crash during upgrade
Nathan Cutler wrote:
> Alexander Morozov wrote:
> > Any ETA for the fix?
>
> Did you mean ETA for 12.2.10? Lumin...
Alexander Morozov
11:43 AM Bug #36686: osd: pg log hard limit can cause crash during upgrade
Alexander Morozov wrote:
> Any ETA for the fix?
Did you mean ETA for 12.2.10? Luminous v12.2.10 was released on N...
Nathan Cutler
06:56 AM Bug #36686: osd: pg log hard limit can cause crash during upgrade
Nathan Cutler wrote:
> Neha, 12.2.9 has already been cut, so we'll need to expedite 12.2.10 to push the revert out t...
Alexander Morozov
03:48 PM Bug #37673 (Won't Fix): latency between "initiated" and "queued_for_pg"
* 6 osd cluster
* separated cluster network on eth0, and public network on eth1.
* a rados client accessing from pu...
Kefu Chai
02:59 PM Bug #37671 (Resolved): race between split and pg create
... Sage Weil
02:44 PM Bug #37507: osd_memory_target: failed assert when options mismatch
merged https://github.com/ceph/ceph/pull/25421 Yuri Weinstein
01:23 PM Bug #36515: config options: 'services' field is empty for many config options
Some of the config option 'services' fields have been addressed by https://github.com/ceph/ceph/pull/25456 Tatjana Dehler
01:10 PM Bug #25211 (Resolved): bug in PerfCounters
Kefu Chai
01:08 PM Bug #36709 (Closed): OSD stuck while flushing rocksdb WAL
Igor Fedotov
12:50 PM Bug #36709: OSD stuck while flushing rocksdb WAL
Thanks for your answers, that was helpful info.
It looks like aacraid module v.1.2.1.50877 issue. IO requests stucke...
Aleksei Zakharov

12/14/2018

01:06 PM Bug #37665 (Fix Under Review): ceph-objectstore-tool export from luminous, import to master clear...
turns out the upgrade suite already turns import/export tool tests off... let's just do the same.
https://github.com...
Sage Weil
12:57 PM Bug #37665: ceph-objectstore-tool export from luminous, import to master clears same_interval_since
I'm thinking we should make ceph-objectstore-tool refuse to use an export from an older major release (without, say, ... Sage Weil
12:56 PM Bug #37665 (Resolved): ceph-objectstore-tool export from luminous, import to master clears same_i...
on luminous exporting osd.3, pg last seen as... Sage Weil
07:25 AM Bug #25174: osd: assert failure with FAILED assert(repop_queue.front() == repop) In function 'vo...
seen again here: http://qa-proxy.ceph.com/teuthology/yuriw-2018-12-12_21:15:36-kcephfs-wip-yuri5-testing-2018-12-12-1... Venky Shankar
04:14 AM Cleanup #37662 (In Progress): Review-RADOS suite
Master tracker for associated works arising from the RADOS teuthology suite review.
Attach related trackers for as...
Brad Hubbard

12/13/2018

10:26 PM Bug #37656 (Triaged): FileStore::_do_transaction() crashed with error 17 (merge collection vs osd...
... Neha Ojha
10:02 PM Bug #37654 (Resolved): FAILED ceph_assert(info.history.same_interval_since != 0) in PG::start_pee...
... Neha Ojha
09:49 PM Bug #37653: list-inconsistent-obj output truncated, causing osd-scrub-repair.sh failure

The commit 873655062de03fbeda7053eaf34eab5a7644e1d1 from https://github.com/ceph/ceph/pull/24229 exposed a bug in...
David Zafman
09:36 PM Bug #37653 (Resolved): list-inconsistent-obj output truncated, causing osd-scrub-repair.sh failure

This bug causes an diff to be detected because of missing entries. It would have been nice if the decode failure w...
David Zafman
03:44 PM Feature #21073: mgr: ceph/rgw: show hostnames and ports in ceph -s status output
The port info is in servicemap under frontend_config, though I agree it is specific enough and probably doesnt warran... Abhishek Lekshmanan
03:19 PM Feature #21073: mgr: ceph/rgw: show hostnames and ports in ceph -s status output
https://github.com/ceph/ceph/pull/25540
This patch will show the service's id, but not the port. For the rgw examp...
Joao Eduardo Luis
01:11 PM Bug #37439: Degraded PG does not discover remapped data on originating OSD
Tested on a 5-node cluster with 20 OSDs and 14 3-replica pools.
Here's the log file (level 20) of OSD 18, which is...
Jonas Jelten
12:07 PM Bug #37439: Degraded PG does not discover remapped data on originating OSD
please please let us edit issues and comments...
-I made a mistake in the above post: *please ignore* the @ceph os...
Jonas Jelten
11:13 AM Bug #37439: Degraded PG does not discover remapped data on originating OSD
Easy steps to reproduce seem to be:
* Have a healthy cluster
* @ceph osd set pause # make sure no writes me...
Jonas Jelten
12:56 PM Bug #20798: LibRadosLockECPP.LockExclusiveDurPP gets EEXIST
/a/sage-2018-12-12_23:36:13-rados-wip-sage2-testing-2018-12-12-1435-distro-basic-smithi/3335654
Sage Weil
12:31 PM Bug #37640 (Resolved): Can not rollback from 12.2.1 to 12.2.0 for CEPH_MON_FEATURE_INCOMPAT_LUMINOUS
Resolving as requested. Lenz Grimmer
10:20 AM Bug #37640: Can not rollback from 12.2.1 to 12.2.0 for CEPH_MON_FEATURE_INCOMPAT_LUMINOUS
liuzhong chen wrote:
> I can not find the ceph-mon project in the issue,So I add it in ceph-mgr column.If it is wron...
liuzhong chen
03:13 AM Bug #37640: Can not rollback from 12.2.1 to 12.2.0 for CEPH_MON_FEATURE_INCOMPAT_LUMINOUS
I can not find the ceph-mon project in the issue,So I add it in ceph-mgr column.If it is wrong,please move it right p... liuzhong chen
03:11 AM Bug #37640 (Resolved): Can not rollback from 12.2.1 to 12.2.0 for CEPH_MON_FEATURE_INCOMPAT_LUMINOUS
As 12.2.1 and higher version has mon feature CEPH_MON_FEATURE_INCOMPAT_LUMINOUS,we can not rollback from 12.2.1 to 12... liuzhong chen
09:25 AM Bug #36725: luminous: Apparent Memory Leak in OSD
I made dumps during the tune of the osd_memory_target value. Perhaps this data will be useful in the future.... Konstantin Shalygin
09:11 AM Bug #36709: OSD stuck while flushing rocksdb WAL
iostat -xtd 1 output most of time when the problem occurs:... Aleksei Zakharov
04:12 AM Bug #37618 (Pending Backport): Command failed on smithi191 with status 1: '\n sudo yum -y install...
Kefu Chai

12/12/2018

10:29 PM Bug #36709: OSD stuck while flushing rocksdb WAL
The backtrace that's attached shows the kv_sync_thread waiting for I/O to complete from the block device:... Josh Durgin
01:40 PM Bug #36709: OSD stuck while flushing rocksdb WAL
I've finally reproduced this(as i hope) behavior.
Our staging cluster:
3 nodes with 22 osds on ssd's,
Kernel 4.1...
Aleksei Zakharov
10:19 PM Bug #37326: Daily inconsistent objects
The ceph-users list may be able to help debug this faster - it could be many things in the hw/sw stack. Josh Durgin
10:13 PM Bug #37593 (Fix Under Review): ec pool lost data due to snap clone
Neha Ojha
10:06 PM Bug #36725 (Closed): luminous: Apparent Memory Leak in OSD
Greg Farnum
09:03 PM Backport #37341 (Resolved): luminous: doc: Add bluestore memory autotuning docs
Nathan Cutler
09:01 PM Backport #37341: luminous: doc: Add bluestore memory autotuning docs
Note this is a follow-up on https://github.com/ceph/ceph/pull/24065 Nathan Cutler
08:30 PM Backport #37343 (In Progress): luminous: Prioritize user specified scrubs
Nathan Cutler
08:29 PM Bug #37583: mix luminous + master mons break ceph cli
https://github.com/ceph/ceph/pull/25470 Sage Weil
08:25 PM Backport #37342 (In Progress): mimic: Prioritize user specified scrubs
Nathan Cutler
05:20 PM Bug #37264: scrub warning check incorrectly uses mon scrub interval
https://github.com/ceph/ceph/pull/25112 David Zafman
05:15 PM Feature #37597: ceph-objectstore-tool: Add HashInfo to object dump output
https://github.com/ceph/ceph/pull/25483 David Zafman
02:43 PM Bug #36040: mon: Valgrind: mon (InvalidFree, InvalidWrite, InvalidRead)
seen again here: http://qa-proxy.ceph.com/teuthology/yuriw-2018-12-10_20:44:09-fs-wip-yuri4-testing-2018-12-10-1710-m... Venky Shankar
01:52 PM Backport #36729 (In Progress): mimic: Add support for osd_delete_sleep configuration value
Nathan Cutler
03:33 AM Bug #37618 (Fix Under Review): Command failed on smithi191 with status 1: '\n sudo yum -y install...
change on teuthology side
- https://github.com/ceph/teuthology/pull/1244
change on ceph side
- https://githu...
Kefu Chai
03:32 AM Bug #37618 (Resolved): Command failed on smithi191 with status 1: '\n sudo yum -y install ceph-ra...
librados2 and librbd1 are installed as a dependency of qemu-kvm.
qemu-kvm is installed by ceph-cm-ansible, see [1].
...
Kefu Chai

12/11/2018

11:07 PM Bug #36725: luminous: Apparent Memory Leak in OSD
Konstantin: thanks for pointing that out. that looks like the issue. Both OSD servers have 8GB RAM total, each run... John Jaser
03:55 PM Feature #37597 (Resolved): ceph-objectstore-tool: Add HashInfo to object dump output
David Zafman
03:08 PM Bug #37507: osd_memory_target: failed assert when options mismatch
Hi Mark,
You got it: 1105322466 boots, and 1105322465 crashes with the above trace.
Cheers, Dan
Dan van der Ster
12:18 PM Bug #37593 (Resolved): ec pool lost data due to snap clone
the wrong process is posted in https://github.com/ceph/ceph/pull/25490 Zengran Zhang
07:56 AM Bug #37452 (Resolved): FAILED ceph_assert(prealloc_left == (int64_t)need)
thanks Igor! Kefu Chai
02:03 AM Bug #24615 (Resolved): error message for 'unable to find any IP address' not shown
Victor Denisov
02:02 AM Bug #24615: error message for 'unable to find any IP address' not shown
Thanks Francois, I'll close the ticket. Victor Denisov
01:22 AM Bug #24615: error message for 'unable to find any IP address' not shown
Hi Victor Denisov,
First, really sorry for my late answer (I was a little busy).
In fact, I have tested again w...
Francois Lafont

12/10/2018

11:18 PM Bug #37507: osd_memory_target: failed assert when options mismatch
Hi Folks,
I'm guessing this is related to https://github.com/ceph/ceph/pull/25421 Basically a stupid uint64_t bug...
Mark Nelson
10:37 PM Bug #37507: osd_memory_target: failed assert when options mismatch
Thoughts, Mark? Greg Farnum
10:13 PM Feature #37500: ceph status/health hang when they could give helpful hints
Hmm, perhaps we could fall back to outputting other commands when connections to the monitor seem to be hanging, as t... Greg Farnum
06:18 PM Bug #37583 (Fix Under Review): mix luminous + master mons break ceph cli
Sage Weil

12/09/2018

06:19 PM Bug #37583 (Resolved): mix luminous + master mons break ceph cli
both luminous and ceph cli fail intermittently, depending on which mon they connect to.... Sage Weil
06:07 PM Bug #37582 (New): luminous: ceph -s client gets all mgrmaps
... Sage Weil
04:59 PM Bug #36748: ms_deliver_verify_authorizer no AuthAuthorizeHandler found for protocol 0
/a/kchai-2018-12-09_00:37:50-rados-wip-kefu2-testing-2018-12-09-0002-distro-basic-smithi/3318960 Kefu Chai
04:48 AM Bug #36725: luminous: Apparent Memory Leak in OSD
John, you are you in course about new 12.2.9 options osd_memory_target and bluestore_cache_autotune?
You should try ...
Konstantin Shalygin

12/08/2018

08:30 PM Bug #37542: nvme partitions aren't mapped back to device
Hrm, it looks like the code in question is... Sage Weil
02:51 PM Bug #36725: luminous: Apparent Memory Leak in OSD
I have same problem. birong huang
01:44 PM Bug #37507: osd_memory_target: failed assert when options mismatch
... Konstantin Shalygin
03:46 AM Bug #20491 (New): objecter leaked OSDMap in handle_osd_map
... Kefu Chai

12/07/2018

09:00 AM Bug #24601: FAILED assert(is_up(osd)) in OSDMap::get_inst(int)
https://github.com/ceph/ceph/pull/25437 Zengran Zhang
03:36 AM Bug #37542 (Resolved): nvme partitions aren't mapped back to device
... Sage Weil
02:32 AM Bug #17257: ceph_test_rados_api_lock fails LibRadosLockPP.LockExclusiveDurPP
... Kefu Chai

12/06/2018

11:22 PM Bug #37525 (Resolved): unprime_split_children may discard query
Sage Weil
06:26 PM Bug #37439: Degraded PG does not discover remapped data on originating OSD
See also the ceph-devel mailing list thread "Degraded PG does not discover remapped data on originating OSD". Greg Farnum
12:03 PM Bug #37532: mon: expected_num_objects warning triggers on bluestore-only setups
Joao Eduardo Luis
11:49 AM Bug #37532: mon: expected_num_objects warning triggers on bluestore-only setups
I don't think it's wise to simply remove the code because filestore is no longer the default. We need to consider exi... Joao Eduardo Luis

12/05/2018

11:28 PM Bug #37532: mon: expected_num_objects warning triggers on bluestore-only setups
https://github.com/ceph/ceph/pull/25417 Paul Emmerich
11:11 PM Bug #37532 (Resolved): mon: expected_num_objects warning triggers on bluestore-only setups
Follow up for the mailing list thread http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-December/031711.html
...
Paul Emmerich

12/04/2018

07:59 PM Bug #37512 (Resolved): ready_to_merge message lost
Sage Weil
04:10 AM Bug #37512: ready_to_merge message lost
https://github.com/ceph/ceph/pull/25388 Sage Weil
04:09 AM Bug #37512 (Resolved): ready_to_merge message lost
//a/sage-2018-12-03_17:39:26-rados-wip-sage2-testing-2018-12-03-0942-distro-basic-smithi/3304262... Sage Weil
07:58 PM Bug #37525 (Fix Under Review): unprime_split_children may discard query
https://github.com/ceph/ceph/pull/25399 Sage Weil
07:52 PM Bug #37525 (Resolved): unprime_split_children may discard query
... Sage Weil
09:48 AM Bug #23031: FAILED assert(!parent->get_log().get_missing().is_missing(soid))
https://github.com/ceph/ceph/pull/25219 Zengran Zhang
03:49 AM Bug #37511: merge target placeholder may get wrong PastIntervals from source
I think the fix is to just bite the bullet and put PastIntervals at decrement time in the pg_info_t, along with the o... Sage Weil
03:49 AM Bug #37511 (Resolved): merge target placeholder may get wrong PastIntervals from source
... Sage Weil
03:25 AM Bug #37509: require past_interval bounds mismatch due to osd oldest_map
I don't think the superblock.oldest_map should be a factor in this calculation. I suspect it is in there to deal wit... Sage Weil
03:24 AM Bug #37509 (Can't reproduce): require past_interval bounds mismatch due to osd oldest_map
... Sage Weil

12/03/2018

11:33 PM Backport #37496 (In Progress): mimic: OSD mkfs might assert when working agains bluestore disk th...
https://github.com/ceph/ceph/pull/25385 Igor Fedotov
04:02 PM Bug #37507 (Resolved): osd_memory_target: failed assert when options mismatch
We tried setting osd_memory_target to 1GB and this results in the following assertion early after startup:... Dan van der Ster

12/02/2018

07:30 PM Bug #36725: luminous: Apparent Memory Leak in OSD
Upgraded one OSD server to 12.2.10: Same symptom observed. See attached. Two OSD daemons use up all physical memory... John Jaser

12/01/2018

07:14 PM Feature #37500 (New): ceph status/health hang when they could give helpful hints
Today I had an incident with my Ceph cluster that took down my infrastructure.
I am running Ceph(FS) 13.2.2 on Lin...
Niklas Hambuechen
06:42 AM Backport #37496 (Resolved): mimic: OSD mkfs might assert when working agains bluestore disk that ...
https://github.com/ceph/ceph/pull/25385 Nathan Cutler

11/30/2018

07:56 PM Bug #37404: OSD mkfs might assert when working agains bluestore disk that already has a superblock
Finally merged within
https://github.com/ceph/ceph/pull/25308
Igor Fedotov
07:20 PM Bug #37404 (Pending Backport): OSD mkfs might assert when working agains bluestore disk that alre...
Sage Weil
08:51 AM Bug #37452: FAILED ceph_assert(prealloc_left == (int64_t)need)
Kefu Chai wrote:
> but in the mean time, can we have a more user-friend error message in this case? i can hardly t...
Igor Fedotov
08:11 AM Bug #37452 (New): FAILED ceph_assert(prealloc_left == (int64_t)need)
i'd like to keep this open as usability issue. Kefu Chai
08:09 AM Bug #37452 (Rejected): FAILED ceph_assert(prealloc_left == (int64_t)need)
Kefu Chai
02:51 AM Bug #37452: FAILED ceph_assert(prealloc_left == (int64_t)need)
Igor, thanks for looking into it.
so before the osd crashed, we had allocated 8.79 G out of 10G, and the free spac...
Kefu Chai
07:33 AM Bug #24909 (Resolved): RBD client IOPS pool stats are incorrect (2x higher; includes IO hints as ...
Nathan Cutler
07:33 AM Backport #36556 (Resolved): luminous: RBD client IOPS pool stats are incorrect (2x higher; includ...
Nathan Cutler
06:24 AM Bug #24587 (Resolved): librados api aio tests race condition
Nathan Cutler
06:22 AM Backport #36646 (Resolved): luminous: librados api aio tests race condition
Nathan Cutler
06:21 AM Bug #36602 (Resolved): osd: race condition opening heartbeat connection
Nathan Cutler
06:21 AM Backport #36636 (Resolved): luminous: osd: race condition opening heartbeat connection
Nathan Cutler
06:18 AM Bug #36406 (Resolved): Cache-tier forward mode hang in luminous (again)
Nathan Cutler
06:18 AM Backport #36657 (Resolved): luminous: Cache-tier forward mode hang in luminous (again)
Nathan Cutler

11/29/2018

10:10 AM Bug #37452: FAILED ceph_assert(prealloc_left == (int64_t)need)
The most probable root cause for the issue is the lack of free space at BlueStore main device. It's 10GB by default a... Igor Fedotov
05:46 AM Bug #37452 (Resolved): FAILED ceph_assert(prealloc_left == (int64_t)need)
... Kefu Chai
09:58 AM Bug #37439: Degraded PG does not discover remapped data on originating OSD
In the second scenario, the cluster was completely healthy before new disks were added. My guess is that non-remapped... Jonas Jelten
09:13 AM Backport #36321 (Resolved): luminous: Add support for osd_delete_sleep configuration value
Nathan Cutler
01:09 AM Backport #36321: luminous: Add support for osd_delete_sleep configuration value
Vikhyat Umrao wrote:
> https://github.com/ceph/ceph/pull/24501
merged
Yuri Weinstein
09:10 AM Backport #36630 (Resolved): luminous: potential deadlock in PG::_scan_snaps when repairing snap m...
Nathan Cutler
01:05 AM Backport #36630: luminous: potential deadlock in PG::_scan_snaps when repairing snap mapper
https://github.com/ceph/ceph/pull/24833 merged Yuri Weinstein
06:21 AM Bug #36177 (Resolved): rados rm --force-full is blocked when cluster is in full status
Nathan Cutler
06:20 AM Backport #36436 (Resolved): luminous: rados rm --force-full is blocked when cluster is in full st...
Nathan Cutler
01:03 AM Backport #36436: luminous: rados rm --force-full is blocked when cluster is in full status
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25018
merged
Yuri Weinstein
01:17 AM Backport #36556: luminous: RBD client IOPS pool stats are incorrect (2x higher; includes IO hints...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25025
merged
Yuri Weinstein
01:16 AM Backport #36646: luminous: librados api aio tests race condition
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/25028
merged
Yuri Weinstein
01:14 AM Backport #36636: luminous: osd: race condition opening heartbeat connection
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/25035
merged
Yuri Weinstein
01:14 AM Backport #36657: luminous: Cache-tier forward mode hang in luminous (again)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25074
merged
Yuri Weinstein

11/28/2018

10:06 PM Bug #37439: Degraded PG does not discover remapped data on originating OSD
The first scenario definitely looks like an issue; perhaps we are improperly filtering for out rather than down durin... Greg Farnum
02:07 PM Bug #37439: Degraded PG does not discover remapped data on originating OSD
As I can't edit the post...
To clarify: With *missing* I mean the parts of the erasure coded object so the object ...
Jonas Jelten
02:00 PM Bug #37439 (Resolved): Degraded PG does not discover remapped data on originating OSD
There seems to be an issue that an OSD is not queried for *missing objects* that were *remapped*, but the OSD for thi... Jonas Jelten
05:22 PM Backport #37437: mimic: crushtool: add --reclassify operation to convert legacy crush maps to use...
h3. original description
The functionality has been added to master (nautilus) [1]. It would be nice to backport t...
Nathan Cutler
04:03 PM Backport #37437: mimic: crushtool: add --reclassify operation to convert legacy crush maps to use...
PR: https://github.com/ceph/ceph/pull/25306 Mykola Golub
01:39 PM Backport #37437 (Resolved): mimic: crushtool: add --reclassify operation to convert legacy crush ...
https://github.com/ceph/ceph/pull/25306 Mykola Golub
05:21 PM Backport #37438: luminous: crushtool: add --reclassify operation to convert legacy crush maps to ...
h3. original description
The functionality has been added to master (nautilus) [1]. It would be nice to backport t...
Nathan Cutler
04:02 PM Backport #37438: luminous: crushtool: add --reclassify operation to convert legacy crush maps to ...
PR: https://github.com/ceph/ceph/pull/25307 Mykola Golub
01:41 PM Backport #37438 (Resolved): luminous: crushtool: add --reclassify operation to convert legacy cru...
https://github.com/ceph/ceph/pull/25307 Mykola Golub
05:20 PM Bug #37443 (Resolved): crushtool: add --reclassify operation to convert legacy crush maps to use ...
The functionality has been added to master (nautilus) [1]. It would be nice to backport this.
[1] https://github.c...
Nathan Cutler
05:09 AM Bug #36732 (Resolved): tools/rados: fix segmentation fault
Kefu Chai

11/27/2018

08:40 PM Backport #36321 (In Progress): luminous: Add support for osd_delete_sleep configuration value
Nathan Cutler
08:39 PM Backport #36321: luminous: Add support for osd_delete_sleep configuration value
h3. original description
[RFE] Introduce an option or flag to throttle the pg deletion process
https://bugzilla.r...
Nathan Cutler
07:45 PM Bug #36250: ceph-osd process crashing
I believe this issue was due to a malfunctioning ceph-fuse client, although I don't have data to back that up as it w... Josh Haft
06:02 PM Fix #37410 (Duplicate): change default osd_objectstore to bluestore
duplicate of #36494 Douglas Fuller
05:53 PM Fix #37410 (Fix Under Review): change default osd_objectstore to bluestore
https://github.com/ceph/ceph/pull/25288 Douglas Fuller
05:38 PM Fix #37410 (Duplicate): change default osd_objectstore to bluestore
This way, the mon and associated tools know what the default actually is on the cluster. Douglas Fuller
06:01 PM Bug #36494: Change osd_objectstore default to bluestore
Can you set this for backport to mimic and luminous? Douglas Fuller
03:30 PM Backport #37341 (In Progress): luminous: doc: Add bluestore memory autotuning docs
Josh Durgin
03:26 PM Backport #37340 (In Progress): mimic: doc: Add bluestore memory autotuning docs
Josh Durgin
02:27 PM Bug #36525: osd-scrub-snaps.sh failure
/a/kchai-2018-11-27_11:44:27-rados-wip-kefu2-testing-2018-11-27-1724-distro-basic-smithi/3285226/teuthology.log Kefu Chai
11:45 AM Bug #37404 (Fix Under Review): OSD mkfs might assert when working agains bluestore disk that alre...
https://github.com/ceph/ceph/pull/25281/files Igor Fedotov
11:04 AM Bug #37404 (In Progress): OSD mkfs might assert when working agains bluestore disk that already h...
Igor Fedotov
11:01 AM Bug #37404 (Resolved): OSD mkfs might assert when working agains bluestore disk that already has ...
One might face an assert on collection's release which happens
after store destroy. For now is observable in some qa...
Igor Fedotov

11/26/2018

11:49 PM Bug #24612 (Resolved): FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::prune_init()
Nathan Cutler
11:49 PM Backport #35071 (Resolved): mimic: FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::p...
Nathan Cutler
08:56 PM Backport #35071: mimic: FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::prune_init()
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24918
merged
Yuri Weinstein
11:48 PM Bug #22544 (Resolved): objecter cannot resend split-dropped op when racing with con reset
Nathan Cutler
11:48 PM Backport #35843 (Resolved): mimic: objecter cannot resend split-dropped op when racing with con r...
Nathan Cutler
08:55 PM Backport #35843: mimic: objecter cannot resend split-dropped op when racing with con reset
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24970
merged
Yuri Weinstein
11:48 PM Bug #36358 (Resolved): Interactive mode CLI prints no output since Mimic
Nathan Cutler
11:47 PM Backport #36432 (Resolved): mimic: Interactive mode CLI prints no output since Mimic
Nathan Cutler
08:54 PM Backport #36432: mimic: Interactive mode CLI prints no output since Mimic
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24971
merged
Yuri Weinstein
11:47 PM Backport #36433 (Resolved): mimic: monstore tool rebuild does not generate creating_pgs
Nathan Cutler
08:54 PM Backport #36433: mimic: monstore tool rebuild does not generate creating_pgs
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25016
merged
Yuri Weinstein
11:46 PM Backport #36435 (Resolved): mimic: rados rm --force-full is blocked when cluster is in full status
Nathan Cutler
08:53 PM Backport #36435: mimic: rados rm --force-full is blocked when cluster is in full status
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25017
merged
Yuri Weinstein
11:45 PM Backport #36505 (Resolved): mimic: mon osdmap cash too small during upgrade to mimic
Nathan Cutler
08:53 PM Backport #36505: mimic: mon osdmap cash too small during upgrade to mimic
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25019
merged
Yuri Weinstein
11:44 PM Backport #36557 (Resolved): mimic: RBD client IOPS pool stats are incorrect (2x higher; includes ...
Nathan Cutler
08:52 PM Backport #36557: mimic: RBD client IOPS pool stats are incorrect (2x higher; includes IO hints as...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25024
merged
Yuri Weinstein
11:44 PM Backport #36637 (Resolved): mimic: osd: race condition opening heartbeat connection
Nathan Cutler
08:51 PM Backport #36637: mimic: osd: race condition opening heartbeat connection
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/25026
merged
Yuri Weinstein
11:43 PM Backport #36647 (Resolved): mimic: librados api aio tests race condition
Nathan Cutler
08:51 PM Backport #36647: mimic: librados api aio tests race condition
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/25027
merged
Yuri Weinstein
11:40 PM Backport #36658 (Resolved): mimic: Cache-tier forward mode hang in luminous (again)
Nathan Cutler
08:48 PM Backport #36658: mimic: Cache-tier forward mode hang in luminous (again)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25075
merged
Yuri Weinstein
08:45 PM Bug #37393 (Resolved): mimic: osd-backfill-stats.sh fails in rados/standalone/osd.yaml
Run: http://pulpito.front.sepia.ceph.com/yuriw-2018-11-21_22:16:20-rados-wip-yuri5-testing-2018-11-21-1510-mimic-dist... Yuri Weinstein

11/25/2018

09:56 AM Bug #37326: Daily inconsistent objects
Anyone has any idea? Greg Smith

11/23/2018

04:52 PM Bug #22597: "sudo chown -R ceph:ceph /var/lib/ceph/osd/ceph-0'" fails in upgrade test
The problematic chown was introduced in mimic, so backporting only that far back.
See https://github.com/ceph/ceph...
Nathan Cutler
02:34 AM Backport #37288 (In Progress): mimic: "sudo chown -R ceph:ceph /var/lib/ceph/osd/ceph-0'" fails i...
https://github.com/ceph/ceph/pull/25227 Prashant D

11/22/2018

05:19 PM Backport #37273 (Resolved): mimic: debian: packaging need to reflect move of /etc/bash_completion...
Nathan Cutler
04:46 PM Backport #37273: mimic: debian: packaging need to reflect move of /etc/bash_completion.d/radosgw-...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25115
merged
Yuri Weinstein
07:32 AM Bug #36767: OSD: unrecoverable heartbeat connections
see also: https://tracker.ceph.com/issues/36175 Yan Jun

11/21/2018

08:25 AM Backport #37340 (Need More Info): mimic: doc: Add bluestore memory autotuning docs
Nathan Cutler
07:19 AM Bug #37326: Daily inconsistent objects
It happens on different disks, even on different host nodes. Greg Smith
06:40 AM Bug #24676: FreeBSD/Linux integration - monitor map with wrong sa_family
Hello,
Just tested this and received the same "NetHandler create_socket couldn't create socket (97) Address family...
Richard Gallamore
 

Also available in: Atom