Project

General

Profile

Activity

From 07/11/2019 to 08/09/2019

08/09/2019

03:03 PM Bug #41167: arm64: unexpected aio return value: does not match length
https://github.com/ceph/ceph/pull/29370 Kefu Chai
02:23 PM Bug #41188 (Resolved): incorrect RW_IO_MAX
0x7fff0000, not 0x7ffff000 Sage Weil
04:01 AM Bug #38559: 50-100% iops lost due to bluefs_preextend_wal_files = false
luminous: https://github.com/ceph/ceph/pull/29564 Konstantin Shalygin

08/08/2019

08:52 PM Bug #41037 (Pending Backport): Containerized cluster failure due to osd_memory_target not being s...
nautilus backport: https://github.com/ceph/ceph/pull/29562 Sage Weil
02:34 PM Bug #41167 (Duplicate): arm64: unexpected aio return value: does not match length
... Sage Weil

08/07/2019

06:38 PM Feature #40704: BlueStore tool to check fragmentation
Luminous backport: https://github.com/ceph/ceph/pull/29539 Neha Ojha
06:38 PM Feature #40704 (Pending Backport): BlueStore tool to check fragmentation
Neha Ojha

08/06/2019

04:16 PM Bug #41037 (Fix Under Review): Containerized cluster failure due to osd_memory_target not being s...
https://github.com/ceph/ceph/pull/29511 Sage Weil
11:33 AM Documentation #39522 (Pending Backport): fix and improve doc regarding manual bluestore cache set...
Jan Fajerski
12:14 AM Feature #40704 (Fix Under Review): BlueStore tool to check fragmentation
Neha Ojha

08/05/2019

04:05 PM Bug #41037: Containerized cluster failure due to osd_memory_target not being set to ratio of cgro...
@ben that's probably a semi-reasonable assumption in a lot of cases, though I've noticed that the kernel doesn't alwa... Mark Nelson
02:09 PM Bug #41037: Containerized cluster failure due to osd_memory_target not being set to ratio of cgro...
My guess would be if CGroup limit is X, then 0.95 X - 1/2 GB should be fine for osd_memory_target, that would give th... Ben England
12:11 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
Robin,
not sure about the next Mimic/Luminous releases, but may be later. Let this patch bake for a bit in Nautilus ...
Igor Fedotov

08/02/2019

08:17 PM Backport #40281: nautilus: 50-100% iops lost due to bluefs_preextend_wal_files = false
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28573
merged
Yuri Weinstein
08:14 PM Backport #40837: nautilus: Set concurrent max_background_compactions in rocksdb to 2
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29162
merged
Yuri Weinstein
03:49 AM Feature #41053 (New): bluestore/rocksdb: aarch64 optimized crc32c instructions support
Currently, rocksdb engine in the newest nautilus(14.2.2) and master branch doesn't support aarch64 optimized crc32c i... Zhiwei Dai

08/01/2019

08:36 PM Bug #41037: Containerized cluster failure due to osd_memory_target not being set to ratio of cgro...
Joe Talerico reproduced this and found the POD_LIMIT was getting set, but not the system-wide limit, so the current O... Josh Durgin
07:17 PM Bug #41037: Containerized cluster failure due to osd_memory_target not being set to ratio of cgro...
Neha Ojha wrote:
> Can you enable debug_osd=10 and see what this line (https://github.com/ceph/ceph/commit/fc3bdad87...
Neha Ojha
05:44 PM Bug #41037: Containerized cluster failure due to osd_memory_target not being set to ratio of cgro...
from Joe T on his system:
# ceph version
ceph version 14.2.2-218-g734b519 (734b5199dc45d3d36c8d8d066d6249cc304d0e0e...
Ben England
05:38 PM Bug #41037: Containerized cluster failure due to osd_memory_target not being set to ratio of cgro...
version you asked for:
ceph-base-14.2.2-0.el7.x86_64
from the Ceph container image ceph/ceph:v14.2.2-20190722
Ben England
05:33 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
Nathan/Igor: any chance of a Luminious backport for v12.2.13, and Mimic as well? Robin Johnson

07/31/2019

09:25 PM Bug #41037: Containerized cluster failure due to osd_memory_target not being set to ratio of cgro...
Can you enable debug_osd=10 and see what this line (https://github.com/ceph/ceph/commit/fc3bdad87597066a813a3734b2a79... Neha Ojha
09:07 PM Bug #41037: Containerized cluster failure due to osd_memory_target not being set to ratio of cgro...
Which version of Ceph are you running? Neha Ojha
07:21 PM Bug #41037 (Resolved): Containerized cluster failure due to osd_memory_target not being set to ra...
Under heavy I/O workload (generated to multiple postgres databaseses, backed by Ceph RBD, via the pgbench utility), w... Dustin Black
01:54 PM Backport #40757 (Resolved): nautilus: stupid allocator might return extents with length = 0
Igor Fedotov
01:53 PM Bug #36482 (Resolved): High amount of Read I/O on BlueFS/DB when listing omap keys
Igor Fedotov
01:53 PM Backport #40632 (Resolved): nautilus: High amount of Read I/O on BlueFS/DB when listing omap keys
Igor Fedotov
01:52 PM Bug #40480 (Resolved): pool compression options not consistently applied
Igor Fedotov
01:52 PM Backport #40536 (Resolved): nautilus: pool compression options not consistently applied
Igor Fedotov
01:52 PM Bug #40623 (Resolved): massive allocator dumps when unable to allocate space for bluefs
Igor Fedotov
01:51 PM Backport #40675 (Resolved): nautilus: massive allocator dumps when unable to allocate space for b...
Igor Fedotov

07/30/2019

10:28 PM Backport #40675: nautilus: massive allocator dumps when unable to allocate space for bluefs
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/28891
merged
Yuri Weinstein
10:28 PM Backport #40536: nautilus: pool compression options not consistently applied
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28892
mergedReviewed-by: Sage Weil <sage@redhat.com>
Yuri Weinstein
10:26 PM Backport #40632: nautilus: High amount of Read I/O on BlueFS/DB when listing omap keys
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28963
merged
Yuri Weinstein
10:25 PM Backport #40757: nautilus: stupid allocator might return extents with length = 0
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29023
merged
Yuri Weinstein
05:37 PM Bug #41014 (Fix Under Review): make bluefs_alloc_size default to bluestore_min_alloc_size
Neha Ojha
05:13 PM Bug #41014 (Duplicate): make bluefs_alloc_size default to bluestore_min_alloc_size
Originally 1M was chosen as the bluefs_alloc_size since metadata is stored in rocksdb, which persists it in large chu... Neha Ojha
02:42 PM Bug #41009: osd_memory_target isn't applied in runtime.
Sridhar was planning on working on this after the mon_memory_target (https://github.com/ceph/ceph/pull/28227) - I had... Josh Durgin
01:46 PM Bug #41009 (Resolved): osd_memory_target isn't applied in runtime.
Looks like this PR (https://github.com/ceph/ceph/pull/27381) completely removed (intentionally?) the ability to adjus... Igor Fedotov
02:37 AM Bug #40938: Some osd processes restart automatically after adding osd
Detailed log information 伟 宋

07/25/2019

09:55 PM Bug #38795 (Resolved): fsck on mkfs breaks ObjectStore/StoreTestSpecificAUSize.BlobReuseOnOverwrite
Igor Fedotov
09:54 PM Backport #39638 (Resolved): luminous: fsck on mkfs breaks ObjectStore/StoreTestSpecificAUSize.Blo...
Igor Fedotov
09:51 PM Backport #39638: luminous: fsck on mkfs breaks ObjectStore/StoreTestSpecificAUSize.BlobReuseOnOve...
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/27056
merged
Yuri Weinstein
09:54 PM Bug #40080 (Resolved): Bitmap allocator return duplicate entries which cause interval_set assert
Igor Fedotov
09:53 PM Backport #40422 (Resolved): luminous: Bitmap allocator return duplicate entries which cause inter...
Igor Fedotov
09:47 PM Backport #40422: luminous: Bitmap allocator return duplicate entries which cause interval_set assert
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/28644
merged
Yuri Weinstein
09:52 PM Backport #40534 (Resolved): luminous: pool compression options not consistently applied
Igor Fedotov
09:45 PM Backport #40534: luminous: pool compression options not consistently applied
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/28895
merged
Yuri Weinstein
02:52 PM Cleanup #40918 (Resolved): os/bluestore:There is a unused parameter in _get_deferred_op()
Sage Weil
02:14 PM Cleanup #40918: os/bluestore:There is a unused parameter in _get_deferred_op()
https://github.com/ceph/ceph/pull/29320 Sage Weil
02:13 PM Cleanup #40918 (Fix Under Review): os/bluestore:There is a unused parameter in _get_deferred_op()
http://tracker.ceph.com/issues/40918 Sage Weil
02:17 PM Bug #40306 (Resolved): Pool dont show their true size after add more osd - Max Available 1TB
resolved by https://github.com/ceph/ceph/pull/28978, fixed in 14.2.2 Sage Weil
02:15 PM Bug #40492: man page for ceph-kvstore-tool missing command
Sage Weil
02:11 PM Bug #38745 (In Progress): spillover that doesn't make sense
Sage Weil
02:10 PM Bug #40520 (Need More Info): snap_mapper record resurrected: trim_object: Can not trim 3:205afc9d...
I think this was from something like
teuthology-suite rados/thrash --subset 1/99 --filter snaps
Sage Weil
12:07 PM Bug #40741: Mass OSD failure, unable to restart
Here is a summary of what we've discovered during this issue troubleshooting.
1) OSDs were dying due to suicide time...
Igor Fedotov
08:23 AM Bug #40938: Some osd processes restart automatically after adding osd
ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable) 伟 宋
08:22 AM Bug #40938 (Need More Info): Some osd processes restart automatically after adding osd
<31>2019-07-25T00:34:09.203757+08:00 osd006 snmpd[3548]: message repeated 35 times: [ error on subcontainer 'ia_addr'... 伟 宋
07:47 AM Backport #40758 (Resolved): mimic: stupid allocator might return extents with length = 0
Igor Fedotov

07/24/2019

11:02 PM Backport #40758: mimic: stupid allocator might return extents with length = 0
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/29024
merged
Yuri Weinstein
08:59 AM Cleanup #40918 (Resolved): os/bluestore:There is a unused parameter in _get_deferred_op()
os/bluestore:There is a unused parameter Onode *o in BlueStore::_get_deferred_op(),it maybe remove. shuguang wang

07/22/2019

10:43 AM Backport #40837 (In Progress): nautilus: Set concurrent max_background_compactions in rocksdb to 2
Nathan Cutler
08:20 AM Backport #40837 (Resolved): nautilus: Set concurrent max_background_compactions in rocksdb to 2
https://github.com/ceph/ceph/pull/29162 Nathan Cutler

07/19/2019

06:10 PM Bug #37282: rocksdb: submit_transaction_sync error: Corruption: block checksum mismatch code = 2
I'm seeing this on 14.2.2. Disk seems healthy.
The OSD in question suffered from https://tracker.ceph.com/issues/4...
Paul Emmerich
04:28 PM Bug #39618: Runaway memory usage on Bluestore OSD
I'm no longer running nautilus (or trying to run it), so I can't get more additional information. I was just reportin... Richard Hesse

07/18/2019

11:52 PM Bug #39618: Runaway memory usage on Bluestore OSD
Hi Richard,
Sorry for the long latency on this reply! Setting the osd_memory_target won't do anything if you disa...
Mark Nelson
08:33 PM Backport #40535 (Resolved): mimic: pool compression options not consistently applied
Igor Fedotov
07:50 PM Backport #40535: mimic: pool compression options not consistently applied
Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/28894
merged
Yuri Weinstein

07/17/2019

05:13 PM Bug #23463: src/os/bluestore/StupidAllocator.cc: 336: FAILED assert(rm.empty())
Would you share file listing for root folder of osd-0 and any other working OSD? Igor Fedotov
05:05 PM Bug #23463: src/os/bluestore/StupidAllocator.cc: 336: FAILED assert(rm.empty())
Christian - please tell more about the drive layout at these OSDs.. Is this just a single drive config? Igor Fedotov
03:56 PM Bug #23463: src/os/bluestore/StupidAllocator.cc: 336: FAILED assert(rm.empty())
Full log file with osd debug = 20 is attached.
The meta data ceph gathered:...
Christian Wahl
03:14 PM Bug #23463: src/os/bluestore/StupidAllocator.cc: 336: FAILED assert(rm.empty())
Nevermind - I missed the note that you can't reproduce it... Igor Fedotov
03:12 PM Bug #23463: src/os/bluestore/StupidAllocator.cc: 336: FAILED assert(rm.empty())
Christian - could you please set debug bluestore to 20 restart the osd and collect the log? Igor Fedotov
02:56 PM Bug #23463: src/os/bluestore/StupidAllocator.cc: 336: FAILED assert(rm.empty())
I encountered this bug in ceph version 13.2.6 mimic (stable) and it pulled down 6 out of 8 deployed OSDs, however I w... Christian Wahl
01:21 PM Bug #40741: Mass OSD failure, unable to restart
Igor Fedotov
01:15 AM Backport #40424 (Resolved): nautilus: Bitmap allocator return duplicate entries which cause inter...
Sage Weil

07/16/2019

03:50 PM Bug #40769 (Pending Backport): Set concurrent max_background_compactions in rocksdb to 2
https://github.com/ceph/ceph/pull/29027#issuecomment-511863023 Neha Ojha

07/13/2019

03:45 AM Bug #40741: Mass OSD failure, unable to restart
Brett Chancellor wrote:
> 1. Info below
> 2. Attached last 50k lines of logs with debug_bluefs set to 20/20
> 3. C...
Igor Fedotov

07/12/2019

10:35 PM Bug #40741: Mass OSD failure, unable to restart
1. Info below
2. Attached last 50k lines of logs with debug_bluefs set to 20/20
3. Can you share the syntax for cep...
Brett Chancellor
07:59 PM Bug #40741: Mass OSD failure, unable to restart
Let's keep osd.44 aside for now. For 35 & 110 please answer/do the following.
1) Check corresponding disk activity f...
Igor Fedotov
06:25 PM Bug #40741: Mass OSD failure, unable to restart
LVM..
The bigger issue right now isn't the failing SSDs, it's the constantly HDD's that are constantly rebooting ...
Brett Chancellor
04:30 PM Bug #40741: Mass OSD failure, unable to restart
What's behind you DB volumes - LVM or plain partition/device? Igor Fedotov
04:28 PM Bug #40741: Mass OSD failure, unable to restart
This one doesn't have enough space as well, 0xc00000 bytes as ssd, 0x28c8000 bytes at main device. See:
2019-07-11...
Igor Fedotov
04:05 PM Bug #40741: Mass OSD failure, unable to restart
Thanks for looking into Igor. That was one of the many failed SSD volumes, chosen at random. Here is some info from ... Brett Chancellor
12:42 PM Bug #40741: Mass OSD failure, unable to restart
Here is my analysis from what I've seen in your logs so far:
1) After initial issue(s) that trigger OSDs to restart ...
Igor Fedotov
05:42 PM Bug #40769 (Resolved): Set concurrent max_background_compactions in rocksdb to 2
https://github.com/ceph/ceph/pull/29027#issue-297158998 explains why this change makes sense. Neha Ojha
03:55 PM Backport #40756: luminous: stupid allocator might return extents with length = 0
https://github.com/ceph/ceph/pull/29025 Igor Fedotov
03:51 PM Backport #40756 (In Progress): luminous: stupid allocator might return extents with length = 0
Igor Fedotov
03:03 PM Backport #40756 (Resolved): luminous: stupid allocator might return extents with length = 0
https://github.com/ceph/ceph/pull/29025 Nathan Cutler
03:52 PM Backport #40758: mimic: stupid allocator might return extents with length = 0
https://github.com/ceph/ceph/pull/29024 Igor Fedotov
03:36 PM Backport #40758 (In Progress): mimic: stupid allocator might return extents with length = 0
Igor Fedotov
03:04 PM Backport #40758 (Resolved): mimic: stupid allocator might return extents with length = 0
https://github.com/ceph/ceph/pull/29024 Nathan Cutler
03:20 PM Backport #40757: nautilus: stupid allocator might return extents with length = 0
https://github.com/ceph/ceph/pull/29023 Igor Fedotov
03:11 PM Backport #40757 (In Progress): nautilus: stupid allocator might return extents with length = 0
Igor Fedotov
03:04 PM Backport #40757 (Resolved): nautilus: stupid allocator might return extents with length = 0
https://github.com/ceph/ceph/pull/29023 Nathan Cutler
02:00 PM Bug #40703 (Pending Backport): stupid allocator might return extents with length = 0
Sage Weil

07/11/2019

07:26 PM Bug #40741 (Triaged): Mass OSD failure, unable to restart
Cluster: 14.2.1
OSDs: 250 spinners in default root, 63 SSDs in ssd root
History: 5 days ago, this cluster began l...
Brett Chancellor
11:38 AM Backport #40423 (Resolved): mimic: Bitmap allocator return duplicate entries which cause interval...
Igor Fedotov
08:53 AM Backport #40280 (Resolved): mimic: 50-100% iops lost due to bluefs_preextend_wal_files = false
Nathan Cutler
 

Also available in: Atom