Project

General

Profile

Activity

From 04/20/2021 to 05/19/2021

05/19/2021

11:25 PM Bug #50017 (Fix Under Review): OSDs broken after nautilus->octopus upgrade: rocksdb Corruption: u...
Igor Fedotov
10:19 PM Bug #50656: bluefs _allocate unable to allocate, though enough free
based on https://github.com/ceph/ceph/pull/41369#issuecomment-844520075 Neha Ojha

05/18/2021

03:54 PM Bug #48216: Spanning blobs list might have zombie blobs that aren't of use any more
Konstantin Shalygin wrote:
> > Can't verify right now, but I presume you're getting zombies for pools 1 & 7:
> 1 is...
Igor Fedotov
03:52 PM Bug #50844 (Triaged): ceph_assert(r == 0) in BlueFS::_rewrite_log_and_layout_sync()
Given the following line I presume we're just out of space for WAL volume (which has got 512MB only):
-1 bluefs _a...
Igor Fedotov
12:58 PM Bug #42928 (Closed): ceph-bluestore-tool bluefs-bdev-new-db does not update lv tags
Complete bluefs volume migration is now implemented at ceph-volume level. See https://github.com/ceph/ceph/pull/39580... Igor Fedotov
06:51 AM Bug #50511: osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDs
Some small correction, in some case this compaction didn't help either so I had to reinstall the osd. Ist Gab
06:50 AM Bug #50511: osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDs
Had similar issue with RBD pool removal, the only thing that helped me as Igor suggested, stop the osd and run ceph-k... Ist Gab

05/17/2021

09:16 PM Bug #50656 (Fix Under Review): bluefs _allocate unable to allocate, though enough free
Neha Ojha
01:23 PM Bug #50656: bluefs _allocate unable to allocate, though enough free
Igor Fedotov wrote:
> Dan van der Ster wrote:
> > Igor, I was checking what is different between pacific's avl and ...
Igor Fedotov
12:40 PM Bug #50656: bluefs _allocate unable to allocate, though enough free
Dan van der Ster wrote:
> Igor, I was checking what is different between pacific's avl and octopus, and found it is ...
Igor Fedotov
11:55 AM Bug #50656: bluefs _allocate unable to allocate, though enough free
Igor, I was checking what is different between pacific's avl and octopus, and found it is *only* this: https://tracke... Dan van der Ster
08:32 PM Bug #50844: ceph_assert(r == 0) in BlueFS::_rewrite_log_and_layout_sync()
Hi Igor, this seems new in master. So far just one occurrence but I am assigning this to you for your thoughts. Neha Ojha
04:36 PM Bug #50844 (Triaged): ceph_assert(r == 0) in BlueFS::_rewrite_log_and_layout_sync()
... Neha Ojha

05/14/2021

11:28 PM Bug #50017: OSDs broken after nautilus->octopus upgrade: rocksdb Corruption: unknown WriteBatch tag
Jonas Jelten wrote:
> Yes, this de-zombie-blobs the OSDs. So now I have an upgradepath by (automatically) stopping a...
Igor Fedotov
05:19 PM Bug #50017: OSDs broken after nautilus->octopus upgrade: rocksdb Corruption: unknown WriteBatch tag
Yes, this de-zombie-blobs the OSDs. So now I have an upgradepath by (automatically) stopping an osd, running bluestor... Jonas Jelten
08:17 PM Bug #50656: bluefs _allocate unable to allocate, though enough free
Quote from ceph-users thread - after upgrade to 16.2.3 16.2.4 and after adding few hdd's OSD's started to fail 1 by 1... Neha Ojha
05:45 PM Backport #50405: octopus: Increase default value of bluestore_cache_trim_max_skip_pinned
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40919
merged
Yuri Weinstein
07:22 AM Bug #48216: Spanning blobs list might have zombie blobs that aren't of use any more
> Can't verify right now, but I presume you're getting zombies for pools 1 & 7:
1 is replicated RBD pool, and 7 is E...
Konstantin Shalygin

05/13/2021

11:12 PM Bug #48216: Spanning blobs list might have zombie blobs that aren't of use any more
Konstantin Shalygin wrote:
> I have a case: Luminous 12.2.13 -> Nautilus 14.2.20 upgrade:
>
> host 0-5: redeploye...
Igor Fedotov
11:02 PM Bug #48216: Spanning blobs list might have zombie blobs that aren't of use any more
Konstantin Shalygin wrote:
>
> On logs I noticed a trand that almost all errors comes from prefix "rbd_data.6", "s...
Igor Fedotov
12:33 PM Bug #50656: bluefs _allocate unable to allocate, though enough free
Jan-Philipp Litza wrote:
> Thanks for looking into this so quickly!
>
> So in #47883 it says that hybrid is the d...
Igor Fedotov

05/12/2021

10:34 PM Bug #50788 (Duplicate): crash in BlueStore::Onode::put()
... Neha Ojha
03:05 PM Backport #50782 (Resolved): pacific: AvlAllocator.cc: 60: FAILED ceph_assert(size != 0)
https://github.com/ceph/ceph/pull/41753 Backport Bot
03:05 PM Backport #50781 (Resolved): octopus: AvlAllocator.cc: 60: FAILED ceph_assert(size != 0)
https://github.com/ceph/ceph/pull/41612 Backport Bot
03:05 PM Backport #50780 (Resolved): nautilus: AvlAllocator.cc: 60: FAILED ceph_assert(size != 0)
https://github.com/ceph/ceph/pull/41750 Backport Bot
03:02 PM Bug #50555 (Pending Backport): AvlAllocator.cc: 60: FAILED ceph_assert(size != 0)
Kefu Chai

05/11/2021

08:53 AM Backport #50403 (Resolved): nautilus: Increase default value of bluestore_cache_trim_max_skip_pinned
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40920
m...
Loïc Dachary
08:28 AM Bug #50017: OSDs broken after nautilus->octopus upgrade: rocksdb Corruption: unknown WriteBatch tag
Checked on next host by myself - "--command repair" fix OSD's before Nautilus auto fsck, and also CAN repair already ... Konstantin Shalygin
07:47 AM Backport #50402 (Resolved): pacific: Increase default value of bluestore_cache_trim_max_skip_pinned
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40918
m...
Loïc Dachary
02:14 AM Bug #50739 (Resolved): crash: void BlueStore::_txc_add_transaction(BlueStore::TransContext*, Obje...

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=5c573d1b3e5166fa32aaa6f1...
Yaarit Hatuka

05/09/2021

05:44 AM Support #46781 (Closed): how to keep data security in bluestore when server power down ?
This sort of question is best handled by writing the ceph-users@ceph.io mailing list. :) Greg Farnum

05/08/2021

06:34 PM Bug #50017: OSDs broken after nautilus->octopus upgrade: rocksdb Corruption: unknown WriteBatch tag
Jonas, looks like my https://tracker.ceph.com/issues/48216#note-3 ?
This cluster have a EC meta pool for RBD?
Also,...
Konstantin Shalygin
06:21 PM Bug #48216: Spanning blobs list might have zombie blobs that aren't of use any more
I have a case: Luminous 12.2.13 -> Nautilus 14.2.20 upgrade:
host 0-5: redeployed on last 60 days: ceph-disk -> ce...
Konstantin Shalygin

05/07/2021

08:04 PM Bug #50017: OSDs broken after nautilus->octopus upgrade: rocksdb Corruption: unknown WriteBatch tag
FTR, Igor replied on the ML:
> I think the root cause is related to the high amount of repairs made
> during the ...
Dan van der Ster
07:55 PM Backport #50403: nautilus: Increase default value of bluestore_cache_trim_max_skip_pinned
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40920
merged
Yuri Weinstein

05/06/2021

07:29 AM Bug #50656: bluefs _allocate unable to allocate, though enough free
Thanks for looking into this so quickly!
So in #47883 it says that hybrid is the default allocator since 14.2.11, ...
Jan-Philipp Litza

05/05/2021

07:25 PM Backport #50402: pacific: Increase default value of bluestore_cache_trim_max_skip_pinned
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40918
merged
Yuri Weinstein
05:38 PM Bug #50017: OSDs broken after nautilus->octopus upgrade: rocksdb Corruption: unknown WriteBatch tag
Ok, some new information, tested on 15.2.11 :D
It seems that the OSDs are shredded with the @ceph-osd@ boot-time f...
Jonas Jelten
04:14 PM Bug #50656 (Triaged): bluefs _allocate unable to allocate, though enough free
OK, I managed to reproduce the issue via allocator's log replay - looks to some degree similar to https://tracker.cep... Igor Fedotov
03:07 PM Bug #50656: bluefs _allocate unable to allocate, though enough free
Oh well, the OSDs ran roughly an hour with the hybrid allocator before crashing again, and again in a cascading manne... Jan-Philipp Litza
02:44 PM Bug #50656: bluefs _allocate unable to allocate, though enough free
Jan-Philipp Litza wrote:
> > Could you please collect free blocks dump via 'ceph daemon OSD.N bluestore allocator du...
Igor Fedotov
02:17 PM Bug #50656: bluefs _allocate unable to allocate, though enough free
> Could you please collect free blocks dump via 'ceph daemon OSD.N bluestore allocator dump block'?
Isn't that wha...
Jan-Philipp Litza
01:49 PM Bug #50656: bluefs _allocate unable to allocate, though enough free
Could you please collect free blocks dump via 'ceph daemon OSD.N bluestore allocator dump block'?
Afterwards it wo...
Igor Fedotov
12:05 PM Bug #50656 (Resolved): bluefs _allocate unable to allocate, though enough free
Yesterday evening, 4 OSDs on SSDs on 2 hosts crashed almost simultaneously and didn't come back up (crashed again rig... Jan-Philipp Litza

05/03/2021

12:20 PM Bug #50511: osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDs
Rainer Stumbaum wrote:
> So,
> the snaptrim of the 128 PGs in that cephfs_data pool took about two hours now.
>
...
Igor Fedotov
12:15 PM Bug #50511: osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDs
Given all the symptoms I'm pretty sure the root cause for the issue is a "degraded" state of RocksDB after bulk data ... Igor Fedotov

05/02/2021

08:27 PM Bug #50511: osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDs
So,
the snaptrim of the 128 PGs in that cephfs_data pool took about two hours now.
No OSDs and no clients were ha...
Rainer Stumbaum
05:57 PM Bug #50511: osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDs
Hi,
this is what I did a few days ago from a recommendation on all the OSD servers:...
Rainer Stumbaum

04/30/2021

09:34 PM Bug #50511 (Need More Info): osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDs
osd.0 and osd.7 show different crashes. Would it be possible for you to capture a coredump or osd logs with debug_osd... Neha Ojha
11:06 AM Bug #46490: osds crashing during deep-scrub
Igor Fedotov wrote:
> Maximilian Stinsky wrote:
> > Igor Fedotov wrote:
> > > Maximilian Stinsky wrote:
> > > > H...
Maximilian Stinsky

04/29/2021

06:33 PM Bug #50555 (Fix Under Review): AvlAllocator.cc: 60: FAILED ceph_assert(size != 0)
Looks like 'expand" improperly marked out-of-bound blocks as unallocated due to the bug fixed by https://github.com/c... Igor Fedotov
02:59 PM Bug #50571 (Resolved): deadlock in fileWriter
Kefu Chai
10:34 AM Bug #50571 (Fix Under Review): deadlock in fileWriter
Kefu Chai
06:21 AM Bug #50571: deadlock in fileWriter
Swapping steps 4 and 5 can avoid deadlock problems. CONGMIN YIN
12:15 AM Bug #50571 (Resolved): deadlock in fileWriter
quote from https://github.com/ceph/ceph/pull/34109
> The new locking scheme can lead to a deadlock if two threads ...
Kefu Chai
01:00 PM Bug #50578: BlueFS::FileWriter::lock aborts when trying to `--mkfs`
Presumably duplicate of #50571 Igor Fedotov
10:34 AM Bug #50578 (Duplicate): BlueFS::FileWriter::lock aborts when trying to `--mkfs`
with vstart.sh on centos 8 ... Deepika Upadhyay

04/28/2021

11:38 AM Support #50309 (Resolved): bluestore_min_alloc_size_hdd = 4096
This have now been backported to the next nautilus/octopus release. #50549 #50550 Dan van der Ster
11:15 AM Support #50309: bluestore_min_alloc_size_hdd = 4096
Thanks. Greg Smith
08:33 AM Support #50309: bluestore_min_alloc_size_hdd = 4096
min_alloc_size is printed in hex at debug_bluestore level 10 when the superblock is opened:... Dan van der Ster
08:02 AM Support #50309: bluestore_min_alloc_size_hdd = 4096
Can someone help? Greg Smith
11:36 AM Bug #50550 (Resolved): octopus: os/bluestore: be more verbose in _open_super_meta by default
Dan van der Ster
09:02 AM Bug #50550 (Fix Under Review): octopus: os/bluestore: be more verbose in _open_super_meta by default
Dan van der Ster
08:56 AM Bug #50550 (Resolved): octopus: os/bluestore: be more verbose in _open_super_meta by default
backport https://github.com/ceph/ceph/pull/30838/commits/4087f82aea674df4c7b485bf804f3a9c98ae3741 only Dan van der Ster
11:35 AM Bug #50549 (Resolved): nautilus: os/bluestore: be more verbose in _open_super_meta by default
Dan van der Ster
09:01 AM Bug #50549 (Fix Under Review): nautilus: os/bluestore: be more verbose in _open_super_meta by def...
Dan van der Ster
08:56 AM Bug #50549 (Resolved): nautilus: os/bluestore: be more verbose in _open_super_meta by default
backport https://github.com/ceph/ceph/pull/30838/commits/4087f82aea674df4c7b485bf804f3a9c98ae3741 only Dan van der Ster
10:08 AM Bug #50555 (Resolved): AvlAllocator.cc: 60: FAILED ceph_assert(size != 0)
This started happening for an existing OSD after an upgrade from 15.2.2 octopus to 16.2.0 Pacific, but it turns out t... Hector Martin

04/27/2021

07:59 AM Bug #50511: osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDs
Hi,
I am able to reproduce this behaviour:
- Create lots of snapshots on CephFS on an active filesystem
- rmdir lo...
Rainer Stumbaum

04/26/2021

06:03 PM Bug #50511: osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDs
... Patrick Donnelly

04/25/2021

06:10 PM Bug #50511 (Need More Info): osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDs
Hi,
we use http://manpages.ubuntu.com/manpages/artful/man1/cephfs-snap.1.html to create hourly, daily, weekly and mo...
Rainer Stumbaum

04/24/2021

01:12 PM Bug #46490: osds crashing during deep-scrub
Maximilian Stinsky wrote:
> Igor Fedotov wrote:
> > Maximilian Stinsky wrote:
> > > Hi,
> > >
> > > we tried up...
Igor Fedotov

04/23/2021

09:38 AM Bug #46490: osds crashing during deep-scrub
Igor Fedotov wrote:
> Maximilian Stinsky wrote:
> > Hi,
> >
> > we tried upgrading our cluster to version 14.2.1...
Maximilian Stinsky

04/21/2021

12:37 PM Bug #46490: osds crashing during deep-scrub
Maximilian Stinsky wrote:
> Hi,
>
> we tried upgrading our cluster to version 14.2.18 but still have the random s...
Igor Fedotov
12:31 PM Bug #46490: osds crashing during deep-scrub
Hi,
we tried upgrading our cluster to version 14.2.18 but still have the random scrub error's on the ec pool every...
Maximilian Stinsky
09:10 AM Bug #45765 (Resolved): BlueStore::_collection_list causes huge latency growth pg deletion
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
09:01 AM Backport #49966 (Resolved): nautilus: BlueStore::_collection_list causes huge latency growth pg d...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40393
m...
Loïc Dachary
06:25 AM Support #50309: bluestore_min_alloc_size_hdd = 4096
We encountered a problem that we have some of the disks that are both old and new and we have no ability to identify ... Greg Smith
 

Also available in: Atom