Project

General

Profile

Activity

From 10/10/2018 to 11/08/2018

11/08/2018

11:04 PM Bug #36606 (Resolved): osd: checksum failure during upgrade test
Igor Fedotov
11:04 PM Bug #36606: osd: checksum failure during upgrade test
Sage, no, it's specific to Nautilus for now. We need it when/if we backport BlueFS migrate stuff. Igor Fedotov
10:28 PM Bug #36606 (Pending Backport): osd: checksum failure during upgrade test
Igor, we should backport this, right? Sage Weil
10:29 PM Bug #36625 (Pending Backport): _aio_log_start inflight overlap of 0x10000~1000 with [65536~4096]
Sage Weil
01:56 PM Backport #26943 (In Progress): luminous: os/bluestore/BlueStore.cc: 1025: FAILED assert(buffer_by...
Jonathan Brielmaier
09:53 AM Backport #36638 (In Progress): luminous: rename does not old ref to replacement onode at old name
Jonathan Brielmaier

11/06/2018

03:37 PM Bug #36606: osd: checksum failure during upgrade test
https://github.com/ceph/ceph/pull/24948 Igor Fedotov
01:45 PM Bug #36606 (Fix Under Review): osd: checksum failure during upgrade test
Igor Fedotov
01:28 PM Bug #36606 (In Progress): osd: checksum failure during upgrade test
Igor Fedotov

11/05/2018

10:27 PM Bug #36526 (Resolved): segv in BlueStore::OldExtent::create
Nathan Cutler
10:26 PM Backport #36591 (Resolved): luminous: segv in BlueStore::OldExtent::create
Nathan Cutler
10:08 PM Backport #36591: luminous: segv in BlueStore::OldExtent::create
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24746
merged
Yuri Weinstein

11/02/2018

04:46 PM Bug #36606: osd: checksum failure during upgrade test
Here's my analysis:
reproducer: https://tracker.ceph.com/issues/36606#note-9
commit before https://github.com/c...
Neha Ojha

10/31/2018

07:49 PM Backport #36592 (Resolved): mimic: segv in BlueStore::OldExtent::create
Nathan Cutler
12:27 AM Bug #36606: osd: checksum failure during upgrade test
The following seem to be the relevant pieces for one osd leading to the failure:... Neha Ojha

10/30/2018

10:50 PM Bug #36606: osd: checksum failure during upgrade test
Yes, the mkfs suceeds. That part of the logs is also present in the successful run of this test.... Neha Ojha
10:03 PM Bug #36606: osd: checksum failure during upgrade test
The --no-mon-config splats or normal.. qa/tasks/ceph.py tries first with --no-mon-config and, if it fails, does the m... Sage Weil
07:46 PM Backport #36592: mimic: segv in BlueStore::OldExtent::create
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24745
merged
Yuri Weinstein
05:15 PM Backport #36641 (Rejected): mimic: Unable to recover from ENOSPC in BlueFS
Patrick Donnelly
05:15 PM Backport #36640 (Rejected): luminous: Unable to recover from ENOSPC in BlueFS
Patrick Donnelly
05:14 PM Backport #36639 (Resolved): mimic: rename does not old ref to replacement onode at old name
https://github.com/ceph/ceph/pull/25313 Patrick Donnelly
05:14 PM Backport #36638 (Resolved): luminous: rename does not old ref to replacement onode at old name
https://github.com/ceph/ceph/pull/24989 Patrick Donnelly
04:09 PM Bug #22534: Debian's bluestore *rocksdb* does not support neither fast CRC nor compression
Partly broken:... Марк Коренберг
03:25 PM Bug #22534: Debian's bluestore *rocksdb* does not support neither fast CRC nor compression
Is this still broken? Sage Weil
02:49 PM Bug #36567: Segmentation fault in BlueStore::Blob::discard_unallocated
Not that good ;-) it always happen, when we trigger a heavy backfill or recovery. But i don't want to pull that many ... Stefan Priebe
02:37 PM Bug #36567: Segmentation fault in BlueStore::Blob::discard_unallocated
Stefan Priebe wrote:
> Yes so my question is if all of those are may be just a result of the race mentioned here: ht...
Sage Weil
02:44 PM Bug #36268 (Pending Backport): Unable to recover from ENOSPC in BlueFS
also https://github.com/ceph/ceph/pull/23103 Sage Weil
02:41 PM Bug #36625 (In Progress): _aio_log_start inflight overlap of 0x10000~1000 with [65536~4096]
Sage Weil
07:00 AM Bug #36625: _aio_log_start inflight overlap of 0x10000~1000 with [65536~4096]
https://github.com/ceph/ceph/pull/24820 Honggang Yang
06:55 AM Bug #36625 (Resolved): _aio_log_start inflight overlap of 0x10000~1000 with [65536~4096]
h1. discription... Honggang Yang
02:40 PM Bug #36422 (Duplicate): ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
Sage Weil
10:05 AM Bug #36284: Bluestore might be hanging OSD
My problem was fixed by:
https://github.com/ceph/ceph/commit/f755bed3e438d2e7d5ed0df30b8d5bebf2d0cf5a
I expect th...
Adam Kupczyk

10/29/2018

11:55 PM Bug #36606: osd: checksum failure during upgrade test
/a/nojha-2018-10-29_19:19:04-fs:upgrade-master-distro-basic-smithi/3201377/ Neha Ojha
06:21 PM Bug #36606: osd: checksum failure during upgrade test
Igor Fedotov wrote:
> I think mkfs doesn't run properly for bluestore since --no-mon-config param isn't recognized f...
Patrick Donnelly
08:55 AM Bug #36606: osd: checksum failure during upgrade test
I think mkfs doesn't run properly for bluestore since --no-mon-config param isn't recognized for unknown reason):
...
Igor Fedotov
08:10 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
https://github.com/ceph/ceph/pull/24647 merged Yuri Weinstein
06:47 PM Bug #36604 (Rejected): osd-bluefs-volume-ops.sh test hangs

I ran cmake again and make then rebuilt the ceph-bluestore-tool and this problem went away.
David Zafman
02:16 PM Bug #36541 (Pending Backport): rename does not old ref to replacement onode at old name
Sage Weil

10/28/2018

12:14 AM Bug #36606: osd: checksum failure during upgrade test
Does not affect filestore. Only upgrade tests (fs:upgrade) with bluestore (replicated or EC). Patrick Donnelly

10/27/2018

11:58 PM Bug #36606: osd: checksum failure during upgrade test
Note that this could be caused by a recent merge into luminous. Patrick Donnelly
10:31 PM Bug #36606 (Resolved): osd: checksum failure during upgrade test
... Patrick Donnelly

10/26/2018

10:47 PM Bug #36604: osd-bluefs-volume-ops.sh test hangs
David,
did you do make install for the new code base? Looks like the script runs legacy ceph-bluestore-tool..
Igor Fedotov
08:20 PM Bug #36604 (Rejected): osd-bluefs-volume-ops.sh test hangs

I ran this in my build tree as follows:...
David Zafman

10/25/2018

08:05 PM Bug #36284: Bluestore might be hanging OSD
Observation: when deferred_aggressive==false, kv_sync_thread goes to sleep with deferred_done_queue nonempty.
Someti...
Adam Kupczyk
01:10 PM Bug #36284: Bluestore might be hanging OSD
I have been working on a problem that seems to be related.
Using FIO with rados ioengine stops.
This seems to be ...
Adam Kupczyk

10/24/2018

08:02 PM Backport #36591 (In Progress): luminous: segv in BlueStore::OldExtent::create
Nathan Cutler
07:56 PM Backport #36591 (Resolved): luminous: segv in BlueStore::OldExtent::create
https://github.com/ceph/ceph/pull/24746 Nathan Cutler
07:59 PM Backport #36592 (In Progress): mimic: segv in BlueStore::OldExtent::create
Nathan Cutler
07:56 PM Backport #36592 (Resolved): mimic: segv in BlueStore::OldExtent::create
https://github.com/ceph/ceph/pull/24745 Nathan Cutler
03:34 PM Bug #36526 (Pending Backport): segv in BlueStore::OldExtent::create
Sage Weil
12:20 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Odd, I got the same error Nick.
> libceph: get_reply osd4 tid 1850429 data 1835008 > preallocated 262144, skippi...
Mark Lopez

10/23/2018

04:48 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
I think I maybe seeing this on actual client requests as well as scrubs. Since upgrading to Mimic and these scrub err... Nick Fisk
12:09 PM Bug #36567: Segmentation fault in BlueStore::Blob::discard_unallocated
Yes so my question is if all of those are may be just a result of the race mentioned here: https://github.com/ceph/ce... Stefan Priebe
12:02 PM Bug #36567: Segmentation fault in BlueStore::Blob::discard_unallocated
The second log is similar to
http://tracker.ceph.com/issues/36526
Igor Fedotov
12:00 PM Bug #36567: Segmentation fault in BlueStore::Blob::discard_unallocated
But I'm seeing also those:... Stefan Priebe
11:56 AM Bug #36567 (Duplicate): Segmentation fault in BlueStore::Blob::discard_unallocated
Hello,
i'm observing regular crashes / segmentation faults of bluestore OSDs in ceph 12.2.8.
Trace as follows:
...
Stefan Priebe
12:01 PM Bug #36526: segv in BlueStore::OldExtent::create
Is this the same? https://tracker.ceph.com/issues/36567 Stefan Priebe
05:42 AM Bug #36099 (Resolved): ObjectStore/StoreTest.BluestoreRepairTest/2 fails with os/bluestore/BlueSt...
Nathan Cutler
05:32 AM Bug #36099 (Pending Backport): ObjectStore/StoreTest.BluestoreRepairTest/2 fails with os/bluestor...
Nathan Cutler
05:39 AM Backport #36145 (In Progress): luminous: fsck: cid is improperly matched to oid
Nathan Cutler
05:34 AM Backport #36146 (Resolved): mimic: fsck: cid is improperly matched to oid
Nathan Cutler
05:33 AM Backport #36551 (Resolved): mimic: ObjectStore/StoreTest.BluestoreRepairTest/2 fails with os/blue...
Nathan Cutler
05:32 AM Backport #36551 (Resolved): mimic: ObjectStore/StoreTest.BluestoreRepairTest/2 fails with os/blue...
https://github.com/ceph/ceph/pull/24480 Nathan Cutler

10/22/2018

07:39 PM Bug #36526 (Fix Under Review): segv in BlueStore::OldExtent::create
https://github.com/ceph/ceph/pull/24701 Sage Weil
07:29 PM Bug #36526: segv in BlueStore::OldExtent::create
... Sage Weil
06:24 PM Bug #25006 (Need More Info): bad csum during upgrade test
Looking at the log, I don't see any useful clues as to what might have went wrong. No intervening writes, etc. Sage Weil
04:45 PM Bug #36422: ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
Looks similar to
http://tracker.ceph.com/issues/20236
Igor Fedotov
04:32 PM Feature #36231 (Resolved): cli options for ceph journal migration to different ssd/nvme
This has been implemented for BlueStore. And I haven't heard of any plans to support the same for FileStore. Hence ma... Igor Fedotov
03:37 PM Backport #36146: mimic: fsck: cid is improperly matched to oid
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24480
merged
Yuri Weinstein
09:50 AM Bug #36541: rename does not old ref to replacement onode at old name
But for get_onode can't do onode::flush. So for the later read(stat/getattr)still get the foo infos. Or i missed some... jianpeng ma
09:00 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
We are hitting this bug as well. In our cluster it occurred 14 times in the last 50 days.
This is our setup:
* 3 ...
Gaudenz Steinlin

10/20/2018

11:46 PM Bug #23206: ceph-osd daemon crashes - *** Caught signal (Aborted) **
Rams rams, could you please share your stack trace and log output preceding the assertion? Igor Fedotov
09:37 PM Bug #23206: ceph-osd daemon crashes - *** Caught signal (Aborted) **
we can confirm we are experiencing the same issue on version 12.2.7 and currently have some random osds that went off... Rams C
08:17 PM Bug #36541: rename does not old ref to replacement onode at old name
https://github.com/ceph/ceph/pull/24686 Sage Weil
08:15 PM Bug #36541: rename does not old ref to replacement onode at old name
Fix is to note_modified_object() in rename on the new replacement foo onode at the old name, so that it doesn't go aw... Sage Weil
08:14 PM Bug #36541 (Resolved): rename does not old ref to replacement onode at old name
- rename from foo to bar
- foo onode is moved to bar in onode_map
- keys removed at position foo as part of txc
- ...
Sage Weil

10/18/2018

10:19 PM Bug #36526 (Resolved): segv in BlueStore::OldExtent::create
... Sage Weil

10/17/2018

07:49 PM Feature #36231: cli options for ceph journal migration to different ssd/nvme
https://github.com/ceph/ceph/pull/23103 Igor Fedotov
07:49 PM Feature #36231 (Fix Under Review): cli options for ceph journal migration to different ssd/nvme
Igor Fedotov
07:21 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
There might be a work-around/fix for this: compacting the database
I did this:...
Wido den Hollander
01:43 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
One thing to add is that a few ago, at 15-10-2018 at 18:13 multiple OSDs in this cluster were showing these messages:... Wido den Hollander
12:33 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
and it's BlueStore::get_omap_iterator() and/or its subsequent usage which triggered these long massive reads.
Igor Fedotov
12:31 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
Just one thing to add - reads from BlueFS are performed in a sequential manner using pretty ineffective block sizes (... Igor Fedotov
12:29 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
Igor Fedotov
12:22 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
To add to this, I am also to reproduce it on osd.246 in this cluster:... Wido den Hollander
11:54 AM Bug #36482 (Resolved): High amount of Read I/O on BlueFS/DB when listing omap keys
I don't know how to describe this issue the best, but I've been observing various issues with Luminous 12.2.4 ~ 12.2.... Wido den Hollander
06:31 PM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
And can you specify, which kernel issue/bug you are talking about?. You mentioned 4.9+ kernel problem. Do you have an... Jan Pekař

10/16/2018

08:40 PM Bug #36455: BlueStore: ENODATA not fully handled
The code appears identical in master.
For this particular case, especially during scrub, we know our local copy is...
Lars Marowsky-Brée
11:19 AM Bug #36455 (Resolved): BlueStore: ENODATA not fully handled
We have a drive model experiencing weak writes, which manifest themselves as failed reads later; the drive notices th... Lars Marowsky-Brée

10/14/2018

01:03 PM Bug #36422 (Duplicate): ObjectStore/StoreTestSpecificAUSize.Many4KWritesTest/2 failure
... Sage Weil

10/10/2018

07:57 PM Bug #36364: Bluestore OSD IO Hangs near Flush (flush in 90.330556)
To clarify the behaviour I see on iostat... The disk %util of a hung disk goes to 100%, while the average queue lengt... Gavin Baker
09:06 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
Reporting back, increasing min_free_kbytes has not appeared to have helped. Swap usage is only a couple of MB out of ... Nick Fisk
 

Also available in: Atom