Activity
From 01/23/2019 to 02/21/2019
02/21/2019
- 09:52 PM Backport #38142: luminous: os/bluestore: fixup access a destroy cond cause deadlock or undefine b...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26261
merged - 03:49 PM Bug #38329: OSD crashes in get_str_map while creating with ceph-volume
- (original reporter here)
I have following customisation in ceph.conf:... - 01:00 PM Bug #25077: Occasional assertion in ObjectStore/StoreTest.HashCollisionTest/2
- We *think* we have hit this same issue "in the field" on a Luminous 12.2.8 cluster:
2019-02-20 18:42:45.261357 7fd...
02/20/2019
- 11:15 AM Bug #38395 (Fix Under Review): luminous: write following remove might access previous onode
- 10:35 AM Bug #38395: luminous: write following remove might access previous onode
- 10:25 AM Bug #38395 (Resolved): luminous: write following remove might access previous onode
- So the sequence is as follows:
T1:
remove A
T2:
touch A
write A
In Luminous there is a chance that A is rem...
02/18/2019
- 01:41 PM Bug #38363 (Need More Info): Failure in assert when calling: ceph-volume lvm prepare --bluestore ...
- I run Ubuntu 18.04 and and ceph version 13.2.4-1bionic from this repo: https://download.ceph.com/debian-mimic.
Whe...
02/16/2019
- 11:00 AM Backport #37990 (Resolved): mimic: Compression not working, and when applied OSD disks are failin...
02/15/2019
- 10:38 PM Bug #37839: Compression not working, and when applied OSD disks are failing randomly
- merged https://github.com/ceph/ceph/pull/26342
https://github.com/ceph/ceph/pull/26544
- 08:29 PM Bug #38329: OSD crashes in get_str_map while creating with ceph-volume
- - have any options been customized?
- what version is this? 14.0.1-2.fc30 is a random dev checkpoint commit from ma... - 08:12 PM Bug #38329: OSD crashes in get_str_map while creating with ceph-volume
- Changing back to the Ceph tracker, this is not a crash in ceph-volume or specific to ceph-volume that I can see
- 12:57 PM Bug #38329 (Resolved): OSD crashes in get_str_map while creating with ceph-volume
- see https://bugzilla.redhat.com/show_bug.cgi?id=1661583...
- 03:10 PM Bug #38049: random osds failing in thread_name:bstore_kv_final
- We have not experienced any further crashes in over a week (compared to multiple crashes per hour before), so it look...
02/13/2019
- 12:39 PM Bug #38230 (Resolved): segv in onode lookup
- https://github.com/ceph/ceph/pull/26391
02/12/2019
- 03:35 PM Bug #38272: "no available blob id" assertion might occur
- onode dump shortly before the assertion:
2019-02-12 18:23:47.546 7fca6fab1b40 0 bluestore(bluestore.test_temp_dir) ... - 03:29 PM Bug #38272: "no available blob id" assertion might occur
- Backtrace from UT:
-1> 2019-02-12 18:23:48.346 7fca6fab1b40 -1 /home/if/ceph/src/os/bluestore/BlueStore.cc: In fun... - 03:26 PM Bug #38272: "no available blob id" assertion might occur
- Stack trace from the customer log:
2019-02-06 00:04:25.934977 7ff3e3bca700 -1 /home/abuild/rpmbuild/BUILD/ceph-12.2.... - 03:25 PM Bug #38272 (Resolved): "no available blob id" assertion might occur
- We observed that on-site but unfortunately OSD were removed and are unavailable for inspection.
However I managed to... - 03:01 PM Backport #38143 (Resolved): mimic: os/bluestore: fixup access a destroy cond cause deadlock or un...
- 12:00 AM Backport #38143: mimic: os/bluestore: fixup access a destroy cond cause deadlock or undefine beha...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26260
merged - 02:53 PM Backport #38188 (In Progress): luminous: deep fsck fails on inspecting very large onodes
- 02:50 PM Backport #38187 (Resolved): mimic: deep fsck fails on inspecting very large onodes
02/11/2019
- 09:08 PM Backport #38187: mimic: deep fsck fails on inspecting very large onodes
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26291
merged - 04:34 PM Bug #38049: random osds failing in thread_name:bstore_kv_final
- From the first log, this looks like #36541. My guess is the crashes you were seeing after were continued problems fr...
- 10:06 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- I still seem to be experiencing these errors, albeit at a much reduced rate since upgrading to 13.2.3. I could wake u...
02/09/2019
- 05:14 PM Bug #38250 (Rejected): assert failure crash prevents ceph-osd from running
- One of my OSDs keeps crashing shortly after startup, which is preventing it from joining the cluster. The core issue...
02/08/2019
- 03:40 PM Bug #38230: segv in onode lookup
- ...
02/07/2019
- 10:41 PM Bug #38230: segv in onode lookup
- i'm guessing this is the same heap corruption we've been seeing, but logging it anyway
- 10:40 PM Bug #38230 (Resolved): segv in onode lookup
- ...
02/06/2019
- 03:56 PM Bug #38049: random osds failing in thread_name:bstore_kv_final
- After two days of running fine I had set the bluestore and bluefs log level back to default, so I dont know how helpf...
- 03:27 PM Bug #38049: random osds failing in thread_name:bstore_kv_final
- Lawrence, would you share the log for current crashes please?
Existing failures with fsck are expected as the patc... - 01:24 PM Bug #38049: random osds failing in thread_name:bstore_kv_final
- We have since patched our ceph with https://github.com/ceph/ceph/pull/24686 which fixes Issue #36541. Since then the ...
- 02:16 PM Bug #38176: Unable to recover from ENOSPC in BlueFS, WAL
- Fixed link to bug replication script.
https://drive.google.com/file/d/10Lvcf6_Lj2c2sydcfU170lbb-IQClvH- - 09:05 AM Bug #37360 (Resolved): bluefs-bdev-expand aborts
- 08:54 AM Backport #38188: luminous: deep fsck fails on inspecting very large onodes
- No need for that additional cherry-pick, just add new option using the method applicable for luminous
02/05/2019
- 11:56 PM Backport #38188 (Need More Info): luminous: deep fsck fails on inspecting very large onodes
- We need to cherry-pick additional commits to get this backport PR, Option::TYPE_SIZE and Option::FLAG_RUNTIME not de...
- 05:02 PM Backport #38188 (Resolved): luminous: deep fsck fails on inspecting very large onodes
- https://github.com/ceph/ceph/pull/26387
- 11:31 PM Backport #38187 (In Progress): mimic: deep fsck fails on inspecting very large onodes
- https://github.com/ceph/ceph/pull/26291
- 05:01 PM Backport #38187 (Resolved): mimic: deep fsck fails on inspecting very large onodes
- https://github.com/ceph/ceph/pull/26291
- 09:39 PM Backport #37494 (Resolved): mimic: bluefs-bdev-expand aborts
- 09:17 PM Backport #37494: mimic: bluefs-bdev-expand aborts
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25348
merged - 11:43 AM Bug #38176 (Fix Under Review): Unable to recover from ENOSPC in BlueFS, WAL
- 11:28 AM Bug #38176 (Won't Fix): Unable to recover from ENOSPC in BlueFS, WAL
- It is possible to insert so much OMAP data into objects that it will overflow storage and cause ENOSPC when rocksdb t...
02/04/2019
- 08:55 PM Bug #38065 (Pending Backport): deep fsck fails on inspecting very large onodes
- 08:51 PM Bug #38065: deep fsck fails on inspecting very large onodes
- Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/26170
merged - 11:21 AM Backport #38161 (Rejected): mimic: KernelDevice exclusive lock broken
- 11:21 AM Backport #38160 (Rejected): luminous: KernelDevice exclusive lock broken
- https://github.com/ceph/ceph/pull/34514
- 03:32 AM Backport #38142 (In Progress): luminous: os/bluestore: fixup access a destroy cond cause deadlock...
- https://github.com/ceph/ceph/pull/26261
- 03:30 AM Backport #38143 (In Progress): mimic: os/bluestore: fixup access a destroy cond cause deadlock or...
- https://github.com/ceph/ceph/pull/26260
02/03/2019
02/02/2019
- 10:59 AM Bug #38049: random osds failing in thread_name:bstore_kv_final
- The fsck on a failed osd gives the following:...
02/01/2019
- 05:39 PM Bug #38150: KernelDevice exclusive lock broken
- https://github.com/ceph/ceph/pull/26245
- 05:37 PM Bug #38150 (Resolved): KernelDevice exclusive lock broken
- fcntl locks go away when *any* fd on the file is closed.
This can cause corruption when running ceph-osd in contai... - 02:12 PM Bug #38049: random osds failing in thread_name:bstore_kv_final
- @Sage - could you please confirm this is a duplicate for #36541
- 02:10 PM Bug #38049: random osds failing in thread_name:bstore_kv_final
- Log extract for competing transactions (T1 - rename A->B, T2 - write A, T3 - remove B:...
- 02:05 PM Bug #38049: random osds failing in thread_name:bstore_kv_final
- Looks like duplicate for #36541
- 01:54 PM Bug #37282: rocksdb: submit_transaction_sync error: Corruption: block checksum mismatch code = 2
- I might have been bitten by the same issue. The OSD in question is has its main data on a spinning drive and its data...
- 09:18 AM Backport #38143 (Resolved): mimic: os/bluestore: fixup access a destroy cond cause deadlock or un...
- https://github.com/ceph/ceph/pull/26260
- 09:18 AM Backport #38142 (Resolved): luminous: os/bluestore: fixup access a destroy cond cause deadlock or...
- https://github.com/ceph/ceph/pull/26261
01/31/2019
- 03:28 PM Bug #20557 (Closed): segmentation fault with rocksdb|BlueStore and jemalloc
- 03:26 PM Bug #25098: Bluestore OSD failed to start with `bluefs_types.h: 54: FAILED assert(pos <= end)`
- This is about OSD's behavior in malfunctioned environment (disappearing block.db symlink). We need to terminate OSD b...
- 03:26 PM Bug #24561 (Resolved): if disableWAL is set, submit_transacton_sync will met error.
- 03:24 PM Bug #24906 (Closed): fio with bluestore crushed
- Closing this for now based on Adam's theory. Please feel free to reopen if the issue persists.
- 03:19 PM Bug #27222 (Can't reproduce): FAILED assert(available >= allocated) in void AllocatorLevel02<T>::...
- 03:10 PM Bug #37733 (Pending Backport): os/bluestore: fixup access a destroy cond cause deadlock or undefi...
- 02:58 PM Bug #38049: random osds failing in thread_name:bstore_kv_final
- Just to record intermediate analysis results:
the failure sequence looks like the following - compressed blob was ov... - 02:21 PM Bug #38049: random osds failing in thread_name:bstore_kv_final
- Lawrence, can you please run fsck for some of already failed OSDs.
01/30/2019
- 06:20 PM Bug #36541 (Resolved): rename does not old ref to replacement onode at old name
- 06:19 PM Backport #36639 (Resolved): mimic: rename does not old ref to replacement onode at old name
- 05:06 PM Backport #36639: mimic: rename does not old ref to replacement onode at old name
- Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/25313
merged
01/28/2019
- 03:28 PM Bug #38065 (Fix Under Review): deep fsck fails on inspecting very large onodes
- 03:27 PM Bug #38065: deep fsck fails on inspecting very large onodes
- https://github.com/ceph/ceph/pull/26170
- 03:16 PM Bug #38065: deep fsck fails on inspecting very large onodes
- Looks like aio_queue_t::submit_batch timeouts due to long list of blocks to read. We need to cap this for fsck and, p...
- 02:21 PM Bug #38065 (Resolved): deep fsck fails on inspecting very large onodes
- Steps to reproduce (100%):
Put 3GB object to replicated pool via rados then do deep fsck either on mount or via cep... - 02:50 PM Backport #37824 (Resolved): mimic: BlueStore: ENODATA not fully handled
01/25/2019
- 04:03 PM Backport #37824: mimic: BlueStore: ENODATA not fully handled
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25854
merged - 03:24 PM Bug #38049 (Resolved): random osds failing in thread_name:bstore_kv_final
- Since upgrading to mimic 13.2.4 single osds are randomly failing with the following assert:...
Also available in: Atom