Activity
From 11/06/2018 to 12/05/2018
12/05/2018
- 03:51 PM Bug #22534: Debian's bluestore *rocksdb* does not support neither fast CRC nor compression
- fix on rocksdb side:
- https://github.com/ceph/rocksdb/pull/41
12/04/2018
- 12:10 PM Bug #20236 (Can't reproduce): bluestore: ObjectStore/StoreTestSpecificAUSize.Many4KWritesNoCSumTe...
- 12:09 PM Bug #20236: bluestore: ObjectStore/StoreTestSpecificAUSize.Many4KWritesNoCSumTest/2 failure
- Sage Weil wrote:
> I haven't seen this in a while.. have you?
Me too. Just gave this another try for both master ...
12/03/2018
- 11:32 PM Backport #37495 (In Progress): luminous: bluefs-bdev-expand aborts
- 11:17 PM Backport #37495: luminous: bluefs-bdev-expand aborts
- https://github.com/ceph/ceph/pull/25384
- 08:36 PM Bug #36364: Bluestore OSD IO Hangs near Flush (flush in 90.330556)
- Igor Fedotov wrote:
> May be benchmark this drive using FIO?
> And try to simulate the use pattern: mixed read + w... - 08:35 PM Bug #36364: Bluestore OSD IO Hangs near Flush (flush in 90.330556)
- Igor Fedotov wrote:
> BTW - do these drives/controllers have write caching enabled? May be try to disable if so? AFA...
12/01/2018
- 06:38 AM Backport #37494 (In Progress): mimic: bluefs-bdev-expand aborts
- 06:37 AM Backport #37494 (Resolved): mimic: bluefs-bdev-expand aborts
- https://github.com/ceph/ceph/pull/25348
- 06:37 AM Backport #37495 (Resolved): luminous: bluefs-bdev-expand aborts
- https://github.com/ceph/ceph/pull/25384
- 06:37 AM Bug #37360 (Pending Backport): bluefs-bdev-expand aborts
11/30/2018
- 06:49 PM Bug #37360: bluefs-bdev-expand aborts
- mimic fix (which is completely different from Nautilus one as we don't backport main device expansion feature): https...
- 01:30 PM Bug #20236: bluestore: ObjectStore/StoreTestSpecificAUSize.Many4KWritesNoCSumTest/2 failure
- I haven't seen this in a while.. have you?
- 01:29 PM Bug #26896 (Can't reproduce): store_test.cc: FAILED ObjectStore/StoreTest.Rename/2
11/29/2018
- 08:22 PM Bug #23463 (Can't reproduce): src/os/bluestore/StupidAllocator.cc: 336: FAILED assert(rm.empty())
- 08:21 PM Bug #25006 (Can't reproduce): bad csum during upgrade test
- http://pulpito.ceph.com/sage-2018-11-29_15:08:26-upgrade:luminous-x-mimic-distro-basic-smithi/
- 07:30 PM Bug #36364: Bluestore OSD IO Hangs near Flush (flush in 90.330556)
- BTW - do these drives/controllers have write caching enabled? May be try to disable if so? AFAIR there were some talk...
- 07:19 PM Bug #36364: Bluestore OSD IO Hangs near Flush (flush in 90.330556)
- May be benchmark this drive using FIO?
And try to simulate the use pattern: mixed read + write + fdatasync.
- 05:25 PM Bug #36364: Bluestore OSD IO Hangs near Flush (flush in 90.330556)
- [2041833.966145] INFO: task bstore_kv_sync:79243 blocked for more than 120 seconds.
[2041833.966148] "echo 0 > /proc... - 05:24 PM Bug #36364: Bluestore OSD IO Hangs near Flush (flush in 90.330556)
- I also less frequently get these dmesg errors. Not sure if they are related.
[2041833.966150] bstore_kv_sync D ff... - 05:03 PM Bug #36364: Bluestore OSD IO Hangs near Flush (flush in 90.330556)
- No not SMR, these drives are Seagate Exos 10TB Enterprise sata drives. We are seeing this behavior on multiple types ...
- 03:31 PM Bug #36364: Bluestore OSD IO Hangs near Flush (flush in 90.330556)
- The code is just timing fdatasync(2), so the problem is almost certainly going to be below ceph (kernel or hardware)
... - 03:29 PM Bug #36364 (Need More Info): Bluestore OSD IO Hangs near Flush (flush in 90.330556)
- This flush time is suspiciously close to 90s (flush in 90.330556)...
These aren't SMR drives, right? - 03:54 PM Bug #36268 (Fix Under Review): Unable to recover from ENOSPC in BlueFS
- https://github.com/ceph/ceph/pull/25132
- 03:45 PM Bug #23120 (Can't reproduce): OSDs continously crash during recovery
- 03:44 PM Bug #25207 (Can't reproduce): ceph-volume lvm create gives segmentation fault
- 03:38 PM Bug #36284 (Duplicate): Bluestore might be hanging OSD
- 03:35 PM Bug #36303 (Duplicate): luminous: 12.2.8 - FAILED assert(0 == "put on missing extent (nothing bef...
- 03:34 PM Bug #36331: FAILED ObjectStore/StoreTestSpecificAUSize.SyntheticMatrixNoCsum/2 (zeros)
- ...
- 03:26 PM Bug #36455: BlueStore: ENODATA not fully handled
- 03:19 PM Bug #36567 (Duplicate): Segmentation fault in BlueStore::Blob::discard_unallocated
- 03:18 PM Bug #37090 (Can't reproduce): BlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob")
- I have a feeling this is caused by http://tracker.ceph.com/issues/36526, the fix for which is in 12.2.10.
- 03:15 PM Bug #37282 (Need More Info): rocksdb: submit_transaction_sync error: Corruption: block checksum m...
- 02:58 PM Bug #37282: rocksdb: submit_transaction_sync error: Corruption: block checksum mismatch code = 2
- Somewhat similar issue, may be useful as recovery guidance:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018... - 03:11 PM Bug #25001 (Can't reproduce): Crashing OSDs after going from 12.2.5 -> 12.2.6 -> 13.2.0
- I believe this is related to the SharedBLob refcounting bugs. See 7031addfe6fcd070df8c4c7b175f374bda77a671 and ff883...
- 03:06 PM Bug #25050 (Need More Info): osd: OSD Failed to Start In function 'int BlueStore::_do_alloc_write
- 02:55 PM Bug #37360 (Fix Under Review): bluefs-bdev-expand aborts
- 02:55 PM Bug #37360: bluefs-bdev-expand aborts
- https://github.com/ceph/ceph/pull/25308
- 09:11 AM Bug #32731 (Resolved): fsck: cid is improperly matched to oid
- 09:11 AM Backport #36145 (Resolved): luminous: fsck: cid is improperly matched to oid
- 01:07 AM Backport #36145: luminous: fsck: cid is improperly matched to oid
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24705
merged - 09:09 AM Backport #36638 (Resolved): luminous: rename does not old ref to replacement onode at old name
- 01:04 AM Backport #36638: luminous: rename does not old ref to replacement onode at old name
- Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/24989
merged - 06:14 AM Bug #24439 (Resolved): os/bluestore/BlueStore.cc: 1025: FAILED assert(buffer_bytes >= b->length) ...
- 06:14 AM Backport #26943 (Resolved): luminous: os/bluestore/BlueStore.cc: 1025: FAILED assert(buffer_bytes...
- 01:04 AM Backport #26943: luminous: os/bluestore/BlueStore.cc: 1025: FAILED assert(buffer_bytes >= b->leng...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/24992
merged - 04:28 AM Backport #36639 (In Progress): mimic: rename does not old ref to replacement onode at old name
- https://github.com/ceph/ceph/pull/25313
- 01:08 AM Bug #22464: Bluestore: many checksum errors, always 0x6706be76 (which matches a zero block)
- https://github.com/ceph/ceph/pull/24649 mergedhttps://github.com/ceph/ceph/pull/24649
11/26/2018
- 11:41 PM Backport #36754 (Resolved): mimic: _aio_log_start inflight overlap of 0x10000~1000 with [65536~4096]
- 08:49 PM Backport #36754: mimic: _aio_log_start inflight overlap of 0x10000~1000 with [65536~4096]
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25062
merged
11/22/2018
- 11:25 AM Bug #37360: bluefs-bdev-expand aborts
- Got it. Thanks, Mark!
So as I said before main device resize isn't supported at the moment.
Will probably start a... - 11:10 AM Bug #37360: bluefs-bdev-expand aborts
- I decided to enlarge OSD backing store device to be able to store more data on this OSD without re-creating it.
Se... - 10:17 AM Bug #37360: bluefs-bdev-expand aborts
- Actually there are 2 aspects for this ticket:
1) the tool improperly handles OSD deployments that lack DB and/or WAL... - 09:34 AM Bug #37360 (In Progress): bluefs-bdev-expand aborts
- 09:04 AM Bug #37360: bluefs-bdev-expand aborts
- Problem is still triggered every time.
- 09:04 AM Bug #37360: bluefs-bdev-expand aborts
- ...
- 09:03 AM Bug #37360: bluefs-bdev-expand aborts
- ...
- 08:46 AM Bug #37360: bluefs-bdev-expand aborts
- Wondering if bluefs-bdev-sizes command works fine? What's about fsck?
11/21/2018
- 09:35 PM Bug #37360 (Resolved): bluefs-bdev-expand aborts
- root@node1:~# ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-16
infering bluefs devices from b...
11/16/2018
- 05:31 PM Bug #37282: rocksdb: submit_transaction_sync error: Corruption: block checksum mismatch code = 2
- I have checked the kernel log and smartctl and do not see any errors.
- 09:48 AM Bug #37282: rocksdb: submit_transaction_sync error: Corruption: block checksum mismatch code = 2
- Firstly I suggest to verify the disk drive behind DB volume for physical errors.
- 05:28 AM Bug #37282 (Need More Info): rocksdb: submit_transaction_sync error: Corruption: block checksum m...
- I have an OSD that will not start. It keep crashing. Not sure where to go from here. Unfortunately, it happened ri...
11/14/2018
- 09:13 PM Bug #37090: BlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob")
- Kjetil Joergensen wrote:
> Kjetil Joergensen wrote:
> > Ok - I think you can close this one. This is in all likelih... - 08:56 PM Bug #37090: BlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob")
- Kjetil Joergensen wrote:
> Ok - I think you can close this one. This is in all likelihood a hardware error of some s... - 08:41 PM Bug #37090: BlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob")
- Ok - I think you can close this one. This is in all likelihood a hardware error of some sort, on the same machine I h...
- 06:11 PM Bug #37090: BlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob")
- Log posted with ceph-upload-file: fbc90b08-887d-40b9-99b9-0a843465a313
Console output below... - 09:47 AM Bug #37090: BlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob")
- Could you please run fsck on this OSD with "debug bluestore" set to 20 and share the log?
11/13/2018
- 07:49 PM Bug #37090: BlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob")
- Part of the osd log, should incude the first crash and maybe a couple of the subsequent ones, to make it fit within t...
- 07:27 PM Bug #37090 (Can't reproduce): BlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob")
- Possibly a duplicate of #36303
What is slightly interesting, after setting the osd out and migrating off of it, it... - 06:38 PM Backport #36641 (Need More Info): mimic: Unable to recover from ENOSPC in BlueFS
- Igor writes in the parent issue: "In fact previously mentioned PR is just a workaround to be able to manually fix the...
- 06:37 PM Backport #36640 (Need More Info): luminous: Unable to recover from ENOSPC in BlueFS
- Igor writes in the parent issue: "In fact previously mentioned PR is just a workaround to be able to manually fix the...
- 10:36 AM Bug #36268 (In Progress): Unable to recover from ENOSPC in BlueFS
- In fact previously mentioned PR is just a workaround to be able to manually fix the issue.
Working on the actual sol...
11/12/2018
- 06:16 PM Backport #36755 (In Progress): luminous: _aio_log_start inflight overlap of 0x10000~1000 with [65...
- 04:26 PM Backport #36754 (In Progress): mimic: _aio_log_start inflight overlap of 0x10000~1000 with [65536...
11/10/2018
- 08:54 AM Backport #36755 (Rejected): luminous: _aio_log_start inflight overlap of 0x10000~1000 with [65536...
- https://github.com/ceph/ceph/pull/25064
- 08:54 AM Backport #36754 (Resolved): mimic: _aio_log_start inflight overlap of 0x10000~1000 with [65536~4096]
- https://github.com/ceph/ceph/pull/25062
11/08/2018
- 11:04 PM Bug #36606 (Resolved): osd: checksum failure during upgrade test
- 11:04 PM Bug #36606: osd: checksum failure during upgrade test
- Sage, no, it's specific to Nautilus for now. We need it when/if we backport BlueFS migrate stuff.
- 10:28 PM Bug #36606 (Pending Backport): osd: checksum failure during upgrade test
- Igor, we should backport this, right?
- 10:29 PM Bug #36625 (Pending Backport): _aio_log_start inflight overlap of 0x10000~1000 with [65536~4096]
- 01:56 PM Backport #26943 (In Progress): luminous: os/bluestore/BlueStore.cc: 1025: FAILED assert(buffer_by...
- 09:53 AM Backport #36638 (In Progress): luminous: rename does not old ref to replacement onode at old name
11/06/2018
- 03:37 PM Bug #36606: osd: checksum failure during upgrade test
- https://github.com/ceph/ceph/pull/24948
- 01:45 PM Bug #36606 (Fix Under Review): osd: checksum failure during upgrade test
- 01:28 PM Bug #36606 (In Progress): osd: checksum failure during upgrade test
Also available in: Atom