Activity
From 10/15/2019 to 11/13/2019
11/13/2019
- 11:47 PM Bug #42284 (Duplicate): fastbmap_allocator_impl.h: FAILED ceph_assert(available >= allocated) in ...
- haha - we've finally got it locally...
https://tracker.ceph.com/issues/42223 - 09:11 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- @Marcin,
003808.sst has the same bluefs log pattern inside at e.g. offset 5687664. So it seems it has been overwritt... - 08:27 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- @Igor,
Well done!
Would it also explain problem I had with 003808.sst(30MB) where all others were 16MB as they w... - 07:48 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Hopefully here is the final analysis for the root cause.
a) analysis for 002375.sst dump from comment #94 reveals ... - 05:39 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- @Rafal - great, thank you. Now I can tell that data extents in this file which are supposed to be RocksDB data belong...
- 11:25 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Yes it is available, I didn't redeployed failed OSDs...
- 07:12 PM Bug #20236: bluestore: ObjectStore/StoreTestSpecificAUSize.Many4KWritesNoCSumTest/2 failure
- /a/yuriw-2019-11-08_20:50:44-rados-wip-yuri3-testing-2019-11-08-1221-nautilus-distro-basic-smithi/4485359/
- 06:17 PM Bug #20236: bluestore: ObjectStore/StoreTestSpecificAUSize.Many4KWritesNoCSumTest/2 failure
- This appeared in nautilus
/a/yuriw-2019-11-12_15:30:19-rados-wip-yuri5-testing-2019-11-11-1520-nautilus-distro-bas... - 05:25 PM Bug #42683: OSD Segmentation fault
- @Antonio - yes, please go head.
- 04:17 PM Bug #42683: OSD Segmentation fault
- @Igor if you don't object I would scratch the OSD to keep testing the system.
11/12/2019
- 05:00 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- @Rafal, is osd.35 still available? If so could you please run bluestore-tool:
CEPH_ARGS="--debug-bluefs 10 --log-fil... - 09:22 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Rafal Wadolowski wrote:
> I'm just upgraded version, so they are created with 14.2.3 version.
> So do you suggest t... - 09:17 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- I'm just upgraded version, so they are created with 14.2.3 version.
So do you suggest to redeploy cluster? - 09:15 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Thanks, Rafal! One clarification though - have you redeployed all the OSDs before running Ceph with the latest RocksD...
- 08:55 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- @Igor version with upgraded rocksdb broke two OSD for now. Logs from one of them:...
- 03:15 PM Bug #42166: crash when LRU trimming
- Just to note, osd log contains multiple odd checksum verification failures from RocksDB, e.g.
2019-10-02T11:44:22.... - 12:15 AM Bug #42712 (Resolved): ObjectStore/StoreTest.ColSplitTest0/2 hangs
11/11/2019
- 08:48 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- @Krzysztof - what I'd like to do for broken OSDs is:
a) (simple case) export bluefs log using ceph-bluestore-tool's ... - 08:23 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- I'm actually not sure - I see that the cluster has been upgrade to a new 14.2.4 build with upgraded rocksdb, but osd-...
- 06:59 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- @Rafal, @Krzystzof - am I right that data for OSDs from comment #87 aren't available any more as you deployed a new c...
- 03:03 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- I deployed the version with new rocksdb and started tests. I will come up with results :)
- 01:41 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Hi Marcin,
there is bluefs_buffered_io parameter (true by default) which controls buffered vs. direct IO mode. And y...
11/10/2019
- 12:02 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Dev guys,
I haven't checked the source code for handling KV store but is it possible that some parts use buffered ...
11/08/2019
- 11:34 PM Bug #42712 (Fix Under Review): ObjectStore/StoreTest.ColSplitTest0/2 hangs
- 11:28 PM Bug #42712: ObjectStore/StoreTest.ColSplitTest0/2 hangs
- I think the problem is ff71ad472e94e14f392c618b6eb5e8608afec94f
Here's the log:... - 11:13 PM Bug #42712: ObjectStore/StoreTest.ColSplitTest0/2 hangs
- /a/sage-2019-11-08_19:27:06-rados-master-distro-basic-smithi/4484836
- 05:43 PM Bug #42712 (Resolved): ObjectStore/StoreTest.ColSplitTest0/2 hangs
- ...
- 02:54 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Entire db/000960.sst for osd.50 can be downloaded from https://cf2.cloudferro.com:8080/swift/v1/AUTH_36d73de269134544...
- 01:32 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Entire db/000960.sst can be downloaded from https://cf2.cloudferro.com:8080/swift/v1/AUTH_36d73de2691345449a04f9b9f95...
- 01:25 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Another crash, attached osd log, and exert below:...
- 12:32 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- IF anyone has spare cluster to try custom build here is nautilus v14.2.4 with RocksDB updated to 6.4.6.
Worth checki... - 09:52 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- And this time it is still happens during compaction, isn't it?
- 09:50 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- @Krzystzof- thanks for the info. Please share more when available.
Please also export bluefs for this OSD and shar... - 09:20 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Attached is a log from OSD that has crashed with Igor's patch applied, the interesting part is below:...
- 11:04 AM Bug #42683: OSD Segmentation fault
- @Igor here is a link to the file:
https://drive.google.com/open?id=1sd3507O58wyb0a1iGt4fjWjQGkoOgnwH
Thanks aga... - 10:35 AM Bug #42683: OSD Segmentation fault
- Here it is.
The proper command is:
ceph-bluestore-tool --path <osd-path> --out-dir <destination dir> --command bl... - 10:33 AM Bug #42683: OSD Segmentation fault
- @Igor I don't know how exactly I should to do this, if you can give some instruction of point me to some docs I would...
- 10:26 AM Bug #42683: OSD Segmentation fault
- @Antonio - would you be able to export bluefs for the broken OSD to regular file system and then share content of db/...
- 10:24 AM Bug #42683 (Duplicate): OSD Segmentation fault
- Duplicate of https://tracker.ceph.com/issues/42223
- 10:23 AM Bug #42683: OSD Segmentation fault
- @Antonio, thanks a lot.
So the same pattern:
-49> 2019-11-06 13:08:33.333 7f30b5b93700 3 rocksdb: [db/db_impl_c... - 10:15 AM Bug #42683: OSD Segmentation fault
- @Igor You can find attached the very first error portion of the log. I am afraid at that moment the debug level was n...
- 10:00 AM Bug #42683: OSD Segmentation fault
- @Antonio, could you please share the log for this first crash
- 08:25 AM Bug #42683: OSD Segmentation fault
- After the 1Gb tuning the cluster was all fine. Then we started our stress test that is fio with with increasing numbe...
11/07/2019
- 11:36 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Another option to try (once the current tests are done): set rocksdb_enable_rmrange = false
- 03:17 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Tobias Fischer wrote:
> @Igor: I installed 14.2.4 with the patch on 2 Clusters that already had failed OSDs. As soon... - 02:24 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- @Igor: I installed 14.2.4 with the patch on 2 Clusters that already had failed OSDs. As soon as new OSDs fail i will ...
- 08:17 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Igor Fedotov wrote:
> Can anybody try to make a custom build for Nautilus 14.2.3 or. 4 using this patch, update a cl... - 05:28 PM Bug #42683: OSD Segmentation fault
- well, please disregard my words about recovery referenced in the log - they are present after regular shutdown as wel...
- 05:17 PM Bug #42683: OSD Segmentation fault
- @Antonio - IMO it doesn't make much sense to fix these specific parameters - who knows what else has been broken... H...
- 04:40 PM Bug #42683: OSD Segmentation fault
- No, any assertions before this. What we were doing was to test the lower limit at which the OSD could operate, given ...
- 04:28 PM Bug #42683: OSD Segmentation fault
- So brief log analysis:
Freelist init shows some garbage in DB for its key records
>> -9> 2019-11-07 17:08:33.20... - 04:23 PM Bug #42683: OSD Segmentation fault
- I attached the portion of the log produced after the restart with increased debug level, you can find it attached.
T... - 03:52 PM Bug #42683: OSD Segmentation fault
- @Antonio - thanks for the info, but I need more...
First of all please preserve all the available logs for this sp... - 03:32 PM Bug #42683: OSD Segmentation fault
- Dear Igor,
I attached the log portion you required.
Thanks
Antonio - 03:19 PM Bug #42683: OSD Segmentation fault
- @Antonio - could you please provide the whole log for the failure.
- 02:11 PM Bug #42683 (Duplicate): OSD Segmentation fault
- Dear support,
I have a small ceph cluster installed with nautilus 14.2.4 meant for a test for future larger deployme...
11/06/2019
- 08:47 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- As a quick update, I've purged and re-deployed our cluster with 14.2.2. I've been running same stress tests for the l...
- 05:25 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Mines aren't crashing anymore... But I managed to reduce the load
- 02:54 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Can anybody try to make a custom build for Nautilus 14.2.3 or. 4 using this patch, update a cluster and collect logs ...
- 07:29 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Did anybody find a way to fast reproduce the issue? Right now it's still occurs randomly.
- 09:13 AM Bug #41215 (Duplicate): os/bluestore: do not set osd_memory_target default from cgroup limit
- duplicate of #41200
11/05/2019
- 07:09 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Yeah, from this ticket point of view 14.2.2 is the best choice. Not sure how good it is in general though - it makes ...
- 06:30 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- I'm currently considering biting the bullet and redeploying one of our clusters with one of earlier nautilus releases...
- 05:42 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- I had the cluster upgraded to 14.2.4, and after that problems started under heavy load.
I don't have any other clust... - 04:36 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- There were two changes in rocksdb default configuration.
First was version upgrade from 5.17.2 to 6.1.20 (https://g... - 01:30 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Krzysztof and me are working on the same clusters
- 01:18 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Krzysztof - are you referring the same clusters as Rafal above?
- 01:01 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- We have 2 clusters that were deployed with 14.2.3 and have failed OSDs during stress testing. We have 1 more cluster ...
- 12:42 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- General questions to everybody experiencing the issue.
1)Have you run Nautilus before v14.2.3? Have you ever seen ... - 12:30 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Rafal, thanks for your information.
I don't think switching to stupid allocator will help. IMO this problem isn't re... - 07:57 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Igor,
We are facing this same issue. Clean install 14.2.3.
Right now we have over 10 OSDs down.... - 12:39 PM Backport #41810 (Need More Info): nautilus: osd_memory_target isn't applied in runtime.
- non-trivial backport assigned to developer
- 12:10 AM Bug #42605: KernelDevice.cc: 688: FAILED assert(off % block_size == 0)
- 黄 维 wrote:
> Igor Fedotov wrote:
> > Looks like bluefs replay tries to read an out-of-bound extent (#3 while just 2... - 12:08 AM Bug #42605: KernelDevice.cc: 688: FAILED assert(off % block_size == 0)
- Igor Fedotov wrote:
> Looks like bluefs replay tries to read an out-of-bound extent (#3 while just 2 are present for...
11/04/2019
- 09:51 PM Documentation #39522 (Resolved): fix and improve doc regarding manual bluestore cache settings.
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 05:56 PM Backport #41290 (Resolved): nautilus: fix and improve doc regarding manual bluestore cache settings.
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31259
m... - 05:46 PM Bug #39569 (Duplicate): os/bluestore: fix for FreeBSD iocb structure #27458
- 04:41 PM Backport #41460 (In Progress): nautilus: incorrect RW_IO_MAX
- 01:56 PM Bug #42605: KernelDevice.cc: 688: FAILED assert(off % block_size == 0)
- Looks like bluefs replay tries to read an out-of-bound extent (#3 while just 2 are present for log file (aka ino 1) i...
- 01:53 AM Bug #42605 (Closed): KernelDevice.cc: 688: FAILED assert(off % block_size == 0)
- OSD start failed after server power down
ceph version: v12.2.9
stack info:
/clove/vm/zstor/ceph/rpmbuild/BUILD...
11/02/2019
- 04:57 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- I ran bluefs-export export twice on the same OSD and data MD5 hashes are identical for both copies.
- 04:31 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Same here. Although, I have another OSD with same issues.
- 03:23 PM Backport #41289: luminous: fix and improve doc regarding manual bluestore cache settings.
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31257
m...
10/31/2019
- 02:36 PM Bug #39529 (Need More Info): bluestore transaction apparently lost on osd restart
- 02:24 PM Bug #40938: Some osd processes restart automatically after adding osd
- Have you observed this on a more recent release (e.g. 12.2.12)?
- 02:24 PM Bug #40938 (Need More Info): Some osd processes restart automatically after adding osd
- Do you have a coredump from this crash?
- 02:24 PM Bug #42166: crash when LRU trimming
- I'm afraid not.
- 02:19 PM Bug #42166 (Need More Info): crash when LRU trimming
- Jeff do you happen to still have a coredump from this?
- 02:14 PM Bug #42345 (Closed): OSD: When object compression ratio is high(but less than “bluestore_compress...
- Sounds like there wasn't a bug here, just some confusion about config meaning.
- 01:45 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- 1) for all OSDs (3) on the affected Servers (2) "bluestore_reads_with_retries": 0
2) no occurence for the pattern on... - 11:46 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Tobias, Marcin, Dmitry - could you please do the following:
1) For all OSDs (or at least ones that co-locate the fai... - 11:25 AM Backport #41288 (Resolved): mimic: fix and improve doc regarding manual bluestore cache settings.
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31258
m... - 10:58 AM Backport #41289 (Resolved): luminous: fix and improve doc regarding manual bluestore cache settings.
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/31257
m...
10/30/2019
- 03:33 PM Backport #41290 (In Progress): nautilus: fix and improve doc regarding manual bluestore cache set...
- Updated automatically by ceph-backport.sh version 15.0.0.6270
- 03:32 PM Backport #41288 (In Progress): mimic: fix and improve doc regarding manual bluestore cache settings.
- Updated automatically by ceph-backport.sh version 15.0.0.6270
- 03:29 PM Backport #41289 (In Progress): luminous: fix and improve doc regarding manual bluestore cache set...
- Updated automatically by ceph-backport.sh version 15.0.0.6270
- 03:16 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- My both corruptions happened under heavy IOPS load, on older osds created in previous versions of ceph and migrated t...
- 02:41 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Tobias,
Perhaps the below helps to avoid corruption:
I also had https://tracker.ceph.com/issues/20381 in logs:
... - 01:09 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Thanks!
The same as in Marcin's sst - 12:42 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Igor Fedotov wrote:
> Tobias, would you be able to export bluefs for osd.1 or .4 and inspect or share sst file that ... - 12:26 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- the proper command is:
ceph-bluestore-tool --path <osd-path> --out-dir <destination dir> --command bluefs-export
... - 12:11 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Igor Fedotov wrote:
> Tobias, would you be able to export bluefs for osd.1 or .4 and inspect or share sst file that ... - 11:40 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- "So I checked what's present at offsets: 5687664 and 5687664+3875 - a bunch of zeroes at both locations. Generally th...
- 11:20 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- I took only one snapshot when it happened and recreated the volume. Sorry :(
I think it's important to mention few... - 10:59 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Tobias, would you be able to export bluefs for osd.1 or .4 and inspect or share sst file that showed the checksum mis...
- 10:58 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Marcin, would you be able to export bluefs for osd.101 once again and do binary comparison for both previous and new ...
- 10:52 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- hmm... so some observations for now:
1) ALL block checksum mismatch errors I can see have the same 'got' checksum ... - 10:02 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Igor Fedotov wrote:
> Marcin W wrote:
> > In my case, all OSDs were created just now with 14.2.4
>
> So you're s... - 09:54 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- bug only affects OSDs redeployed while running Luminous...
we have a cluster that was build with luminous and bluest... - 09:48 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Marcin W wrote:
> In my case, all OSDs were created just now with 14.2.4
So you're saying you have failing OSDs t... - 09:45 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- In my case, all OSDs were created just now with 14.2.4
- 09:44 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Tobias, and you did ALL OSD filestore->bluestore migration while running Luminous, didn't you?
> I can confirm th... - 09:43 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- I've had to split 003808.sst with rar due to 100KB upload limit
- 09:37 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- I can confirm that the Bug only affects OSDs on Clusters created prior to Luminous
- 09:31 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- We started our Clusters with different Versions - but all prior Mimic and for Clusters prior to Luminous with Filesto...
- 09:31 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- It's MiB. DB partition size aims at 5% of block device.
osd.403 where it was coming from has been already demolish... - 09:15 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Tobias, Marcin, with what Ceph version were your clusters (broken OSDs specifically) created?
Have you observed the ... - 08:52 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Marcin, thanks for info.
Some questions, please
>DB partition size is 437888MB
do you mean KB not MB ?
You...
10/29/2019
- 02:41 PM Bug #38745: spillover that doesn't make sense
- Due to spillover, I'm trying to optimize RocksDB options based on data partition size(roughly 9TB in example below). ...
- 01:46 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- DB partition size is 437888MB and according to my calculations RocksDB size including all 5 levels[256, 1536, 9216, 5...
- 01:25 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- I have the same issue with 2 OSDs so far. The initial error was:...
10/28/2019
- 03:26 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- ceph-kvstore-tool crashes on both OSDs - see files. I also attached the /var/lib/ceph/crash folder of both servers co...
- 03:13 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Well, nothing interesting in dmesgs...
Could you please run:
ceph-kvstore-tool bluestore-kv <path_to_osd> list "b... - 03:05 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- could not find any other assertions in old logs - but we don't keep so many old logs. i attached all logs i have for ...
- 02:52 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- here the dmesg output of both severs containing the broken OSDs. On a quick look I could not find anything. And hones...
- 02:45 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Also could you please check earlier logs of the broken OSDs for other assertions, e.g. one I shared in
comment #16 ... - 02:41 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Tobias, could you please check for H/W errors using dmesg for nodes with broken OSD's ?
- 02:39 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- And preliminary intermediate analysis is as follows (from last to first):
1) All avaialble OSD logs have the follo... - 02:39 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Thanks a lot!
- 02:27 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- here you go
- 02:16 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Tobias, great! Thanks a lot.
May I have the logs for these two OSDs where they first hit the issue?
- 02:06 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Igor, I have two failed OSDs. I will preserve them util end of the week in case you need them
- 01:52 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Sage, I assume you meant to assign this to Igor.
- 01:29 PM Bug #42223 (In Progress): ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED cep...
- 01:26 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- BTW I'm seeing 4 different clusters hitting this assert on 14.2.4 (via telemetry).
- 03:18 PM Bug #42495 (Resolved): ENOENT on just-created object
10/27/2019
- 02:58 PM Bug #42495: ENOENT on just-created object
- This seems to be easy to reproduce:
/a/sage-2019-10-26_16:12:32-rados-wip-sage2-testing-2019-10-25-1224-distro-bas... - 02:20 PM Bug #42490 (Resolved): mimic->master: fsck error: #2:717a0223:::608.00000000:head# has omap that ...
10/26/2019
- 10:28 PM Bug #42495: ENOENT on just-created object
- I'm getting a bit different failure report when running ceph_test_objectstore against upstream/master:
[==========... - 12:10 AM Bug #42495: ENOENT on just-created object
- This other failed test in the same run was also a bad ENOENT (seen by ceph-objectstore-tool fuse), but no logs to go ...
- 12:08 AM Bug #42495 (Resolved): ENOENT on just-created object
- ...
- 04:18 AM Bug #42345: OSD: When object compression ratio is high(but less than “bluestore_compression_requi...
- see also https://github.com/ceph/ceph/pull/31047
10/25/2019
- 04:03 PM Bug #42490 (Fix Under Review): mimic->master: fsck error: #2:717a0223:::608.00000000:head# has om...
- 02:41 PM Bug #42490: mimic->master: fsck error: #2:717a0223:::608.00000000:head# has omap that is not per-...
- I think this happens after "fast" repair which fixes(sets) per_pool_omap flag in db but bypasses all the omaps (for t...
- 02:09 PM Bug #42490 (Resolved): mimic->master: fsck error: #2:717a0223:::608.00000000:head# has omap that ...
- ...
- 02:08 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- exactly. ceph-bluestore-tool didn't run successfully. it crashed somehow. unfortuntelly i already replaced the OSD. B...
- 01:18 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Tobias, thanks a lot! Looking into this, please preserve OSD for a while if possible.
Could you please clarify the... - 01:14 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Hi Igor,
new logs: startup of broken OSD & ceph-bluestore-tool repair of broken OSD. ceph-bluestore-tool did run s... - 09:37 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- To collect log for ceph-bluestore-tool one can use something like that:
CEPH_ARGS="--log-file=output.log --debug-blu... - 12:38 AM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- How can I enable debug bluestore for this OSD?
10/24/2019
- 09:26 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- Dmitry - could you please provide more details as per my comment #7?
- 05:40 PM Bug #42223: ceph-14.2.4/src/os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(avail...
- I've also got this in 14.2.4
10/23/2019
- 11:26 AM Backport #40757 (Resolved): nautilus: stupid allocator might return extents with length = 0
- 11:10 AM Backport #40757 (Need More Info): nautilus: stupid allocator might return extents with length = 0
- 11:09 AM Backport #40757 (In Progress): nautilus: stupid allocator might return extents with length = 0
10/22/2019
- 03:00 PM Backport #42428 (Rejected): mimic: High amount of Read I/O on BlueFS/DB when listing omap keys
- no mimic backport at this time
- 10:36 AM Backport #42428 (Rejected): mimic: High amount of Read I/O on BlueFS/DB when listing omap keys
- 03:00 PM Bug #36482 (Resolved): High amount of Read I/O on BlueFS/DB when listing omap keys
- After discussing in the rados team standup, we've decided not to backport to mimic/luminous at this time. Nautilus f...
- 10:08 AM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
- Upon discussion with @Igor, the backports of this issue will require
* https://github.com/ceph/ceph/pull/27627
* ... - 12:37 PM Bug #42345: OSD: When object compression ratio is high(but less than “bluestore_compression_requi...
- also please note that compression makes sense for writes that are at least 2x times as large as bluestore_min_alloc_s...
- 12:26 PM Bug #42345 (Need More Info): OSD: When object compression ratio is high(but less than “bluestore_...
- would you please set debug bluestore to 20, repeat the test and share the log? Thanks!
10/21/2019
- 07:07 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
- There's several clusters on luminous that can't be upgraded just yet, but will upgrade what we can. I'm just trying t...
- 06:27 PM Bug #36482 (Pending Backport): High amount of Read I/O on BlueFS/DB when listing omap keys
- 06:26 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
- Gerben Meijer wrote:
> How long would this need to "bake"? Running into this frequently (several times per day). Is ... - 12:18 PM Bug #36482: High amount of Read I/O on BlueFS/DB when listing omap keys
- How long would this need to "bake"? Running into this frequently (several times per day). Is this going to make it in...
- 01:08 PM Backport #42041 (In Progress): nautilus: bluestore objectstore_blackhole=true violates read-after...
- 09:54 AM Backport #42041: nautilus: bluestore objectstore_blackhole=true violates read-after-write
- Sage writes in parent issue:
note that for backport, we only want one commit, 6c2a8e472dc71b962d7de008e30631f125b1...
10/17/2019
- 08:14 AM Bug #38559 (Resolved): 50-100% iops lost due to bluefs_preextend_wal_files = false
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:12 AM Bug #40769 (Resolved): Set concurrent max_background_compactions in rocksdb to 2
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:08 AM Backport #41710 (Resolved): mimic: Set concurrent max_background_compactions in rocksdb to 2
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30150
m... - 07:05 AM Bug #42345: OSD: When object compression ratio is high(but less than “bluestore_compression_requi...
- Fengzhe Han wrote:
> Version: nautilus
>
> I set value “.98” to “bluestore_compression_required_ratio”. Then put ... - 07:03 AM Bug #42345 (Closed): OSD: When object compression ratio is high(but less than “bluestore_compress...
- Version: nautilus
I set value “.98” to “bluestore_compression_required_ratio”. Then put some objects into the clus... - 06:21 AM Backport #41510 (Resolved): luminous: 50-100% iops lost due to bluefs_preextend_wal_files = false
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/29564
m...
10/16/2019
- 11:22 PM Backport #41710: mimic: Set concurrent max_background_compactions in rocksdb to 2
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30150
merged - 11:07 PM Backport #41510: luminous: 50-100% iops lost due to bluefs_preextend_wal_files = false
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/29564
merged
10/15/2019
- 07:55 PM Backport #41339 (Resolved): mimic: os/bluestore/BlueFS: use 64K alloc_size on the shared device
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30219
m... - 07:46 PM Backport #41339: mimic: os/bluestore/BlueFS: use 64K alloc_size on the shared device
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/30219
merged - 08:38 AM Bug #42297 (Rejected): ceph-bluestore-tool repair osd error
- 02:57 AM Bug #42297: ceph-bluestore-tool repair osd error
- Igor Fedotov wrote:
> I think that's not a bug.
> First time the tool showed some errors in DB due to legacy stats...
Also available in: Atom