Bug #65828
openradosgw process killed with "Out of memory" while executing query "select * from s3object limit 1" on a 12GB parquet file
0%
Description
(coppied from https://bugzilla.redhat.com/show_bug.cgi?id=2275323)
Description of problem:
radosgw process killed with "Out of memory" while executing query "select * from s3object limit 1" on a 12GB parquet file
[cephuser@ceph-hmaheswa-reef-x220k9-node6 ~]$ time aws s3api --endpoint-url http://10.0.211.33:80 select-object-content --bucket bkt1 --key file12GBparquet --expression-type 'SQL' --input-serialization '{"Parquet": {}, "CompressionType": "NONE"}' --output-serialization '{"CSV": {}}' --expression "select * from s3object limit 1;" /dev/stdout
("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))
real 0m5.769s
user 0m0.477s
sys 0m0.110s
Journalctl logs snippet on rgw node:
Out of memory: Killed process 970456 (radosgw) total-vm:7666032kB, anon-rss:2285168kB, file-rss:0kB, shmem-rss:0kB, UID:167 pgtables:7108kB oom_score_adj:0
ceph-fe41f8f0-8d0d-11ee-aee8-fa163ec880af@rgw.rgw.all.ceph-hmaheswa-reef-x220k9-node5.nkuffe.service: A process of this unit has been killed by the OOM killer.
Actual results:
radosgw process killed because of 'Out of memory" while trying to query just one row on a low end cluster.
Expected results:
query should execute fine on a low end cluster as well.
Additional info:
parquet file of 11.95 GB size is downloaded from:
https://www.kaggle.com/datasets/aaronweymouth/nyc-rideshare-raw-data?select=rideshare_data.parquet
======
from the result we can observe that free space is very less, remaining processes(like ceph-osd) also consuming significant amount of memory and I guess the radosgw which is requesting for even more memory is being killed.
so, used another rgw node ip as the endpoint-url where no other ceph daemon is running. then query executed fine and mem utilization is 84% by radosgw. you can see the top output captured below while the query is getting executed.
====
--- Additional comment from gal salomon on 2024-04-14 10:58:19 UTC ---
these findings imply that there isn't anything wrong with the radosgw behavior upon processing Parquet object.
it depends on machine sizing and workload.
this specific 12GB parquet file contains only 6 row-groups (on 365M rows!)
thus, upon `select *` (extract all columns), it "forces" the reader to load a great amount of data.
====
Updated by Casey Bodley 10 days ago
- Priority changed from Normal to High
- Tags set to s3select