Bug #61821
openRGW crashes on s3select query whith wrong syntax
0%
Description
ceph 16.2.13
Config of this RGW instance:
[client]
rgw enable apis = s3, admin
rgw dynamic resharding = false
rgw enable gc threads = false
rgw enable lc threads = false
rgw remote addr param = HTTP_X_FORWARDED_FOR
rgw thread pool size = 2048
[client.rgw.a]
admin socket = /var/run/ceph/ceph-client.rgw.a.asok
host = xxxx
keyring = /var/lib/ceph/radosgw/rgw.a.keyring
rgw enable static website = true
rgw frontends = civetweb num_threads=2048 port=0.0.0.0:7480 enable_keep_alive=yes
cvs file in s3 bucket:
name1,20,1996,address no 134
name2,10,1913,address no 1434
name3,15,1945,address no 133
name4,14,1953,address no 14343
name5,13,1964,address no 1556
name6654,12,1973,address no 1657
name77,23,1991,address no 13434
query:
aws s3api {connection-config} select-object-content --bucket select --expression-type 'SQL' \
--input-serialization '{"CSV": {}, "CompressionType": "NONE"}' \
--output-serialization '{"CSV": {}}' \
--key addresses.csv \
--expression "select count(*) from s3object;" output.json
"stdin" can be used insted of "s3object", but - same result.
The problem comes not on each request. Sometimes it needed to make about 10-20 consecutive requests.
There will be one of two problems:
First one problem:
admin socket of RGW stops working:
ceph daemon client.rgw.a config show
Couldn't parse JSON ERROR: (22) Invalid argument
invalid json
: Expecting value: line 1 column 1 (char 0)
admin_socket: Expecting value: line 1 column 1 (char 0)
Second one posible problem: RGW crashes with next trace:
-1> 2023-06-27T10:15:40.699+0300 7fdaf5dba700 2 req 154783551899988622 0.007000133s s3:get_obj executing
0> 2023-06-27T10:15:40.713+0300 7fdaf5dba700 -1 *** Caught signal (Aborted) **
in thread 7fdaf5dba700 thread_name:civetweb-worker
ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)
1: /lib64/libpthread.so.0(+0x12b20) [0x7fdb219a7b20]
2: gsignal()
3: abort()
4: /lib64/libstdc++.so.6(+0x9009b) [0x7fdb2099c09b]
5: /lib64/libstdc++.so.6(+0x9653c) [0x7fdb209a253c]
6: /lib64/libstdc++.so.6(+0x96597) [0x7fdb209a2597]
7: /lib64/libstdc++.so.6(+0x967f8) [0x7fdb209a27f8]
8: /lib64/libtcmalloc.so.4(+0x19fa4) [0x7fdb2af9ffa4]
9: (tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0x146) [0x7fdb2afc1c96]
10: (std::vector<s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >*, std::allocator<s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >*> >::_M_default_append(unsigned long)+0xa2) [0x7fdb2c9b8262]
11: (s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >& boost::spirit::classic::impl::get_definition<s3selectEngine::s3select, boost::spirit::classic::parser_context<boost::spirit::classic::nil_t>, boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >(boost::spirit::classic::grammar<s3selectEngine::s3select, boost::spirit::classic::parser_context<boost::spirit::classic::nil_t> > const*)+0x2dc) [0x7fdb2c9b932c]
12: (s3selectEngine::s3select::parse_query(char const*)+0xcd) [0x7fdb2c9b94dd]
13: (RGWSelectObj_ObjStore_S3::run_s3select(char const*, char const*, unsigned long)+0x62f) [0x7fdb2c98d0bf]
14: (RGWSelectObj_ObjStore_S3::send_response_data(ceph::buffer::v15_2_0::list&, long, long)+0x9c) [0x7fdb2c98d96c]
15: (RGWRados::get_obj_iterate_cb(DoutPrefixProvider const*, rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*)+0x13c) [0x7fdb2c8ee75c]
16: /lib64/libradosgw.so.2(+0x891cef) [0x7fdb2c8eecef]
17: (RGWRados::iterate_obj(DoutPrefixProvider const*, RGWObjectCtx&, RGWBucketInfo const&, rgw_obj const&, long, long, unsigned long, int (*)(DoutPrefixProvider const*, rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*), void*, optional_yield)+0x37f) [0x7fdb2c900d6f]
18: (RGWRados::Object::Read::iterate(DoutPrefixProvider const*, long, long, RGWGetDataCB*, optional_yield)+0x2c1) [0x7fdb2c9056e1]
19: (RGWGetObj::execute(optional_yield)+0x76a) [0x7fdb2c89a66a]
20: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, bool)+0xb3d) [0x7fdb2c513abd]
21: (process_request(rgw::sal::RGWRadosStore*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSink*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*, int*)+0x2795) [0x7fdb2c517865]
22: (RGWCivetWebFrontend::process(mg_connection*)+0x2df) [0x7fdb2c479d7f]
23: /lib64/libradosgw.so.2(+0x58d6d6) [0x7fdb2c5ea6d6]
24: /lib64/libradosgw.so.2(+0x58f347) [0x7fdb2c5ec347]
25: /lib64/libradosgw.so.2(+0x58f808) [0x7fdb2c5ec808]
26: /lib64/libpthread.so.0(+0x814a) [0x7fdb2199d14a]
27: clone()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 rbd_pwl
0/ 5 journaler
0/ 5 objectcacher
0/ 5 immutable_obj_cache
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 0 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 rgw_sync
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 fuse
2/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
1/ 5 prioritycache
0/ 5 test
0/ 5 cephfs_mirror
0/ 5 cephsqlite
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
--- pthread ID / name mapping for recent threads ---
140578370840320 / civetweb-worker
140578379233024 / civetweb-worker
140578387625728 / civetweb-worker
140578396018432 / civetweb-worker
140578404411136 / civetweb-worker
140578437981952 / rgw_user_st_syn
140578521908992 / safe_timer
140578538694400 / ms_dispatch
140578563872512 / io_context_pool
140578572265216 / rgw_dt_lg_renew
140578857617152 / safe_timer
140578874402560 / ms_dispatch
140578882795264 / ceph_timer
140578899580672 / io_context_pool
140578960455424 / admin_socket
140578968848128 / service
140578977240832 / msgr-worker-2
140578985633536 / msgr-worker-1
140578994026240 / msgr-worker-0
140579002418944 / safe_timer
140579332601216 / radosgw
max_recent 10000
max_new 10000
log_file /var/log/ceph/ceph-client.rgw.a.log
--- end dump of recent events ---
And i have another question. In Ceph we have config parameters for disabling some subsets of APIs like notifications, websites, sts, etc. Why you don't have config options to disable select API? I think this API is so "problem-rich" and it will be great if we will have ability to disable it. Now, we will try to patch source code to disable it in our build.
Updated by Casey Bodley 10 months ago
- Assignee set to Gal Salomon
- Tags set to s3select
Updated by Gal Salomon 10 months ago
i built the 16.2.13
it's a quite old version
the statement "select count(*) from s3object;" ; is the wrong syntax for that version
must use the stdin (instead of s3object)
i ran many times (>100) that SQL statement, no crash
according to the crash-stack, it happens in parsing stage.