Actions
Bug #61821
openRGW crashes on s3select query whith wrong syntax
% Done:
0%
Source:
Community (dev)
Tags:
s3select
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
ceph 16.2.13
Config of this RGW instance:
[client]
rgw enable apis = s3, admin
rgw dynamic resharding = false
rgw enable gc threads = false
rgw enable lc threads = false
rgw remote addr param = HTTP_X_FORWARDED_FOR
rgw thread pool size = 2048
[client.rgw.a]
admin socket = /var/run/ceph/ceph-client.rgw.a.asok
host = xxxx
keyring = /var/lib/ceph/radosgw/rgw.a.keyring
rgw enable static website = true
rgw frontends = civetweb num_threads=2048 port=0.0.0.0:7480 enable_keep_alive=yes
cvs file in s3 bucket:
name1,20,1996,address no 134
name2,10,1913,address no 1434
name3,15,1945,address no 133
name4,14,1953,address no 14343
name5,13,1964,address no 1556
name6654,12,1973,address no 1657
name77,23,1991,address no 13434
query:
aws s3api {connection-config} select-object-content --bucket select --expression-type 'SQL' \
--input-serialization '{"CSV": {}, "CompressionType": "NONE"}' \
--output-serialization '{"CSV": {}}' \
--key addresses.csv \
--expression "select count(*) from s3object;" output.json
"stdin" can be used insted of "s3object", but - same result.
The problem comes not on each request. Sometimes it needed to make about 10-20 consecutive requests.
There will be one of two problems:
First one problem:
admin socket of RGW stops working:
ceph daemon client.rgw.a config show
Couldn't parse JSON ERROR: (22) Invalid argument
invalid json
: Expecting value: line 1 column 1 (char 0)
admin_socket: Expecting value: line 1 column 1 (char 0)
Second one posible problem: RGW crashes with next trace:
-1> 2023-06-27T10:15:40.699+0300 7fdaf5dba700 2 req 154783551899988622 0.007000133s s3:get_obj executing
0> 2023-06-27T10:15:40.713+0300 7fdaf5dba700 -1 *** Caught signal (Aborted) **
in thread 7fdaf5dba700 thread_name:civetweb-worker
ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)
1: /lib64/libpthread.so.0(+0x12b20) [0x7fdb219a7b20]
2: gsignal()
3: abort()
4: /lib64/libstdc++.so.6(+0x9009b) [0x7fdb2099c09b]
5: /lib64/libstdc++.so.6(+0x9653c) [0x7fdb209a253c]
6: /lib64/libstdc++.so.6(+0x96597) [0x7fdb209a2597]
7: /lib64/libstdc++.so.6(+0x967f8) [0x7fdb209a27f8]
8: /lib64/libtcmalloc.so.4(+0x19fa4) [0x7fdb2af9ffa4]
9: (tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0x146) [0x7fdb2afc1c96]
10: (std::vector<s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >*, std::allocator<s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >*> >::_M_default_append(unsigned long)+0xa2) [0x7fdb2c9b8262]
11: (s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >& boost::spirit::classic::impl::get_definition<s3selectEngine::s3select, boost::spirit::classic::parser_context<boost::spirit::classic::nil_t>, boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >(boost::spirit::classic::grammar<s3selectEngine::s3select, boost::spirit::classic::parser_context<boost::spirit::classic::nil_t> > const*)+0x2dc) [0x7fdb2c9b932c]
12: (s3selectEngine::s3select::parse_query(char const*)+0xcd) [0x7fdb2c9b94dd]
13: (RGWSelectObj_ObjStore_S3::run_s3select(char const*, char const*, unsigned long)+0x62f) [0x7fdb2c98d0bf]
14: (RGWSelectObj_ObjStore_S3::send_response_data(ceph::buffer::v15_2_0::list&, long, long)+0x9c) [0x7fdb2c98d96c]
15: (RGWRados::get_obj_iterate_cb(DoutPrefixProvider const*, rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*)+0x13c) [0x7fdb2c8ee75c]
16: /lib64/libradosgw.so.2(+0x891cef) [0x7fdb2c8eecef]
17: (RGWRados::iterate_obj(DoutPrefixProvider const*, RGWObjectCtx&, RGWBucketInfo const&, rgw_obj const&, long, long, unsigned long, int (*)(DoutPrefixProvider const*, rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*), void*, optional_yield)+0x37f) [0x7fdb2c900d6f]
18: (RGWRados::Object::Read::iterate(DoutPrefixProvider const*, long, long, RGWGetDataCB*, optional_yield)+0x2c1) [0x7fdb2c9056e1]
19: (RGWGetObj::execute(optional_yield)+0x76a) [0x7fdb2c89a66a]
20: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, bool)+0xb3d) [0x7fdb2c513abd]
21: (process_request(rgw::sal::RGWRadosStore*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSink*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*, int*)+0x2795) [0x7fdb2c517865]
22: (RGWCivetWebFrontend::process(mg_connection*)+0x2df) [0x7fdb2c479d7f]
23: /lib64/libradosgw.so.2(+0x58d6d6) [0x7fdb2c5ea6d6]
24: /lib64/libradosgw.so.2(+0x58f347) [0x7fdb2c5ec347]
25: /lib64/libradosgw.so.2(+0x58f808) [0x7fdb2c5ec808]
26: /lib64/libpthread.so.0(+0x814a) [0x7fdb2199d14a]
27: clone()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 rbd_pwl
0/ 5 journaler
0/ 5 objectcacher
0/ 5 immutable_obj_cache
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 0 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 rgw_sync
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 fuse
2/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
1/ 5 prioritycache
0/ 5 test
0/ 5 cephfs_mirror
0/ 5 cephsqlite
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
--- pthread ID / name mapping for recent threads ---
140578370840320 / civetweb-worker
140578379233024 / civetweb-worker
140578387625728 / civetweb-worker
140578396018432 / civetweb-worker
140578404411136 / civetweb-worker
140578437981952 / rgw_user_st_syn
140578521908992 / safe_timer
140578538694400 / ms_dispatch
140578563872512 / io_context_pool
140578572265216 / rgw_dt_lg_renew
140578857617152 / safe_timer
140578874402560 / ms_dispatch
140578882795264 / ceph_timer
140578899580672 / io_context_pool
140578960455424 / admin_socket
140578968848128 / service
140578977240832 / msgr-worker-2
140578985633536 / msgr-worker-1
140578994026240 / msgr-worker-0
140579002418944 / safe_timer
140579332601216 / radosgw
max_recent 10000
max_new 10000
log_file /var/log/ceph/ceph-client.rgw.a.log
--- end dump of recent events ---
And i have another question. In Ceph we have config parameters for disabling some subsets of APIs like notifications, websites, sts, etc. Why you don't have config options to disable select API? I think this API is so "problem-rich" and it will be great if we will have ability to disable it. Now, we will try to patch source code to disable it in our build.
Actions