Project

General

Profile

Actions

Bug #61821

open

RGW crashes on s3select query whith wrong syntax

Added by Aleksandr Rudenko 11 months ago. Updated 11 months ago.

Status:
New
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
s3select
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ceph 16.2.13

Config of this RGW instance:

[client]
    rgw enable apis = s3, admin
    rgw dynamic resharding = false
    rgw enable gc threads = false
    rgw enable lc threads = false
    rgw remote addr param = HTTP_X_FORWARDED_FOR
    rgw thread pool size = 2048

[client.rgw.a]
    admin socket = /var/run/ceph/ceph-client.rgw.a.asok
    host = xxxx
    keyring = /var/lib/ceph/radosgw/rgw.a.keyring
    rgw enable static website = true
    rgw frontends = civetweb num_threads=2048 port=0.0.0.0:7480 enable_keep_alive=yes

cvs file in s3 bucket:

name1,20,1996,address no 134
name2,10,1913,address no 1434
name3,15,1945,address no 133
name4,14,1953,address no 14343
name5,13,1964,address no 1556
name6654,12,1973,address no 1657
name77,23,1991,address no 13434

query:

aws s3api {connection-config} select-object-content --bucket select --expression-type 'SQL' \
--input-serialization '{"CSV": {}, "CompressionType": "NONE"}' \
--output-serialization '{"CSV": {}}' \
--key addresses.csv \
--expression "select count(*) from s3object;" output.json

"stdin" can be used insted of "s3object", but - same result.

The problem comes not on each request. Sometimes it needed to make about 10-20 consecutive requests.

There will be one of two problems:

First one problem:
admin socket of RGW stops working:

ceph daemon client.rgw.a config show
Couldn't parse JSON ERROR: (22) Invalid argument
invalid json
: Expecting value: line 1 column 1 (char 0)
admin_socket: Expecting value: line 1 column 1 (char 0)

Second one posible problem: RGW crashes with next trace:

    -1> 2023-06-27T10:15:40.699+0300 7fdaf5dba700  2 req 154783551899988622 0.007000133s s3:get_obj executing
     0> 2023-06-27T10:15:40.713+0300 7fdaf5dba700 -1 *** Caught signal (Aborted) **
 in thread 7fdaf5dba700 thread_name:civetweb-worker

 ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)
 1: /lib64/libpthread.so.0(+0x12b20) [0x7fdb219a7b20]
 2: gsignal()
 3: abort()
 4: /lib64/libstdc++.so.6(+0x9009b) [0x7fdb2099c09b]
 5: /lib64/libstdc++.so.6(+0x9653c) [0x7fdb209a253c]
 6: /lib64/libstdc++.so.6(+0x96597) [0x7fdb209a2597]
 7: /lib64/libstdc++.so.6(+0x967f8) [0x7fdb209a27f8]
 8: /lib64/libtcmalloc.so.4(+0x19fa4) [0x7fdb2af9ffa4]
 9: (tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0x146) [0x7fdb2afc1c96]
 10: (std::vector<s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >*, std::allocator<s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >*> >::_M_default_append(unsigned long)+0xa2) [0x7fdb2c9b8262]
 11: (s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >& boost::spirit::classic::impl::get_definition<s3selectEngine::s3select, boost::spirit::classic::parser_context<boost::spirit::classic::nil_t>, boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >(boost::spirit::classic::grammar<s3selectEngine::s3select, boost::spirit::classic::parser_context<boost::spirit::classic::nil_t> > const*)+0x2dc) [0x7fdb2c9b932c]
 12: (s3selectEngine::s3select::parse_query(char const*)+0xcd) [0x7fdb2c9b94dd]
 13: (RGWSelectObj_ObjStore_S3::run_s3select(char const*, char const*, unsigned long)+0x62f) [0x7fdb2c98d0bf]
 14: (RGWSelectObj_ObjStore_S3::send_response_data(ceph::buffer::v15_2_0::list&, long, long)+0x9c) [0x7fdb2c98d96c]
 15: (RGWRados::get_obj_iterate_cb(DoutPrefixProvider const*, rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*)+0x13c) [0x7fdb2c8ee75c]
 16: /lib64/libradosgw.so.2(+0x891cef) [0x7fdb2c8eecef]
 17: (RGWRados::iterate_obj(DoutPrefixProvider const*, RGWObjectCtx&, RGWBucketInfo const&, rgw_obj const&, long, long, unsigned long, int (*)(DoutPrefixProvider const*, rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*), void*, optional_yield)+0x37f) [0x7fdb2c900d6f]
 18: (RGWRados::Object::Read::iterate(DoutPrefixProvider const*, long, long, RGWGetDataCB*, optional_yield)+0x2c1) [0x7fdb2c9056e1]
 19: (RGWGetObj::execute(optional_yield)+0x76a) [0x7fdb2c89a66a]
 20: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, bool)+0xb3d) [0x7fdb2c513abd]
 21: (process_request(rgw::sal::RGWRadosStore*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSink*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*, int*)+0x2795) [0x7fdb2c517865]
 22: (RGWCivetWebFrontend::process(mg_connection*)+0x2df) [0x7fdb2c479d7f]
 23: /lib64/libradosgw.so.2(+0x58d6d6) [0x7fdb2c5ea6d6]
 24: /lib64/libradosgw.so.2(+0x58f347) [0x7fdb2c5ec347]
 25: /lib64/libradosgw.so.2(+0x58f808) [0x7fdb2c5ec808]
 26: /lib64/libpthread.so.0(+0x814a) [0x7fdb2199d14a]
 27: clone()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_mirror
   0/ 5 rbd_replay
   0/ 5 rbd_pwl
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 immutable_obj_cache
   0/ 5 client
   1/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 0 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 1 reserver
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 rgw_sync
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 compressor
   1/ 5 bluestore
   1/ 5 bluefs
   1/ 3 bdev
   1/ 5 kstore
   4/ 5 rocksdb
   4/ 5 leveldb
   4/ 5 memdb
   1/ 5 fuse
   2/ 5 mgr
   1/ 5 mgrc
   1/ 5 dpdk
   1/ 5 eventtrace
   1/ 5 prioritycache
   0/ 5 test
   0/ 5 cephfs_mirror
   0/ 5 cephsqlite
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
--- pthread ID / name mapping for recent threads ---
  140578370840320 / civetweb-worker
  140578379233024 / civetweb-worker
  140578387625728 / civetweb-worker
  140578396018432 / civetweb-worker
  140578404411136 / civetweb-worker
  140578437981952 / rgw_user_st_syn
  140578521908992 / safe_timer
  140578538694400 / ms_dispatch
  140578563872512 / io_context_pool
  140578572265216 / rgw_dt_lg_renew
  140578857617152 / safe_timer
  140578874402560 / ms_dispatch
  140578882795264 / ceph_timer
  140578899580672 / io_context_pool
  140578960455424 / admin_socket
  140578968848128 / service
  140578977240832 / msgr-worker-2
  140578985633536 / msgr-worker-1
  140578994026240 / msgr-worker-0
  140579002418944 / safe_timer
  140579332601216 / radosgw
  max_recent     10000
  max_new        10000
  log_file /var/log/ceph/ceph-client.rgw.a.log
--- end dump of recent events ---

And i have another question. In Ceph we have config parameters for disabling some subsets of APIs like notifications, websites, sts, etc. Why you don't have config options to disable select API? I think this API is so "problem-rich" and it will be great if we will have ability to disable it. Now, we will try to patch source code to disable it in our build.

Actions

Also available in: Atom PDF