Project

General

Profile

Actions

Bug #61821

open

RGW crashes on s3select query whith wrong syntax

Added by Aleksandr Rudenko 10 months ago. Updated 10 months ago.

Status:
New
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
s3select
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ceph 16.2.13

Config of this RGW instance:

[client]
    rgw enable apis = s3, admin
    rgw dynamic resharding = false
    rgw enable gc threads = false
    rgw enable lc threads = false
    rgw remote addr param = HTTP_X_FORWARDED_FOR
    rgw thread pool size = 2048

[client.rgw.a]
    admin socket = /var/run/ceph/ceph-client.rgw.a.asok
    host = xxxx
    keyring = /var/lib/ceph/radosgw/rgw.a.keyring
    rgw enable static website = true
    rgw frontends = civetweb num_threads=2048 port=0.0.0.0:7480 enable_keep_alive=yes

cvs file in s3 bucket:

name1,20,1996,address no 134
name2,10,1913,address no 1434
name3,15,1945,address no 133
name4,14,1953,address no 14343
name5,13,1964,address no 1556
name6654,12,1973,address no 1657
name77,23,1991,address no 13434

query:

aws s3api {connection-config} select-object-content --bucket select --expression-type 'SQL' \
--input-serialization '{"CSV": {}, "CompressionType": "NONE"}' \
--output-serialization '{"CSV": {}}' \
--key addresses.csv \
--expression "select count(*) from s3object;" output.json

"stdin" can be used insted of "s3object", but - same result.

The problem comes not on each request. Sometimes it needed to make about 10-20 consecutive requests.

There will be one of two problems:

First one problem:
admin socket of RGW stops working:

ceph daemon client.rgw.a config show
Couldn't parse JSON ERROR: (22) Invalid argument
invalid json
: Expecting value: line 1 column 1 (char 0)
admin_socket: Expecting value: line 1 column 1 (char 0)

Second one posible problem: RGW crashes with next trace:

    -1> 2023-06-27T10:15:40.699+0300 7fdaf5dba700  2 req 154783551899988622 0.007000133s s3:get_obj executing
     0> 2023-06-27T10:15:40.713+0300 7fdaf5dba700 -1 *** Caught signal (Aborted) **
 in thread 7fdaf5dba700 thread_name:civetweb-worker

 ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)
 1: /lib64/libpthread.so.0(+0x12b20) [0x7fdb219a7b20]
 2: gsignal()
 3: abort()
 4: /lib64/libstdc++.so.6(+0x9009b) [0x7fdb2099c09b]
 5: /lib64/libstdc++.so.6(+0x9653c) [0x7fdb209a253c]
 6: /lib64/libstdc++.so.6(+0x96597) [0x7fdb209a2597]
 7: /lib64/libstdc++.so.6(+0x967f8) [0x7fdb209a27f8]
 8: /lib64/libtcmalloc.so.4(+0x19fa4) [0x7fdb2af9ffa4]
 9: (tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0x146) [0x7fdb2afc1c96]
 10: (std::vector<s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >*, std::allocator<s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >*> >::_M_default_append(unsigned long)+0xa2) [0x7fdb2c9b8262]
 11: (s3selectEngine::s3select::definition<boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >& boost::spirit::classic::impl::get_definition<s3selectEngine::s3select, boost::spirit::classic::parser_context<boost::spirit::classic::nil_t>, boost::spirit::classic::scanner<char const*, boost::spirit::classic::scanner_policies<boost::spirit::classic::skipper_iteration_policy<boost::spirit::classic::iteration_policy>, boost::spirit::classic::match_policy, boost::spirit::classic::action_policy> > >(boost::spirit::classic::grammar<s3selectEngine::s3select, boost::spirit::classic::parser_context<boost::spirit::classic::nil_t> > const*)+0x2dc) [0x7fdb2c9b932c]
 12: (s3selectEngine::s3select::parse_query(char const*)+0xcd) [0x7fdb2c9b94dd]
 13: (RGWSelectObj_ObjStore_S3::run_s3select(char const*, char const*, unsigned long)+0x62f) [0x7fdb2c98d0bf]
 14: (RGWSelectObj_ObjStore_S3::send_response_data(ceph::buffer::v15_2_0::list&, long, long)+0x9c) [0x7fdb2c98d96c]
 15: (RGWRados::get_obj_iterate_cb(DoutPrefixProvider const*, rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*)+0x13c) [0x7fdb2c8ee75c]
 16: /lib64/libradosgw.so.2(+0x891cef) [0x7fdb2c8eecef]
 17: (RGWRados::iterate_obj(DoutPrefixProvider const*, RGWObjectCtx&, RGWBucketInfo const&, rgw_obj const&, long, long, unsigned long, int (*)(DoutPrefixProvider const*, rgw_raw_obj const&, long, long, long, bool, RGWObjState*, void*), void*, optional_yield)+0x37f) [0x7fdb2c900d6f]
 18: (RGWRados::Object::Read::iterate(DoutPrefixProvider const*, long, long, RGWGetDataCB*, optional_yield)+0x2c1) [0x7fdb2c9056e1]
 19: (RGWGetObj::execute(optional_yield)+0x76a) [0x7fdb2c89a66a]
 20: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, bool)+0xb3d) [0x7fdb2c513abd]
 21: (process_request(rgw::sal::RGWRadosStore*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSink*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*, int*)+0x2795) [0x7fdb2c517865]
 22: (RGWCivetWebFrontend::process(mg_connection*)+0x2df) [0x7fdb2c479d7f]
 23: /lib64/libradosgw.so.2(+0x58d6d6) [0x7fdb2c5ea6d6]
 24: /lib64/libradosgw.so.2(+0x58f347) [0x7fdb2c5ec347]
 25: /lib64/libradosgw.so.2(+0x58f808) [0x7fdb2c5ec808]
 26: /lib64/libpthread.so.0(+0x814a) [0x7fdb2199d14a]
 27: clone()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_mirror
   0/ 5 rbd_replay
   0/ 5 rbd_pwl
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 immutable_obj_cache
   0/ 5 client
   1/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 0 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 1 reserver
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 rgw_sync
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 compressor
   1/ 5 bluestore
   1/ 5 bluefs
   1/ 3 bdev
   1/ 5 kstore
   4/ 5 rocksdb
   4/ 5 leveldb
   4/ 5 memdb
   1/ 5 fuse
   2/ 5 mgr
   1/ 5 mgrc
   1/ 5 dpdk
   1/ 5 eventtrace
   1/ 5 prioritycache
   0/ 5 test
   0/ 5 cephfs_mirror
   0/ 5 cephsqlite
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
--- pthread ID / name mapping for recent threads ---
  140578370840320 / civetweb-worker
  140578379233024 / civetweb-worker
  140578387625728 / civetweb-worker
  140578396018432 / civetweb-worker
  140578404411136 / civetweb-worker
  140578437981952 / rgw_user_st_syn
  140578521908992 / safe_timer
  140578538694400 / ms_dispatch
  140578563872512 / io_context_pool
  140578572265216 / rgw_dt_lg_renew
  140578857617152 / safe_timer
  140578874402560 / ms_dispatch
  140578882795264 / ceph_timer
  140578899580672 / io_context_pool
  140578960455424 / admin_socket
  140578968848128 / service
  140578977240832 / msgr-worker-2
  140578985633536 / msgr-worker-1
  140578994026240 / msgr-worker-0
  140579002418944 / safe_timer
  140579332601216 / radosgw
  max_recent     10000
  max_new        10000
  log_file /var/log/ceph/ceph-client.rgw.a.log
--- end dump of recent events ---

And i have another question. In Ceph we have config parameters for disabling some subsets of APIs like notifications, websites, sts, etc. Why you don't have config options to disable select API? I think this API is so "problem-rich" and it will be great if we will have ability to disable it. Now, we will try to patch source code to disable it in our build.

Actions #1

Updated by Casey Bodley 10 months ago

  • Assignee set to Gal Salomon
  • Tags set to s3select
Actions #2

Updated by Gal Salomon 10 months ago

i built the 16.2.13

it's a quite old version
the statement "select count(*) from s3object;" ; is the wrong syntax for that version
must use the stdin (instead of s3object)

i ran many times (>100) that SQL statement, no crash

according to the crash-stack, it happens in parsing stage.

Actions

Also available in: Atom PDF