Project

General

Profile

Actions

Bug #20542

closed

rgw: not initialized pointer cause rgw crash with ec data pool

Added by Aleksei Gutikov almost 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
jewel kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rgw
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

In RGWPutObjProcessor_Atomic::complete_writing_data()
with pending_data_bl.length() > 0 and next_part_ofs==data_ofs
not initialized void *handle leads to invalid pointer librados::AioCompletion::pc
which leads to crash of rgw.

For ec pools alignment = osd_pool_erasure_code_stripe_unit * data_chunk_count.
(Not sure that is correct, but that is what I've observed)

So, for example for
rgw_obj_stripe_size=4M
rgw_max_chunk_size=4M
osd_pool_erasure_code_stripe_unit=4k
data_chunk_count=3 (k=3, m=2)
RGWRados::get_max_chunk_size(const rgw_pool& pool, uint64_t *max_chunk_size) returns max_chunk_size = 4M-4k

Then if uploading to S3 not multipart object with size=16M

While rgw_obj_stripe_size=4M
That leads (somehow) to entering while (pending_data_bl.length()) {
in int RGWPutObjProcessor_Atomic::complete_writing_data():2710

And there not initialized void *handle leads to invalid pointer librados::AioCompletion::pc
which leads to rgw crash

1: (()+0x1fcee2) [0x55f98c2e0ee2]
2: (()+0x11390) [0x7f43341c8390]
3: (Mutex::Lock(bool)+0xd) [0x7f432b5d1a2d]
4: (librados::AioCompletion::wait_for_safe()+0x15) [0x7f4334425995]
5: (RGWRados::aio_wait(void*)+0x11) [0x55f98c3efd81]
6: (RGWPutObjProcessor_Aio::wait_pending_front()+0x4c) [0x55f98c3fc51c]
7: (RGWPutObjProcessor_Aio::drain_pending()+0x20) [0x55f98c3fc5a0]
8: (RGWPutObjProcessor_Atomic::complete_writing_data()+0x30d) [0x55f98c40f6bd]
9: (RGWPutObjProcessor_Atomic::do_complete(unsigned long, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, std::map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, ceph::buffer::list, std::less<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, ceph::buffer::list> > >&, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, char const, char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const*, std::set<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::less<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >)+0x67) [0x55f98c440037]
10: (RGWPutObjProcessor::complete(unsigned long, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, std::map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, ceph::buffer::list, std::less<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, ceph::buffer::list> > >&, std::chrono::time_point<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, char const*, char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const*, std::set<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::less<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >)+0x22) [0x55f98c3eea52]
11: (RGWPutObj::execute()+0x22a9) [0x55f98c3be659]
12: (rgw_process_authenticated(RGWHandler_REST, RGWOp*&, RGWRequest*, req_state*, bool)+0x165) [0x55f98c3e8eb5]
13: (process_request(RGWRados*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSocket*)+0x1abc) [0x55f98c3ead5c]
14: (RGWCivetWebFrontend::process(mg_connection*)+0x371) [0x55f98c299711]
15: (()+0x1ee2a9) [0x55f98c2d22a9]
16: (()+0x1efc79) [0x55f98c2d3c79]
17: (()+0x76ba) [0x7f43341be6ba]
18: (clone()+0x6d) [0x7f4329c453dd]


Related issues 2 (0 open2 closed)

Copied to rgw - Backport #20712: jewel: rgw: not initialized pointer cause rgw crash with ec data poolResolvedNathan CutlerActions
Copied to rgw - Backport #20713: kraken: rgw: not initialized pointer cause rgw crash with ec data poolRejectedActions
Actions

Also available in: Atom PDF