Project

General

Profile

Actions

Bug #5931

closed

radosgw crashes when deleting object

Added by Artem Salpagarov over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
bobtail, cuttlefish
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I use 3hub mac app (S3 client) for managing files in rados. Rados is installed on two physical Ubuntu presice machines, there are 4 osds, 3 mons. This deletion operation is being run by a user who have read/write rights to this particular bucket (user was added to a bucket by S3 commands).

Whenever I try to delete a file from a bucket, rados process crashes. After restarting radosgw, the file appears to be deleted. Here is crash log:

2013-08-11 12:06:04.443239 7f18737ee700 -1 ** Caught signal (Aborted) *
in thread 7f18737ee700

ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
1: radosgw() [0x4f1d7a]
2: (()+0xfcb0) [0x7f18bc8e3cb0]
3: (gsignal()+0x35) [0x7f18bb2ec425]
4: (abort()+0x17b) [0x7f18bb2efb8b]
5: (_gnu_cxx::_verbose_terminate_handler()+0x11d) [0x7f18bbc3e69d]
6: (()+0xb5846) [0x7f18bbc3c846]
7: (()+0xb5873) [0x7f18bbc3c873]
8: (()+0xb596e) [0x7f18bbc3c96e]
9: (std::__throw_logic_error(char const*)+0x57) [0x7f18bbbe9787]
10: (char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag)+0xa9) [0x7f18bbc27509]
11: (std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)+0x43) [0x7f18bbc275e3]
12: (RGWDeleteMultiObj_ObjStore_S3::send_partial_response(std::pair<std::string, int>&)+0x4b) [0x4c6afb]
13: (RGWDeleteMultiObj::execute()+0x346) [0x5d86f6]
14: (RGWProcess::handle_request(RGWRequest*)+0x406) [0x4ec656]
15: (RGWProcess::RGWWQ::_process(RGWRequest*)+0x36) [0x4ee2f6]
16: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x50b246]
17: (ThreadPool::WorkThread::entry()+0x10) [0x50d080]
18: (()+0x7e9a) [0x7f18bc8dbe9a]
19: (clone()+0x6d) [0x7f18bb3a9ccd]
Actions #1

Updated by Yehuda Sadeh over 10 years ago

  • Assignee set to Yehuda Sadeh

Do you have log for the entire request? Looks like an issue with multi-delete.

Actions #2

Updated by Yehuda Sadeh over 10 years ago

  • Status changed from New to 12
Actions #3

Updated by Ian Colle over 10 years ago

  • Status changed from 12 to In Progress
  • Priority changed from Normal to Urgent
  • Backport set to cuttlefish
Actions #4

Updated by Yehuda Sadeh over 10 years ago

  • Backport changed from cuttlefish to bobtail, cuttlefish
Actions #5

Updated by Artem Salpagarov over 10 years ago

Yehuda Sadeh wrote:

Do you have log for the entire request? Looks like an issue with multi-delete.

I hope that's what you need:

   -89> 2013-08-12 23:31:57.159751 7f57067fc700  1 ====== starting new request req=0x1606d40 =====
   -88> 2013-08-12 23:31:57.159791 7f57067fc700  2 req 20:0.000040::::initializing
   -87> 2013-08-12 23:31:57.159800 7f57067fc700 10 host=sambo-s.s3.busy.me rgw_dns_name=s3.busy.me
   -86> 2013-08-12 23:31:57.159825 7f57067fc700 10 s->object=<NULL> s->bucket=sambo-s
   -85> 2013-08-12 23:31:57.159831 7f57067fc700 20 FCGI_ROLE=RESPONDER
   -84> 2013-08-12 23:31:57.159832 7f57067fc700 20 SCRIPT_URL=/
   -83> 2013-08-12 23:31:57.159833 7f57067fc700 20 SCRIPT_URI=http://sambo-s.s3.busy.me/
   -82> 2013-08-12 23:31:57.159834 7f57067fc700 20 HTTP_AUTHORIZATION=AWS SKWZFPLQH0XJLUP4CL9C:DuLuupkC2ztDTa1j8HcmpTXzAwU=
   -81> 2013-08-12 23:31:57.159835 7f57067fc700 20 HTTP_HOST=sambo-s.s3.busy.me
   -80> 2013-08-12 23:31:57.159836 7f57067fc700 20 HTTP_USER_AGENT=3Hub/1.0.9 (Mac OS X)
   -79> 2013-08-12 23:31:57.159836 7f57067fc700 20 CONTENT_LENGTH=113
   -78> 2013-08-12 23:31:57.159837 7f57067fc700 20 HTTP_DATE=Mon, 12 Aug 2013 21:31:57 GMT
   -77> 2013-08-12 23:31:57.159838 7f57067fc700 20 HTTP_CONTENT_MD5=w5b6Y+2W3gRNBPEW/J9+GA==
   -76> 2013-08-12 23:31:57.159839 7f57067fc700 20 HTTP_CONNECTION=close
   -75> 2013-08-12 23:31:57.159839 7f57067fc700 20 PATH=/usr/local/bin:/usr/bin:/bin
   -74> 2013-08-12 23:31:57.159840 7f57067fc700 20 SERVER_SIGNATURE=
   -73> 2013-08-12 23:31:57.159841 7f57067fc700 20 SERVER_SOFTWARE=Apache/2.2.22 (Ubuntu)
   -72> 2013-08-12 23:31:57.159842 7f57067fc700 20 SERVER_NAME=sambo-s.s3.busy.me
   -71> 2013-08-12 23:31:57.159843 7f57067fc700 20 SERVER_ADDR=188.138.34.140
   -70> 2013-08-12 23:31:57.159844 7f57067fc700 20 SERVER_PORT=80
   -69> 2013-08-12 23:31:57.159845 7f57067fc700 20 REMOTE_ADDR=109.252.55.178
   -68> 2013-08-12 23:31:57.159845 7f57067fc700 20 DOCUMENT_ROOT=/var/www
   -67> 2013-08-12 23:31:57.159846 7f57067fc700 20 SERVER_ADMIN=admin@busify.com
   -66> 2013-08-12 23:31:57.159847 7f57067fc700 20 SCRIPT_FILENAME=/var/www/s3gw.fcgi
   -65> 2013-08-12 23:31:57.159848 7f57067fc700 20 REMOTE_PORT=57435
   -64> 2013-08-12 23:31:57.159849 7f57067fc700 20 GATEWAY_INTERFACE=CGI/1.1
   -63> 2013-08-12 23:31:57.159850 7f57067fc700 20 SERVER_PROTOCOL=HTTP/1.1
   -62> 2013-08-12 23:31:57.159850 7f57067fc700 20 REQUEST_METHOD=POST
   -61> 2013-08-12 23:31:57.159851 7f57067fc700 20 QUERY_STRING=page=¶ms=&delete
   -60> 2013-08-12 23:31:57.159852 7f57067fc700 20 REQUEST_URI=/?delete
   -59> 2013-08-12 23:31:57.159853 7f57067fc700 20 SCRIPT_NAME=/
   -58> 2013-08-12 23:31:57.159855 7f57067fc700  2 req 20:0.000104:s3:POST /sambo-s/::getting op
   -57> 2013-08-12 23:31:57.159858 7f57067fc700  2 req 20:0.000107:s3:POST /sambo-s/:multi_object_delete:authorizing
   -56> 2013-08-12 23:31:57.159872 7f57067fc700 20 get_obj_state: rctx=0x7f56c0002bb0 obj=.users:SKWZFPLQH0XJLUP4CL9C state=0x7f56c0006348 s->prefetch_data=0
   -55> 2013-08-12 23:31:57.159881 7f57067fc700 10 moving .users+SKWZFPLQH0XJLUP4CL9C to cache LRU end
   -54> 2013-08-12 23:31:57.159884 7f57067fc700 10 cache get: name=.users+SKWZFPLQH0XJLUP4CL9C : hit
   -53> 2013-08-12 23:31:57.159890 7f57067fc700 20 get_obj_state: s->obj_tag was set empty
   -52> 2013-08-12 23:31:57.159895 7f57067fc700 10 moving .users+SKWZFPLQH0XJLUP4CL9C to cache LRU end
   -51> 2013-08-12 23:31:57.159896 7f57067fc700 10 cache get: name=.users+SKWZFPLQH0XJLUP4CL9C : hit
   -50> 2013-08-12 23:31:57.159925 7f57067fc700 20 get_obj_state: rctx=0x7f56c0002bb0 obj=.users.uid:sambo-uploads state=0x7f56c000cd08 s->prefetch_data=0
   -49> 2013-08-12 23:31:57.159931 7f57067fc700 10 moving .users.uid+sambo-uploads to cache LRU end
   -48> 2013-08-12 23:31:57.159933 7f57067fc700 10 cache get: name=.users.uid+sambo-uploads : hit
   -47> 2013-08-12 23:31:57.159936 7f57067fc700 20 get_obj_state: s->obj_tag was set empty
   -46> 2013-08-12 23:31:57.159940 7f57067fc700 10 moving .users.uid+sambo-uploads to cache LRU end
   -45> 2013-08-12 23:31:57.159941 7f57067fc700 10 cache get: name=.users.uid+sambo-uploads : hit
   -44> 2013-08-12 23:31:57.159980 7f57067fc700 10 get_canon_resource(): dest=/sambo-s/?delete
   -43> 2013-08-12 23:31:57.159985 7f57067fc700 10 auth_hdr:
POST
w5b6Y+2W3gRNBPEW/J9+GA==

Mon, 12 Aug 2013 21:31:57 GMT
/sambo-s/?delete
   -42> 2013-08-12 23:31:57.160052 7f57067fc700 15 b64=DuLuupkC2ztDTa1j8HcmpTXzAwU=
   -41> 2013-08-12 23:31:57.160055 7f57067fc700 15 auth_sign=DuLuupkC2ztDTa1j8HcmpTXzAwU=
   -40> 2013-08-12 23:31:57.160057 7f57067fc700 15 compare=0
   -39> 2013-08-12 23:31:57.160059 7f57067fc700  2 req 20:0.000308:s3:POST /sambo-s/:multi_object_delete:reading permissions
   -38> 2013-08-12 23:31:57.160070 7f57067fc700 20 get_obj_state: rctx=0x7f56c0001290 obj=.rgw:sambo-s state=0x7f56c000b498 s->prefetch_data=0
   -37> 2013-08-12 23:31:57.160076 7f57067fc700 10 moving .rgw+sambo-s to cache LRU end
   -36> 2013-08-12 23:31:57.160078 7f57067fc700 10 cache get: name=.rgw+sambo-s : hit
   -35> 2013-08-12 23:31:57.160084 7f57067fc700 20 get_obj_state: s->obj_tag was set empty
   -34> 2013-08-12 23:31:57.160088 7f57067fc700 10 moving .rgw+sambo-s to cache LRU end
   -33> 2013-08-12 23:31:57.160089 7f57067fc700 10 cache get: name=.rgw+sambo-s : hit
   -32> 2013-08-12 23:31:57.160097 7f57067fc700 20 rgw_get_bucket_info: bucket=sambo-s(@.rgw.buckets[7624.2]) owner me
   -31> 2013-08-12 23:31:57.160107 7f57067fc700 20 get_obj_state: rctx=0x7f56c0001290 obj=sambo-s: state=0x7f56c000b498 s->prefetch_data=0
   -30> 2013-08-12 23:31:57.160126 7f57067fc700 15 Read AccessControlPolicy<AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>me</ID><DisplayName>Artem</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>me</ID><DisplayName>Artem</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>sambo-ios</ID><DisplayName>Sambo iOS App</DisplayName></Grantee><Permission>READ</Permission></Grant><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>sambo-ios</ID><DisplayName>Sambo iOS App</DisplayName></Grantee><Permission>READ_ACP</Permission></Grant><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>sambo-uploads</ID><DisplayName>Sambo uploads</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy>
   -29> 2013-08-12 23:31:57.160139 7f57067fc700  2 req 20:0.000388:s3:POST /sambo-s/:multi_object_delete:reading the cors attr
   -28> 2013-08-12 23:31:57.160142 7f57067fc700 10 Going to read cors from attrs
   -27> 2013-08-12 23:31:57.160148 7f57067fc700 20 get_obj_state: rctx=0x7f56c0001290 obj=sambo-s: state=0x7f56c000b498 s->prefetch_data=0
   -26> 2013-08-12 23:31:57.160151 7f57067fc700  2 Warning: There is no content for CORS xattr, cors may not be set yet
   -25> 2013-08-12 23:31:57.160153 7f57067fc700  2 req 20:0.000402:s3:POST /sambo-s/:multi_object_delete:verifying op permissions
   -24> 2013-08-12 23:31:57.160156 7f57067fc700  5 Searching permissions for uid=sambo-uploads mask=50
   -23> 2013-08-12 23:31:57.160158 7f57067fc700  5 Found permission: 15
   -22> 2013-08-12 23:31:57.160159 7f57067fc700  5 Searching permissions for group=1 mask=50
   -21> 2013-08-12 23:31:57.160161 7f57067fc700  5 Permissions for group not found
   -20> 2013-08-12 23:31:57.160162 7f57067fc700  5 Getting permissions id=sambo-uploads owner=me perm=2
   -19> 2013-08-12 23:31:57.160163 7f57067fc700 10  uid=sambo-uploads requested perm (type)=2, policy perm=2, user_perm_mask=2, acl perm=2
   -18> 2013-08-12 23:31:57.160165 7f57067fc700  2 req 20:0.000414:s3:POST /sambo-s/:multi_object_delete:verifying op params
   -17> 2013-08-12 23:31:57.160167 7f57067fc700  2 req 20:0.000416:s3:POST /sambo-s/:multi_object_delete:executing
   -16> 2013-08-12 23:31:57.160323 7f57067fc700 20 get_obj_state: rctx=0x7f56c0001290 obj=sambo-s:3/demo.mp4 state=0x7f56c0003388 s->prefetch_data=0
   -15> 2013-08-12 23:31:57.160352 7f57067fc700  1 -- 10.1.2.1:0/1053771 --> 10.1.1.1:6801/2157 -- osd_op(client.15196.0:144 7624.2_3/demo.mp4 [getxattrs,stat] 10.3d68e473 e82) v4 -- ?+0 0x7f56c0004130 con 0x7f5728002270
   -14> 2013-08-12 23:31:57.172930 7f57403b3700  1 -- 10.1.2.1:0/1053771 <== osd.0 10.1.1.1:6801/2157 46 ==== osd_op_reply(144 7624.2_3/demo.mp4 [getxattrs,stat] ondisk = 0) v4 ==== 158+0+1114 (2626042734 0 100609773) 0x7f5710000be0 con 0x7f5728002270
   -13> 2013-08-12 23:31:57.173048 7f57067fc700 10 manifest: total_size = 6342788
   -12> 2013-08-12 23:31:57.173060 7f57067fc700 10 manifest: ofs=0 loc=sambo-s:3/demo.mp4
   -11> 2013-08-12 23:31:57.173063 7f57067fc700 10 manifest: ofs=524288 loc=sambo-s:_shadow__Iy-HYIrhbPZLTvWhOqVix8fd-Plge-B_1
   -10> 2013-08-12 23:31:57.173064 7f57067fc700 10 manifest: ofs=4718592 loc=sambo-s:_shadow__Iy-HYIrhbPZLTvWhOqVix8fd-Plge-B_2
    -9> 2013-08-12 23:31:57.173065 7f57067fc700 20 get_obj_state: setting s->obj_tag to _Qq5tYkDQZ_xsPwfa5xBC1-tYnZAbopn
    -8> 2013-08-12 23:31:57.173067 7f57067fc700 10 obj contains manifest
    -7> 2013-08-12 23:31:57.173119 7f57067fc700  0 setting object tag=_6W4zC-Y2oVsOKcikO5POLwf3z0JxUqO
    -6> 2013-08-12 23:31:57.173190 7f57067fc700  1 -- 10.1.2.1:0/1053771 --> 10.1.2.1:6800/6695 -- osd_op(client.15196.0:145 .dir.7624.2 [call rgw.bucket_prepare_op] 10.8abdf842 e82) v4 -- ?+0 0x7f56c000dc70 con 0x160a520
    -5> 2013-08-12 23:31:57.207824 7f57403b3700  1 -- 10.1.2.1:0/1053771 <== osd.2 10.1.2.1:6800/6695 35 ==== osd_op_reply(145 .dir.7624.2 [call] ack = 0) v4 ==== 110+0+0 (3252951521 0 0) 0x7f57240009a0 con 0x160a520
    -4> 2013-08-12 23:31:57.207943 7f57067fc700  1 -- 10.1.2.1:0/1053771 --> 10.1.1.1:6801/2157 -- osd_op(client.15196.0:146 7624.2_3/demo.mp4 [cmpxattr user.rgw.idtag (33) op 1 mode 1,setxattr user.rgw.idtag (33),setxattr user.rgw.shadow_name (43),call refcount.put] 10.3d68e473 e82) v4 -- ?+0 0x7f56c000d940 con 0x7f5728002270
    -3> 2013-08-12 23:31:57.273423 7f57403b3700  1 -- 10.1.2.1:0/1053771 <== osd.0 10.1.1.1:6801/2157 47 ==== osd_op_reply(146 7624.2_3/demo.mp4 [cmpxattr (33) op 1 mode 1,setxattr (33),setxattr (43),call] ack = 0) v4 ==== 242+0+0 (3535126808 0 0) 0x7f5710000be0 con 0x7f5728002270
    -2> 2013-08-12 23:31:57.273562 7f57067fc700  1 -- 10.1.2.1:0/1053771 --> 10.1.2.1:6800/6695 -- osd_op(client.15196.0:147 .dir.7624.2 [call rgw.bucket_complete_op] 10.8abdf842 e82) v4 -- ?+0 0x7f56c000d830 con 0x160a520
    -1> 2013-08-12 23:31:57.273632 7f57067fc700  1 -- 10.1.2.1:0/1053771 --> 10.1.2.1:6800/6695 -- osd_op(client.15196.0:148 gc.16 [call rgw.gc_set_entry] 5.c2ff39f3 e82) v4 -- ?+0 0x7f56c0009f30 con 0x160a520
     0> 2013-08-12 23:31:57.275374 7f57067fc700 -1 *** Caught signal (Aborted) **
 in thread 7f57067fc700

 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
 1: radosgw() [0x4f1d7a]
 2: (()+0xfcb0) [0x7f5748552cb0]
 3: (gsignal()+0x35) [0x7f5746f5b425]
 4: (abort()+0x17b) [0x7f5746f5eb8b]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f57478ad69d]
 6: (()+0xb5846) [0x7f57478ab846]
 7: (()+0xb5873) [0x7f57478ab873]
 8: (()+0xb596e) [0x7f57478ab96e]
 9: (std::__throw_logic_error(char const*)+0x57) [0x7f5747858787]
 10: (char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag)+0xa9) [0x7f5747896509]
 11: (std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)+0x43) [0x7f57478965e3]
 12: (RGWDeleteMultiObj_ObjStore_S3::send_partial_response(std::pair<std::string, int>&)+0x4b) [0x4c6afb]
 13: (RGWDeleteMultiObj::execute()+0x346) [0x5d86f6]
 14: (RGWProcess::handle_request(RGWRequest*)+0x406) [0x4ec656]
 15: (RGWProcess::RGWWQ::_process(RGWRequest*)+0x36) [0x4ee2f6]
 16: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x50b246]
 17: (ThreadPool::WorkThread::entry()+0x10) [0x50d080]
 18: (()+0x7e9a) [0x7f574854ae9a]
 19: (clone()+0x6d) [0x7f5747018ccd]
Actions #6

Updated by Yehuda Sadeh over 10 years ago

  • Status changed from In Progress to Resolved
Actions #7

Updated by Artem Salpagarov over 10 years ago

I just wanted to confirm that the issue is gone. Thanks for a quick fix!

Actions #8

Updated by Moritz Krinke over 10 years ago

Stumbled upon the same problem using bobtail binaries for ubuntu precise (from http://ceph.com/debian-bobtail repo), any idea on when new binaries will be available?
Thanks!

Actions #9

Updated by ramon makkelie over 10 years ago

we have exactly the same problem
and i also upgraded to emperor because i thought this will fix the issue
but the issue still persists.

any help would be appreciated here

Actions #10

Updated by Artem Salpagarov over 10 years ago

I believe you need to open another bug with link to this one to get any feedback.

Actions

Also available in: Atom PDF