Project

General

Profile

Actions

Bug #56029

closed

rgw crash when use swift api

Added by jielei zhou almost 2 years ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Target version:
-
% Done:

100%

Source:
Tags:
backport_processed
Backport:
pacific
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi

My ceph version is 17.2.0 also rgw is the same version.
When I use swift api .if I set a X-Container-Meta-Web-Index: on a container. And access it. It will case rgw crash.
This is the log :

#swift post -m 'web-index:index.html' okamura-static-web
  1. curl http:// ***/swift/v1/AUTH_9c4a3213fc124afcb369c3b8246cc543/okamura-static-web/ -v
    • Trying ***:80...
    • TCP_NODELAY set
    • Connected to dev-swift-vip.i1.dev.v6.internal-gmo (***) port 80 (#0)

GET /swift/v1/AUTH_9c4a3213fc124afcb369c3b8246cc543/okamura-static-web/ HTTP/1.1
Host: dev-swift-vip.i1.dev.v6.internal-gmo
User-Agent: curl/7.68.0
Accept: */*

  • Empty reply from server
  • Connection #0 to host dev-swift-vip.i1.dev.v6.internal-gmo left intact
    curl: (52) Empty reply from server

first time it will return Empty reply
next request will connect refused

#curl http://*********** -v
  • Trying *****:80...
  • TCP_NODELAY set
  • connect to 172.22.35.94 port 80 failed: Connection refused
  • Failed to connect to dev-swift-vip.i1.dev.v6.internal-gmo port 80: Connection refused
  • Closing connection 0
    curl: (7) Failed to connect to dev-swift-vip.i1.dev.v6.internal-gmo port 80: Connection refused

in ceph rgw log. We will see thread is crashed

60> 2022-05-20T00:21:19.739+0000 7fe264a1c700  5 lifecycle: schedule life cycle next start time: Sat May 21 00:00:00 2022
-59> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 HTTP_ACCEPT=*/*
-58> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 HTTP_HOST=***********
-57> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 HTTP_USER_AGENT=curl/7.68.0
-56> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 HTTP_VERSION=1.1
-55> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 REMOTE_ADDR=***********
-54> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 REQUEST_METHOD=GET
-53> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 REQUEST_URI=/swift/v1/AUTH_9c4a3213fc124afcb369c3b8246cc543/okamura-static-web/
-52> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 SCRIPT_URI=/swift/v1/AUTH_9c4a3213fc124afcb369c3b8246cc543/okamura-static-web/
-51> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 SERVER_PORT=80
-50> 2022-05-20T00:21:20.294+0000 7fe2471e1700 1 ====== starting new request req=0x7fe2983db650 =====
-49> 2022-05-20T00:21:20.294+0000 7fe2471e1700 2 req 14058840826806468654 0.000000000s initializing for trans_id = tx00000c31b093ebb8f002e-006286df00-d605-dev-c3j1
-48> 2022-05-20T00:21:20.294+0000 7fe2471e1700 10 req 14058840826806468654 0.000000000s rgw api priority: s3=8 s3website=7
-47> 2022-05-20T00:21:20.294+0000 7fe2471e1700 10 req 14058840826806468654 0.000000000s host=dev-swift-vip.i1.dev.v6.internal-gmo
-46> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s subdomain= domain= in_hosted_domain=0 in_hosted_domain_s3website=0
-45> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s final domain/bucket subdomain= domain= in_hosted_domain=0 in_hosted_domain_s3website=0 s
>info.domain= s->info.request_uri=/swift/v1/AUTH_9c4a3213fc124afcb369c3b8246cc543/okamura-static-web/
44> 2022-05-20T00:21:20.294+0000 7fe2471e1700 10 req 14058840826806468654 0.000000000s ver=v1 first=okamura-static-web req=
-43> 2022-05-20T00:21:20.294+0000 7fe2471e1700 10 req 14058840826806468654 0.000000000s handler=28RGWHandler_REST_Bucket_SWIFT
-42> 2022-05-20T00:21:20.294+0000 7fe2471e1700 2 req 14058840826806468654 0.000000000s getting op 0
-41> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s get_system_obj_state: rctx=0x7fe2983da680 obj=dev-c3j1.rgw.log:script.prerequest. state=0x560db724e9a0 s
>prefetch_data=0
40> 2022-05-20T00:21:20.294+0000 7fe2471e1700 10 req 14058840826806468654 0.000000000s cache get: name=dev-c3j1.rgw.log++script.prerequest. : hit (negative entry)
-39> 2022-05-20T00:21:20.294+0000 7fe2471e1700 10 req 14058840826806468654 0.000000000s swift:list_bucket scheduling with throttler client=3 cost=1
-38> 2022-05-20T00:21:20.294+0000 7fe2471e1700 10 req 14058840826806468654 0.000000000s swift:list_bucket op=28RGWListBucket_ObjStore_SWIFT
-37> 2022-05-20T00:21:20.294+0000 7fe2471e1700 2 req 14058840826806468654 0.000000000s swift:list_bucket verifying requester
-36> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s swift:list_bucket rgw::auth::swift::DefaultStrategy: trying rgw::auth::swift::TempURLEngine
-35> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s swift:list_bucket rgw::auth::swift::TempURLEngine denied with reason=-13
-34> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s swift:list_bucket rgw::auth::swift::DefaultStrategy: trying rgw::auth::swift::SignedTokenEngine
-33> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s swift:list_bucket rgw::auth::swift::SignedTokenEngine denied with reason=-1
-32> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s swift:list_bucket rgw::auth::swift::DefaultStrategy: trying rgw::auth::keystone::TokenEngine
-31> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s swift:list_bucket rgw::auth::keystone::TokenEngine denied with reason=-13
-30> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s swift:list_bucket rgw::auth::swift::DefaultStrategy: trying rgw::auth::swift::SwiftAnonymousEngine
-29> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s swift:list_bucket rgw::auth::swift::SwiftAnonymousEngine granted access
-28> 2022-05-20T00:21:20.294+0000 7fe2471e1700 2 req 14058840826806468654 0.000000000s swift:list_bucket normalizing buckets and tenants
-27> 2022-05-20T00:21:20.294+0000 7fe2471e1700 10 req 14058840826806468654 0.000000000s s
>object=<NULL> s->bucket=9c4a3213fc124afcb369c3b8246cc543/okamura-static-web
26> 2022-05-20T00:21:20.294+0000 7fe2471e1700 2 req 14058840826806468654 0.000000000s swift:list_bucket init permissions
-25> 2022-05-20T00:21:20.294+0000 7fe2471e1700 20 req 14058840826806468654 0.000000000s swift:list_bucket get_system_obj_state: rctx=0x7fe2983da060 obj=dev-c3j1.rgw.meta:root:9c4a3213fc124afcb369c3b8246cc543/okamura-static-web state=0x560db724ee20 s
>prefetch_data=0
24> 2022-05-20T00:21:20.294+0000 7fe2471e1700 10 req 14058840826806468654 0.000000000s swift:list_bucket cache get: name=dev-c3j1.rgw.meta+root+9c4a3213fc124afcb369c3b8246cc543/okamura-static-web : miss
-23> 2022-05-20T00:21:20.294+0000 7fe2411d5700 10 req 14058840826806468654 0.000000000s swift:list_bucket cache put: name=dev-c3j1.rgw.meta+root+9c4a3213fc124afcb369c3b8246cc543/okamura-static-web info.flags=0x16
-22> 2022-05-20T00:21:20.294+0000 7fe2411d5700 10 req 14058840826806468654 0.000000000s swift:list_bucket adding dev-c3j1.rgw.meta+root+9c4a3213fc124afcb369c3b8246cc543/okamura-static-web to cache LRU end
-21> 2022-05-20T00:21:20.294+0000 7fe2411d5700 10 req 14058840826806468654 0.000000000s swift:list_bucket updating xattr: name=ceph.objclass.version bl.length()=42
-20> 2022-05-20T00:21:20.294+0000 7fe2411d5700 20 req 14058840826806468654 0.000000000s swift:list_bucket get_system_obj_state: s
>obj_tag was set empty
19> 2022-05-20T00:21:20.294+0000 7fe2411d5700 10 req 14058840826806468654 0.000000000s swift:list_bucket cache get: name=dev-c3j1.rgw.meta+root+9c4a3213fc124afcb369c3b8246cc543/okamura-static-web : type miss (requested=0x11, cached=0x16)
-18> 2022-05-20T00:21:20.294+0000 7fe2411d5700 20 req 14058840826806468654 0.000000000s swift:list_bucket rados
>read ofs=0 len=0
17> 2022-05-20T00:21:20.294+0000 7fe2409d4700 20 req 14058840826806468654 0.000000000s swift:list_bucket rados_obj.operate() r=0 bl.length=302
-16> 2022-05-20T00:21:20.294+0000 7fe2409d4700 10 req 14058840826806468654 0.000000000s swift:list_bucket cache put: name=dev-c3j1.rgw.meta+root+9c4a3213fc124afcb369c3b8246cc543/okamura-static-web info.flags=0x11
-15> 2022-05-20T00:21:20.294+0000 7fe2409d4700 10 req 14058840826806468654 0.000000000s swift:list_bucket moving dev-c3j1.rgw.meta+root+9c4a3213fc124afcb369c3b8246cc543/okamura-static-web to cache LRU end
-14> 2022-05-20T00:21:20.294+0000 7fe2409d4700 15 req 14058840826806468654 0.000000000s swift:list_bucket decode_policy Read AccessControlPolicy<AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>9c4a3213fc124afcb369c3b8246cc543$9c4a3213fc124afcb369c3b8246cc543</ID><DisplayName>demo</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>9c4a3213fc124afcb369c3b8246cc543$9c4a3213fc124afcb369c3b8246cc543</ID><DisplayName>demo</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy>
-13> 2022-05-20T00:21:20.294+0000 7fe2409d4700 20 req 14058840826806468654 0.000000000s swift:list_bucket get_system_obj_state: rctx=0x7fe2983d9ac8 obj=dev-c3j1.rgw.meta:users.uid:9c4a3213fc124afcb369c3b8246cc543$9c4a3213fc124afcb369c3b8246cc543 state=0x560db724f060 s
>prefetch_data=0
12> 2022-05-20T00:21:20.294+0000 7fe2409d4700 10 req 14058840826806468654 0.000000000s swift:list_bucket cache get: name=dev-c3j1.rgw.meta+users.uid+9c4a3213fc124afcb369c3b8246cc543$9c4a3213fc124afcb369c3b8246cc543 : hit (requested=0x16, cached=0x17)
-11> 2022-05-20T00:21:20.294+0000 7fe2409d4700 20 req 14058840826806468654 0.000000000s swift:list_bucket get_system_obj_state: s
>obj_tag was set empty
10> 2022-05-20T00:21:20.294+0000 7fe2409d4700 20 req 14058840826806468654 0.000000000s swift:list_bucket Read xattr: user.rgw.idtag
-9> 2022-05-20T00:21:20.294+0000 7fe2409d4700 10 req 14058840826806468654 0.000000000s swift:list_bucket cache get: name=dev-c3j1.rgw.meta+users.uid+9c4a3213fc124afcb369c3b8246cc543$9c4a3213fc124afcb369c3b8246cc543 : hit (requested=0x13, cached=0x17)
-8> 2022-05-20T00:21:20.294+0000 7fe2409d4700 20 req 14058840826806468654 0.000000000s swift:list_bucket RGWSI_User_RADOS::read_user_info(): anonymous user
-7> 2022-05-20T00:21:20.294+0000 7fe2409d4700 2 req 14058840826806468654 0.000000000s swift:list_bucket recalculating target
-6> 2022-05-20T00:21:20.294+0000 7fe2409d4700 10 req 14058840826806468654 0.000000000s Starting retarget
-5> 2022-05-20T00:21:20.294+0000 7fe2409d4700 20 req 14058840826806468654 0.000000000s get_obj_state: rctx=0x7fe2983da9e0 obj=okamura-static-web:index.html state=0x560db6f02de8 s
>prefetch_data=1
4> 2022-05-20T00:21:20.294+0000 7fe2409d4700 20 req 14058840826806468654 0.000000000s WARNING: blocking librados call
-3> 2022-05-20T00:21:20.294+0000 7fe2409d4700 10 req 14058840826806468654 0.000000000s manifest: total_size = 10
-2> 2022-05-20T00:21:20.294+0000 7fe2409d4700 20 req 14058840826806468654 0.000000000s get_obj_state: setting s
>obj_tag to 807b4a00-6279-4f24-8971-b1175e2f49c1.54526.10995817613273309006
-1> 2022-05-20T00:21:20.294+0000 7fe2409d4700 2 req 14058840826806468654 0.000000000s swift:get_obj reading permissions
0> 2022-05-20T00:21:20.298+0000 7fe2409d4700 -1 ** Caught signal (Segmentation fault) *
in thread 7fe2409d4700 thread_name:radosgw

ceph version 17.2.0 (43e2e60a7559d3f46c9d53f1ca875fd499a1e35e) quincy (stable)
1: /lib64/libpthread.so.0(+0x12ce0) [0x7fe294232ce0]
2: (rgw_bucket::rgw_bucket(rgw_bucket const&)+0x23) [0x7fe296f6f4a3]
3: (rgw::sal::RadosObject::set_atomic(RGWObjectCtx*) const+0x55) [0x7fe297473f25]
4: (rgw_build_object_policies(DoutPrefixProvider const*, rgw::sal::Store*, req_state*, bool, optional_yield)+0x1cb) [0x7fe2972e317b]
5: (RGWHandler::do_read_permissions(RGWOp*, bool, optional_yield)+0x56) [0x7fe29730f0c6]
6: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, rgw::sal::Store*, bool)+0x3c4) [0x7fe296f49fb4]
7: (process_request(rgw::sal::Store*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSink*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >, std::shared_ptr<RateLimiter>, int*)+0x2616) [0x7fe296f4d696]
8: /lib64/libradosgw.so.2(+0x65a00a) [0x7fe296eba00a]
9: /lib64/libradosgw.so.2(+0x65b6b1) [0x7fe296ebb6b1]
10: /lib64/libradosgw.so.2(+0x65b82c) [0x7fe296ebb82c]
11: make_fcontext()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 rbd_pwl
0/ 5 journaler
0/ 5 objectcacher
0/ 5 immutable_obj_cache
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 0 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
20/20 rgw
1/ 5 rgw_sync
1/ 5 rgw_datacache
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 fuse
2/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
1/ 5 prioritycache
0/ 5 test
0/ 5 cephfs_mirror
0/ 5 cephsqlite
0/ 5 seastore
0/ 5 seastore_onode
0/ 5 seastore_odata
0/ 5 seastore_omap
0/ 5 seastore_tm
0/ 5 seastore_cleaner
0/ 5 seastore_lba
0/ 5 seastore_cache
0/ 5 seastore_journal
0/ 5 seastore_device
0/ 5 alienstore
1/ 5 mclock
2/-2 (syslog threshold)
99/99 (stderr threshold)
--
pthread ID / name mapping for recent threads ---
7fe159005700 / safe_timer
7fe15a007700 / ms_dispatch
7fe15a808700 / ceph_timer
7fe15b80a700 / io_context_pool
7fe2409d4700 / radosgw
7fe2411d5700 / radosgw
7fe2471e1700 / radosgw
7fe24b1e9700 / radosgw
7fe2511f5700 / radosgw
7fe2519f6700 / radosgw
7fe2521f7700 / radosgw
7fe2529f8700 / radosgw
7fe2531f9700 / radosgw
7fe2539fa700 / radosgw
7fe2541fb700 / radosgw
7fe2549fc700 / radosgw
7fe2551fd700 / radosgw
7fe2559fe700 / radosgw
7fe2561ff700 / radosgw
7fe256a00700 / radosgw
7fe257201700 / radosgw
7fe259205700 / radosgw
7fe25e20f700 / notif-worker0
7fe25f211700 / rgw_reshard
7fe25fa12700 / rgw_user_st_syn
7fe260213700 / rgw_buck_st_syn
7fe260a14700 / lifecycle_thr_2
7fe262a18700 / lifecycle_thr_1
7fe264a1c700 / lifecycle_thr_0
7fe267221700 / http_manager
7fe268223700 / sync-log-trim
7fe268a24700 / http_manager
7fe26ba2a700 / http_manager
7fe26ca2c700 / rgw_obj_expirer
7fe26d22d700 / rgw_gc
7fe26e22f700 / safe_timer
7fe26f231700 / ms_dispatch
7fe270a34700 / io_context_pool
7fe271235700 / rgw_dt_lg_renew
7fe282257700 / safe_timer
7fe283259700 / ms_dispatch
7fe283a5a700 / ceph_timer
7fe284a5c700 / io_context_pool
7fe28625f700 / kmip worker
7fe286a60700 / http_manager
7fe288263700 / admin_socket
7fe288a64700 / service
7fe289265700 / msgr-worker-2
7fe289a66700 / msgr-worker-1
7fe28a267700 / msgr-worker-0
7fe2984265c0 / radosgw
max_recent 10000
max_new 10000
log_file /var/log/ceph/ceph-client.rgw.*izjtte.log
--- end dump of recent events ---


Related issues 1 (0 open1 closed)

Copied to rgw - Backport #56185: pacific: rgw crash when use swift apiResolvedCory SnyderActions
Actions #1

Updated by Casey Bodley almost 2 years ago

  • Assignee set to Daniel Gryniewicz
Actions #2

Updated by Casey Bodley almost 2 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 46719
Actions #3

Updated by Daniel Gryniewicz almost 2 years ago

jielei zhou, There should be a fix in the linked PR. Can you test it if you get a chance?

Actions #4

Updated by Daniel Gryniewicz almost 2 years ago

  • Backport set to pacific
Actions #5

Updated by Daniel Gryniewicz almost 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Backport Bot almost 2 years ago

Actions #7

Updated by jielei zhou almost 2 years ago

hi Daniel
We have do a test on it. It look fine. I think this debug is be solved. Thank you very much

Actions #8

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions #9

Updated by Konstantin Shalygin 7 months ago

  • Status changed from Pending Backport to Resolved
  • % Done changed from 0 to 100
Actions

Also available in: Atom PDF