Project

General

Profile

Actions

Bug #27048

open

s3aws: RequestError: send request failed after upgrade to mimic

Added by Dieter Roels over 5 years ago. Updated over 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We have a multisite setup running luminous which is primarily used as the storage for our docker registry. Recently we upgraded a test cluster to mimic. The upgrade was successful, everything is healthy, but the docker registry no longer works. With s3cmd everything is ok, but when the docker registry tries to access its s3 backend it gives a unexpected EOF:

time="2018-08-22T09:24:27.900996761Z" level=error msg="response completed with error" auth.user.name=admin err.code="manifest unknown" err.detail="s3aws: RequestError: send request failed\ncaused by: Get https://<loadbalancerIP>/tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link: EOF" err.message="manifest unknown" go.version=go1.7.6

We see the request on the loadbalancer, we see the request coming in on the radosgw server itself with tcpdump, but we do not see the request at all in radosgw logs, even with "debug rgw = 20". We are running civetweb with ssl, so we cannot see what civetweb/radosgw is returning from the tcpdumps, but when we run strace on the radosgw process we see the connection from the loadbalancer. So radosgw receives the request, but logs nothing.

Any tips on how we can debug this issue further?

Actions #1

Updated by Dieter Roels over 5 years ago

I found this other issue that looks like the same problem: http://tracker.ceph.com/issues/23489

That would explain why s3cmd works (relative paths) and docker registry does not (absolute paths). It made sense, so I tested by adding the loadbalancer dns names to the "hostnames" of the zonegroup. This broke the multisite sync. (failed to retrieve sync info: (13) Permission denied)

After also adding the dns names of all the radosgw instances to "hostnames" of the zonegroup the sync works again, but the requests from docker registry are still being aborted and still nothing in the radosgw logs...

Actions #2

Updated by Dieter Roels over 5 years ago

In the meantime we were able to do further testing and manually crafted a GET request on the exact same object, with absolute path. This works. So it is not an issue with absolute versus relative paths imo. Afaik it is only docker with s3aws go library that gets its connections closed immediately.

2018-08-23 10:31:10.645 7ffb4b47b700 20 CONTENT_TYPE=application/x-compressed-tar
2018-08-23 10:31:10.645 7ffb4b47b700 20 HTTP_ACCEPT=*/*
2018-08-23 10:31:10.645 7ffb4b47b700 20 HTTP_AUTHORIZATION=AWS 7Z2GY04RZZAGF0B39URY:YQJwrJqgKMz4C3mOaLajaBR841o=
2018-08-23 10:31:10.645 7ffb4b47b700 20 HTTP_DATE=Thu, 23 Aug 2018 08:31:10 +0000
2018-08-23 10:31:10.645 7ffb4b47b700 20 HTTP_HOST=s3.lb-vip.com
2018-08-23 10:31:10.645 7ffb4b47b700 20 HTTP_USER_AGENT=curl/7.29.0
2018-08-23 10:31:10.645 7ffb4b47b700 20 HTTP_VERSION=1.1
2018-08-23 10:31:10.645 7ffb4b47b700 20 REMOTE_ADDR=10.5.10.181
2018-08-23 10:31:10.645 7ffb4b47b700 20 REQUEST_METHOD=GET
2018-08-23 10:31:10.645 7ffb4b47b700 20 REQUEST_URI=/tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link
2018-08-23 10:31:10.645 7ffb4b47b700 20 SCRIPT_URI=/tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link
2018-08-23 10:31:10.645 7ffb4b47b700 20 SERVER_PORT=443
2018-08-23 10:31:10.645 7ffb4b47b700 20 SERVER_PORT_SECURE=443
2018-08-23 10:31:10.645 7ffb4b47b700 1 ====== starting new request req=0x7ffb4b472850 =====
2018-08-23 10:31:10.645 7ffb4b47b700 2 req 13419:0.000030::GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link::initializing for trans_id = tx00000000000000000346b-005b7e70ce-1a4fa4-bccl-tda
2018-08-23 10:31:10.645 7ffb4b47b700 10 rgw api priority: s3=5 s3website=4
2018-08-23 10:31:10.645 7ffb4b47b700 10 host=s3.lb-vip.com
2018-08-23 10:31:10.645 7ffb4b47b700 20 subdomain= domain=s3.lb-vip.com in_hosted_domain=1 in_hosted_domain_s3website=0
2018-08-23 10:31:10.645 7ffb4b47b700 20 final domain/bucket subdomain= domain=s3.lb-vip.com in_hosted_domain=1 in_hosted_domain_s3website=0 s->info.domain=s3.lb-vip.com s->info.request_uri=/tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link
2018-08-23 10:31:10.645 7ffb4b47b700 20 get_handler handler=22RGWHandler_REST_Obj_S3
2018-08-23 10:31:10.645 7ffb4b47b700 10 handler=22RGWHandler_REST_Obj_S3
2018-08-23 10:31:10.645 7ffb4b47b700 2 req 13419:0.000115:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link::getting op 0
2018-08-23 10:31:10.645 7ffb4b47b700 10 op=21RGWGetObj_ObjStore_S3
2018-08-23 10:31:10.645 7ffb4b47b700 2 req 13419:0.000159:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:verifying requester
2018-08-23 10:31:10.645 7ffb4b47b700 20 rgw::auth::StrategyRegistry::s3_main_strategy_t: trying rgw::auth::s3::AWSAuthStrategy
2018-08-23 10:31:10.645 7ffb4b47b700 20 rgw::auth::s3::AWSAuthStrategy: trying rgw::auth::s3::S3AnonymousEngine
2018-08-23 10:31:10.645 7ffb4b47b700 20 rgw::auth::s3::S3AnonymousEngine denied with reason=-1
2018-08-23 10:31:10.645 7ffb4b47b700 20 rgw::auth::s3::AWSAuthStrategy: trying rgw::auth::s3::LocalEngine
2018-08-23 10:31:10.645 7ffb4b47b700 10 get_canon_resource(): dest=/tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link
2018-08-23 10:31:10.645 7ffb4b47b700 10 string_to_sign:
GET

application/x-compressed-tar
Thu, 23 Aug 2018 08:31:10 +0000
/tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link
2018-08-23 10:31:10.645 7ffb4b47b700 15 string_to_sign=GET

application/x-compressed-tar
Thu, 23 Aug 2018 08:31:10 +0000
/tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link
2018-08-23 10:31:10.645 7ffb4b47b700 15 server signature=YQJwrJqgKMz4C3mOaLajaBR841o=
2018-08-23 10:31:10.645 7ffb4b47b700 15 client signature=YQJwrJqgKMz4C3mOaLajaBR841o=
2018-08-23 10:31:10.645 7ffb4b47b700 15 compare=0
2018-08-23 10:31:10.645 7ffb4b47b700 20 rgw::auth::s3::LocalEngine granted access
2018-08-23 10:31:10.645 7ffb4b47b700 20 rgw::auth::s3::AWSAuthStrategy granted access
2018-08-23 10:31:10.645 7ffb4b47b700 2 req 13419:0.000363:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:normalizing buckets and tenants
2018-08-23 10:31:10.645 7ffb4b47b700 10 s->object=docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link s->bucket=tda-registry
2018-08-23 10:31:10.645 7ffb4b47b700 2 req 13419:0.000376:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:init permissions
2018-08-23 10:31:10.645 7ffb4b47b700 15 decode_policy Read AccessControlPolicy<AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/&quot;&gt;&lt;Owner&gt;&lt;ID&gt;testuser&lt;/ID&gt;&lt;DisplayName&gt;User for testing s3 access</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance&quot; xsi:type="CanonicalUser"><ID>testuser</ID><DisplayName>User for testing s3 access</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy>
2018-08-23 10:31:10.646 7ffb4b47b700 2 req 13419:0.000517:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:recalculating target
2018-08-23 10:31:10.646 7ffb4b47b700 2 req 13419:0.000521:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:reading permissions
2018-08-23 10:31:10.646 7ffb4b47b700 20 get_obj_state: rctx=0x7ffb4b472010 obj=tda-registry:docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link state=0x55a25774f8a0 s->prefetch_data=1
2018-08-23 10:31:10.648 7ffb4b47b700 10 manifest: total_size = 71
2018-08-23 10:31:10.648 7ffb4b47b700 20 get_obj_state: setting s->obj_tag to 127d514d-4028-4b64-a66b-2040e9372548.1575093.323123
2018-08-23 10:31:10.648 7ffb4b47b700 15 decode_policy Read AccessControlPolicy<AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/&quot;&gt;&lt;Owner&gt;&lt;ID&gt;testuser&lt;/ID&gt;&lt;DisplayName&gt;User for testing s3 access</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance&quot; xsi:type="CanonicalUser"><ID>testuser</ID><DisplayName>User for testing s3 access</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy>
2018-08-23 10:31:10.648 7ffb4b47b700 2 req 13419:0.002803:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:init op
2018-08-23 10:31:10.648 7ffb4b47b700 2 req 13419:0.002811:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:verifying op mask
2018-08-23 10:31:10.648 7ffb4b47b700 20 required_mask= 1 user.op_mask=7
2018-08-23 10:31:10.648 7ffb4b47b700 2 req 13419:0.002814:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:verifying op permissions
2018-08-23 10:31:10.648 7ffb4b47b700 20 -- Getting permissions begin with perm_mask=49
2018-08-23 10:31:10.648 7ffb4b47b700 5 Searching permissions for identity=rgw::auth::SysReqApplier > rgw::auth::LocalApplier(acct_user=testuser, acct_name=User for testing s3 access, subuser=, perm_mask=15, is_admin=0) mask=49
2018-08-23 10:31:10.648 7ffb4b47b700 5 Searching permissions for uid=testuser
2018-08-23 10:31:10.648 7ffb4b47b700 5 Found permission: 15
2018-08-23 10:31:10.648 7ffb4b47b700 5 Searching permissions for group=1 mask=49
2018-08-23 10:31:10.648 7ffb4b47b700 5 Permissions for group not found
2018-08-23 10:31:10.648 7ffb4b47b700 5 Searching permissions for group=2 mask=49
2018-08-23 10:31:10.648 7ffb4b47b700 5 Permissions for group not found
2018-08-23 10:31:10.648 7ffb4b47b700 5 -
Getting permissions done for identity=rgw::auth::SysReqApplier > rgw::auth::LocalApplier(acct_user=testuser, acct_name=User for testing s3 access, subuser=, perm_mask=15, is_admin=0), owner=testuser, perm=1
2018-08-23 10:31:10.648 7ffb4b47b700 10 identity=rgw::auth::SysReqApplier -> rgw::auth::LocalApplier(acct_user=testuser, acct_name=User for testing s3 access, subuser=, perm_mask=15, is_admin=0) requested perm (type)=1, policy perm=1, user_perm_mask=15, acl perm=1
2018-08-23 10:31:10.648 7ffb4b47b700 2 req 13419:0.002854:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:verifying op params
2018-08-23 10:31:10.648 7ffb4b47b700 2 req 13419:0.002856:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:pre-executing
2018-08-23 10:31:10.648 7ffb4b47b700 2 req 13419:0.002859:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:executing
2018-08-23 10:31:10.648 7ffb4b47b700 20 get_obj_state: rctx=0x7ffb4b472010 obj=tda-registry:docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link state=0x55a25774f8a0 s
>prefetch_data=1
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.acl
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.content_type
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.etag
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.idtag
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.manifest
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.pg_ver
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.source_zone
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.tail_tag
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.x-amz-acl
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.x-amz-content-sha256
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.x-amz-date
2018-08-23 10:31:10.648 7ffb4b47b700 20 Read xattr: user.rgw.x-amz-storage-class
2018-08-23 10:31:10.648 7ffb4b47b700 15 Encryption mode:
2018-08-23 10:31:10.648 7ffb4b47b700 20 get_obj_state: rctx=0x7ffb4b472010 obj=tda-registry:docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link state=0x55a25774f8a0 s->prefetch_data=1
2018-08-23 10:31:10.648 7ffb4b47b700 2 req 13419:0.003120:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:completing
2018-08-23 10:31:10.648 7ffb4b47b700 2 req 13419:0.003148:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:op status=0
2018-08-23 10:31:10.648 7ffb4b47b700 2 req 13419:0.003151:s3:GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link:get_obj:http status=200
2018-08-23 10:31:10.648 7ffb4b47b700 1 ====== req done req=0x7ffb4b472850 op status=0 http_status=200 ======
2018-08-23 10:31:10.648 7ffb4b47b700 1 civetweb: 0x55a256fde9d8: 10.5.10.181 - - [23/Aug/2018:10:31:10 +0200] "GET /tda-registry/docker/registry/v2/repositories/pnu/kubectl-download/_manifests/tags/v1.11.2/current/link HTTP/1.1" 200 367 - curl/7.29.0

Actions #3

Updated by Dieter Roels over 5 years ago

To debug this further we need t oenable debug logging in civetweb but it seems there is no way of doing this? debug civetweb = 20 in the config does not change anything, and it seems a lot of the "rgw frontends" options are overwritten/overruled as well... How can we get the embedded civetweb to log why it is dropping these connections?

Actions #4

Updated by Dieter Roels over 5 years ago

So, I was totally wrong in comment #2. It is an absoluteURI issue. My curl did not do an absoluteURI obviously...

It turns out there is a bug in civetweb, and there is already a PR for this: https://github.com/ceph/civetweb/pull/27 (tough I believe it is best fixed in civetweb upstream)

So I'm really hoping this makes it to mimic 13.2.2 since we would like to upgrade and use the radosgw almost exclusively for docker registry....

Actions #5

Updated by Dieter Roels over 5 years ago

Turns out this is already fixed in upstream civetweb: https://github.com/civetweb/civetweb/pull/540/files

I hope this can be included in the next mimic release so ceph can handle absoluteURI's. Maybe we can add a test well since this is not the first time we have an issue with absoluteURI if I recall correctly.

Actions

Also available in: Atom PDF