Project

General

Profile

Actions

Bug #38373

closed

multisite: rgw_data_sync_status json decode failure breaks automated datalog trimming

Added by Casey Bodley about 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
multisite
Backport:
luminous mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

the logic for automated datalog trimming queries the rgw_data_sync_status from peer zones. this request succeeds with 200 OK, but the json fails to decode with -EINVAL

2019-02-18 17:46:35.664 7f3ca89c3f00 10 data trim: fetching sync status for zone a42a6a8d-fdcb-469e-a159-09c27f61deae                            [1329/980936]
2019-02-18 17:46:35.664 7f3ca89c3f00 20 data trim: query sync status from b1f3d387-87df-4532-9681-e6d52c388cf7                                               
2019-02-18 17:46:35.664 7f3b7dffb700 20 reqs_thread_entry: start                                                                                             
2019-02-18 17:46:35.664 7f3ca89c3f00 20 cr:s=0x55d7e2189640:op=0x55d7e23fb880:21RGWReadRESTResourceCRI20rgw_data_sync_statusE: operate()                     
2019-02-18 17:46:35.664 7f3ca89c3f00 20 cr:s=0x55d7e22517f0:op=0x55d7e23fafa0:13DataLogTrimCR: operate()                                                     
2019-02-18 17:46:35.664 7f3ca89c3f00 20 run: stack=0x55d7e22517f0 is_blocked_by_stack()=0 is_sleeping=0 waiting_for_child()=1                                
2019-02-18 17:46:35.664 7f3ca89c3f00 20 cr:s=0x55d7e2189640:op=0x55d7e23fb880:21RGWReadRESTResourceCRI20rgw_data_sync_statusE: operate()                     
2019-02-18 17:46:35.664 7f3ca89c3f00 20 > HTTP_DATE -> Mon, 18 Feb 2019 22:46:35 +0000                                                                       
2019-02-18 17:46:35.664 7f3ca89c3f00 10 get_canon_resource(): dest=/admin/log/                                                                               
2019-02-18 17:46:35.664 7f3ca89c3f00 10 generated canonical header: GET                                                                                      

Mon, 18 Feb 2019 22:46:35 +0000                                                                                                                              
/admin/log/                                                                                                                                                  
2019-02-18 17:46:35.665 7f3ca89c3f00 15 generated auth header: AWS DiPt4V7WWvy2njL1z6aC:72RVsmf1bwcRjgsv+RqpYjAdAq0=                                         
2019-02-18 17:46:35.665 7f3ca89c3f00 20 sending request to http://localhost:8001/admin/log/?type=data&status&source-zone=a42a6a8d-fdcb-469e-a159-09c27f61deae&
rgwx-zonegroup=cd63f686-0a7d-40c1-8837-893d57e6316a
2019-02-18 17:46:35.665 7f3ca89c3f00 20 register_request mgr=0x7ffc5259a450 req_data->id=0, curl_handle=0x55d7e23fc130                                       
2019-02-18 17:46:35.665 7f3ca89c3f00 20 run: stack=0x55d7e2189640 is io blocked
2019-02-18 17:46:35.665 7f3b7dffb700 20 link_request req_data=0x55d7e23fed10 req_data->id=0, curl_handle=0x55d7e23fc130                                      
2019-02-18 17:46:35.668 7f3b7dffb700 10 receive_http_header
2019-02-18 17:46:35.668 7f3b7dffb700 10 received header:HTTP/1.1 200 OK
2019-02-18 17:46:35.668 7f3b7dffb700 10 receive_http_header
2019-02-18 17:46:35.668 7f3b7dffb700 10 received header:x-amz-request-id: tx000000000000000000249-005c6b35cb-101b-na-2                                       
2019-02-18 17:46:35.668 7f3b7dffb700 10 receive_http_header
2019-02-18 17:46:35.668 7f3b7dffb700 10 received header:Content-Length: 240
2019-02-18 17:46:35.668 7f3b7dffb700 10 receive_http_header
2019-02-18 17:46:35.668 7f3b7dffb700 10 received header:Date: Mon, 18 Feb 2019 22:46:35 GMT                                                                  
2019-02-18 17:46:35.668 7f3b7dffb700 10 receive_http_header
2019-02-18 17:46:35.668 7f3b7dffb700 10 received header:Connection: Keep-Alive
2019-02-18 17:46:35.668 7f3b7dffb700 10 receive_http_header
2019-02-18 17:46:35.668 7f3b7dffb700 10 received header:
2019-02-18 17:46:35.668 7f3ca89c3f00 20 cr:s=0x55d7e2189640:op=0x55d7e23fb880:21RGWReadRESTResourceCRI20rgw_data_sync_statusE: operate()                     
2019-02-18 17:46:35.670 7f3ca89c3f00 20 cr:s=0x55d7e2189640:op=0x55d7e23fb880:21RGWReadRESTResourceCRI20rgw_data_sync_statusE: operate() returned r=-22      
2019-02-18 17:46:35.670 7f3ca89c3f00 15 stack 0x55d7e2189640 end
2019-02-18 17:46:35.670 7f3ca89c3f00 20 stack->operate() returned ret=-22
2019-02-18 17:46:35.670 7f3ca89c3f00 20 run: stack=0x55d7e2189640 is done
2019-02-18 17:46:35.670 7f3ca89c3f00 20 cr:s=0x55d7e22517f0:op=0x55d7e23fafa0:13DataLogTrimCR: operate()                                                     
2019-02-18 17:46:35.670 7f3ca89c3f00 20 cr:s=0x55d7e22517f0:op=0x55d7e23fafa0:13DataLogTrimCR: operate()                                                     
2019-02-18 17:46:35.670 7f3ca89c3f00  4 data trim: failed to fetch sync status from all peers

https://github.com/ceph/ceph/pull/26494


Related issues 2 (0 open2 closed)

Copied to rgw - Backport #38412: luminous: multisite: rgw_data_sync_status json decode failure breaks automated datalog trimmingResolvedCasey BodleyActions
Copied to rgw - Backport #38413: mimic: multisite: rgw_data_sync_status json decode failure breaks automated datalog trimmingResolvedPrashant DActions
Actions #1

Updated by Casey Bodley about 5 years ago

  • Description updated (diff)
  • Status changed from In Progress to Fix Under Review
Actions #2

Updated by Casey Bodley about 5 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #3

Updated by Casey Bodley about 5 years ago

  • Copied to Backport #38412: luminous: multisite: rgw_data_sync_status json decode failure breaks automated datalog trimming added
Actions #4

Updated by Casey Bodley about 5 years ago

  • Copied to Backport #38413: mimic: multisite: rgw_data_sync_status json decode failure breaks automated datalog trimming added
Actions #5

Updated by Nathan Cutler almost 5 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF