Project

General

Profile

Actions

Bug #61588

closed

radosgw-admin: System Attributes on Objects can cause object stat to dump invalid JSON

Added by Tom Coldrick 11 months ago. Updated 9 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rgw
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

While attempting to write a python script around radosgw-admin object stat, I found that I was occasionally receiving UnicodeDecodeError exceptions when loading the output as JSON, such as:

UnicodeDecodeError: 'utf-8' codec cannot decode byte 0xe4 in position 4802: invalid continuation byte

Tracking down the offending bytes, I noticed that they are always in the attrs section, and are found in the values of user.rgw.pg_ver and user.rgw.source_zone. Looking through the source, it looks like these are uint64_t and uint32_t objects, which is perfectly reasonable for the underlying data.

The trouble is that when radosgw-admin dumps these attributes it assumes that they're all stringly typed, calling ceph::buffer::list::to_str(), which seems to whack all the contained bytes into a string. As the underlying data are integer types, there's no guarantee that this will form a valid UTF-8 encoded string, and even when it does the output is not really what we want.

I think the easiest solution here is to add special handling for these attrs alongside other system attributes that are already treated differently, such as etag or delete_at. I'm happy to make that change. I think any fix is going to require special handling like this, as there isn't a way to inspect the serialised type of an arbitrary bufferlist at runtime.

Actions #1

Updated by Casey Bodley 11 months ago

  • Status changed from New to Triaged
  • Assignee set to Tom Coldrick

Tom Coldrick wrote:

I think the easiest solution here is to add special handling for these attrs alongside other system attributes that are already treated differently, such as etag or delete_at. I'm happy to make that change. I think any fix is going to require special handling like this, as there isn't a way to inspect the serialised type of an arbitrary bufferlist at runtime.

agreed, that sounds good

Actions #2

Updated by Casey Bodley 11 months ago

  • Status changed from Triaged to Fix Under Review
  • Pull request ID set to 52007
Actions #3

Updated by Casey Bodley 9 months ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF