Bug #21401
closedrgw: Missing error handling when gen_rand_alphanumeric is failing
0%
Description
The function gen_rand_alphanumeric()
tries to read some randomness from /dev/urandom
and converts it into a string. The read operation may fail (e.g. with "Too many open files") and there will be a negative error code returned.
int gen_rand_alphanumeric(CephContext *cct, char *dest, int size) /* size should be the required string size + 1 */
{
int ret = get_random_bytes(dest, size);
if (ret < 0) {
lderr(cct) << "cannot get random bytes: " << cpp_strerror(-ret) << dendl;
return ret;
}
...
}
The consuming function append_rand_alpha()
however does not check the return code, it uses the uninitialized char buf
and appends that to the result string.
static inline void append_rand_alpha(CephContext *cct, const string& src, string& dest, int len)
{
dest = src;
char buf[len + 1];
gen_rand_alphanumeric(cct, buf, len);
dest.append("_");
dest.append(buf);
}
As a result, when this happens while an object is being copied, we see its tag and prefix fields containing garbage instead of the expected 24 character string. In particular the prefix field seems to always contain ".P_"
now, leading to collisions for tail object names and in the long run to data loss, as a second objects tail objects will now overwrite those of the first object.
Originally found in v0.94.10 but the code looks still the same in master.
Updated by Jens Harbott over 6 years ago
In order to reproduce, just make /dev/urandom
inaccessible for some time, e.g. run (on the rgw node):
mv /dev/urandom /dev/blah;sleep 60; mv /dev/blah /dev/urandom
During these 60 seconds now run on a client node:
$ dd if=/dev/urandom of=test01 count=520 bs=1024 520+0 records in 520+0 records out 532480 bytes (532 kB) copied, 0.044453 s, 12.0 MB/s $ dd if=/dev/urandom of=test02 count=520 bs=1024 520+0 records in 520+0 records out 532480 bytes (532 kB) copied, 0.0425381 s, 12.5 MB/s $ s3cmd put test01 s3://test1/atest01 test01 -> s3://test1/atest01 [1 of 1] 532480 of 532480 100% in 0s 3.57 MB/s done $ s3cmd put test02 s3://test1/atest02 test02 -> s3://test1/atest02 [1 of 1] 532480 of 532480 100% in 0s 3.91 MB/s done
Now the first object is corrupted, as its tail object has been overwritten by the second PUT.
$ s3cmd get s3://test1/atest01 atest01 s3://test1/atest01 -> atest01 [1 of 1] 532480 of 532480 100% in 0s 24.54 MB/s done WARNING: MD5 signatures do not match: computed=949048077122cc83f883a72b690db3cb, received="c9212fd208c64173a7b4b1e057f5b752"
One will also get truncated objects when one of these objects is removed, see http://tracker.ceph.com/issues/20107 and http://tracker.ceph.com/issues/20166 for related issues.
Updated by Casey Bodley over 6 years ago
- Status changed from New to 12
- Assignee set to Casey Bodley
Updated by Casey Bodley over 6 years ago
- Status changed from 12 to Fix Under Review
Updated by Abhishek Lekshmanan over 6 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport set to Luminous
Jewel backports may need a seperate fix
Updated by Abhishek Lekshmanan over 6 years ago
- Copied to Backport #21851: luminous: rgw: Missing error handling when gen_rand_alphanumeric is failing added
Updated by Nathan Cutler over 6 years ago
- Backport changed from Luminous to luminous
Updated by Nathan Cutler over 6 years ago
- Related to Bug #22006: RGWCrashError: RGW will crash when generating random bucket name and object name during loadgen process added
Updated by Nathan Cutler over 6 years ago
Note that the luminous backport is non-trivial because random number generation has been reworked in master.
Updated by Casey Bodley almost 6 years ago
- Related to Bug #22225: rgw:socket leak in s3 multi part upload added
Updated by Nathan Cutler over 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".