Actions
Bug #18412
closedmultisite: use after free in RGWCloneMetaLogCoroutine::state_read_shard_status()
Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
<kind>InvalidWrite</kind> <what>Invalid write of size 4</what> <stack> <frame> <ip>0x3B9976</ip> <obj>/usr/bin/radosgw</obj> <fn>finish</fn> <dir>/usr/src/debug/ceph-11.1.0-6257-gd4000dd/src/rgw</dir> <file>rgw_metadata.cc</file> <line>205</line> </frame> <frame> <ip>0x3B9976</ip> <obj>/usr/bin/radosgw</obj> <fn>_mdlog_info_completion(void*, void*)</fn> <dir>/usr/src/debug/ceph-11.1.0-6257-gd4000dd/src/rgw</dir> <file>rgw_metadata.cc</file> <line>228</line> </frame> ... <auxwhat>Address 0x731263c8 is 1,464 bytes inside a block of size 1,528 free'd</auxwhat> <stack> <frame> <ip>0x98B418D</ip> <obj>/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so</obj> <fn>operator delete(void*)</fn> <dir>/builddir/build/BUILD/valgrind-3.11.0/coregrind/m_replacemalloc</dir> <file>vg_replace_malloc.c</file> <line>576</line> </frame> <frame> <ip>0x5760E2</ip> <obj>/usr/bin/radosgw</obj> <fn>RGWCloneMetaLogCoroutine::~RGWCloneMetaLogCoroutine()</fn> <dir>/usr/src/debug/ceph-11.1.0-6257-gd4000dd/src/rgw</dir> <file>rgw_sync.cc</file> <line>1220</line> </frame> ... <auxwhat>Block was alloc'd at</auxwhat> <stack> <frame> <ip>0x98B3203</ip> <obj>/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so</obj> <fn>operator new(unsigned long)</fn> <dir>/builddir/build/BUILD/valgrind-3.11.0/coregrind/m_replacemalloc</dir> <file>vg_replace_malloc.c</file> <line>334</line> </frame> <frame> <ip>0x5810F1</ip> <obj>/usr/bin/radosgw</obj> <fn>RGWMetaSyncShardCR::incremental_sync()</fn> <dir>/usr/src/debug/ceph-11.1.0-6257-gd4000dd/src/rgw</dir> <file>rgw_sync.cc</file> <line>1562</line> </frame>
this coroutine was interrupted by shutdown, according to http://qa-proxy.ceph.com/teuthology/sage-2016-12-26_18:32:29-rgw-wip-sage-testing---basic-smithi/668021/remote/smithi031/log/rgw.client.1.log.gz:
2016-12-26 22:38:40.701980 34a62700 20 cr:s=0x181698a0:op=0x73125e10:24RGWCloneMetaLogCoroutine: operate() 2016-12-26 22:38:40.702042 34a62700 20 meta sync: operate: shard_id=24: reading shard status 2016-12-26 22:38:40.703443 34a62700 20 run: stack=0x181698a0 is io blocked 2016-12-26 22:38:40.703515 34a62700 20 cr:s=0x22f864c0:op=0x72e5cf30:18RGWMetaSyncShardCR: operate() 2016-12-26 22:38:40.703573 34a62700 20 meta sync: incremental_sync:1571: shard_id=61 mdlog_marker= sync_marker.marker= 2016-12-26 22:38:40.703645 34a62700 20 meta sync: incremental_sync:1603: shard_id=61 mdlog_marker= max_marker= sync_marker.marker= period_marker= 2016-12-26 22:38:40.703763 34a62700 20 run: stack=0x22f864c0 is io blocked 2016-12-26 22:38:40.711414 34a62700 0 ERROR: failed to clone shard, completion_mgr.get_next() returned ret=-125 2016-12-26 22:38:40.712671 34a62700 5 run(): was stopped, exiting 2016-12-26 22:38:40.713967 34a62700 20 clearing stack on run() exit: stack=0x18131230 nref=1 2016-12-26 22:38:40.715969 34a62700 20 clearing stack on run() exit: stack=0x18137780 nref=3 2016-12-26 22:38:40.716024 34a62700 20 clearing stack on run() exit: stack=0x18137d30 nref=3 2016-12-26 22:38:40.716064 34a62700 20 clearing stack on run() exit: stack=0x18138b10 nref=3 2016-12-26 22:38:40.716102 34a62700 20 clearing stack on run() exit: stack=0x18139890 nref=3
RGWCloneMetaLogCoroutine::state_read_shard_status() is calling RGWMetadataLog::get_info_async(), and passing pointers to some of its member variables. RGWMetadataLogInfoCompletion stores these pointers and dereferences them on completion. But nothing appears to be holding a reference on RGWCloneMetaLogCoroutine to do this safely.
Updated by Casey Bodley over 7 years ago
- Subject changed from multisite: invalid read in RGWCloneMetaLogCoroutine::state_read_shard_status() to multisite: use after free in RGWCloneMetaLogCoroutine::state_read_shard_status()
Updated by Casey Bodley over 7 years ago
merged to master in https://github.com/ceph/ceph/pull/12605
backported to kraken in https://github.com/ceph/ceph/pull/12949
Updated by Abhishek Lekshmanan over 7 years ago
- Status changed from New to Pending Backport
Updated by Nathan Cutler about 7 years ago
- Copied to Backport #18613: kraken: multisite: use after free in RGWCloneMetaLogCoroutine::state_read_shard_status() added
Updated by Nathan Cutler about 7 years ago
- Status changed from Pending Backport to Resolved
Actions