Project

General

Profile

Bug #38470

Radosg-admin can't delete bucket and radosgw-admin gc process not working (or very slowly)

Added by hoan nv 7 months ago. Updated 6 months ago.

Status:
Verified
Priority:
Normal
Assignee:
Target version:
-
Start date:
02/25/2019
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Hi all.

I have a ceph cluster. After delete a bucket has 100M object, radosgw-admin gc list has 1419659 objects.

I run radosgw-admin gc process, it not reduce number objects.

radosgw-admin bucket rm --bucket=sample_bucket --purge-objects --debug-ms=1

2019-02-25 15:14:00.688 7fd8e7c02700  1 -- 172.24.9.51:0/3200411148 <== osd.125 172.24.8.54:6826/3486989 2 ==== osd_op_reply(108 17a4ce99-009e-40f2-a2d2-2afc218ebd9b.1427945.187_tienhv.jpg [stat,cmpxattr (32) op 1 mode 1,setxattr (14)] v25102'85540 uv85540 ondisk = 0) v8 ==== 287+0+0 (2785563410 0 0) 0x7fd8dc00f320 con 0x2b85560
2019-02-25 15:14:00.688 7fd8fa50d700  1 -- 172.24.9.51:0/3200411148 --> 172.24.8.21:6801/1844268 -- osd_op(unknown.0.0:109 16.61 16:86e31656:::.dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0:head [call rgw.guard_bucket_resharding,call rgw.bucket_unlink_instance] snapc 0=[] ondisk+write+known_if_redirected e25102) v8 -- 0x2b8efb0 con 0
2019-02-25 15:14:00.688 7fd8e8c04700  1 -- 172.24.9.51:0/3200411148 <== osd.29 172.24.8.21:6801/1844268 16 ==== osd_op_reply(109 .dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0 [call,call] v25102'4663291 uv4662993 ondisk = 0) v8 ==== 241+0+0 (2028355245 0 0) 0x7fd8e0014c10 con 0x7fd8e000c080
2019-02-25 15:14:00.688 7fd8fa50d700  1 -- 172.24.9.51:0/3200411148 --> 172.24.8.21:6801/1844268 -- osd_op(unknown.0.0:110 16.61 16:86e31656:::.dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0:head [call rgw.guard_bucket_resharding,call rgw.bucket_read_olh_log] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x2b87bc0 con 0
2019-02-25 15:14:00.688 7fd8e8c04700  1 -- 172.24.9.51:0/3200411148 <== osd.29 172.24.8.21:6801/1844268 17 ==== osd_op_reply(110 .dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0 [call,call] v0'0 uv4662993 ondisk = 0) v8 ==== 241+0+11 (1961857954 0 1993775135) 0x7fd8e00049e0 con 0x7fd8e000c080
2019-02-25 15:14:00.688 7fd8fa50d700  1 -- 172.24.9.51:0/3200411148 --> 172.24.8.21:6801/1844268 -- osd_op(unknown.0.0:111 16.61 16:86e31656:::.dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0:head [call rgw.bi_get] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x2b7af10 con 0
2019-02-25 15:14:00.692 7fd8e8c04700  1 -- 172.24.9.51:0/3200411148 <== osd.29 172.24.8.21:6801/1844268 18 ==== osd_op_reply(111 .dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0 [call] v0'0 uv4662993 ondisk = 0) v8 ==== 199+0+148 (1347967863 0 1998127041) 0x7fd8e00049e0 con 0x7fd8e000c080
2019-02-25 15:14:00.692 7fd8fa50d700  1 -- 172.24.9.51:0/3200411148 --> 172.24.8.53:6853/2229354 -- osd_op(unknown.0.0:112 18.9aa 18:5597dd5f:::17a4ce99-009e-40f2-a2d2-2afc218ebd9b.1427945.187_vtv.jpg:head [getxattrs,stat] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x2b7af10 con 0
2019-02-25 15:14:01.984 7fd8e8c04700  1 -- 172.24.9.51:0/3200411148 <== osd.85 172.24.8.53:6853/2229354 1 ==== osd_op_reply(112 17a4ce99-009e-40f2-a2d2-2afc218ebd9b.1427945.187_vtv.jpg [getxattrs,stat] v0'0 uv271611 ondisk = 0) v8 ==== 242+0+13659717 (2678204839 0 740394784) 0x7fd8e0014c10 con 0x2b87bc0
2019-02-25 15:14:02.300 7fd8fa50d700  1 -- 172.24.9.51:0/3200411148 --> 172.24.8.53:6853/2229354 -- osd_op(unknown.0.0:113 18.9aa 18:5597dd5f:::17a4ce99-009e-40f2-a2d2-2afc218ebd9b.1427945.187_vtv.jpg:head [stat,cmpxattr user.rgw.olh.idtag (32) op 1 mode 1,setxattr user.rgw.olh.pending.000000005c73a3cai8mixbj151q569dn (14)] snapc 0=[] ondisk+write+known_if_redirected e25102) v8 -- 0x5ace970 con 0
2019-02-25 15:14:02.580 7fd8e8c04700  1 -- 172.24.9.51:0/3200411148 <== osd.85 172.24.8.53:6853/2229354 2 ==== osd_op_reply(113 17a4ce99-009e-40f2-a2d2-2afc218ebd9b.1427945.187_vtv.jpg [stat,cmpxattr (32) op 1 mode 1,setxattr (14)] v25102'271612 uv271612 ondisk = 0) v8 ==== 284+0+0 (1268605200 0 0) 0x7fd8e0013cb0 con 0x2b87bc0
2019-02-25 15:14:02.580 7fd8fa50d700  1 -- 172.24.9.51:0/3200411148 --> 172.24.8.21:6801/1844268 -- osd_op(unknown.0.0:114 16.61 16:86e31656:::.dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0:head [call rgw.guard_bucket_resharding,call rgw.bucket_unlink_instance] snapc 0=[] ondisk+write+known_if_redirected e25102) v8 -- 0x5ace600 con 0
2019-02-25 15:14:02.584 7fd8e8c04700  1 -- 172.24.9.51:0/3200411148 <== osd.29 172.24.8.21:6801/1844268 19 ==== osd_op_reply(114 .dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0 [call,call] v25102'4663292 uv4662993 ondisk = 0) v8 ==== 241+0+0 (710074718 0 0) 0x7fd8e0013cb0 con 0x7fd8e000c080
2019-02-25 15:14:02.584 7fd8fa50d700  1 -- 172.24.9.51:0/3200411148 --> 172.24.8.21:6801/1844268 -- osd_op(unknown.0.0:115 16.61 16:86e31656:::.dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0:head [call rgw.guard_bucket_resharding,call rgw.bucket_read_olh_log] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x5acdf70 con 0
2019-02-25 15:14:02.584 7fd8e8c04700  1 -- 172.24.9.51:0/3200411148 <== osd.29 172.24.8.21:6801/1844268 20 ==== osd_op_reply(115 .dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0 [call,call] v0'0 uv4662993 ondisk = 0) v8 ==== 241+0+11 (1576923740 0 1993775135) 0x7fd8e0013cb0 con 0x7fd8e000c080
2019-02-25 15:14:02.652 7fd8fa50d700  1 -- 172.24.9.51:0/3200411148 --> 172.24.8.21:6801/1844268 -- osd_op(unknown.0.0:116 16.61 16:86e31656:::.dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0:head [call rgw.bucket_list] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x2b7af10 con 0
2019-02-25 15:14:02.668 7fd8e8c04700  1 -- 172.24.9.51:0/3200411148 <== osd.29 172.24.8.21:6801/1844268 21 ==== osd_op_reply(116 .dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0 [call] v0'0 uv4662993 ondisk = 0) v8 ==== 199+0+446 (1331265696 0 4111972205) 0x7fd8e0013cb0 con 0x7fd8e000c080
2019-02-25 15:14:02.668 7fd8fa50d700  1 -- 172.24.9.51:0/3200411148 --> 172.24.8.21:6801/1844268 -- osd_op(unknown.0.0:117 16.61 16:86e31656:::.dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0:head [call rgw.bi_get] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x2b7af10 con 0
2019-02-25 15:14:02.672 7fd8e8c04700  1 -- 172.24.9.51:0/3200411148 <== osd.29 172.24.8.21:6801/1844268 22 ==== osd_op_reply(117 .dir.17a4ce99-009e-40f2-a2d2-2afc218ebd9b.216060181.1.0 [call] v0'0 uv4662993 ondisk = 0) v8 ==== 199+0+148 (1347967863 0 1998127041) 0x7fd8e0013cb0 con 0x7fd8e000c080
2019-02-25 15:14:02.672 7fd8fa50d700  1 -- 172.24.9.51:0/3200411148 --> 172.24.8.53:6853/2229354 -- osd_op(unknown.0.0:118 18.9aa 18:5597dd5f:::17a4ce99-009e-40f2-a2d2-2afc218ebd9b.1427945.187_vtv.jpg:head [getxattrs,stat] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x2b7af10 con 0

radosgw-admin gc process --include-all --debug-ms=1

1) 0x7f16a002a840 con 0x1ff0fb0
2019-02-25 15:14:48.128 7f16b9b3c700  1 -- 172.24.9.51:0/438005962 --> 172.24.8.21:6806/1858502 -- osd_op(unknown.0.0:1149 15.d5 15:ab569f81:gc::gc.12:head [call rgw.gc_list] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x20aa8d0 con 0
2019-02-25 15:14:48.128 7f16a7a32700  1 -- 172.24.9.51:0/438005962 <== osd.4 172.24.8.21:6806/1858502 1042 ==== osd_op_reply(1149 gc.12 [call] v0'0 uv1093746 ondisk = 0) v8 ==== 149+0+6138 (3003091281 0 101532591) 0x7f16a002a840 con 0x1ff0fb0
2019-02-25 15:14:48.128 7f16b9b3c700  1 -- 172.24.9.51:0/438005962 --> 172.24.8.21:6806/1858502 -- osd_op(unknown.0.0:1150 15.d5 15:ab569f81:gc::gc.12:head [call rgw.gc_list] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x20ab2e0 con 0
2019-02-25 15:14:48.128 7f16a7a32700  1 -- 172.24.9.51:0/438005962 <== osd.4 172.24.8.21:6806/1858502 1043 ==== osd_op_reply(1150 gc.12 [call] v0'0 uv1093746 ondisk = 0) v8 ==== 149+0+6138 (3003091281 0 101532591) 0x7f16a002a840 con 0x1ff0fb0
2019-02-25 15:14:48.128 7f16b9b3c700  1 -- 172.24.9.51:0/438005962 --> 172.24.8.21:6806/1858502 -- osd_op(unknown.0.0:1151 15.d5 15:ab569f81:gc::gc.12:head [call rgw.gc_list] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x20aa8d0 con 0
2019-02-25 15:14:48.132 7f16a7a32700  1 -- 172.24.9.51:0/438005962 <== osd.4 172.24.8.21:6806/1858502 1044 ==== osd_op_reply(1151 gc.12 [call] v0'0 uv1093746 ondisk = 0) v8 ==== 149+0+6138 (3003091281 0 101532591) 0x7f16a002a840 con 0x1ff0fb0
2019-02-25 15:14:48.132 7f16b9b3c700  1 -- 172.24.9.51:0/438005962 --> 172.24.8.21:6806/1858502 -- osd_op(unknown.0.0:1152 15.d5 15:ab569f81:gc::gc.12:head [call rgw.gc_list] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x20ab2e0 con 0
2019-02-25 15:14:48.132 7f16a7a32700  1 -- 172.24.9.51:0/438005962 <== osd.4 172.24.8.21:6806/1858502 1045 ==== osd_op_reply(1152 gc.12 [call] v0'0 uv1093746 ondisk = 0) v8 ==== 149+0+6138 (3003091281 0 101532591) 0x7f16a002a840 con 0x1ff0fb0
2019-02-25 15:14:48.132 7f16b9b3c700  1 -- 172.24.9.51:0/438005962 --> 172.24.8.21:6806/1858502 -- osd_op(unknown.0.0:1153 15.d5 15:ab569f81:gc::gc.12:head [call rgw.gc_list] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x20aa8d0 con 0
2019-02-25 15:14:48.132 7f16a7a32700  1 -- 172.24.9.51:0/438005962 <== osd.4 172.24.8.21:6806/1858502 1046 ==== osd_op_reply(1153 gc.12 [call] v0'0 uv1093746 ondisk = 0) v8 ==== 149+0+6138 (3003091281 0 101532591) 0x7f16a002a840 con 0x1ff0fb0
2019-02-25 15:14:48.132 7f16b9b3c700  1 -- 172.24.9.51:0/438005962 --> 172.24.8.21:6806/1858502 -- osd_op(unknown.0.0:1154 15.d5 15:ab569f81:gc::gc.12:head [call rgw.gc_list] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x20ab2e0 con 0
2019-02-25 15:14:48.136 7f16a7a32700  1 -- 172.24.9.51:0/438005962 <== osd.4 172.24.8.21:6806/1858502 1047 ==== osd_op_reply(1154 gc.12 [call] v0'0 uv1093746 ondisk = 0) v8 ==== 149+0+6138 (3003091281 0 101532591) 0x7f16a002a840 con 0x1ff0fb0
2019-02-25 15:14:48.136 7f16b9b3c700  1 -- 172.24.9.51:0/438005962 --> 172.24.8.21:6806/1858502 -- osd_op(unknown.0.0:1155 15.d5 15:ab569f81:gc::gc.12:head [call rgw.gc_list] snapc 0=[] ondisk+read+known_if_redirected e25102) v8 -- 0x20aa8d0 con 0

I think 2 problem is ralated.

Thanks.


Related issues

Related to rgw - Bug #38134: rgw: `radosgw-admin bucket rm ... --purge-objects` can hang... Resolved 01/31/2019
Related to rgw - Bug #38454: rgw: gc entries with zero-length chains are not cleaned up Pending Backport 02/22/2019

History

#1 Updated by Abhishek Lekshmanan 7 months ago

Can you post teh radosgw-admin gc list --include-all output. Also note another fix in this area https://tracker.ceph.com/issues/38454

#2 Updated by Casey Bodley 7 months ago

  • Related to Bug #38134: rgw: `radosgw-admin bucket rm ... --purge-objects` can hang... added

#3 Updated by Casey Bodley 7 months ago

The bucket rm issue looks like the one we fixed in http://tracker.ceph.com/issues/38134

#4 Updated by hoan nv 7 months ago

radosgw-admin gc list --include-all out is nothing. It just return 0.
no gc log deleted.

#5 Updated by Casey Bodley 6 months ago

  • Related to Bug #38454: rgw: gc entries with zero-length chains are not cleaned up added

#6 Updated by Casey Bodley 6 months ago

  • Status changed from New to Verified
  • Assignee set to Eric Ivancich

it looks like http://tracker.ceph.com/issues/38454 was causing infinite loops in gc process, so i think that's what you're seeing here

#7 Updated by Eric Ivancich 6 months ago

@hoan nv:

Once the backport of http://tracker.ceph.com/issues/38713 is complete, would you kindly test it? This seems likely to be a duplicate.

#8 Updated by hoan nv 6 months ago

Eric Ivancich wrote:

@hoan nv:

Once the backport of http://tracker.ceph.com/issues/38713 is complete, would you kindly test it? This seems likely to be a duplicate.

Yes.I am planning to update ceph.

Thanks.

#9 Updated by hoan nv 6 months ago

i updated my cluster to 13.2.5

issue https://tracker.ceph.com/issues/38134 not same my issue

delete bucket log :

2019-03-31 22:34:36.797 7fd2164d0300 10 RGWRados::cls_bucket_list_ordered: got _multipart_multipart.mp4.2~yFVvAfSuyxKFdpkSXI2fTBKJJM5ip_2.1[]
2019-03-31 22:34:36.797 7fd2164d0300 10 RGWRados::cls_bucket_list_ordered: got _multipart_multipart.mp4.2~yFVvAfSuyxKFdpkSXI2fTBKJJM5ip_2.meta[]
2019-03-31 22:34:36.817 7fd2164d0300  0 WARNING : aborted 17 incomplete multipart uploads
2019-03-31 22:34:36.817 7fd2164d0300 -1 ERROR: unable to remove bucket(2009) Unknown error 2009
2019-03-31 22:34:36.817 7fd2164d0300 20 remove_watcher() i=0
2019-03-31 22:34:36.817 7fd2164d0300  2 removed watcher, disabling cache

Also available in: Atom PDF