Project

General

Profile

Actions

Bug #51429

closed

radosgw-admin bi list fails with Input/Output error

Added by Tejas C almost 3 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
High
Target version:
% Done:

0%

Source:
Tags:
Backport:
octopus, pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ceph version 17.0.0-5278-g79eb0c85 (79eb0c853ca1ee491410e0c6c6796675a7449ef9) quincy (dev)

Steps:
- On a versioned bucket start IO from both directions on objects 1 to 300k (same objects from both directions)
- around 250k written from primary and 225k from secondary
- Try a bi list from any site , fails with IO error , and subsequent bucket reshard also fails with the same .
- I dont know if this makes a difference , but rgw_max_objs_per_shard is set to 500 and 300 on primary and secondary respectively.

/]# radosgw-admin bi list --bucket ver1 >o1
ERROR: bi_list(): (5) Input/output error

/]# radosgw-admin bucket stats --bucket ver1 {
"bucket": "ver1",
"num_shards": 11,
"tenant": "",
"zonegroup": "1785a4fa-f7d6-4081-8c72-74f9cc441d3a",
"placement_rule": "default-placement",
"explicit_placement": {
"data_pool": "",
"data_extra_pool": "",
"index_pool": ""
},
"id": "b76458b4-55bc-41b4-8610-b4aa5df49661.25021.1",
"marker": "b76458b4-55bc-41b4-8610-b4aa5df49661.25021.1",
"index_type": "Normal",
"owner": "test1",
"ver": "0#86569,1#86697,2#86661,3#86303,4#86693,5#86707,6#86397,7#86947,8#86609,9#86757,10#86708",
"master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0",
"mtime": "0.000000",
"creation_time": "2021-06-28T14:27:35.268133Z",
"max_marker": "0#00000086568.173215.11,1#00000086696.173471.11,2#00000086660.173399.11,3#00000086302.172683.11,4#00000086692.173463.11,5#00000086706.173491.11,6#00000086396.172871.11,7#00000086946.173971.11,8#00000086608.173295.11,9#00000086756.173591.11,10#00000086707.173493.11",
"usage": {
"rgw.main": {
"size": 5510454152,
"size_actual": 5855453184,
"size_utilized": 5510454152,
"size_kb": 5381303,
"size_kb_actual": 5718216,
"size_kb_utilized": 5381303,
"num_objects": 476518
}
},
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
}
}

Command with debug rgw:

/]# radosgw-admin bi list --bucket ver1 >o1 --debug_rgw=20
2021-06-29T13:14:27.795+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.799+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=46
2021-06-29T13:14:27.799+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.799+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=105
2021-06-29T13:14:27.799+0000 7f4dbb042340 20 realm india 6d34d4f8-671c-4cf0-a6b7-48e1fa21fde7
2021-06-29T13:14:27.799+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.800+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=105
2021-06-29T13:14:27.800+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.801+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=10
2021-06-29T13:14:27.801+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.801+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=1093
2021-06-29T13:14:27.801+0000 7f4dbb042340 20 current period 253b44c3-aa4d-441e-bbb9-0e6c9e6f73a6
2021-06-29T13:14:27.801+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.803+0000 7f4dbb042340 20 rados_obj.operate() r=-2 bl.length=0
2021-06-29T13:14:27.803+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.803+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=46
2021-06-29T13:14:27.803+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.804+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=105
2021-06-29T13:14:27.804+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.805+0000 7f4dbb042340 20 rados_obj.operate() r=-2 bl.length=0
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got periods.253b44c3-aa4d-441e-bbb9-0e6c9e6f73a6.2
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got zonegroups_names.south
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got realms_names.india
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got period_config.6d34d4f8-671c-4cf0-a6b7-48e1fa21fde7
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got realms.6d34d4f8-671c-4cf0-a6b7-48e1fa21fde7
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got periods.253b44c3-aa4d-441e-bbb9-0e6c9e6f73a6.latest_epoch
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got realms.6d34d4f8-671c-4cf0-a6b7-48e1fa21fde7.control
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got default.realm
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got periods.6d34d4f8-671c-4cf0-a6b7-48e1fa21fde7:staging
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got zone_info.b76458b4-55bc-41b4-8610-b4aa5df49661
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got default.zone.6d34d4f8-671c-4cf0-a6b7-48e1fa21fde7
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got zone_names.blr
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got default.zonegroup.6d34d4f8-671c-4cf0-a6b7-48e1fa21fde7
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got periods.253b44c3-aa4d-441e-bbb9-0e6c9e6f73a6.1
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got periods.bf3e33fa-e935-4a68-8234-2912c6719db1.1
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got zonegroup_info.1785a4fa-f7d6-4081-8c72-74f9cc441d3a
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got periods.253b44c3-aa4d-441e-bbb9-0e6c9e6f73a6.3
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got periods.bf3e33fa-e935-4a68-8234-2912c6719db1.latest_epoch
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 RGWRados::pool_iterate: got periods.253b44c3-aa4d-441e-bbb9-0e6c9e6f73a6.4
2021-06-29T13:14:27.844+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.845+0000 7f4dbb042340 20 rados_obj.operate() r=-2 bl.length=0
2021-06-29T13:14:27.845+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.845+0000 7f4dbb042340 20 rados_obj.operate() r=-2 bl.length=0
2021-06-29T13:14:27.845+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.846+0000 7f4dbb042340 20 rados_obj.operate() r=-2 bl.length=0
2021-06-29T13:14:27.846+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.846+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=46
2021-06-29T13:14:27.846+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.846+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=105
2021-06-29T13:14:27.846+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.847+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=46
2021-06-29T13:14:27.847+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.847+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=610
2021-06-29T13:14:27.847+0000 7f4dbb042340 20 period zonegroup init ret 0
2021-06-29T13:14:27.847+0000 7f4dbb042340 20 period zonegroup name south
2021-06-29T13:14:27.847+0000 7f4dbb042340 20 using current period zonegroup south
2021-06-29T13:14:27.847+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.848+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=46
2021-06-29T13:14:27.848+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.848+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=105
2021-06-29T13:14:27.848+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.848+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=46
2021-06-29T13:14:27.848+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.849+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=829
2021-06-29T13:14:27.849+0000 7f4dbb042340 10 Cannot find current period zone using local zone
2021-06-29T13:14:27.849+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.849+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=829
2021-06-29T13:14:27.849+0000 7f4dbb042340 20 zone blr found
2021-06-29T13:14:27.849+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.850+0000 7f4dbb042340 20 rados_obj.operate() r=-2 bl.length=0
2021-06-29T13:14:27.850+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.850+0000 7f4dbb042340 20 rados_obj.operate() r=-2 bl.length=0
2021-06-29T13:14:27.850+0000 7f4dbb042340 20 init(): pipe_map (zone=b6caae11-7796-4265-8133-374bf1992f5a bucket=*): adding potential pipe: {id=all,s={b=*,z=},d={b=*,z=}}
2021-06-29T13:14:27.850+0000 7f4dbb042340 20 reflect(): flow manager (bucket=): adding source pipe: {s={b=*,z=b76458b4-55bc-41b4-8610-b4aa5df49661,az=0},d={b=*,z=b6caae11-7796-4265-8133-374bf1992f5a,az=0}}
2021-06-29T13:14:27.850+0000 7f4dbb042340 20 reflect(): flow manager (bucket=): adding dest pipe: {s={b=*,z=b6caae11-7796-4265-8133-374bf1992f5a,az=0},d={b=*,z=b76458b4-55bc-41b4-8610-b4aa5df49661,az=0}}
2021-06-29T13:14:27.850+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.851+0000 7f4dbb042340 20 rados_obj.operate() r=-2 bl.length=0
2021-06-29T13:14:27.851+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.851+0000 7f4dbb042340 20 rados_obj.operate() r=-2 bl.length=0
2021-06-29T13:14:27.851+0000 7f4dbb042340 20 init(): pipe_map (zone=b76458b4-55bc-41b4-8610-b4aa5df49661 bucket=*): adding potential pipe: {id=all,s={b=*,z=},d={b=*,z=}}
2021-06-29T13:14:27.851+0000 7f4dbb042340 20 reflect(): flow manager (bucket=): adding source pipe: {s={b=*,z=b6caae11-7796-4265-8133-374bf1992f5a,az=0},d={b=*,z=b76458b4-55bc-41b4-8610-b4aa5df49661,az=0}}
2021-06-29T13:14:27.851+0000 7f4dbb042340 20 reflect(): flow manager (bucket=): adding dest pipe: {s={b=*,z=b76458b4-55bc-41b4-8610-b4aa5df49661,az=0},d={b=*,z=b6caae11-7796-4265-8133-374bf1992f5a,az=0}}
2021-06-29T13:14:27.851+0000 7f4dbb042340 20 reflect(): flow manager (bucket=): adding source pipe: {s={b=*,z=b6caae11-7796-4265-8133-374bf1992f5a,az=0},d={b=*,z=b76458b4-55bc-41b4-8610-b4aa5df49661,az=0}}
2021-06-29T13:14:27.851+0000 7f4dbb042340 20 reflect(): flow manager (bucket=): adding dest pipe: {s={b=*,z=b76458b4-55bc-41b4-8610-b4aa5df49661,az=0},d={b=*,z=b6caae11-7796-4265-8133-374bf1992f5a,az=0}}
2021-06-29T13:14:27.851+0000 7f4dbb042340 20 started sync module instance, tier type =
2021-06-29T13:14:27.851+0000 7f4dbb042340 20 generating connection object for zone pune id b6caae11-7796-4265-8133-374bf1992f5a
2021-06-29T13:14:27.851+0000 7f4dbb042340 20 started zone id=b76458b4-55bc-41b4-8610-b4aa5df49661 (name=blr) with tier type =
2021-06-29T13:14:27.900+0000 7f4dbb042340 20 add_watcher() i=0
2021-06-29T13:14:27.900+0000 7f4dbb042340 20 add_watcher() i=1
2021-06-29T13:14:27.900+0000 7f4dbb042340 20 add_watcher() i=2
2021-06-29T13:14:27.900+0000 7f4dbb042340 20 add_watcher() i=3
2021-06-29T13:14:27.900+0000 7f4dbb042340 20 add_watcher() i=4
2021-06-29T13:14:27.900+0000 7f4dbb042340 20 add_watcher() i=5
2021-06-29T13:14:27.900+0000 7f4dbb042340 20 add_watcher() i=6
2021-06-29T13:14:27.900+0000 7f4dbb042340 20 add_watcher() i=7
2021-06-29T13:14:27.900+0000 7f4dbb042340 2 all 8 watchers are set, enabling cache
2021-06-29T13:14:27.901+0000 7f4d857e2700 2 rgw data changes log: RGWDataChangesLog::ChangesRenewThread: start
2021-06-29T13:14:27.902+0000 7f4dbb042340 20 check_secure_mon_conn(): auth registy supported: methods=[2,1] modes=[2,1]
2021-06-29T13:14:27.902+0000 7f4dbb042340 20 check_secure_mon_conn(): method 1 is insecure
2021-06-29T13:14:27.910+0000 7f4dbb042340 5 note: GC not initialized
2021-06-29T13:14:27.913+0000 7f4cf3fff700 20 reqs_thread_entry: start
2021-06-29T13:14:27.913+0000 7f4cf1ffb700 20 reqs_thread_entry: start
2021-06-29T13:14:27.913+0000 7f4dbb042340 20 init_complete bucket index max shards: 11
2021-06-29T13:14:27.914+0000 7f4dbb042340 10 rgw notify: Started notification manager with: 1 workers
2021-06-29T13:14:27.914+0000 7f4cdffff700 20 rgw notify: INFO: next queues processing will happen at: Tue Jun 29 13:14:58 2021

2021-06-29T13:14:27.914+0000 7f4cdf7fe700 20 reqs_thread_entry: start
2021-06-29T13:14:27.914+0000 7f4dbb042340 20 get_system_obj_state: rctx=0x7fffc5ef8430 obj=blr.rgw.meta:root:ver1 state=0x55768941cec0 s->prefetch_data=0
2021-06-29T13:14:27.914+0000 7f4dbb042340 20 get_system_obj_state: s->obj_tag was set empty
2021-06-29T13:14:27.914+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.915+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=170
2021-06-29T13:14:27.915+0000 7f4dbb042340 20 get_system_obj_state: rctx=0x7fffc5ef8430 obj=blr.rgw.meta:root:.bucket.meta.ver1:b76458b4-55bc-41b4-8610-b4aa5df49661.25021.1 state=0x5576893f29f0 s->prefetch_data=0
2021-06-29T13:14:27.915+0000 7f4dbb042340 20 get_system_obj_state: s->obj_tag was set empty
2021-06-29T13:14:27.915+0000 7f4dbb042340 20 Read xattr: user.rgw.acl
2021-06-29T13:14:27.915+0000 7f4dbb042340 20 Read xattr: user.rgw.idtag
2021-06-29T13:14:27.915+0000 7f4dbb042340 20 rados->read ofs=0 len=0
2021-06-29T13:14:27.916+0000 7f4dbb042340 20 rados_obj.operate() r=0 bl.length=355
2021-06-29T13:14:27.916+0000 7f4dbb042340 20 couldn't put binfo cache entry, might have raced with data changes
2021-06-29T13:14:27.916+0000 7f4dbb042340 20 bucket index object: blr.rgw.buckets.index:.dir.b76458b4-55bc-41b4-8610-b4aa5df49661.25021.1.0
ERROR: bi_list(): (5) Input/output error
2021-06-29T13:14:32.468+0000 7f4dbb042340 20 remove_watcher() i=0
2021-06-29T13:14:32.468+0000 7f4dbb042340 2 removed watcher, disabling cache
2021-06-29T13:14:32.470+0000 7f4dbb042340 20 remove_watcher() i=1
2021-06-29T13:14:32.471+0000 7f4dbb042340 20 remove_watcher() i=2
2021-06-29T13:14:32.473+0000 7f4dbb042340 20 remove_watcher() i=3
2021-06-29T13:14:32.475+0000 7f4dbb042340 20 remove_watcher() i=4
2021-06-29T13:14:32.476+0000 7f4dbb042340 20 remove_watcher() i=5
2021-06-29T13:14:32.478+0000 7f4dbb042340 20 remove_watcher() i=6
2021-06-29T13:14:32.480+0000 7f4dbb042340 20 remove_watcher() i=7
2021-06-29T13:14:32.481+0000 7f4d857e2700 2 rgw data changes log: RGWDataChangesLog::ChangesRenewThread: start


Related issues 4 (0 open4 closed)

Related to rgw - Bug #53856: rgw: fix bucket index list minor calculation bugResolvedJ. Eric Ivancich

Actions
Has duplicate rgw - Bug #51732: rgw: `radosgw-admin bi list ...` can result in an I/O ErrorDuplicateJ. Eric Ivancich

Actions
Copied to rgw - Backport #52009: pacific: radosgw-admin bi list fails with Input/Output error ResolvedDan van der SterActions
Copied to rgw - Backport #52010: octopus: radosgw-admin bi list fails with Input/Output error ResolvedDan van der SterActions
Actions #1

Updated by Casey Bodley almost 3 years ago

  • Tags set to multisite-reshard
Actions #2

Updated by J. Eric Ivancich over 2 years ago

  • Status changed from New to Fix Under Review
  • Assignee set to J. Eric Ivancich
  • Tags deleted (multisite-reshard)
  • Backport set to octopus, pacific
  • Pull request ID set to 42404
Actions #3

Updated by J. Eric Ivancich over 2 years ago

  • Has duplicate Bug #51732: rgw: `radosgw-admin bi list ...` can result in an I/O Error added
Actions #4

Updated by J. Eric Ivancich over 2 years ago

  • Priority changed from Normal to High
Actions #5

Updated by Casey Bodley over 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Backport Bot over 2 years ago

  • Copied to Backport #52009: pacific: radosgw-admin bi list fails with Input/Output error added
Actions #7

Updated by Backport Bot over 2 years ago

  • Copied to Backport #52010: octopus: radosgw-admin bi list fails with Input/Output error added
Actions #8

Updated by BenoƮt Knecht over 2 years ago

I'm running Ceph Nautilus 14.2.22, and I have a bucket that needs to be resharded (dynamic resharding is enabled):

```
[root@gp3-c2mon-01 ~]# radosgw-admin reshard list
[ {
"time": "2021-08-26 08:02:37.065125Z",
"tenant": "test",
"bucket_name": "backup",
"bucket_id": "48b537f5-42a9-474b-af75-63f7ff290e4f.82983516.2",
"new_instance_id": "",
"old_num_shards": 256,
"new_num_shards": 663
}
]
```

However, it hasn't been resharded for several days now, and two of the three RadosGW instances keep logging

```
RGWReshardLock::lock found lock on reshard.0000000006 to be held by another RGW process; skipping for now
```

so it looks like the first RadosGW is stuck. Looking into the logs, I found these on all three RadosGW instances:

```
Aug 26 05:03:49 mon-01 radosgw6481: 2021-08-26 05:03:49.610 7fc27ddb9700 -1 WARNING: The bucket info cache is inconsistent. This is a failure that should be debugged. I am a nice machine, so I will try to recover.
Aug 26 05:03:49 mon-01 radosgw6481: 2021-08-26 05:03:49.610 7fc27ddb9700 -1 WARNING: The OSD has the same version I have. Something may have gone squirrelly. An administrator may have forced a change; otherwise there is a problem somewhere.
Aug 26 05:13:49 mon-01 radosgw6481: 2021-08-26 05:13:49.635 7fc2de67a700 -1 ERROR: bi_list(): (5) Input/output error
Aug 26 05:23:49 mon-01 radosgw6481: 2021-08-26 05:23:49.769 7fc2de67a700 -1 ERROR: bi_list(): (5) Input/output error
Aug 26 05:33:49 mon-01 radosgw6481: 2021-08-26 05:33:49.696 7fc2de67a700 -1 ERROR: bi_list(): (5) Input/output error
Aug 26 05:43:50 mon-01 radosgw6481: 2021-08-26 05:43:50.174 7fc2de67a700 -1 ERROR: bi_list(): (5) Input/output error
Aug 26 06:03:50 mon-01 radosgw6481: 2021-08-26 06:03:50.188 7fc2de67a700 -1 ERROR: bi_list(): (5) Input/output error
```

To me, it looks like I'm hitting this bug, and that it's preventing resharding because it needs `bi_list()` to succeed (and hopefully it's also what's causing the inconsistent cache).

I would like to try and reproduce this issue on another cluster (with a smaller bucket), but I'm not sure what kind of entries are causing this in the first place. Any advice? I should mention that versioning is enabled on the bucket that is exhibiting this issue, in case it makes a difference.

Also, given that Nautilus is EOL and won't be getting a fix for this issue, is there any workaround I could try?

Actions #9

Updated by Vidushi Mishra over 2 years ago

Issue not seen on ceph version 17.0.0-8051-g15b54dc9

Actions #10

Updated by Justin Goetz over 2 years ago

We're seeing the same issue in Octopus, on 15.2.14:

2021-12-13T00:37:25.551+0000 7faba2f4d700 -1 WARNING: The bucket info cache is inconsistent. This is a failure that should be debugged. I am a nice machine, so I will try to recover.
2021-12-13T00:37:25.551+0000 7fab8a71c700 -1 WARNING: The OSD has the same version I have. Something may have gone squirrelly. An administrator may have forced a change; otherwise there is a problem somewhere.
7fac0dffb700 -1 ERROR: bi_list(): (5) Input/output error

My resharding processes have been stuck for quite a while now as well, I assume due to the bi_list bug.

Has anyone been able to find a workaround on Octopus by chance?

Actions #11

Updated by Gilles Mocellin over 2 years ago

We also have this problem with Octopus 15.2.15.
No resharding since then, and LARGE OMAP Objects that can not go.

Really seems that bi_list() is the culprit.

Here, I can easily trigger it with :

root -> radosgw-admin bi list --uid XXX --tenant XXX --bucket mybigbucket >/dev/null
ERROR: bi_list(): (5) Input/output error
Actions #12

Updated by J. Eric Ivancich over 2 years ago

  • Related to Bug #53856: rgw: fix bucket index list minor calculation bug added
Actions #13

Updated by Pawel Stefanski about 2 years ago

Tried with Pacific backport found here https://github.com/ceph/ceph/pull/44815
but still has issues with "ERROR: bi_list(): (5) Input/output error"

2022-02-04T09:05:23.856+0000 7fa88f263200 20  bucket index oid: default.rgw.buckets.index:.dir.aab60bea-af65-4343-a5e8-ac14aa8ade17.6370.1
ERROR: bi_list(): (5) Input/output error

and

rados -p default.rgw.buckets.index listomapkeys .dir.aab60bea-af65-4343-a5e8-ac14aa8ade17.6370.1 | hexdump
0000000 0000 0000 0000 c104 0a2d
000000a

ceph version 16.2.7-314-g57584c702b7

Actions #14

Updated by J. Eric Ivancich about 2 years ago

Pawel Stefanski wrote:

Tried with Pacific backport found here https://github.com/ceph/ceph/pull/44815
but still has issues with "ERROR: bi_list(): (5) Input/output error"

Thanks for reporting this!! Are you able to tell me what entries you put in the bucket and how many shards there are so I can reproduce?

Actions #15

Updated by Justin Goetz about 2 years ago

We attempted to apply the backport as well (although on Octopus, 15.2.14). It appeared the issue was still present as well.

During the test, we observed some interesting key points:

  • We've only been able to replicate this error on our versioned (immutable) buckets.
  • For our test, we created 4 buckets with versioning enabled, and began to fill them with a large volume of data quickly.
  • When the buckets grew and RADOS began to auto reshard the buckets, the bi_list() error began, and resharding hung indefinitely.
  • Repeating the above test with non-versioned buckets does not appear to invoke the bi_list() error.

J. Eric Ivancich wrote:

Pawel Stefanski wrote:

Tried with Pacific backport found here https://github.com/ceph/ceph/pull/44815
but still has issues with "ERROR: bi_list(): (5) Input/output error"

Thanks for reporting this!! Are you able to tell me what entries you put in the bucket and how many shards there are so I can reproduce?

For our tests, we were generating random data, however we've also seen this behavior in the real world with Veeam object data. It appears all of our buckets started out with the default 11 shards. When resharding began is when the bi_list() errors began to show in the log (we found this bug when our reshard processes were hanging, and we narrowed down the cause to the bi_list() errors).

Please do let me know if there is more-specific data you'd need that might help pin this down further, or if we can be of any assistance with testing further. Thanks!

Actions #16

Updated by Pawel Stefanski about 2 years ago

J. Eric Ivancich wrote:

Pawel Stefanski wrote:

Tried with Pacific backport found here https://github.com/ceph/ceph/pull/44815
but still has issues with "ERROR: bi_list(): (5) Input/output error"

Thanks for reporting this!! Are you able to tell me what entries you put in the bucket and how many shards there are so I can reproduce?

Unfortunately I can't say what's inside, most probably there are some ordinary pdf files. So I'm very surprised about those keys, I've tried to rebuild the bucket index, but after 24h I've lost my hope.

Metadata seems to be simple, there is default placement rule and no shards on this bucket.

radosgw-admin metadata get bucket.instance:401284785f899fc2adcb4785f899fc2adcb/aklla:aab60bea-af65-4343-a5e8-ac14aa8ade17.6370.1 | grep shard
"num_shards": 0,
"bi_shard_hash_type": 0,
"reshard_status": 0,

Actions #17

Updated by Dan van der Ster about 2 years ago

After updating to 15.2.15 we now have at least one bucket which we cannot list (the user is trying to rclone sync the whole bucket, but it gets stuck as I described in https://tracker.ceph.com/issues/51462#note-7).
While debugging I also noticed I get "ERROR: bi_list(): (5) Input/output error" like in this ticket.
I've compiled the related PRs for octopus here: https://github.com/ceph/ceph/pull/45088
If it looks complete and safe to the dev we'll try it on our cluster.

Actions #18

Updated by Dan van der Ster about 2 years ago

Dan van der Ster wrote:

After updating to 15.2.15 we now have at least one bucket which we cannot list (the user is trying to rclone sync the whole bucket, but it gets stuck as I described in https://tracker.ceph.com/issues/51462#note-7).
While debugging I also noticed I get "ERROR: bi_list(): (5) Input/output error" like in this ticket.
I've compiled the related PRs for octopus here: https://github.com/ceph/ceph/pull/45088
If it looks complete and safe to the dev we'll try it on our cluster.

https://github.com/ceph/ceph/pull/45088 fixes the listing issues on our octopus cluster, yay!
The pacific equivalent is at https://github.com/ceph/ceph/pull/45087, if someone would like to try.

Actions #19

Updated by Pawel Stefanski about 2 years ago

Dan van der Ster wrote:

Dan van der Ster wrote:

After updating to 15.2.15 we now have at least one bucket which we cannot list (the user is trying to rclone sync the whole bucket, but it gets stuck as I described in https://tracker.ceph.com/issues/51462#note-7).
While debugging I also noticed I get "ERROR: bi_list(): (5) Input/output error" like in this ticket.
I've compiled the related PRs for octopus here: https://github.com/ceph/ceph/pull/45088
If it looks complete and safe to the dev we'll try it on our cluster.

https://github.com/ceph/ceph/pull/45088 fixes the listing issues on our octopus cluster, yay!
The pacific equivalent is at https://github.com/ceph/ceph/pull/45087, if someone would like to try.

Thank you for this PR, tried at our case on Pacific. But unfortunately doesn't work.

Log:
https://paste.opendev.org/show/bau4ghMIz7VqpM93ceaB/

So this case is not covered yet.

Actions #20

Updated by Pawel Stefanski about 2 years ago

Pawel Stefanski wrote:

Dan van der Ster wrote:
The pacific equivalent is at https://github.com/ceph/ceph/pull/45087, if someone would like to try.

Thank you for this PR, tried at our case on Pacific. But unfortunately doesn't work.

Log:
https://paste.opendev.org/show/bau4ghMIz7VqpM93ceaB/

So this case is not covered yet.

Found this in OSD log

2022-03-16T12:37:37.992+0000 7fd7be90f700 0 <cls> /build/ceph-16.2.7/src/cls/rgw/cls_rgw.cc:2520: ERROR: list_plain_entries(): failed to decode buffer
2022-03-16T12:37:37.992+0000 7fd7be90f700 0 <cls> /build/ceph-16.2.7/src/cls/rgw/cls_rgw.cc:2789: ERROR: rgw_bi_list_op(): list_plain_entries returned ret=-5

Should I use patched OSD as well ?

Actions #21

Updated by Dan van der Ster about 2 years ago

Pawel Stefanski wrote:

Pawel Stefanski wrote:

Dan van der Ster wrote:
The pacific equivalent is at https://github.com/ceph/ceph/pull/45087, if someone would like to try.

Thank you for this PR, tried at our case on Pacific. But unfortunately doesn't work.

Log:
https://paste.opendev.org/show/bau4ghMIz7VqpM93ceaB/

So this case is not covered yet.

Found this in OSD log

2022-03-16T12:37:37.992+0000 7fd7be90f700 0 <cls> /build/ceph-16.2.7/src/cls/rgw/cls_rgw.cc:2520: ERROR: list_plain_entries(): failed to decode buffer
2022-03-16T12:37:37.992+0000 7fd7be90f700 0 <cls> /build/ceph-16.2.7/src/cls/rgw/cls_rgw.cc:2789: ERROR: rgw_bi_list_op(): list_plain_entries returned ret=-5

Should I use patched OSD as well ?

Yes in ceph octopus and later the bucket listing is mostly handled by the osd (in cls_rgw).
So you need to upgrade OSDs and RGWs to test this.

Actions #22

Updated by Pawel Stefanski about 2 years ago

Dan van der Ster wrote:

Pawel Stefanski wrote:

Should I use patched OSD as well ?

Yes in ceph octopus and later the bucket listing is mostly handled by the osd (in cls_rgw).
So you need to upgrade OSDs and RGWs to test this.

Got it, this does look actually like Octopus alike concept of distributed "brain" around the whole creature body. :-)

Actions #23

Updated by Pawel Stefanski about 2 years ago

With new OSD patched with all https://github.com/ceph/ceph/pull/45087 it's Input/Output Error.

OSD log:

2022-03-16T17:13:23.913+0000 7f6ce74e9700  0 <cls> /usr/src/ceph/src/cls/rgw/cls_rgw.cc:2597: ERROR: list_plain_entries_help: failed to decode buffer for plain bucket index entry "\u0000\u0000\u0000\u0000\nw<A2><CC>-" 
2022-03-16T17:13:23.913+0000 7f6ce74e9700  0 <cls> /usr/src/ceph/src/cls/rgw/cls_rgw.cc:2934: ERROR: rgw_bi_list_op: list_plain_entries (low) returned ret=-5, marker="", filter="", max=1000
2022-03-16T17:13:36.101+0000 7f6ce74e9700  0 <cls> /usr/src/ceph/src/cls/rgw/cls_rgw.cc:2597: ERROR: list_plain_entries_help: failed to decode buffer for plain bucket index entry "\u0000\u0000\u0000\u0000\nw<A2><CC>-" 
2022-03-16T17:13:36.101+0000 7f6ce74e9700  0 <cls> /usr/src/ceph/src/cls/rgw/cls_rgw.cc:2934: ERROR: rgw_bi_list_op: list_plain_entries (low) returned ret=-5, marker="", filter="", max=1000

Actions #24

Updated by J. Eric Ivancich about 2 years ago

Dan van der Ster wrote:

Pawel Stefanski wrote:

Should I use patched OSD as well ?

Yes in ceph octopus and later the bucket listing is mostly handled by the osd (in cls_rgw).
So you need to upgrade OSDs and RGWs to test this.

So the filtering was moved to the OSD, to reduce the number of entries sent from the OSDs to the RGW for processing (i.e., reducing network traffic).

Eric

Actions #25

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions #26

Updated by Konstantin Shalygin over 1 year ago

  • Status changed from Pending Backport to Resolved
  • Tags deleted (backport_processed)
Actions

Also available in: Atom PDF