Actions
Bug #48696
closedosd assert because of aios will be truncated.
Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Description
- 1.anomalies
osd assert after it‘s reboot,just like the following:2020-12-15 10:32:19.008476 7fec4fc5dec0 5 bdev(0x5567f772afc0 /var/lib/ceph/osd/ceph-5/block) aio_write 0x1891e7c000~1000 aio 0x55686d70b690 2020-12-15 10:32:19.008477 7fec4fc5dec0 20 bdev(0x5567f772afc0 /var/lib/ceph/osd/ceph-5/block) aio_submit ioc 0x5567f7c02058 pending 133794 running 0 2020-12-15 10:32:19.009711 7fec4fc5dec0 -1 *** Caught signal (Segmentation fault) ** in thread 7fec4fc5dec0 thread_name:ceph-osd
- 2.reason
after the osd reboot,it will call the fuction _deferred_replay,then it will call submit_batch,but the unit of aios_size is uint16_t, so if the number of aios is is greater than 65535,it will be truncated.then osd will assert.int submit_batch(aio_iter begin, aio_iter end, uint16_t aios_size, void *priv, int *retries);
- 3.reproduce the scene as shown below:
1) change the osd config,just like below and restart osds:bluestore_throttle_bytes = 67108864000 bluestore_throttle_deferred_bytes = 134217728000 bluestore_deferred_batch_ops = 64000000 bluestore_max_deferred_txc = 32000000
2) create pool above the osds.
3) run fio above the pool for 30 seconds and kill the osd by `kill -9`
4) reboot the osds and you will find the anomalies.
Updated by Kefu Chai over 3 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 38709
Updated by Kefu Chai over 3 years ago
- Status changed from Fix Under Review to Resolved
Actions