Bug #52884
osd: optimize pg peering latency when add new osd that need backfill
0%
Description
Reproduce:
(1) ceph cluster not running any client IO
(2) only ceph osd in osd.14 operation ( add new osd to cluster)
Reason:
(1) There is no data on the new OSD, it is empty OSD;
(2) The new OSD requires a full copy from primary OSD;
(3) If primary OSD's PGlog entries count < osd_min_pg_log_entries(3000), /// this is the case important point
The primary OSD thinks the new OSD can recovery by PGlog
(4) primary OSD send PGlog to new OSD,
(5) new OSD receive PGlog, it has two ways to handle these PGlog:
/// one way is backfill, directly claim pglog
/// one way is recovery, loop pglog to merge
(6) The recovery way(loop pglog to merge ) average handle latency is 109ms
The backfill way(directly claim pglog) average handle latency is 14ms
(7) So we should use backfill instead of recovery for add new OSD
Signed-off-by: Jianwei Zhang jianwei1216@qq.com
History
#1 Updated by jianwei zhang over 2 years ago
#2 Updated by Neha Ojha over 2 years ago
- Status changed from New to Fix Under Review
#3 Updated by Loïc Dachary over 2 years ago
- Target version deleted (
v15.2.15)
#4 Updated by jianwei zhang almost 2 years ago
https://github.com/ceph/ceph/pull/46281
add codes for master branch
#5 Updated by jianwei zhang almost 2 years ago
osd: optimize pg peering latency when add new osd that need backfill
set last_backfill to MIN when creating pg
This happens when the newly created(caused by adding or deleting osd)
pg has its pglog (head, tail) that is empty pglog(0,0),
but it still continuous with the authoritative log(0, 3000),
In this case, we use backfill instead of recovery.
If the authoritative log (6000,9000), that is, tail > 3000,
then the newly created pg (0,0) will naturally go directly to the backfill path,
because there is no intersection between the two.
If the osd is offline for a short time,
the pg on it will still be continuous with the authoritative log.
In this case, the recovery path will still be taken, not backfill.
There are 2 benefits
1. Backfill has lower latency than recovery in processing pglog during peering
2. When choose_acting, the backfill osd will not be considered,
when osd is continuously added/deleted, the number of acting changes can be reduced.
Fix : https://tracker.ceph.com/issues/52884
Signed-off-by: Jianwei Zhang <jianwei1216@qq.com>
#6 Updated by Radoslaw Zarzynski almost 2 years ago
- Pull request ID changed from 43482 to 46281