Actions
Bug #65806
openIO hangs when issuing balanced/localized reads to replica crimson osds while the pg is still peering
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Description
The IO request in the following log never ends, this is because it's waiting for the pg to be active, but the current implementation only trigger the blocker on primary osds.
DEBUG 2024-05-04 08:00:51,983 [shard 0:main] osd - pg_epoch 974 pg[3.0( v 956'503 lc 955'501 (0'0,956'503] local-lis/les=958/959 n=7 ec=15/15 lis/c=958/946 les/c/f=959/947/0 sis=974) [3,2,1] r=2 lpr=974 pi=[946,974)/1 crt=956'503 mlcod 0'0 unknown NOTIFY ClientRequest::with_pg_process: client_request(id=7763, detail=m=[osd_op(client.4240.0:7755 3.0 3:03eb76b7:::scephqa02.cpp.bjat.qianxin-inc.cn279848-2:head {sparse-read 0~3445732, omap-get-vals-by-keys in=4b, omap-get-keys in=12b, omap-get-vals in=16b, omap-get-header, getxattrs} snapc 0={} RETRY=5 ondisk+retry+read+localize_reads+known_if_redirected+supports_pool_eio e974)]): same_interval_since: 974 DEBUG 2024-05-04 08:00:51,983 [shard 0:main] osd - pg_epoch 974 pg[3.0( v 956'503 lc 955'501 (0'0,956'503] local-lis/les=958/959 n=7 ec=15/15 lis/c=958/946 les/c/f=959/947/0 sis=974) [3,2,1] r=2 lpr=974 pi=[946,974)/1 crt=956'503 mlcod 0'0 unknown NOTIFY ClientRequest::with_pg_process: client_request(id=7763, detail=m=[osd_op(client.4240.0:7755 3.0 3:03eb76b7:::scephqa02.cpp.bjat.qianxin-inc.cn279848-2:head {sparse-read 0~3445732, omap-get-vals-by-keys in=4b, omap-get-keys in=12b, omap-get-vals in=16b, omap-get-header, getxattrs} snapc 0={} RETRY=5 ondisk+retry+read+localize_reads+known_if_redirected+supports_pool_eio e974)]): same_interval_since: 974 DEBUG 2024-05-04 08:00:51,983 [shard 0:main] osd - pg_epoch 974 pg[3.0( v 956'503 lc 955'501 (0'0,956'503] local-lis/les=958/959 n=7 ec=15/15 lis/c=958/946 les/c/f=959/947/0 sis=974) [3,2,1] r=2 lpr=974 pi=[946,974)/1 crt=956'503 mlcod 0'0 unknown NOTIFY ClientRequest::with_pg_process: client_request(id=7763, detail=m=[osd_op(client.4240.0:7755 3.0 3:03eb76b7:::scephqa02.cpp.bjat.qianxin-inc.cn279848-2:head {sparse-read 0~3445732, omap-get-vals-by-keys in=4b, omap-get-keys in=12b, omap-get-vals in=16b, omap-get-header, getxattrs} snapc 0={} RETRY=5 ondisk+retry+read+localize_reads+known_if_redirected+supports_pool_eio e974)]) start DEBUG 2024-05-04 08:00:51,983 [shard 0:main] osd - pg_epoch 974 pg[3.0( v 956'503 lc 955'501 (0'0,956'503] local-lis/les=958/959 n=7 ec=15/15 lis/c=958/946 les/c/f=959/947/0 sis=974) [3,2,1] r=2 lpr=974 pi=[946,974)/1 crt=956'503 mlcod 0'0 unknown NOTIFY ClientRequest::with_pg_process: client_request(id=7763, detail=m=[osd_op(client.4240.0:7755 3.0 3:03eb76b7:::scephqa02.cpp.bjat.qianxin-inc.cn279848-2:head {sparse-read 0~3445732, omap-get-vals-by-keys in=4b, omap-get-keys in=12b, omap-get-vals in=16b, omap-get-header, getxattrs} snapc 0={} RETRY=5 ondisk+retry+read+localize_reads+known_if_redirected+supports_pool_eio e974)]).0: entering await_map stage DEBUG 2024-05-04 08:00:51,983 [shard 0:main] osd - pg_epoch 974 pg[3.0( v 956'503 lc 955'501 (0'0,956'503] local-lis/les=958/959 n=7 ec=15/15 lis/c=958/946 les/c/f=959/947/0 sis=974) [3,2,1] r=2 lpr=974 pi=[946,974)/1 crt=956'503 mlcod 0'0 unknown NOTIFY ClientRequest::with_pg_process: client_request(id=7763, detail=m=[osd_op(client.4240.0:7755 3.0 3:03eb76b7:::scephqa02.cpp.bjat.qianxin-inc.cn279848-2:head {sparse-read 0~3445732, omap-get-vals-by-keys in=4b, omap-get-keys in=12b, omap-get-vals in=16b, omap-get-header, getxattrs} snapc 0={} RETRY=5 ondisk+retry+read+localize_reads+known_if_redirected+supports_pool_eio e974)]).0: entered await_map stage, waiting for map DEBUG 2024-05-04 08:00:51,983 [shard 0:main] osd - pg_epoch 974 pg[3.0( v 956'503 lc 955'501 (0'0,956'503] local-lis/les=958/959 n=7 ec=15/15 lis/c=958/946 les/c/f=959/947/0 sis=974) [3,2,1] r=2 lpr=974 pi=[946,974)/1 crt=956'503 mlcod 0'0 unknown NOTIFY ClientRequest::with_pg_process: client_request(id=7763, detail=m=[osd_op(client.4240.0:7755 3.0 3:03eb76b7:::scephqa02.cpp.bjat.qianxin-inc.cn279848-2:head {sparse-read 0~3445732, omap-get-vals-by-keys in=4b, omap-get-keys in=12b, omap-get-vals in=16b, omap-get-header, getxattrs} snapc 0={} RETRY=5 ondisk+retry+read+localize_reads+known_if_redirected+supports_pool_eio e974)]).0: map epoch got 974, entering wait_for_active DEBUG 2024-05-04 08:00:51,983 [shard 0:main] osd - pg_epoch 974 pg[3.0( v 956'503 lc 955'501 (0'0,956'503] local-lis/les=958/959 n=7 ec=15/15 lis/c=958/946 les/c/f=959/947/0 sis=974) [3,2,1] r=2 lpr=974 pi=[946,974)/1 crt=956'503 mlcod 0'0 unknown NOTIFY ClientRequest::with_pg_process: client_request(id=7763, detail=m=[osd_op(client.4240.0:7755 3.0 3:03eb76b7:::scephqa02.cpp.bjat.qianxin-inc.cn279848-2:head {sparse-read 0~3445732, omap-get-vals-by-keys in=4b, omap-get-keys in=12b, omap-get-vals in=16b, omap-get-header, getxattrs} snapc 0={} RETRY=5 ondisk+retry+read+localize_reads+known_if_redirected+supports_pool_eio e974)]).0: entered wait_for_active stage, waiting for active
Actions