Project

General

Profile

Actions

Bug #50466

closed

_delete_some additional unexpected onode list

Added by Dan van der Ster about 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
nautilus, octopus, pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After updating to 14.2.19 and then moving some PGs around we have a few warnings related to the new efficient PG removal code, e.g.

/var/log/ceph/ceph-osd.792.log:2021-04-14 20:34:34.353 7fb2439d4700  0
osd.792 pg_epoch: 40906 pg[10.14b2s0( v 40734'290069
(33782'287000,40734'290069] lb MIN (bitwise) local-lis/les=33990/33991
n=36272 ec=4951/4937 lis/c 33990/33716 les/c/f 33991/33747/0
40813/40813/37166) [933,626,260,804,503,491]p933(0) r=-1 lpr=40813
DELETING pi=[33716,40813)/4 crt=40734'290069 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[0#10:4d280000::::head#]

/var/log/ceph/ceph-osd.851.log:2021-04-14 18:40:13.312 7fd87bded700  0
osd.851 pg_epoch: 40671 pg[10.133fs5( v 40662'288967
(33782'285900,40662'288967] lb MIN (bitwise) local-lis/les=33786/33787
n=13 ec=4947/4937 lis/c 40498/33714 les/c/f 40499/33747/0
40670/40670/33432) [859,199,913,329,439,79]p859(0) r=-1 lpr=40670
DELETING pi=[33714,40670)/4 crt=40662'288967 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[5#10:fcc80000::::head#]

/var/log/ceph/ceph-osd.851.log:2021-04-14 20:58:14.393 7fd87adeb700  0
osd.851 pg_epoch: 40906 pg[10.2e8s3( v 40610'288991
(33782'285900,40610'288991] lb MIN (bitwise) local-lis/les=33786/33787
n=161220 ec=4937/4937 lis/c 39826/33716 les/c/f 39827/33747/0
40617/40617/39225) [717,933,727,792,607,129]p717(0) r=-1 lpr=40617
DELETING pi=[33716,40617)/3 crt=40610'288991 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[3#10:17400000::::head#]

/var/log/ceph/ceph-osd.883.log:2021-04-14 18:55:16.822 7f78c485d700  0
osd.883 pg_epoch: 40857 pg[7.d4( v 40804'9911289
(35835'9908201,40804'9911289] lb MIN (bitwise)
local-lis/les=40782/40783 n=195 ec=2063/1989 lis/c 40782/40782 les/c/f
40783/40844/0 40781/40845/40845) [877,870,894] r=-1 lpr=40845 DELETING
pi=[40782,40845)/1 crt=40804'9911289 lcod 40804'9911288 unknown NOTIFY
mbc={}] _delete_some additional unexpected onode list (new onodes has
appeared since PG removal started[#7:2b000000::::head#]

This can also lead to flapping if the OSD is particularly slow, e.g.

2021-04-21 15:23:37.003 7f51c273c700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f51a3e81700' had timed out after 15
2021-04-21 15:23:41.595 7f51a3e81700  0
bluestore(/var/lib/ceph/osd/ceph-941) log_latency_fn slow operation
observed for _collection_list, latency = 67.7234s, lat = 67s cid
=10.14aes4_head start GHMAX end GHMAX max 30
2021-04-21 15:23:41.595 7f51a3e81700  0 osd.941 pg_epoch: 43004
pg[10.14aes4( v 42754'296580 (40999'293500,42754'296580] lb MIN
(bitwise) local-lis/les=41331/41332 n=159058 ec=4951/4937 lis/c
41331/41331 les/c/f 41332/42758/0 41330/42759/33461)
[171,903,106,27,395,773]p171(0) r=-1 lpr=42759 DELETING
pi=[41331,42759)/1 crt=42754'296580 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[4#10:75280000::::head#]
2021-04-21 15:23:50.061 7f51a3e81700  0
bluestore(/var/lib/ceph/osd/ceph-941) log_latency slow operation
observed for submit_transact, latency = 8.46584s
2021-04-21 15:23:50.062 7f51a3e81700  1 heartbeat_map reset_timeout
'OSD::osd_op_tp thread 0x7f51a3e81700' had timed out after 15
2021-04-21 15:23:50.115 7f51b6ca1700  0
bluestore(/var/lib/ceph/osd/ceph-941) log_latency_fn slow operation
observed for _txc_committed_kv, latency = 8.51916s, txc =
0x5573928a7340
2021-04-21 15:23:50.473 7f51b2498700  0 log_channel(cluster) log [WRN]
: Monitor daemon marked osd.941 down, but it is still running

Workaround to avoid flapping is to increase osd_heartbeat_grace.

bluefs_buffered_io is enabled on the above cluster. I suspect it is not helping much in this case because the collection list is not fully cached by the end of a PG removal.


Related issues 4 (1 open3 closed)

Related to bluestore - Bug #50297: long osd online compaction: mon wrongly mark osd downNew

Actions
Copied to RADOS - Backport #50704: nautilus: _delete_some additional unexpected onode listResolvedNeha OjhaActions
Copied to RADOS - Backport #50705: octopus: _delete_some additional unexpected onode listResolvedCory SnyderActions
Copied to RADOS - Backport #50706: pacific: _delete_some additional unexpected onode listResolvedNeha OjhaActions
Actions #1

Updated by Dan van der Ster about 3 years ago

osd log with debugging:

ceph-post-file: 09094430-abdb-4248-812c-47b7babae06c

Actions #2

Updated by Konstantin Shalygin about 3 years ago

  • Source set to Community (user)
  • Affected Versions v14.2.19, v14.2.20 added
Actions #3

Updated by Neha Ojha about 3 years ago

I think #178:ec000000::::head# is just a pgmeta object which is skipped until all other objects are removed. Note how _delete_some only reports 3 objects while there are 4 in the list below. I'm not entirely sure why we wait till the end but I don't think this behavior is new (based on a0f48db5cbfc48761eeac0daa259fecbaa573969). Maybe, this is not really an unexpected situation, Igor? How this can lead to flapping is not clear to me but we did change a delete sleep value with original patch.

2021-04-15 10:30:03.657 7fc550eb9700 20 osd.0 pg_epoch: 364385 pg[178.37( v 330198'282 (0'0,330198'282] lb MIN (bitwise) local-lis/les=364264/364265 n=3 ec=167514/167514 lis/c 364381/364264 les/c/f 364382/364265/0 364380/364384/360316) [128,16,34] r=-1 lpr=364384 DELETING pi=[364264,364384)/1 crt=330198'282 unknown NOTIFY mbc={}] _delete_some [#178:ec000000::::head#,#178:ec213ad1:::5001080715@cnsdev.9822087.000000000000007a:head#,#178:ec54a3f5:::5001080720@cnsdev.9822317.0000000000000074:head#,#178:ec84aff7:::5001080716@cnsdev.9822117.0000000000000042:head#]
2021-04-15 10:30:03.657 7fc550eb9700 20 osd.0 pg_epoch: 364385 pg[178.37( v 330198'282 (0'0,330198'282] lb MIN (bitwise) local-lis/les=364264/364265 n=3 ec=167514/167514 lis/c 364381/364264 les/c/f 364382/364265/0 364380/364384/360316) [128,16,34] r=-1 lpr=364384 DELETING pi=[364264,364384)/1 crt=330198'282 unknown NOTIFY mbc={}] _delete_some deleting 3 objects
...
2021-04-15 10:30:04.768 7fc550eb9700 20 osd.0 pg_epoch: 364385 pg[178.37( v 330198'282 (0'0,330198'282] lb MIN (bitwise) local-lis/les=364264/364265 n=3 ec=167514/167514 lis/c 364381/364264 les/c/f 364382/364265/0 364380/364384/360316) [128,16,34] r=-1 lpr=364384 DELETING pi=[364264,364384)/1 crt=330198'282 unknown NOTIFY mbc={}] _delete_some []

2021-04-15 10:30:04.768 7fc550eb9700  0 osd.0 pg_epoch: 364385 pg[178.37( v 330198'282 (0'0,330198'282] lb MIN (bitwise) local-lis/les=364264/364265 n=3 ec=167514/167514 lis/c 364381/364264 les/c/f 364382/364265/0 364380/364384/360316) [128,16,34] r=-1 lpr=364384 DELETING pi=[364264,364384)/1 crt=330198'282 unknown NOTIFY mbc={}] _delete_some additional unexpected onode list (new onodes has appeared since PG removal started[#178:ec000000::::head#]
..
2021-04-15 10:30:04.770 7fc550eb9700 10 bluestore(/var/lib/ceph/osd/ceph-0) _remove_collection oid #178:ec000000::::head#
2021-04-15 10:30:04.770 7fc550eb9700 10 bluestore(/var/lib/ceph/osd/ceph-0) _remove_collection 178.37_head = 0
Actions #5

Updated by Konstantin Shalygin about 3 years ago

  • Backport set to nautilus, octopus, pacific
Actions #6

Updated by Konstantin Shalygin about 3 years ago

  • Pull request ID set to 40993
Actions #7

Updated by Konstantin Shalygin about 3 years ago

Actually, when PG objects deleted with sleep 1 (default) - NVMe is not loaded, but when pg header is deleted - is huge spike on client op latency and possibility to be marked down by mon.

2021-04-23 11:37:23.300 7f4254dc1700  0 osd.656 pg_epoch: 272109 pg[17.68a( v 270607'95012848 (270592'95009764,270607'95012848] lb MIN (bitwise) local-lis/les=269692/269693 n=3487424 ec=51990/51987 lis/c 271642/271615 les/c/f 271643/271616/0 271644/272093/270608) [775,702,708] r=-1 lpr=272093 DELETING pi=[269692,272093)/3 crt=270607'95012848 lcod 0'0 unknown NOTIFY mbc={}] _delete_some additional unexpected onode list (new onodes has appeared since PG removal started[#17:51600000::::head#]
2021-04-23 11:37:30.609 7f42545c0700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 5.8606s
2021-04-23 11:37:30.609 7f4254dc1700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 7.30805s
2021-04-23 11:37:30.609 7f42525bc700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 7.18567s
2021-04-23 11:37:30.609 7f4257dc7700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 7.19751s
2021-04-23 11:37:30.609 7f4256dc5700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 7.07916s
2021-04-23 11:37:30.609 7f42535be700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 7.16178s
2021-04-23 11:37:30.609 7f4258dc9700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 6.67811s
2021-04-23 11:37:30.609 7f42585c8700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 5.14165s
2021-04-23 11:37:30.609 7f42555c2700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 7.05395s
2021-04-23 11:37:30.609 7f4252dbd700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 7.12029s
2021-04-23 11:37:30.609 7f4255dc3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 6.81781s
2021-04-23 11:37:30.613 7f42575c6700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency slow operation observed for submit_transact, latency = 6.6591s
2021-04-23 11:37:30.613 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 7.31097s, txc = 0x55e3533c7200
2021-04-23 11:37:30.613 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 7.12324s, txc = 0x55e339263800
2021-04-23 11:37:30.613 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 7.08212s, txc = 0x55e27435ad00
2021-04-23 11:37:30.649 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 7.20212s, txc = 0x55e2ab441b00
2021-04-23 11:37:30.649 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 5.90095s, txc = 0x55e3660b8700
2021-04-23 11:37:30.649 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 5.18201s, txc = 0x55e370724c00
2021-04-23 11:37:30.649 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 7.09432s, txc = 0x55e35ee3f600
2021-04-23 11:37:30.649 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 7.23788s, txc = 0x55e33e3a2100
2021-04-23 11:37:30.649 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 7.22604s, txc = 0x55e3c9bd5e00
2021-04-23 11:37:30.649 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 6.71846s, txc = 0x55e366104d00
2021-04-23 11:37:30.649 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 6.85729s, txc = 0x55e2b468fc00
2021-04-23 11:37:30.649 7f426baa3700  0 bluestore(/var/lib/ceph/osd/ceph-656) log_latency_fn slow operation observed for _txc_committed_kv, latency = 6.69883s, txc = 0x55e2764cad00
2021-04-23 11:38:08.953 7f426c2a4700  4 rocksdb: [db/db_impl_write.cc:1436] reusing log 376455 from recycle list
Actions #8

Updated by Dan van der Ster about 3 years ago

Konstantin Shalygin wrote:

Actually, when PG objects deleted with sleep 1 (default) - NVMe is not loaded, but when pg header is deleted - is huge spice on client op latency and possibility to be marked down by mon.

[...]

That observation is consistent that the root cause is this collection list: https://github.com/ceph/ceph/blob/v14.2.20/src/osd/PG.cc#L7150-L7156

Actions #9

Updated by Konstantin Shalygin almost 3 years ago

  • Status changed from New to Pending Backport
Actions #10

Updated by Backport Bot almost 3 years ago

  • Copied to Backport #50704: nautilus: _delete_some additional unexpected onode list added
Actions #11

Updated by Backport Bot almost 3 years ago

  • Copied to Backport #50705: octopus: _delete_some additional unexpected onode list added
Actions #12

Updated by Backport Bot almost 3 years ago

  • Copied to Backport #50706: pacific: _delete_some additional unexpected onode list added
Actions #13

Updated by Konstantin Shalygin almost 3 years ago

  • Related to Bug #50297: long osd online compaction: mon wrongly mark osd down added
Actions #14

Updated by Loïc Dachary almost 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF