Bug #24342
Monitor's routed_requests leak
0%
Description
Recently, we found that, in our non-leader monitors, there are a lot of routed requests that has not been recycled, as is shown in the following gdb's result.
(gdb) #4 0x000000000054bc1f in main (argc=<optimized out>, argv=<optimized out>) at ceph_mon.cc:761 761 msgr->wait(); (gdb) $1 = std::map with 4453 elements = {[6367] = 0x7f3cdda9c040, [6368] = 0x7f3cdda9c180, [6369] = 0x7f3cdda9c2c0, [6370] = 0x7f3cdda9c400, [6371] = 0x7f3cdda9c540, [6372] = 0x7f3cdda9c680, [6373] = 0x7f3cdda9c7c0, [6374] = 0x7f3cdda9c900, [6375] = 0x7f3cdda9ca40, [6376] = 0x7f3cdda9cb80, [6377] = 0x7f3cdda9ccc0, [6378] = 0x7f3cdda9ce00, [6379] = 0x7f3cdda9cf40, [6380] = 0x7f3cdda9d080, [6381] = 0x7f3cdda9d1c0, [6382] = 0x7f3cdda9d300, [6383] = 0x7f3cdda9d440, [6384] = 0x7f3cdda9d580, [6385] = 0x7f3cdda9d6c0, [6386] = 0x7f3cdda9d800, [6387] = 0x7f3cdda9d940, [6388] = 0x7f3cdda9da80, [6389] = 0x7f3cdda9dbc0, [6390] = 0x7f3cdda9dd00, [6391] = 0x7f3cdda9de40, [6392] = 0x7f3cdda9df80, [6393] = 0x7f3cdda9e0c0, [6394] = 0x7f3cdda9e200, [6395] = 0x7f3cdda9e340, [6396] = 0x7f3cdda9e480, [6397] = 0x7f3cdda9e5c0, [6398] = 0x7f3cdda9e700, [6399] = 0x7f3cdda9e840, [6400] = 0x7f3cdda9e980, [6401] = 0x7f3cdda9eac0, [6402] = 0x7f3cdda9ec00, [6403] = 0x7f3cdda9ed40, [6404] = 0x7f3cdda9ee80, [6405] = 0x7f3cdda9efc0, [6406] = 0x7f3cdda9f100, [6407] = 0x7f3cdda9f240, [6408] = 0x7f3cdda9f380, [6409] = 0x7f3cdda9f4c0, [6410] = 0x7f3cdda9f600, [6411] = 0x7f3cdda9f740, [6412] = 0x7f3cdda9f880, [6413] = 0x7f3cdda9f9c0, [6414] = 0x7f3cdda9fb00, [6415] = 0x7f3cdda9fc40, [6416] = 0x7f3cdda9fd80, [6417] = 0x7f3cdda9fec0, [6418] = 0x7f3ce33c4140, [6419] = 0x7f3ce33c4280, [6420] = 0x7f3ce33c43c0, [6421] = 0x7f3ce33c4500, [6422] = 0x7f3ce33c4640, [6423] = 0x7f3ce33c4780, [6424] = 0x7f3ce33c48c0, [6425] = 0x7f3ce33c4a00, [6426] = 0x7f3ce33c4b40, [6427] = 0x7f3ce33c4c80, [6428] = 0x7f3ce33c4dc0, [6429] = 0x7f3ce33c4f00, [6430] = 0x7f3ce33c5040, [6431] = 0x7f3ce33c5180, [6432] = 0x7f3ce33c52c0, [6433] = 0x7f3ce33c5400, [6434] = 0x7f3ce33c5540, [6435] = 0x7f3ce33c5680, [6436] = 0x7f3ce33c57c0, [6437] = 0x7f3ce33c5900, [6438] = 0x7f3ce33c5a40, [6439] = 0x7f3ce33c5b80, [6440] = 0x7f3ce33c5cc0, [6441] = 0x7f3ce33c5e00, [6442] = 0x7f3ce33c5f40, [6443] = 0x7f3ce33c6080, [6444] = 0x7f3ce33c61c0, [6445] = 0x7f3ce33c6300, [6446] = 0x7f3ce33c6440, [6447] = 0x7f3ce33c6580, [6448] = 0x7f3ce33c66c0, [6449] = 0x7f3ce33c6800, [6450] = 0x7f3ce33c6940, [6451] = 0x7f3ce33c6a80, [6452] = 0x7f3ce33c6bc0, [6453] = 0x7f3ce33c6d00, [6454] = 0x7f3ce33c6e40, [6456] = 0x7f3ce33c70c0, [6457] = 0x7f3ce33c7200, [6458] = 0x7f3ce33c7340, [6459] = 0x7f3ce33c7480, [6461] = 0x7f3ce33c7700, [6462] = 0x7f3ce33c7840, [6463] = 0x7f3ce33c7980, [6464] = 0x7f3ce33c7ac0, [6465] = 0x7f3ce33c7c00, [6466] = 0x7f3ce33c7d40, [6467] = 0x7f3ce33c7e80, [6468] = 0x7f3ce33c7fc0...}After a series of further debugging, we found that this should be caused by the following reason:
- 1. One OSDs could send multiple pgtemp requests in a single second;
- 2. non-leader monitors forward all these requests to leader;
- 3. leader only reply the first of these forwarded requests, as others are requesting the same osdmap.
So, only the first routed pgtemp request is recycled when the paxos procedure is finished, others will remain in the memory of those monitors.
History
#1 Updated by Xuehan Xu almost 6 years ago
#2 Updated by Xuehan Xu almost 6 years ago
Sorry, it seems that the latest version doesn't have this problem. Really sorry. please close this.
#3 Updated by Joao Eduardo Luis almost 6 years ago
- Status changed from New to Closed
#4 Updated by Joao Eduardo Luis almost 6 years ago
- Project changed from Ceph to RADOS
- Category set to Correctness/Safety
- Component(RADOS) Monitor added
#5 Updated by Xuehan Xu almost 6 years ago
It seems that this problem has been fixed by https://github.com/ceph/ceph/commit/39e06ef8f070e136e54452bdea3f6105cd79bb73
#6 Updated by Greg Farnum almost 6 years ago
What version are you running? The MRoute handling is all pretty old; though we've certainly discovered a number of leaks recently where some messages weren't being marked as no_reply that I (hope) are more plausible issues than bad pgtemp handling. :)
#7 Updated by Xuehan Xu almost 6 years ago
Greg Farnum wrote:
What version are you running? The MRoute handling is all pretty old; though we've certainly discovered a number of leaks recently where some messages weren't being marked as no_reply that I (hope) are more plausible issues than bad pgtemp handling. :)
We are using 0.94.5, a pretty old version in which osdmap incrementals are sent from leader to peon who forward them to osds.