Project

General

Profile

Actions

Bug #23511

closed

forwarded osd_failure leak in mon

Added by Kefu Chai about 6 years ago. Updated over 4 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

see http://pulpito.ceph.com/kchai-2018-03-29_13:20:02-rados-wip-slow-mon-ops-kefu-distro-basic-smithi/2334154/

2018-03-29 13:31:14.846 7f888bf9f700  0 mon.b@1(peon) e1 DEBUG SLOW OPS{
    "description": "osd_failure(failed immediate osd.1 172.21.15.179:6801/13388 for 21sec e21 v21)",
    "initiated_at": "2018-03-29 13:30:41.306549",
    "age": 33.542503,
    "duration": 33.542534,
    "type_data": {
        "events": [
            {
                "time": "2018-03-29 13:30:41.306549",
                "event": "initiated" 
            },
            {
                "time": "2018-03-29 13:30:41.306549",
                "event": "header_read" 
            },
            {
                "time": "2018-03-29 13:30:41.306549",
                "event": "throttled" 
            },
            {
                "time": "2018-03-29 13:30:41.306554",
                "event": "all_read" 
            },
            {
                "time": "2018-03-29 13:30:41.306748",
                "event": "dispatched" 
            },
            {
                "time": "2018-03-29 13:30:41.306751",
                "event": "mon:_ms_dispatch" 
            },
            {
                "time": "2018-03-29 13:30:41.306761",
                "event": "mon:dispatch_op" 
            },
            {
                "time": "2018-03-29 13:30:41.306762",
                "event": "psvc:dispatch" 
            },
            {
                "time": "2018-03-29 13:30:41.306790",
                "event": "osdmap:preprocess_query" 
            },
            {
                "time": "2018-03-29 13:30:41.306802",
                "event": "osdmap:preprocess_failure" 
            },
            {
                "time": "2018-03-29 13:30:41.306815",
                "event": "forward_request_leader" 
            },
            {
                "time": "2018-03-29 13:30:41.306849",
                "event": "forwarded" 
            }
        ],
        "info": {
            "seq": 1276,
            "src_is_mon": false,
            "source": "osd.2 172.21.15.179:6809/13391",
            "forwarded_to_leader": true
        }
    }
}

no-reply was replied by leader

2018-03-29 13:30:41.301 7f55b7e89700  1 -- 172.21.15.179:6789/0 <== mon.1 172.21.15.179:6790/0 303 ==== forward(osd_failure(failed immediate osd.1 172.21.15.179:6801/13388 for 21sec e21 v21) v3 caps allow * tid 142 con_features 2305244844817580027) v3 ==== 251+0+0 (551290306 0 0) 0x55947aa94c00 con 0x55947a06c1c0
...
2018-03-29 13:30:41.305 7f55b7e89700 10 mon.a@0(leader) e1 no_reply to osd.2 172.21.15.179:6809/13391 via 172.21.15.179:6790/0 for request osd_failure(failed immediate osd.1 172.21.15.179:6801/13388 for 21sec e21 v21) v3
2018-03-29 13:30:41.305 7f55b7e89700  1 -- 172.21.15.179:6789/0 --> 172.21.15.179:6790/0 -- route(no-reply tid 142) v3 -- ?+0 0x55947a23df80 con 0x55947a06c1c0

and the no-reply was received by the peon

2018-03-29 13:30:41.305 7f888979a700  1 -- 172.21.15.179:6790/0 <== mon.0 172.21.15.179:6789/0 374 ==== route(no-reply tid 142) v3 ==== 69+0+0 (540655743 0 0) 0x556a4d8deac0 con 0x556a4d70c640
Actions #2

Updated by Kefu Chai about 6 years ago

  • Subject changed from SLOW OPS in mon to forwarded osd_failure leak in mon
Actions #3

Updated by Greg Farnum about 6 years ago

Kefu, did your latest no_reply() PR resolve this?

Actions #5

Updated by Josh Durgin about 6 years ago

  • Priority changed from Normal to High
Actions #6

Updated by Greg Farnum over 4 years ago

  • Status changed from New to Can't reproduce

I don't think we've seen this again and may have made even more no_reply fixes?

Actions

Also available in: Atom PDF