Project

General

Profile

Actions

Bug #37882

closed

msg/async: wrong source ip inferred

Added by Sage Weil over 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

in /a/sage-2019-01-11_15:59:52-rados-wip-sage2-testing-2019-01-10-1820-distro-basic-smithi/3448145 this leads to the osd not being marked up and then, eventually,
'Scrubbing terminated -- not all pgs were active and clean.'.

osd connects from 172.21.15.79:52058,

2019-01-12 11:09:18.034 7f326a7b8700 10 -- v1:172.21.15.95:6790/0 >>  conn(0x564b29854800 legacy=0x564b298b7600 :-1 s=STATE_NONE l=0).accept sd=37 listen_addr v1:172.21.15.95:6790/0 peer_addr v1:172.21.15.79:52058/0
2019-01-12 11:09:18.034 7f326afb9700 20 -- v1:172.21.15.95:6790/0 >>  conn(0x564b29854800 legacy=0x564b298b7600 :-1 s=STATE_ACCEPTING l=0).process
2019-01-12 11:09:18.035 7f326afb9700 20 --1- v1:172.21.15.95:6790/0 >> - 0x564b298b7600 conn(0x564b29854800 legacy=0x564b298b7600 :-1 s=START_ACCEPT pgs=0 cs=0 l=0).read_event
2019-01-12 11:09:18.035 7f326afb9700 20 --1- v1:172.21.15.95:6790/0 >> - 0x564b298b7600 conn(0x564b29854800 legacy=0x564b298b7600 :-1 s=START_ACCEPT pgs=0 cs=0 l=0).send_server_banner
2019-01-12 11:09:18.035 7f326afb9700  1 --1- v1:172.21.15.95:6790/0 >> - 0x564b298b7600 conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=ACCEPTING pgs=0 cs=0 l=0).send_server_banner sd=37 legacy v1:172.21.15.95:6790/0 socket_addr v1:172.21.15.95:6790/0 target_addr v1:172.21.15.79:52058/0
2019-01-12 11:09:18.035 7f326afb9700 10 -- v1:172.21.15.95:6790/0 >>  conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send sent bytes 281 remaining bytes 0
2019-01-12 11:09:18.035 7f326afb9700 20 --1- v1:172.21.15.95:6790/0 >> - 0x564b298b7600 conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=ACCEPTING pgs=0 cs=0 l=0).handle_server_banner_write r=0
2019-01-12 11:09:18.035 7f326afb9700 10 --1- v1:172.21.15.95:6790/0 >> - 0x564b298b7600 conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=ACCEPTING pgs=0 cs=0 l=0).handle_server_banner_write write banner and addr done: -
2019-01-12 11:09:18.035 7f326afb9700 20 --1- v1:172.21.15.95:6790/0 >> - 0x564b298b7600 conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=ACCEPTING pgs=0 cs=0 l=0).wait_client_banner
2019-01-12 11:09:18.035 7f326afb9700 20 -- v1:172.21.15.95:6790/0 >>  conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=145
2019-01-12 11:09:18.035 7f326afb9700 20 -- v1:172.21.15.95:6790/0 >>  conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=STATE_CONNECTION_ESTABLISHED l=0).process
2019-01-12 11:09:18.035 7f326afb9700 20 -- v1:172.21.15.95:6790/0 >>  conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=STATE_CONNECTION_ESTABLISHED l=0).read continue len=145
2019-01-12 11:09:18.035 7f326afb9700 20 --1- v1:172.21.15.95:6790/0 >> - 0x564b298b7600 conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=ACCEPTING pgs=0 cs=0 l=0).handle_client_banner r=0
2019-01-12 11:09:18.035 7f326afb9700 10 --1- v1:172.21.15.95:6790/0 >> - 0x564b298b7600 conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=ACCEPTING pgs=0 cs=0 l=0).handle_client_banner accept peer addr is v1:0.0.0.0:6809/34208
2019-01-12 11:09:18.035 7f326afb9700  0 --1- v1:172.21.15.95:6790/0 >> - 0x564b298b7600 conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=ACCEPTING pgs=0 cs=0 l=0).handle_client_banner accept peer addr is really v1:172.21.15.95:6809/34208 (socket is v1:172.21.15.95:6790/0)
2019-01-12 11:09:18.035 7f326afb9700 20 --1- v1:172.21.15.95:6790/0 >> v1:172.21.15.95:6809/34208 0x564b298b7600 conn(0x564b29854800 legacy=0x564b298b7600 :6790 s=ACCEPTING pgs=0 cs=0 l=0).wait_connect_message

but there at the end we infer their ip is .95.

this is a mixup of socket_addr (our socket addr) and target_addr (peer addr we are connected to)

Actions #1

Updated by Sage Weil over 5 years ago

  • Status changed from 12 to Fix Under Review
Actions #2

Updated by Sage Weil over 5 years ago

  • Status changed from Fix Under Review to Resolved
Actions #3

Updated by Greg Farnum about 5 years ago

  • Project changed from RADOS to Messengers
Actions

Also available in: Atom PDF