Project

General

Profile

Actions

Bug #42666

closed

mgropen from mgr comes from unknown.$id instead of mgr.$id

Added by Sage Weil over 4 years ago. Updated over 4 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This works fine when the mgr is first restarted



but later on the type for the mgrc messenger switches over to client:
2019-11-06 21:05:00.990 7ff546261700  1 -- 172.21.2.202:0/3859088844 --> [v2:172.21.2.202:6838/4012971,v1:172.21.2.202:6839/4012971] -- mgropen(unknown.reesi002) v3 -- 0xdedb000 con 0xd437180
...
2019-11-06 21:05:00.990 7ff552a7a700  1 -- [v2:172.21.2.202:6838/4012971,v1:172.21.2.202:6839/4012971] <== client.690094970 172.21.2.202:0/3859088844 1 ==== mgropen(client.reesi002) v3 ==== 58776+0+0 (crc 0 0 0) 0x136d9500 con 0x1480f200
2019-11-06 21:05:00.990 7ff552a7a700  4 mgr.server handle_open from 0x1480f200  client,reesi002
...
2019-11-06 21:05:00.990 7ff548265700  1 -- 172.21.2.202:0/3859088844 --> [v2:172.21.2.202:6838/4012971,v1:172.21.2.202:6839/4012971] -- mgrreport(unknown.reesi002 +8-0 packed 70) v7 -- 0xf8a3740 con 0xd437180
2019-11-06 21:05:00.990 7ff552a7a700  1 -- [v2:172.21.2.202:6838/4012971,v1:172.21.2.202:6839/4012971] <== client.690094970 172.21.2.202:0/3859088844 2 ==== mgrreport(client.reesi002 +8-0 packed 70) v7 ==== 613+0+0 (crc 0 0 0) 0x14664a40 con 0x1480f200
2019-11-06 21:05:00.990 7ff552a7a700  4 mgr.server handle_report from 0x1480f200 client,reesi002
2019-11-06 21:05:00.990 7ff552a7a700  4 mgr.server handle_report rejecting report from non-daemon client reesi002


Related issues 1 (0 open1 closed)

Related to RADOS - Bug #42566: mgr commands fail when using non-client authResolved

Actions
Actions #1

Updated by Sage Weil over 4 years ago

I suspect this is caused by the rados or libcephfs module reusing the rados instance?

../src/client/Client.cc:  messenger->set_myname(entity_name_t::CLIENT(whoami.v));
../src/librados/RadosClient.cc:  messenger->set_myname(entity_name_t::CLIENT(monclient.get_global_id()));

Actions #2

Updated by Patrick Donnelly over 4 years ago

  • Status changed from 12 to New
Actions #3

Updated by Sage Weil over 4 years ago

  • Status changed from New to In Progress
  • Assignee set to Sage Weil
Actions #4

Updated by Sage Weil over 4 years ago

from reesi001, with a fresh mgr restart,

2020-01-21 19:11:39.070 7eff2243f700  4 mgrc reconnect Starting new session with [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392]
2020-01-21 19:11:39.070 7eff2243f700  1 --2- 172.21.2.201:0/3739522204 >> [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] conn(0x236e400 0x6f48b00 unknown :-1 s=NONE pgs=0 cs=0 l=1 rx=0 tx=0).connect
2020-01-21 19:11:39.070 7eff2243f700  1 -- 172.21.2.201:0/3739522204 --> [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] -- mgropen(unknown.reesi001) v3 -- 0xa241500 con 0x236e400
2020-01-21 19:11:39.070 7eff4d143700  1 --2- [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] >>  conn(0x14d5f600 0x10185b80 unknown :-1 s=NONE pgs=0 cs=0 l=0 rx=0 tx=0).accept
2020-01-21 19:11:39.070 7eff4d143700  1 --2- 172.21.2.201:0/3739522204 >> [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] conn(0x236e400 0x6f48b00 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=1 rx=0 tx=0)._handle_peer_banner_payload supported=0 required=0
2020-01-21 19:11:39.070 7eff4c942700  1 --2- [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] >>  conn(0x14d5f600 0x10185b80 unknown :-1 s=BANNER_ACCEPTING pgs=0 cs=0 l=0 rx=0 tx=0)._handle_peer_banner_payload supported=0 required=0
2020-01-21 19:11:39.070 7eff4d143700 10 monclient: get_auth_request con 0x236e400 auth_method 0
2020-01-21 19:11:39.070 7eff4c942700 10 mgr.server ms_handle_authentication ms_handle_authentication new session 0x6242b40 con 0x14d5f600 entity mgr.reesi001 addr 
2020-01-21 19:11:39.070 7eff3aaf1700  1 -- [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] <== osd.61 v2:172.21.2.205:6825/97735 369 ==== pg_stats(119 pgs tid 0 v 0) v2 ==== 89349+0+0 (crc 0 0 0) 0x835ea80 con 0xc75b200
2020-01-21 19:11:39.070 7eff4c942700  1 --2- [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] >> 172.21.2.201:0/3739522204 conn(0x14d5f600 0x10185b80 crc :-1 s=READY pgs=928 cs=0 l=1 rx=0 tx=0).ready entity=client.727685673 client_cookie=0 server_cookie=0 in_seq=0 out_seq=0
2020-01-21 19:11:39.070 7eff4d143700  1 --2- 172.21.2.201:0/3739522204 >> [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] conn(0x236e400 0x6f48b00 crc :-1 s=READY pgs=2030 cs=0 l=1 rx=0 tx=0).ready entity=mgr.727688283 client_cookie=0 server_cookie=0 in_seq=0 out_seq=0
2020-01-21 19:11:39.070 7eff27c4a700 10 client.727685673.objecter ms_handle_connect 0x236e400
2020-01-21 19:11:39.070 7eff3aaf1700  1 -- [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] <== client.727685673 172.21.2.201:0/3739522204 1 ==== mgropen(client.reesi001) v3 ==== 59509+0+0 (crc 0 0 0) 0xa241b00 con 0x14d5f600
2020-01-21 19:11:39.070 7eff3aaf1700  4 mgr.server handle_open from 0x14d5f600  client,reesi001
2020-01-21 19:11:39.070 7eff3aaf1700  1 -- [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] --> 172.21.2.201:0/3739522204 -- mgrconfigure(period=5, threshold=5) v3 -- 0xdd483a0 con 0x14d5f600
2020-01-21 19:11:39.070 7eff27c4a700  1 -- 172.21.2.201:0/3739522204 <== mgr.727688283 v2:172.21.2.201:6806/8392 1 ==== mgrconfigure(period=5, threshold=5) v3 ==== 12+0+0 (crc 0 0 0) 0xdd47860 con 0x236e400
2020-01-21 19:11:39.070 7eff27c4a700 20 mgrc handle_mgr_configure mgrconfigure(period=5, threshold=5) v3
2020-01-21 19:11:39.070 7eff27c4a700  4 mgrc handle_mgr_configure stats_period=5
2020-01-21 19:11:39.070 7eff27c4a700 20 mgrc operator()  declare objecter-0x22a3b60.op_active
2020-01-21 19:11:39.070 7eff27c4a700 20 mgrc operator()  declare objecter-0x22a3b60.op_r
2020-01-21 19:11:39.070 7eff27c4a700 20 mgrc operator()  declare objecter-0x22a3b60.op_rmw
2020-01-21 19:11:39.070 7eff27c4a700 20 mgrc operator()  declare objecter-0x22a3b60.op_w
2020-01-21 19:11:39.070 7eff27c4a700 20 mgrc operator()  declare objecter.op_active
2020-01-21 19:11:39.070 7eff27c4a700 20 mgrc operator()  declare objecter.op_r
2020-01-21 19:11:39.070 7eff27c4a700 20 mgrc operator()  declare objecter.op_rmw
2020-01-21 19:11:39.070 7eff27c4a700 20 mgrc operator()  declare objecter.op_w
2020-01-21 19:11:39.070 7eff27c4a700 20 mgrc operator() sending 8 counters (of possible 402), 8 new, 0 removed
2020-01-21 19:11:39.070 7eff27c4a700 20 mgrc _send_report encoded 70 bytes
2020-01-21 19:11:39.070 7eff27c4a700  1 -- 172.21.2.201:0/3739522204 --> [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] -- mgrreport(unknown.reesi001 +8-0 packed 70) v7 -- 0x861f6c0 con 0x236e400
2020-01-21 19:11:39.070 7eff3aaf1700  1 -- [v2:172.21.2.201:6806/8392,v1:172.21.2.201:6807/8392] <== client.727685673 172.21.2.201:0/3739522204 2 ==== mgrreport(client.reesi001 +8-0 packed 70) v7 ==== 613+0+0 (crc 0 0 0) 0xf530000 con 0x14d5f600
2020-01-21 19:11:39.070 7eff3aaf1700  4 mgr.server handle_report from 0x14d5f600 client,reesi001
2020-01-21 19:11:39.070 7eff3aaf1700  4 mgr.server handle_report rejecting report from non-daemon client reesi001

version is "ceph version 14.2.5-2-gdaf0990 (daf0990c19c89267ea10c40b9c7a48bea49a3d1b) nautilus (stable)": 1

Actions #5

Updated by Sage Weil over 4 years ago

compare to vstart, on that same version,

2020-01-21 13:11:15.416 7f87fd5da700  1 -- 10.3.64.23:0/1885833 --> [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] -- mgropen(unknown.x) v3 -- 0x5558385e6300 con 0x555838639f80
2020-01-21 13:11:15.416 7f8802de5700  1 --2- [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] >> 10.3.64.23:0/2003603290 conn(0x555838639b00 0x555838660580 crc :-1 s=READY pgs=2 cs=0 l=1 rx=0 tx=0).ready entity=client.4110 client_cookie=0 server_cookie=0 in_seq=0 out_seq=0
2020-01-21 13:11:15.416 7f87e2838700  1 -- [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] <== client.4110 10.3.64.23:0/2003603290 1 ==== command(tid 0: {"prefix": "get_command_descriptions"}) v1 ==== 62+0+0 (crc 0 0 0) 0x55583431d0e0 con 0x555838639b00
2020-01-21 13:11:15.416 7f8800de1700  1 --2- [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] >> 10.3.64.23:0/1885833 conn(0x55583863a400 0x555838661080 crc :-1 s=READY pgs=3 cs=0 l=1 rx=0 tx=0).ready entity=mgr.4108 client_cookie=9b56117e71d734f2 server_cookie=0 in_seq=0 out_seq=0
2020-01-21 13:11:15.416 7f8802de5700  1 --2- 10.3.64.23:0/1885833 >> [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] conn(0x555838639f80 0x555838660b00 crc :-1 s=READY pgs=2 cs=0 l=1 rx=0 tx=0).ready entity=mgr.4108 client_cookie=9b56117e71d734f2 server_cookie=0 in_seq=0 out_seq=0
2020-01-21 13:11:15.416 7f87e2838700  4 mgr.server _handle_command decoded 1
2020-01-21 13:11:15.416 7f87e2838700  4 mgr.server _handle_command prefix=get_command_descriptions
2020-01-21 13:11:15.416 7f87e2838700 10 mgr.server _handle_command reading commands from python modules
2020-01-21 13:11:15.418 7f8802de5700  1 --2- [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] >>  conn(0x55583863a880 0x555838661600 unknown :-1 s=NONE pgs=0 cs=0 l=0 rx=0 tx=0).accept
2020-01-21 13:11:15.418 7f8800de1700  1 --2- [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] >>  conn(0x55583863a880 0x555838661600 unknown :-1 s=BANNER_ACCEPTING pgs=0 cs=0 l=0 rx=0 tx=0)._handle_peer_banner_payload supported=0 required=0
2020-01-21 13:11:15.419 7f8800de1700 10 mgr.server ms_handle_authentication ms_handle_authentication new session 0x555838652c60 con 0x55583863a880 entity mon. addr 
2020-01-21 13:11:15.419 7f8800de1700  1 --2- [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] >>  conn(0x55583863a880 0x555838661600 secure :-1 s=AUTH_ACCEPTING_SIGN pgs=0 cs=0 l=1 rx=0x55583869f140 tx=0x5558385fea00).handle_read_frame_epilogue_main read frame epilogue bytes=32
2020-01-21 13:11:15.419 7f8800de1700  1 --2- [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] >>  conn(0x55583863a880 0x555838661600 secure :-1 s=SESSION_ACCEPTING pgs=0 cs=0 l=1 rx=0x55583869f140 tx=0x5558385fea00).handle_read_frame_epilogue_main read frame epilogue bytes=32
2020-01-21 13:11:15.419 7f8800de1700  1 --2- [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] >> 10.3.64.23:0/1885168 conn(0x55583863a880 0x555838661600 secure :-1 s=READY pgs=1 cs=0 l=1 rx=0x55583869f140 tx=0x5558385fea00).ready entity=mon.0 client_cookie=e5135dd0a09e8ce1 server_cookie=0 in_seq=0 out_seq=0
2020-01-21 13:11:15.421 7f87e2838700  4 mgr.server reply reply success
2020-01-21 13:11:15.421 7f87e2838700  1 -- [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] --> 10.3.64.23:0/2003603290 -- command_reply(tid 0: 0 ) v1 -- 0x55583431de00 con 0x555838639b00
2020-01-21 13:11:15.421 7f87e2838700  1 -- [v2:10.3.64.23:6820/1885833,v1:10.3.64.23:6821/1885833] <== mgr.4108 10.3.64.23:0/1885833 1 ==== mgropen(mgr.x) v3 ==== 60681+0+0 (crc 0 0 0) 0x5558385e6900 con 0x55583863a400
2020-01-21 13:11:15.421 7f87e2838700  4 mgr.server handle_open from 0x55583863a400  mgr,x

Actions #6

Updated by Sage Weil over 4 years ago

  • Status changed from In Progress to Duplicate

The problem is actually the same as #42566: the second/additional mgrc instance is sending the mgropen based on the !cct->is_client instead of the msgr->is_client condition. I was confused by teh logs because I wasn't expecting multiple mgrc instances (one for the mgr itself, another for the libcephfs client).

Actions #7

Updated by Sage Weil over 4 years ago

  • Related to Bug #42566: mgr commands fail when using non-client auth added
Actions

Also available in: Atom PDF