Project

General

Profile

Actions

Bug #36300

closed

Clients receive "wrong fsid" error when CephX is disabled

Added by Jason Dillaman over 5 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Related to the changes introduced here [1]. The following reproducer shows the issue hit by a client application:

import rados
import json

cluster = rados.Rados(conffile='')
cluster.connect()

cmd = {'prefix': 'osd dump', 'format': 'json'}
rc, buf_s, out = cluster.mon_command(json.dumps(cmd), b'')

print("rc={}, buf={}, out={}".format(rc, buf_s, out))

It appears that the initial monmap is no longer pushed to clients when CephX is disabled.

[1] https://github.com/ceph/ceph/pull/23741


Related issues 1 (1 open0 closed)

Related to RADOS - Bug #56487: Error EPERM: problem getting command descriptions from mon, when execute "ceph -s".New

Actions
Actions #1

Updated by Greg Farnum over 5 years ago

  • Assignee set to Greg Farnum

I'll take a look.

Actions #2

Updated by Greg Farnum over 5 years ago

  • Status changed from New to 4
Actions #3

Updated by Greg Farnum over 5 years ago

  • Status changed from 4 to 7
Actions #4

Updated by Sage Weil over 5 years ago

  • Status changed from 7 to Resolved
Actions #5

Updated by liqun zhang almost 2 years ago

Mon Jul 4 15:31:19 CST 2022
2022-07-04T15:31:20.219+0800 7f8595551700 10 monclient: get_monmap_and_config
2022-07-04T15:31:20.219+0800 7f8595551700 10 monclient: build_initial_monmap
2022-07-04T15:31:20.219+0800 7f8595551700 10 monclient: monmap:
epoch 0
fsid 7a9e6d1a-fb5c-11ec-bf4c-0e03b1ba142b
last_changed 2022-07-04T15:31:20.220530+0800
created 2022-07-04T15:31:20.220530+0800
min_mon_release 0 (unknown)
0: v1:192.168.38.16:6789/0 mon.a
1: v1:192.168.38.17:6789/0 mon.b
2: v1:192.168.38.18:6789/0 mon.c

2022-07-04T15:31:20.221+0800 7f8595551700 10 monclient: init
2022-07-04T15:31:20.222+0800 7f8595551700 10 monclient: _reopen_session rank -1
2022-07-04T15:31:20.222+0800 7f8595551700 10 monclient: _add_conns ranks=[1,2,0]
2022-07-04T15:31:20.222+0800 7f8595551700 10 monclient(hunting): picked mon.b con 0x7f859006c640 addr v1:192.168.38.17:6789/0
2022-07-04T15:31:20.222+0800 7f8595551700 10 monclient(hunting): picked mon.c con 0x7f859006f3b0 addr v1:192.168.38.18:6789/0
2022-07-04T15:31:20.222+0800 7f8595551700 10 monclient(hunting): picked mon.a con 0x7f859007cbd0 addr v1:192.168.38.16:6789/0
2022-07-04T15:31:20.222+0800 7f8595551700 10 monclient(hunting): _renew_subs
2022-07-04T15:31:20.222+0800 7f8595551700 10 monclient(hunting): authenticate will time out at 2022-07-04T15:36:20.223898+0800
2022-07-04T15:31:20.352+0800 7f858dffb700 10 monclient(hunting): _init_auth method 1
2022-07-04T15:31:20.352+0800 7f858dffb700 10 monclient(hunting): _init_auth creating new auth
2022-07-04T15:31:20.352+0800 7f858dffb700 10 monclient(hunting): my global_id is 152691
2022-07-04T15:31:20.353+0800 7f858dffb700 10 monclient: _finish_hunting 0
2022-07-04T15:31:20.353+0800 7f858dffb700 1 monclient: found mon.a
2022-07-04T15:31:20.353+0800 7f858dffb700 20 monclient: _un_backoff reopen_interval_multipler now 1
2022-07-04T15:31:20.353+0800 7f858dffb700 10 monclient: _send_mon_message to mon.a at v1:192.168.38.16:6789/0
2022-07-04T15:31:20.353+0800 7f858dffb700 10 monclient: _finish_auth 0
2022-07-04T15:31:20.353+0800 7f858dffb700 20 monclient: _check_auth_rotating not needed by client.admin
2022-07-04T15:31:20.353+0800 7f858dffb700 10 monclient: handle_monmap mon_map magic: 0 v1
2022-07-04T15:31:20.353+0800 7f858dffb700 10 monclient: got monmap 3 from mon.a (according to old e3)
2022-07-04T15:31:20.353+0800 7f858dffb700 10 monclient: dump:
epoch 3
fsid 7a9e6d1a-fb5c-11ec-bf4c-0e03b1ba142b
last_changed 2022-07-04T13:45:31.648208+0800
created 2022-07-04T13:45:16.451723+0800
min_mon_release 15 (octopus)
0: v1:192.168.38.16:6789/0 mon.a
1: v1:192.168.38.17:6789/0 mon.b
2: v1:192.168.38.18:6789/0 mon.c

2022-07-04T15:31:20.354+0800 7f8595551700 5 monclient: authenticate success, global_id 152691
2022-07-04T15:31:20.354+0800 7f8595551700 20 monclient: get_monmap_and_config waiting for monmap|config
2022-07-04T15:31:20.359+0800 7f858dffb700 10 monclient: handle_config config(0 keys) v1
2022-07-04T15:31:20.359+0800 7f858dffb700 10 monclient: handle_monmap mon_map magic: 0 v1
2022-07-04T15:31:20.359+0800 7f858dffb700 10 monclient: got monmap 3 from mon.a (according to old e3)
2022-07-04T15:31:20.359+0800 7f858dffb700 10 monclient: dump:
epoch 3
fsid 7a9e6d1a-fb5c-11ec-bf4c-0e03b1ba142b
last_changed 2022-07-04T13:45:31.648208+0800
created 2022-07-04T13:45:16.451723+0800
min_mon_release 15 (octopus)
0: v1:192.168.38.16:6789/0 mon.a
1: v1:192.168.38.17:6789/0 mon.b
2: v1:192.168.38.18:6789/0 mon.c

2022-07-04T15:31:20.359+0800 7f8595551700 10 monclient: get_monmap_and_config success
2022-07-04T15:31:20.359+0800 7f8595551700 10 monclient: shutdown
2022-07-04T15:31:20.359+0800 7f8595551700 20 monclient: shutdown discarding 0 pending message(s)
2022-07-04T15:31:20.359+0800 7f857ffff700 4 set_mon_vals no callback set
2022-07-04T15:31:20.369+0800 7f8595551700 10 monclient: build_initial_monmap
2022-07-04T15:31:20.369+0800 7f8595551700 10 monclient: monmap:
epoch 0
fsid 00000000-0000-0000-0000-000000000000
last_changed 0.000000
created 0.000000
min_mon_release 0 (unknown)
0: v1:192.168.38.16:6789/0 mon.noname-a
1: v1:192.168.38.17:6789/0 mon.noname-b
2: v1:192.168.38.18:6789/0 mon.noname-c

2022-07-04T15:31:20.369+0800 7f8595551700 1 librados: starting msgr at
2022-07-04T15:31:20.370+0800 7f8595551700 1 librados: starting objecter
2022-07-04T15:31:20.371+0800 7f8595551700 1 librados: setting wanted keys
2022-07-04T15:31:20.371+0800 7f8595551700 1 librados: calling monclient init
2022-07-04T15:31:20.371+0800 7f8595551700 10 monclient: init
2022-07-04T15:31:20.371+0800 7f8595551700 10 monclient: _reopen_session rank -1
2022-07-04T15:31:20.371+0800 7f8595551700 10 monclient: _add_conns ranks=[1,2,0]
2022-07-04T15:31:20.371+0800 7f8595551700 10 monclient(hunting): picked mon.noname-b con 0x7f859006c640 addr v1:192.168.38.17:6789/0
2022-07-04T15:31:20.371+0800 7f8595551700 10 monclient(hunting): picked mon.noname-c con 0x7f859006f3b0 addr v1:192.168.38.18:6789/0
2022-07-04T15:31:20.371+0800 7f8595551700 10 monclient(hunting): picked mon.noname-a con 0x7f859007cbd0 addr v1:192.168.38.16:6789/0
2022-07-04T15:31:20.371+0800 7f8595551700 10 monclient(hunting): _renew_subs
2022-07-04T15:31:20.371+0800 7f8595551700 10 monclient(hunting): authenticate will time out at 2022-07-04T15:36:20.372880+0800
2022-07-04T15:31:20.452+0800 7f857ffff700 10 monclient(hunting): _init_auth method 1
2022-07-04T15:31:20.452+0800 7f857ffff700 10 monclient(hunting): _init_auth creating new auth
2022-07-04T15:31:20.452+0800 7f857ffff700 10 monclient(hunting): my global_id is 152697
2022-07-04T15:31:20.452+0800 7f857ffff700 10 monclient: _finish_hunting 0
2022-07-04T15:31:20.452+0800 7f857ffff700 1 monclient: found mon.noname-a
2022-07-04T15:31:20.452+0800 7f857ffff700 20 monclient: _un_backoff reopen_interval_multipler now 1
2022-07-04T15:31:20.452+0800 7f857ffff700 10 monclient: _send_mon_message to mon.noname-a at v1:192.168.38.16:6789/0
2022-07-04T15:31:20.452+0800 7f857ffff700 10 monclient: _finish_auth 0
2022-07-04T15:31:20.452+0800 7f857ffff700 20 monclient: _check_auth_rotating not needed by client.admin
2022-07-04T15:31:20.455+0800 7f8595551700 5 monclient: authenticate success, global_id 152697
2022-07-04T15:31:20.455+0800 7f8595551700 10 monclient: _renew_subs
2022-07-04T15:31:20.455+0800 7f8595551700 10 monclient: _send_mon_message to mon.noname-a at v1:192.168.38.16:6789/0
2022-07-04T15:31:20.455+0800 7f8595551700 10 monclient: _renew_subs
2022-07-04T15:31:20.455+0800 7f8595551700 10 monclient: _send_mon_message to mon.noname-a at v1:192.168.38.16:6789/0
2022-07-04T15:31:20.455+0800 7f8595551700 1 librados: init done
2022-07-04T15:31:20.457+0800 7f8595551700 10 monclient: start_mon_command cmd=[{"prefix": "get_command_descriptions"}]
2022-07-04T15:31:20.457+0800 7f8595551700 10 monclient: _send_command 1 [{"prefix": "get_command_descriptions"}]
2022-07-04T15:31:20.457+0800 7f8595551700 10 monclient: _send_mon_message to mon.noname-a at v1:192.168.38.16:6789/0
2022-07-04T15:31:20.460+0800 7f857ffff700 10 monclient: handle_monmap mon_map magic: 0 v1
2022-07-04T15:31:20.463+0800 7f857ffff700 10 monclient: got monmap 3 from mon.noname-a (according to old e3)
2022-07-04T15:31:20.463+0800 7f857ffff700 10 monclient: dump:
epoch 3
fsid 7a9e6d1a-fb5c-11ec-bf4c-0e03b1ba142b
last_changed 2022-07-04T13:45:31.648208+0800
created 2022-07-04T13:45:16.451723+0800
min_mon_release 15 (octopus)
0: v1:192.168.38.16:6789/0 mon.a
1: v1:192.168.38.17:6789/0 mon.b
2: v1:192.168.38.18:6789/0 mon.c

2022-07-04T15:31:20.488+0800 7f857ffff700 10 monclient: handle_config config(0 keys) v1
2022-07-04T15:31:20.488+0800 7f857ffff700 10 monclient: handle_monmap mon_map magic: 0 v1
2022-07-04T15:31:20.488+0800 7f857ffff700 10 monclient: got monmap 3 from mon.a (according to old e3)
2022-07-04T15:31:20.488+0800 7f857ffff700 10 monclient: dump:
epoch 3
fsid 7a9e6d1a-fb5c-11ec-bf4c-0e03b1ba142b
last_changed 2022-07-04T13:45:31.648208+0800
created 2022-07-04T13:45:16.451723+0800
min_mon_release 15 (octopus)
0: v1:192.168.38.16:6789/0 mon.a
1: v1:192.168.38.17:6789/0 mon.b
2: v1:192.168.38.18:6789/0 mon.c

2022-07-04T15:31:20.489+0800 7f857e7fc700 4 set_mon_vals no callback set
2022-07-04T15:31:20.523+0800 7f857ffff700 10 monclient: handle_mon_command_ack 1 [{"prefix": "get_command_descriptions"}]
2022-07-04T15:31:20.523+0800 7f857ffff700 10 monclient: _finish_command 1 = -1 wrong fsid
Error EPERM: problem getting command descriptions from mon.

Actions #6

Updated by liqun zhang almost 2 years ago

version 15.2.13
disable cephx, and excute "ceph -s" every 1 second,
A great chance to reproduce this error. log as above.

Actions #7

Updated by liqun zhang almost 2 years ago

#/usr/bin/bash
while true
do
echo `date` >> /tmp/o.log
ret=`ceph s >> /tmp/o.log 2>&1 `
sleep 1
echo '' >> /tmp/o.log
echo '---------------------------
' >> /tmp/o.log
done

此为复现脚本。

Actions #8

Updated by Greg Farnum almost 2 years ago

Can you make a new ticket with your details and link to this one? We may have recreated a similar issue but the details will be quite different and somebody else will need to work on it.

Actions #9

Updated by Radoslaw Zarzynski almost 2 years ago

  • Related to Bug #56487: Error EPERM: problem getting command descriptions from mon, when execute "ceph -s". added
Actions

Also available in: Atom PDF