Actions
Bug #55850
openlibceph socket closed
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
4 - irritation
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Crash signature (v1):
Crash signature (v2):
Description
We are having this issue on all of our Fuse clients, it looks like Ceph is closing it's socket to our mds0 randomly. There is not much info in the Ceph logs about this, and I am unable to find a root cause.
ceph version 14.2.10-392-gb3a13b81cb (b3a13b81cb4dfddec1cd59e7bab1e3e9984c8dd8) nautilus (stable)
dmesg logs
[Tue May 31 20:15:37 2022] libceph: mds0 (1)10.50.1.248:6800 socket closed (con state OPEN)
[Tue May 31 20:15:37 2022] libceph: mds0 (1)10.50.1.248:6800 socket closed (con state OPEN)
[Tue May 31 20:15:37 2022] libceph: mds0 (1)10.50.1.248:6800 socket closed (con state OPEN)
[Tue May 31 20:15:38 2022] libceph: wrong peer, want (1)10.50.1.248:6800/-1186287667, got (1)10.50.1.248:6800/-1535805463
[Tue May 31 20:15:38 2022] libceph: mds0 (1)10.50.1.248:6800 wrong peer at address
[Tue May 31 20:15:38 2022] libceph: wrong peer, want (1)10.50.1.248:6800/-1186287667, got (1)10.50.1.248:6800/-1535805463
[Tue May 31 20:15:38 2022] libceph: mds0 (1)10.50.1.248:6800 wrong peer at address
[Tue May 31 20:15:38 2022] libceph: wrong peer, want (1)10.50.1.248:6800/-1186287667, got (1)10.50.1.248:6800/-1535805463
[Tue May 31 20:15:38 2022] libceph: mds0 (1)10.50.1.248:6800 wrong peer at address
[Tue May 31 20:15:45 2022] ceph: mds0 reconnect start
[Tue May 31 20:15:45 2022] ceph: mds0 reconnect start
[Tue May 31 20:15:45 2022] ceph: mds0 reconnect start
[Tue May 31 20:15:45 2022] ceph: mds0 reconnect success
[Tue May 31 20:15:45 2022] ceph: mds0 reconnect success
[Tue May 31 20:15:45 2022] ceph: mds0 reconnect success
[Tue May 31 20:16:06 2022] ceph: mds0 recovery completed
[Tue May 31 20:16:06 2022] ceph: mds0 recovery completed
[Tue May 31 20:16:06 2022] ceph: mds0 recovery completed
Ceph config file
# DeepSea default configuration. Changes in this file will be overwritten on
# package update. Include custom configuration fragments in
# /srv/salt/ceph/configuration/files/ceph.conf.d/[global,osd,mon,mgr,mds,client].conf
[global]
fsid = b799274f-d309-4616-8320-a05dd147c602
mon_initial_members = ceph-node4, ceph-node1, ceph-node2, ceph-node3
mon_host = 10.50.1.250, 10.50.1.248, 10.50.1.249, 10.50.1.247
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public_network = 10.50.1.0/24
cluster_network = 10.50.4.0/24
# enable old ceph health format in the json output. This fixes the
# ceph_exporter. This option will only stay until the prometheus plugin takes
# over
mon_health_preluminous_compat = true
mon health preluminous compat warning = false
rbd default features = 3
[mon]
mgr inital modules = dashboard
[mds]
mds_cache_memory_limit = 8589934592
Thank you!
Actions