Project

General

Profile

Actions

Bug #45399

closed

NFS Ganesha : Error searching service specs for all nodes after nfs orch apply nfs...(Cephadm)

Added by Selyan Ferry almost 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Category:
cephadm
Target version:
-
% Done:

0%

Source:
Tags:
NFS
Backport:
octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Environment :

- 3 hypervisors centos 8.1 (hyp00, hyp01, hyp02)
- 19 OSDs.
- cluster upgraded a month ago from nautilus with the new orchestrator (Cephadm adopt...)
- new cluster with all services running a 15.2.1 container.
- NFS ganesha config loaded in RADOS, shares created with the Ceph dashboard (URL set to rados://nfs-ganesha/ganesha..)

When creating a new nfs (Ganesha) service with Cephadm: "ceph orch apply nfs ganesha nfs-ganesha ganesha"
with

- nfs-ganesha : dedicated pool
- ganesha : dedicated namespace for this new service

The result is :
---------------

[root@admin ~]# ceph orch apply nfs ganesha nfs-ganesha ganesha
Scheduled nfs update...

[root@admin ~]# ceph orch ls
NAME         RUNNING  REFRESHED  AGE  PLACEMENT  IMAGE NAME                             IMAGE ID      
mds.ISOs         2/2  8m ago     8d   count:2    ns01.int.intra:5000/ceph/ceph:v15.2.1  bc83a388465f  
mds.cephfs       3/3  8m ago     4w   count:3    ns01.int.intra:5000/ceph/ceph:v15.2.1  bc83a388465f  
mgr              3/3  8m ago     4w   count:3    ns01.int.intra:5000/ceph/ceph:v15.2.1  bc83a388465f  
mon              3/0  8m ago     -    <no spec>  ns01.int.intra:5000/ceph/ceph:v15.2.1  bc83a388465f  
nfs.ganesha      0/1  -          -    count:1    <unknown>                              <unknown>   

And the following error in debug mode :

2020-05-06T13:42:07.199969+0200 mgr.hyp00 [INF] Saving service nfs.ganesha spec with placement count:1
2020-05-06T13:42:07.233173+0200 mgr.hyp00 [DBG] _kick_serve_loop
...
2020-05-06T13:42:07.238257+0200 mgr.hyp00 [DBG] Applying service nfs.ganesha spec
2020-05-06T13:42:07.238443+0200 mgr.hyp00 [DBG] place 1 over all hosts: [HostPlacementSpec(hostname='hyp00.int.intra', network='', name=''), HostPlacementSpec(hostname='hyp02.int.intra', network='', name=''), HostPlacementSpec(hostname='hyp01.int.intra', network='', name='')]
2020-05-06T13:42:07.238608+0200 mgr.hyp00 [DBG] Combine hosts with existing daemons [] + new hosts [HostPlacementSpec(hostname='hyp01.int.intra', network='', name='')]
2020-05-06T13:42:07.238723+0200 mgr.hyp00 [DBG] hosts with daemons: set()
2020-05-06T13:42:07.238842+0200 mgr.hyp00 [INF] Saving service nfs.ganesha spec with placement count:1
2020-05-06T13:42:07.257828+0200 mgr.hyp00 [DBG] Placing nfs.ganesha.hyp01 on host hyp01.int.intra
2020-05-06T13:42:07.258444+0200 mgr.hyp00 [DBG] SpecStore: find spec for nfs.ganesha.hyp01 returned: []
2020-05-06T13:42:07.258980+0200 mgr.hyp00 [WRN] Failed to apply nfs.ganesha spec NFSServiceSpec({'placement': PlacementSpec(count=1), 'service_type': 'nfs', 'service_id': 'ganesha', 'unmanaged': False, 'pool': 'nfs-ganesha', 'namespace': 'ganesha'}): Cannot find service spec nfs.ganesha.hyp01

When I add 3 daemons explicitly on the three nodes, the cluster forks three NFS containers (the nfs.ganesha service has found the service specs for the dedicated nodes), but the same error appear in the logs ("Cannot find service spec nfs.ganesha.hyp00.hyp00", and "nfs.ganesha.hyp01.hyp01, nfs.ganesha.hyp02.hyp02)

What's wrong ? documentation misunderstanding or bug ?

Actions

Also available in: Atom PDF