Project

General

Profile

Bug #45819

cephadm: Possible error in deploying-nfs-ganesha docs

Added by Zac Dover almost 4 years ago. Updated over 3 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
documentation
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Simon Sutter sent an email to ceph-users that included this:

https://docs.ceph.com/docs/master/cephadm/install/#deploying-nfs-ganesha

Sorry, allways the wrong button...

So I ran the command:

ceph orch apply nfs cephnfs cephfs.backuptest.data

And there is now a not working container:

ceph orch ps:
nfs.cephnfs.testnode1            testnode1  error          6m ago     71m  <unknown>  docker.io/ceph/ceph:v15              <unknown>     <unknown>

journalctl tells me this:

Jun 02 15:17:45 testnode1 systemd[1]: Starting Ceph nfs.cephnfs.testnode1 for 915cdf28-8f66-11ea-bb83-ac1f6b4cd516...
Jun 02 15:17:45 testnode1 podman[63413]: Error: no container with name or ID ceph-915cdf28-8f66-11ea-bb83-ac1f6b4cd516-nfs.cephnfs.testnode1 found: no such container
Jun 02 15:17:45 testnode1 systemd[1]: Started Ceph nfs.cephnfs.testnode1 for 915cdf28-8f66-11ea-bb83-ac1f6b4cd516.
Jun 02 15:17:45 testnode1 podman[63434]: 2020-06-02 15:17:45.867685349 +0200 CEST m=+0.080338785 container create 7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
Jun 02 15:17:46 testnode1 podman[63434]: 2020-06-02 15:17:46.196760186 +0200 CEST m=+0.409413617 container init 7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11e>
Jun 02 15:17:46 testnode1 podman[63434]: 2020-06-02 15:17:46.211149759 +0200 CEST m=+0.423803191 container start 7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11>
Jun 02 15:17:46 testnode1 podman[63434]: 2020-06-02 15:17:46.21122888 +0200 CEST m=+0.423882373 container attach 7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11>
Jun 02 15:17:46 testnode1 bash[63432]: rados_connect: -13
Jun 02 15:17:46 testnode1 bash[63432]: Can't connect to cluster: -13
Jun 02 15:17:46 testnode1 podman[63434]: 2020-06-02 15:17:46.300445833 +0200 CEST m=+0.513099326 container died 7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11e>
Jun 02 15:17:46 testnode1 podman[63434]: 2020-06-02 15:17:46.391730251 +0200 CEST m=+0.604383723 container remove 7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
Jun 02 15:17:46 testnode1 podman[63531]: 2020-06-02 15:17:46.496154808 +0200 CEST m=+0.085374929 container create aab4e893b162a643181cea3a9f5f687aae236eb2f4a7f6fad27d503d1fdee893 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
Jun 02 15:17:46 testnode1 podman[63531]: 2020-06-02 15:17:46.81399203 +0200 CEST m=+0.403212198 container init aab4e893b162a643181cea3a9f5f687aae236eb2f4a7f6fad27d503d1fdee893 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11ea>
Jun 02 15:17:46 testnode1 podman[63531]: 2020-06-02 15:17:46.828546918 +0200 CEST m=+0.417767036 container start aab4e893b162a643181cea3a9f5f687aae236eb2f4a7f6fad27d503d1fdee893 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11>
Jun 02 15:17:46 testnode1 podman[63531]: 2020-06-02 15:17:46.828661425 +0200 CEST m=+0.417881609 container attach aab4e893b162a643181cea3a9f5f687aae236eb2f4a7f6fad27d503d1fdee893 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
Jun 02 15:17:46 testnode1 bash[63432]: 02/06/2020 13:17:46 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 3.2
Jun 02 15:17:48 testnode1 bash[63432]: 02/06/2020 13:17:48 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
Jun 02 15:17:48 testnode1 bash[63432]: 02/06/2020 13:17:48 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
Jun 02 15:17:48 testnode1 bash[63432]: 02/06/2020 13:17:48 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
Jun 02 15:17:48 testnode1 bash[63432]: 02/06/2020 13:17:48 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
Jun 02 15:17:48 testnode1 bash[63432]: 02/06/2020 13:17:48 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] main :NFS STARTUP :WARN :No export entries found in configuration file !!!
Jun 02 15:17:48 testnode1 bash[63432]: 02/06/2020 13:17:48 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] lower_my_caps :NFS STARTUP :EVENT :CAP_SYS_RESOURCE was successfully removed for proper quota management in FSAL
Jun 02 15:17:48 testnode1 bash[63432]: 02/06/2020 13:17:48 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] lower_my_caps :NFS STARTUP :EVENT :currenty set capabilities are: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill>
Jun 02 15:17:48 testnode1 bash[63432]: 02/06/2020 13:17:48 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] gsh_dbus_pkginit :DBUS :CRIT :dbus_bus_get failed (Failed to connect to socket /run/dbus/system_bus_socket: No such file or d>
Jun 02 15:17:48 testnode1 bash[63432]: 02/06/2020 13:17:48 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] gsh_dbus_register_path :DBUS :CRIT :dbus_connection_register_object_path called with no DBUS connection
Jun 02 15:17:48 testnode1 bash[63432]: 02/06/2020 13:17:48 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] nfs_Init_svc :DISP :CRIT :Cannot acquire credentials for principal nfs
Jun 02 15:17:48 testnode1 bash[63432]: 02/06/2020 13:17:48 : epoch 5ed6517a : testnode1 : ganesha.nfsd-1[main] __Register_program :DISP :MAJ :Cannot register NFS V3 on UDP
Jun 02 15:17:48 testnode1 podman[63531]: 2020-06-02 15:17:48.976740123 +0200 CEST m=+2.565960305 container died aab4e893b162a643181cea3a9f5f687aae236eb2f4a7f6fad27d503d1fdee893 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11e>
Jun 02 15:17:49 testnode1 podman[63531]: 2020-06-02 15:17:49.051611802 +0200 CEST m=+2.640831963 container remove aab4e893b162a643181cea3a9f5f687aae236eb2f4a7f6fad27d503d1fdee893 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
Jun 02 15:17:49 testnode1 systemd[1]: ceph-915cdf28-8f66-11ea-bb83-ac1f6b4cd516@nfs.cephnfs.testnode1.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Jun 02 15:17:49 testnode1 podman[64059]: 2020-06-02 15:17:49.158593029 +0200 CEST m=+0.082356027 container create d55e6b6797ad129c6de503c0f6fac6b127fcf978ae44b636a54b0b93ad5010b2 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
Jun 02 15:17:49 testnode1 podman[64059]: 2020-06-02 15:17:49.481433978 +0200 CEST m=+0.405196986 container init d55e6b6797ad129c6de503c0f6fac6b127fcf978ae44b636a54b0b93ad5010b2 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11e>
Jun 02 15:17:49 testnode1 podman[64059]: 2020-06-02 15:17:49.495210618 +0200 CEST m=+0.418973669 container start d55e6b6797ad129c6de503c0f6fac6b127fcf978ae44b636a54b0b93ad5010b2 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11>
Jun 02 15:17:49 testnode1 podman[64059]: 2020-06-02 15:17:49.495293786 +0200 CEST m=+0.419056865 container attach d55e6b6797ad129c6de503c0f6fac6b127fcf978ae44b636a54b0b93ad5010b2 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
Jun 02 15:17:49 testnode1 bash[64057]: rados_connect: -13
Jun 02 15:17:49 testnode1 bash[64057]: Can't connect to cluster: -13
Jun 02 15:17:49 testnode1 podman[64059]: 2020-06-02 15:17:49.55130226 +0200 CEST m=+0.475065324 container died d55e6b6797ad129c6de503c0f6fac6b127fcf978ae44b636a54b0b93ad5010b2 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11ea>
Jun 02 15:17:49 testnode1 podman[64059]: 2020-06-02 15:17:49.633569478 +0200 CEST m=+0.557332529 container remove d55e6b6797ad129c6de503c0f6fac6b127fcf978ae44b636a54b0b93ad5010b2 (image=docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
Jun 02 15:17:49 testnode1 systemd[1]: ceph-915cdf28-8f66-11ea-bb83-ac1f6b4cd516@nfs.cephnfs.testnode1.service: Failed with result 'exit-code'.

I can see, one container fails to connect to the cluster, but where can I find out why?

History

#1 Updated by Neha Ojha almost 4 years ago

  • Project changed from RADOS to Orchestrator
  • Category deleted (Documentation)

#2 Updated by Michael Fritch almost 4 years ago

This is likely due to NFSv3 config which requires a running rpcbind service.

$ systemctl status rpcbind

This removed the rpcbind dependency by configuring ganesha as NFSv4 only:
https://github.com/ceph/ceph/pull/34382

And is backported to Octopus here:
https://github.com/ceph/ceph/pull/34554

#3 Updated by Sebastian Wagner almost 4 years ago

  • Subject changed from Possible error in deploying-nfs-ganesha docs to cephadm: Possible error in deploying-nfs-ganesha docs
  • Description updated (diff)

#4 Updated by Sebastian Wagner over 3 years ago

  • Status changed from New to Can't reproduce

Also available in: Atom PDF