Bug #57287
orch/cephadm: 'mount -t nfs 10.0.31.109:/foouser /mnt/foo' failing in nfs-rgw-ingress tests
0%
Description
seeing nfs-ingress tests failing with
Command failed on smithi109 with status 32: "sudo TESTDIR=/home/ubuntu/cephtest bash -ex -c 'mount -t nfs 10.0.31.109:/foouser /mnt/foo'"
It seems the nfs daemons cephadm is creating are failing to come up at all. I grabbed the logs from the nfs daemon but there wasn't much I could take away from it other than "free(): invalid pointer" looks like an issue.
-- Logs begin at Sun 2022-08-21 12:17:46 UTC, end at Wed 2022-08-24 17:00:40 UTC. -- Aug 24 16:54:19 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8 Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper. Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized. Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90 Aug 24 16:54:21 smithi170 bash[47617]: 24/08/2022 16:54:21 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] rados_kv_traverse :CLIENT ID :EVENT :Failed to lst kv ret=-2 Aug 24 16:54:21 smithi170 bash[47617]: 24/08/2022 16:54:21 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] rados_cluster_read_clids :CLIENT ID :EVENT :Failed to traverse recovery db: -2 Aug 24 16:54:21 smithi170 bash[47617]: 24/08/2022 16:54:21 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend Aug 24 16:54:21 smithi170 bash[47617]: 24/08/2022 16:54:21 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0) Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] rados_cluster_end_grace :CLIENT ID :EVENT :Failed to remove rec-0000000000000001:nfs.foo.0: -2 Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] main :NFS STARTUP :WARN :No export entries found in configuration file !!! Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] lower_my_caps :NFS STARTUP :EVENT :CAP_SYS_RESOURCE was successfully removed for proper quota management in > Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] lower_my_caps :NFS STARTUP :EVENT :currently set capabilities are: cap_chown,cap_dac_override,cap_fowner,cap> Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gsh_dbus_pkginit :DBUS :CRIT :dbus_bus_get failed (Failed to connect to socket /run/dbus/system_bus_socket: > Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gsh_dbus_register_path :DBUS :CRIT :dbus_connection_register_object_path called with no DBUS connection Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gsh_dbus_register_path :DBUS :CRIT :dbus_connection_register_object_path called with no DBUS connection Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gsh_dbus_register_path :DBUS :CRIT :dbus_connection_register_object_path called with no DBUS connection Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Init_svc :DISP :CRIT :Cannot acquire credentials for principal nfs Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gsh_dbus_register_path :DBUS :CRIT :dbus_connection_register_object_path called with no DBUS connection Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Init_admin_thread :NFS CB :EVENT :Admin thread initialized Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_rpc_cb_init_ccache :NFS STARTUP :EVENT :Callback creds directory (/var/run/ganesha) already exists Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] find_keytab_entry :NFS CB :WARN :Configuration file does not specify default realm while getting default rea> Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gssd_refresh_krb5_machine_credential :NFS CB :CRIT :ERROR: gssd_refresh_krb5_machine_credential: no usable k> Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_rpc_cb_init_ccache :NFS STARTUP :WARN :gssd_refresh_krb5_machine_credential failed (-1765328160:2) Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Start_threads :THREAD :EVENT :Starting delayed executor. Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Start_threads :THREAD :EVENT :gsh_dbusthread was started successfully Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[dbus] gsh_dbus_thread :DBUS :CRIT :DBUS not initialized, service thread exiting Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[dbus] gsh_dbus_thread :DBUS :EVENT :shutdown Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Start_threads :THREAD :EVENT :admin thread was started successfully Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Start_threads :THREAD :EVENT :reaper thread was started successfully Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Start_threads :THREAD :EVENT :General fridge was started successfully Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_start :NFS STARTUP :EVENT :------------------------------------------------- Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_start :NFS STARTUP :EVENT : NFS SERVER INITIALIZED Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_start :NFS STARTUP :EVENT :------------------------------------------------- Aug 24 16:54:38 smithi170 bash[47617]: 24/08/2022 16:54:38 : epoch 630657bc : smithi170 : ganesha.nfsd-7[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload Aug 24 16:54:38 smithi170 bash[47617]: free(): invalid pointer Aug 24 16:54:41 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a Aug 24 16:54:41 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'. Aug 24 16:54:51 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 1. Aug 24 16:54:51 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:54:51 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8 Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper. Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized. Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90 Aug 24 16:55:01 smithi170 bash[48180]: 24/08/2022 16:55:01 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend Aug 24 16:55:01 smithi170 bash[48180]: 24/08/2022 16:55:01 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0) Aug 24 16:55:07 smithi170 bash[48180]: 24/08/2022 16:55:07 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE Aug 24 16:55:07 smithi170 bash[48180]: free(): invalid pointer Aug 24 16:55:08 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a Aug 24 16:55:08 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'. Aug 24 16:55:18 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 2. Aug 24 16:55:18 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:55:18 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8 Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper. Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized. Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90 Aug 24 16:55:26 smithi170 bash[48803]: 24/08/2022 16:55:26 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend Aug 24 16:55:26 smithi170 bash[48803]: 24/08/2022 16:55:26 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0) Aug 24 16:55:32 smithi170 bash[48803]: 24/08/2022 16:55:32 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE Aug 24 16:55:32 smithi170 bash[48803]: free(): invalid pointer Aug 24 16:55:33 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a Aug 24 16:55:33 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'. Aug 24 16:55:43 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 3. Aug 24 16:55:43 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:55:43 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8 Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper. Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized. Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90 Aug 24 16:55:51 smithi170 bash[49834]: 24/08/2022 16:55:51 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend Aug 24 16:55:51 smithi170 bash[49834]: 24/08/2022 16:55:51 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0) Aug 24 16:55:57 smithi170 bash[49834]: 24/08/2022 16:55:57 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE Aug 24 16:55:57 smithi170 bash[49834]: free(): invalid pointer Aug 24 16:55:58 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a Aug 24 16:55:58 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'. Aug 24 16:56:08 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 4. Aug 24 16:56:08 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:56:08 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8 Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper. Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized. Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90 Aug 24 16:56:15 smithi170 bash[50002]: 24/08/2022 16:56:15 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend Aug 24 16:56:15 smithi170 bash[50002]: 24/08/2022 16:56:15 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0) Aug 24 16:56:22 smithi170 bash[50002]: 24/08/2022 16:56:22 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE Aug 24 16:56:22 smithi170 bash[50002]: free(): invalid pointer Aug 24 16:56:22 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a Aug 24 16:56:22 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'. Aug 24 16:56:33 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 5. Aug 24 16:56:33 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:56:33 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8 Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper. Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized. Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90 Aug 24 16:56:40 smithi170 bash[50165]: 24/08/2022 16:56:40 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend Aug 24 16:56:40 smithi170 bash[50165]: 24/08/2022 16:56:40 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0) Aug 24 16:56:46 smithi170 bash[50165]: 24/08/2022 16:56:46 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE Aug 24 16:56:46 smithi170 bash[50165]: free(): invalid pointer Aug 24 16:56:47 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a Aug 24 16:56:47 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'. Aug 24 16:56:57 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 6. Aug 24 16:56:57 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:56:57 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8 Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper. Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized. Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90 Aug 24 16:57:05 smithi170 bash[50803]: 24/08/2022 16:57:05 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend Aug 24 16:57:05 smithi170 bash[50803]: 24/08/2022 16:57:05 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0) Aug 24 16:57:11 smithi170 bash[50803]: 24/08/2022 16:57:11 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE Aug 24 16:57:11 smithi170 bash[50803]: free(): invalid pointer Aug 24 16:57:12 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a Aug 24 16:57:12 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'. Aug 24 16:57:22 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 7. Aug 24 16:57:22 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c. Aug 24 16:57:22 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Start request repeated too quickly. Aug 24 16:57:22 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'. Aug 24 16:57:22 smithi170 systemd[1]: Failed to start Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
All these runs were either just main branch or main branch + https://github.com/ceph/ceph/pull/47674 which shouldn't affect this
https://pulpito.ceph.com/adking-2022-08-24_13:29:10-orch:cephadm-wip-adk-main-testing-2022-08-24-0725-distro-default-smithi/6988943
https://pulpito.ceph.com/adking-2022-08-24_13:29:10-orch:cephadm-wip-adk-main-testing-2022-08-24-0725-distro-default-smithi/6989005
https://pulpito.ceph.com/adking-2022-08-23_23:33:25-orch:cephadm-wip-adk-main-testing-2022-08-23-1319-distro-default-smithi/6988230
https://pulpito.ceph.com/adking-2022-08-23_23:33:25-orch:cephadm-wip-adk-main-testing-2022-08-23-1319-distro-default-smithi/6988234
https://pulpito.ceph.com/adking-2022-08-23_23:33:25-orch:cephadm-wip-adk-main-testing-2022-08-23-1319-distro-default-smithi/6988235
History
#1 Updated by Adam King over 1 year ago
- Tags set to test-failure
#2 Updated by Adam King over 1 year ago
in case it's useful, the ganesha.conf for the failed nfs daemon
# This file is generated by cephadm. NFS_CORE_PARAM { Enable_NLM = false; Enable_RQUOTA = false; Protocols = 4; NFS_Port = 12049; } NFSv4 { Delegations = false; RecoveryBackend = 'rados_cluster'; Minor_Versions = 1, 2; } RADOS_KV { UserId = "nfs.foo.0.0.smithi170.ototfp"; nodeid = "nfs.foo.0"; pool = ".nfs"; namespace = "foo"; } RADOS_URLS { UserId = "nfs.foo.0.0.smithi170.ototfp"; watch_url = "rados://.nfs/foo/conf-nfs.foo"; } RGW { cluster = "ceph"; name = "client.nfs.foo.0.0.smithi170.ototfp-rgw"; } %url rados://.nfs/foo/conf-nfs.foo
and the docker run command used to start the nfs daemon
/usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ganesha.nfsd --init --name ceph-cccdef92-23cb-11ed-8431-001a4aab830c-nfs-foo-0-0-smithi170-ototfp -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:34443e18785693193f48af244f994b89243b90554b6de6b2b11ebb66bbbc6ff7 -e NODE_NAME=smithi170 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_CONF=/etc/ceph/ceph.conf -v /var/lib/ceph/cccdef92-23cb-11ed-8431-001a4aab830c/nfs.foo.0.0.smithi170.ototfp/config:/etc/ceph/ceph.conf:z -v /var/lib/ceph/cccdef92-23cb-11ed-8431-001a4aab830c/nfs.foo.0.0.smithi170.ototfp/keyring:/etc/ceph/keyring:z -v /var/lib/ceph/cccdef92-23cb-11ed-8431-001a4aab830c/nfs.foo.0.0.smithi170.ototfp/etc/ganesha:/etc/ganesha:z -v /var/lib/ceph/cccdef92-23cb-11ed-8431-001a4aab830c/nfs.foo.0.0.smithi170.ototfp/keyring.rgw:/var/lib/ceph/radosgw/ceph-nfs.foo.0.0.smithi170.ototfp-rgw/keyring:z quay.ceph.io/ceph-ci/ceph@sha256:34443e18785693193f48af244f994b89243b90554b6de6b2b11ebb66bbbc6ff7 -F -L STDERR -N NIV_EVENT
#3 Updated by Adam King over 1 year ago
- Subject changed from orch/cephadm: 'mount -t nfs 10.0.31.109:/foouser /mnt/foo' failing in nfs-ingress tests to orch/cephadm: 'mount -t nfs 10.0.31.109:/foouser /mnt/foo' failing in nfs-rgw-ingress tests
#4 Updated by Ramana Raja over 1 year ago
I see that the Ganesha version is 4.0.8
Looking at the ganesha log, it seems like the ganesha server hits "free(): invalid pointer" after it received a SIGHUP. When mgr/nfs module create a NFS RGW export, it sends a SIGHUP to ganesha by notifying a RADOS object [1].
You can fetch more detailed NFS-Ganesha logs by following the example to change log level in https://docs.ceph.com/en/latest/mgr/nfs/#set-customized-nfs-ganesha-configuration . Maybe a NFS-Ganesha/RGW expert would need that to debug further?
[1] https://github.com/ceph/ceph/blob/main/src/pybind/mgr/nfs/export.py#L670
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/nfs/export.py#L125
#5 Updated by Ramana Raja over 1 year ago
This could be related to the issue reported in librados testing, https://tracker.ceph.com/issues/57163
#6 Updated by Laura Flores over 1 year ago
Tracker #57163 was resolved by compiling with c++20. Perhaps that could also help in this case?
#7 Updated by Adam King over 1 year ago
- Status changed from New to Closed
closing as this stopped happening a long time ago. There does happen to be another similar looking issue right now (yet to open tracker, will soon), but it's not the same as this one