Project

General

Profile

Actions

Bug #57287

closed

orch/cephadm: 'mount -t nfs 10.0.31.109:/foouser /mnt/foo' failing in nfs-rgw-ingress tests

Added by Adam King over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

seeing nfs-ingress tests failing with

Command failed on smithi109 with status 32: "sudo TESTDIR=/home/ubuntu/cephtest bash -ex -c 'mount -t nfs 10.0.31.109:/foouser /mnt/foo'" 

It seems the nfs daemons cephadm is creating are failing to come up at all. I grabbed the logs from the nfs daemon but there wasn't much I could take away from it other than "free(): invalid pointer" looks like an issue.

-- Logs begin at Sun 2022-08-21 12:17:46 UTC, end at Wed 2022-08-24 17:00:40 UTC. --
Aug 24 16:54:19 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT
Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8
Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
Aug 24 16:54:20 smithi170 bash[47617]: 24/08/2022 16:54:20 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
Aug 24 16:54:21 smithi170 bash[47617]: 24/08/2022 16:54:21 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] rados_kv_traverse :CLIENT ID :EVENT :Failed to lst kv ret=-2
Aug 24 16:54:21 smithi170 bash[47617]: 24/08/2022 16:54:21 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] rados_cluster_read_clids :CLIENT ID :EVENT :Failed to traverse recovery db: -2
Aug 24 16:54:21 smithi170 bash[47617]: 24/08/2022 16:54:21 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend
Aug 24 16:54:21 smithi170 bash[47617]: 24/08/2022 16:54:21 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0)
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] rados_cluster_end_grace :CLIENT ID :EVENT :Failed to remove rec-0000000000000001:nfs.foo.0: -2
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] main :NFS STARTUP :WARN :No export entries found in configuration file !!!
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] lower_my_caps :NFS STARTUP :EVENT :CAP_SYS_RESOURCE was successfully removed for proper quota management in >
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] lower_my_caps :NFS STARTUP :EVENT :currently set capabilities are: cap_chown,cap_dac_override,cap_fowner,cap>
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gsh_dbus_pkginit :DBUS :CRIT :dbus_bus_get failed (Failed to connect to socket /run/dbus/system_bus_socket: >
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gsh_dbus_register_path :DBUS :CRIT :dbus_connection_register_object_path called with no DBUS connection
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gsh_dbus_register_path :DBUS :CRIT :dbus_connection_register_object_path called with no DBUS connection
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gsh_dbus_register_path :DBUS :CRIT :dbus_connection_register_object_path called with no DBUS connection
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Init_svc :DISP :CRIT :Cannot acquire credentials for principal nfs
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gsh_dbus_register_path :DBUS :CRIT :dbus_connection_register_object_path called with no DBUS connection
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Init_admin_thread :NFS CB :EVENT :Admin thread initialized
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_rpc_cb_init_ccache :NFS STARTUP :EVENT :Callback creds directory (/var/run/ganesha) already exists
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] find_keytab_entry :NFS CB :WARN :Configuration file does not specify default realm while getting default rea>
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] gssd_refresh_krb5_machine_credential :NFS CB :CRIT :ERROR: gssd_refresh_krb5_machine_credential: no usable k>
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_rpc_cb_init_ccache :NFS STARTUP :WARN :gssd_refresh_krb5_machine_credential failed (-1765328160:2)
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Start_threads :THREAD :EVENT :Starting delayed executor.
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Start_threads :THREAD :EVENT :gsh_dbusthread was started successfully
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[dbus] gsh_dbus_thread :DBUS :CRIT :DBUS not initialized, service thread exiting
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[dbus] gsh_dbus_thread :DBUS :EVENT :shutdown
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Start_threads :THREAD :EVENT :admin thread was started successfully
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Start_threads :THREAD :EVENT :reaper thread was started successfully
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_Start_threads :THREAD :EVENT :General fridge was started successfully
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_start :NFS STARTUP :EVENT :-------------------------------------------------
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_start :NFS STARTUP :EVENT :             NFS SERVER INITIALIZED
Aug 24 16:54:24 smithi170 bash[47617]: 24/08/2022 16:54:24 : epoch 630657bc : smithi170 : ganesha.nfsd-7[main] nfs_start :NFS STARTUP :EVENT :-------------------------------------------------
Aug 24 16:54:38 smithi170 bash[47617]: 24/08/2022 16:54:38 : epoch 630657bc : smithi170 : ganesha.nfsd-7[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload
Aug 24 16:54:38 smithi170 bash[47617]: free(): invalid pointer
Aug 24 16:54:41 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a
Aug 24 16:54:41 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'.
Aug 24 16:54:51 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 1.
Aug 24 16:54:51 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:54:51 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT
Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8
Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
Aug 24 16:54:55 smithi170 bash[48180]: 24/08/2022 16:54:55 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
Aug 24 16:55:01 smithi170 bash[48180]: 24/08/2022 16:55:01 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend
Aug 24 16:55:01 smithi170 bash[48180]: 24/08/2022 16:55:01 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0)
Aug 24 16:55:07 smithi170 bash[48180]: 24/08/2022 16:55:07 : epoch 630657df : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
Aug 24 16:55:07 smithi170 bash[48180]: free(): invalid pointer
Aug 24 16:55:08 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a
Aug 24 16:55:08 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'.
Aug 24 16:55:18 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 2.
Aug 24 16:55:18 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:55:18 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT
Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8
Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
Aug 24 16:55:20 smithi170 bash[48803]: 24/08/2022 16:55:20 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
Aug 24 16:55:26 smithi170 bash[48803]: 24/08/2022 16:55:26 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend
Aug 24 16:55:26 smithi170 bash[48803]: 24/08/2022 16:55:26 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0)
Aug 24 16:55:32 smithi170 bash[48803]: 24/08/2022 16:55:32 : epoch 630657f8 : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
Aug 24 16:55:32 smithi170 bash[48803]: free(): invalid pointer
Aug 24 16:55:33 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a
Aug 24 16:55:33 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'.
Aug 24 16:55:43 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 3.
Aug 24 16:55:43 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:55:43 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT
Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8
Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
Aug 24 16:55:45 smithi170 bash[49834]: 24/08/2022 16:55:45 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
Aug 24 16:55:51 smithi170 bash[49834]: 24/08/2022 16:55:51 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend
Aug 24 16:55:51 smithi170 bash[49834]: 24/08/2022 16:55:51 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0)
Aug 24 16:55:57 smithi170 bash[49834]: 24/08/2022 16:55:57 : epoch 63065811 : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
Aug 24 16:55:57 smithi170 bash[49834]: free(): invalid pointer
Aug 24 16:55:58 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a
Aug 24 16:55:58 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'.
Aug 24 16:56:08 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 4.
Aug 24 16:56:08 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:56:08 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT
Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8
Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
Aug 24 16:56:09 smithi170 bash[50002]: 24/08/2022 16:56:09 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
Aug 24 16:56:15 smithi170 bash[50002]: 24/08/2022 16:56:15 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend
Aug 24 16:56:15 smithi170 bash[50002]: 24/08/2022 16:56:15 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0)
Aug 24 16:56:22 smithi170 bash[50002]: 24/08/2022 16:56:22 : epoch 63065829 : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
Aug 24 16:56:22 smithi170 bash[50002]: free(): invalid pointer
Aug 24 16:56:22 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a
Aug 24 16:56:22 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'.
Aug 24 16:56:33 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 5.
Aug 24 16:56:33 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:56:33 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT
Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8
Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
Aug 24 16:56:34 smithi170 bash[50165]: 24/08/2022 16:56:34 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
Aug 24 16:56:40 smithi170 bash[50165]: 24/08/2022 16:56:40 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend
Aug 24 16:56:40 smithi170 bash[50165]: 24/08/2022 16:56:40 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0)
Aug 24 16:56:46 smithi170 bash[50165]: 24/08/2022 16:56:46 : epoch 63065842 : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
Aug 24 16:56:46 smithi170 bash[50165]: free(): invalid pointer
Aug 24 16:56:47 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a
Aug 24 16:56:47 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'.
Aug 24 16:56:57 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 6.
Aug 24 16:56:57 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:56:57 smithi170 systemd[1]: Started Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] init_logging :LOG :NULL :LOG: Setting log level for all components to NIV_EVENT
Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 4.0.8
Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
Aug 24 16:56:59 smithi170 bash[50803]: 24/08/2022 16:56:59 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
Aug 24 16:57:05 smithi170 bash[50803]: 24/08/2022 16:57:05 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend
Aug 24 16:57:05 smithi170 bash[50803]: 24/08/2022 16:57:05 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0)
Aug 24 16:57:11 smithi170 bash[50803]: 24/08/2022 16:57:11 : epoch 6306585b : smithi170 : ganesha.nfsd-7[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
Aug 24 16:57:11 smithi170 bash[50803]: free(): invalid pointer
Aug 24 16:57:12 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Main process exited, code=exited, status=134/n/a
Aug 24 16:57:12 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'.
Aug 24 16:57:22 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Scheduled restart job, restart counter is at 7.
Aug 24 16:57:22 smithi170 systemd[1]: Stopped Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.
Aug 24 16:57:22 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Start request repeated too quickly.
Aug 24 16:57:22 smithi170 systemd[1]: ceph-cccdef92-23cb-11ed-8431-001a4aab830c@nfs.foo.0.0.smithi170.ototfp.service: Failed with result 'exit-code'.
Aug 24 16:57:22 smithi170 systemd[1]: Failed to start Ceph nfs.foo.0.0.smithi170.ototfp for cccdef92-23cb-11ed-8431-001a4aab830c.

All these runs were either just main branch or main branch + https://github.com/ceph/ceph/pull/47674 which shouldn't affect this

https://pulpito.ceph.com/adking-2022-08-24_13:29:10-orch:cephadm-wip-adk-main-testing-2022-08-24-0725-distro-default-smithi/6988943
https://pulpito.ceph.com/adking-2022-08-24_13:29:10-orch:cephadm-wip-adk-main-testing-2022-08-24-0725-distro-default-smithi/6989005
https://pulpito.ceph.com/adking-2022-08-23_23:33:25-orch:cephadm-wip-adk-main-testing-2022-08-23-1319-distro-default-smithi/6988230
https://pulpito.ceph.com/adking-2022-08-23_23:33:25-orch:cephadm-wip-adk-main-testing-2022-08-23-1319-distro-default-smithi/6988234
https://pulpito.ceph.com/adking-2022-08-23_23:33:25-orch:cephadm-wip-adk-main-testing-2022-08-23-1319-distro-default-smithi/6988235

Actions #1

Updated by Adam King over 1 year ago

  • Translation missing: en.field_tag_list set to test-failure
Actions #2

Updated by Adam King over 1 year ago

in case it's useful, the ganesha.conf for the failed nfs daemon

# This file is generated by cephadm.
NFS_CORE_PARAM {
        Enable_NLM = false;
        Enable_RQUOTA = false;
        Protocols = 4;
        NFS_Port = 12049;
}

NFSv4 {
        Delegations = false;
        RecoveryBackend = 'rados_cluster';
        Minor_Versions = 1, 2;
}

RADOS_KV {
        UserId = "nfs.foo.0.0.smithi170.ototfp";
        nodeid = "nfs.foo.0";
        pool = ".nfs";
        namespace = "foo";
}

RADOS_URLS {
        UserId = "nfs.foo.0.0.smithi170.ototfp";
        watch_url = "rados://.nfs/foo/conf-nfs.foo";
}

RGW {
        cluster = "ceph";
        name = "client.nfs.foo.0.0.smithi170.ototfp-rgw";
}

%url    rados://.nfs/foo/conf-nfs.foo

and the docker run command used to start the nfs daemon

/usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ganesha.nfsd --init --name ceph-cccdef92-23cb-11ed-8431-001a4aab830c-nfs-foo-0-0-smithi170-ototfp -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:34443e18785693193f48af244f994b89243b90554b6de6b2b11ebb66bbbc6ff7 -e NODE_NAME=smithi170 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_CONF=/etc/ceph/ceph.conf -v /var/lib/ceph/cccdef92-23cb-11ed-8431-001a4aab830c/nfs.foo.0.0.smithi170.ototfp/config:/etc/ceph/ceph.conf:z -v /var/lib/ceph/cccdef92-23cb-11ed-8431-001a4aab830c/nfs.foo.0.0.smithi170.ototfp/keyring:/etc/ceph/keyring:z -v /var/lib/ceph/cccdef92-23cb-11ed-8431-001a4aab830c/nfs.foo.0.0.smithi170.ototfp/etc/ganesha:/etc/ganesha:z -v /var/lib/ceph/cccdef92-23cb-11ed-8431-001a4aab830c/nfs.foo.0.0.smithi170.ototfp/keyring.rgw:/var/lib/ceph/radosgw/ceph-nfs.foo.0.0.smithi170.ototfp-rgw/keyring:z quay.ceph.io/ceph-ci/ceph@sha256:34443e18785693193f48af244f994b89243b90554b6de6b2b11ebb66bbbc6ff7 -F -L STDERR -N NIV_EVENT

Actions #3

Updated by Adam King over 1 year ago

  • Subject changed from orch/cephadm: 'mount -t nfs 10.0.31.109:/foouser /mnt/foo' failing in nfs-ingress tests to orch/cephadm: 'mount -t nfs 10.0.31.109:/foouser /mnt/foo' failing in nfs-rgw-ingress tests
Actions #4

Updated by Ramana Raja over 1 year ago

I see that the Ganesha version is 4.0.8

Looking at the ganesha log, it seems like the ganesha server hits "free(): invalid pointer" after it received a SIGHUP. When mgr/nfs module create a NFS RGW export, it sends a SIGHUP to ganesha by notifying a RADOS object [1].

You can fetch more detailed NFS-Ganesha logs by following the example to change log level in https://docs.ceph.com/en/latest/mgr/nfs/#set-customized-nfs-ganesha-configuration . Maybe a NFS-Ganesha/RGW expert would need that to debug further?

[1] https://github.com/ceph/ceph/blob/main/src/pybind/mgr/nfs/export.py#L670
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/nfs/export.py#L125

Actions #5

Updated by Ramana Raja over 1 year ago

This could be related to the issue reported in librados testing, https://tracker.ceph.com/issues/57163

Actions #6

Updated by Laura Flores over 1 year ago

Tracker #57163 was resolved by compiling with c++20. Perhaps that could also help in this case?

Actions #7

Updated by Adam King over 1 year ago

  • Status changed from New to Closed

closing as this stopped happening a long time ago. There does happen to be another similar looking issue right now (yet to open tracker, will soon), but it's not the same as this one

Actions

Also available in: Atom PDF