Project

General

Profile

Actions

Bug #64502

open

pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main

Added by Patrick Donnelly 2 months ago. Updated 11 days ago.

Status:
New
Priority:
Urgent
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
squid,reef,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, ceph-fuse
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Every ceph-fuse mount for quincy fails to unmount for reef->main:

https://pulpito.ceph.com/pdonnell-2024-02-19_18:28:45-fs:upgrade:mds_upgrade_sequence-wip-batrick-testing-20240215.160715-distro-default-smithi/

2024-02-19T19:17:36.535 INFO:tasks.cephfs.fuse_mount:Running fusermount -u on ubuntu@smithi060.front.sepia.ceph.com...
2024-02-19T19:17:36.535 INFO:teuthology.orchestra.run:Running command with timeout 300
2024-02-19T19:17:36.535 DEBUG:teuthology.orchestra.run.smithi060:> sudo fusermount -u /home/ubuntu/cephtest/mnt.0
2024-02-19T19:17:36.562 INFO:teuthology.orchestra.run:waiting for 300

From: /teuthology/pdonnell-2024-02-19_18:28:45-fs:upgrade:mds_upgrade_sequence-wip-batrick-testing-20240215.160715-distro-default-smithi/7566635/teuthology.log

2024-02-19T19:17:36.799+0000 7f9fa7fff640 20 client.14548 tick
2024-02-19T19:17:36.799+0000 7f9fa7fff640 20 client.14548 collect_and_send_metrics
2024-02-19T19:17:36.799+0000 7f9fa7fff640 20 client.14548 collect_and_send_global_metrics
2024-02-19T19:17:36.799+0000 7f9fa7fff640  1 -- 192.168.0.1:0/854663557 --> [v2:172.21.15.60:6826/3594652577,v1:172.21.15.60:6827/3594652577] -- client_metrics [client_metric_type: READ_LATENCY latency: 5.996942, avg_latency: 0.000330, sq_sum: 86627012816408144, count=17901][client_metric_type: WRITE_LATENCY latency: 23.710221, avg_latency: 0.000407, sq_sum: 1890169673992666112, count=56281][client_metric_type: METADATA_LATENCY latency: 238.430933, avg_latency: 0.005247, sq_sum: 13600282437617256448, count=45341][client_metric_type: CAP_INFO cap_hits: 831286 cap_misses: 14792 num_caps: 0][client_metric_type: DENTRY_LEASE dlease_hits: 67 dlease_misses: 154700 num_dentries: 0][client_metric_type: OPENED_FILES opened_files: 0 total_inodes: 1][client_metric_type: PINNED_ICAPS pinned_icaps: 1 total_inodes: 1][client_metric_type: OPENED_INODES opened_inodes: 0 total_inodes: 1][client_metric_type: READ_IO_SIZES total_ops: 22272 total_size: 3731108728][client_metric_type: WRITE_IO_SIZES total_ops: 56281 total_size: 4270138133] v1 -- 0x7f9fa000b9e0 con 0x5637e76f0e80
2024-02-19T19:17:36.799+0000 7f9fa7fff640 20 client.14548 trim_cache size 1 max 16384
2024-02-19T19:17:36.799+0000 7f9fa7fff640 20 client.14548 upkeep thread waiting interval 1.000000000s

...

2024-02-19T20:23:30.865+0000 7f9fc8e36480  2 client.14548 unmounting

From: /teuthology/pdonnell-2024-02-19_18:28:45-fs:upgrade:mds_upgrade_sequence-wip-batrick-testing-20240215.160715-distro-default-smithi/7566635/remote/smithi060/log/ceph-client.0.log.gz

During teardown of the cluster the unmount eventual proceeds but it's not clear what was blocking it. I think something was holding the RWRef preventing unmount from proceeding.


Related issues 3 (2 open1 closed)

Related to CephFS - Bug #64503: client: log message when unmount call is receivedPending BackportPatrick Donnelly

Actions
Related to CephFS - Bug #64440: mds: reversed encoding of MDSMap max_xattr_size/bal_rank_mask v18.2.1 <-> mainPending BackportPatrick Donnelly

Actions
Related to CephFS - Bug #61265: qa: tasks.cephfs.fuse_mount:process failed to terminate after unmountDuplicateMilind Changire

Actions
Actions #1

Updated by Patrick Donnelly 2 months ago

  • Related to Bug #64503: client: log message when unmount call is received added
Actions #2

Updated by Patrick Donnelly 2 months ago

  • Assignee set to Patrick Donnelly
Actions #3

Updated by Patrick Donnelly 2 months ago

  • Assignee deleted (Patrick Donnelly)
  • Target version set to v19.1.0
  • Source set to Development
  • Backport set to squid,reef,quincy
Actions #4

Updated by Patrick Donnelly 2 months ago

  • Related to Bug #64440: mds: reversed encoding of MDSMap max_xattr_size/bal_rank_mask v18.2.1 <-> main added
Actions #5

Updated by Patrick Donnelly 2 months ago

Unassigning myself to return to other high priority tasks.

This issue is only revealed by the fix for i64440 which turns on ceph-fuse mounts for fs:upgrade:mds_upgrade_sequence.

Actions #6

Updated by Patrick Donnelly 2 months ago

  • Related to deleted (Bug #64440: mds: reversed encoding of MDSMap max_xattr_size/bal_rank_mask v18.2.1 <-> main)
Actions #7

Updated by Patrick Donnelly 2 months ago

  • Related to Bug #64440: mds: reversed encoding of MDSMap max_xattr_size/bal_rank_mask v18.2.1 <-> main added
Actions #8

Updated by Patrick Donnelly 2 months ago

Also affects pacific and reef v18.2.0 (possibly v18.2.1 too):

https://pulpito.ceph.com/yuriw-2024-02-21_23:06:32-fs-wip-yuri10-testing-2024-02-21-1249-reef-distro-default-smithi/

See fs:upgrade:mds_upgrade_sequence tests failing with:

"reached maximum tries (51) after waiting for 300 seconds "

Actions #9

Updated by Patrick Donnelly 2 months ago

  • Subject changed from client: quincy ceph-fuse fails to unmount after upgrade to main to pacific/quincy/v18.2.0: client: ceph-fuse fails to unmount after upgrade to main
Actions #10

Updated by Venky Shankar 2 months ago

  • Category set to Correctness/Safety
  • Assignee set to Venky Shankar
  • Target version changed from v19.1.0 to v20.0.0
Actions #11

Updated by Venky Shankar 2 months ago

I am working on this. The plan is to gdb the ceph-fuse process after fusermount. Will update by EOD today.

Actions #12

Updated by Venky Shankar 2 months ago

oh well, infra issues now :/

2024-02-26T07:01:58.420 INFO:teuthology.orchestra.run.smithi057.stderr:/bin/podman: stderr Error: initializing source docker://quay.io/ceph/ceph:v18.2.1: Requesting bearer token: invalid status code from registry 502 (Bad Gateway)
Actions #13

Updated by Venky Shankar 2 months ago

Trying my luck with this today - hopefully no infra issues show up.

Actions #14

Updated by Venky Shankar about 2 months ago

Started running into

ceph: stderr Error: OCI runtime error: crun: bpf create ``: Invalid argument

again in

/a/vshankar-2024-02-28_06:39:32-fs:upgrade-main-testing-default-smithi/7575094

Checked with Ilya and looks like the run just picked up the kernel bits when it was getting redeployed. Will have to run the test again when the latest build trigger finishes.

Actions #15

Updated by Venky Shankar about 2 months ago

Venky Shankar wrote:

Started running into

ceph: stderr Error: OCI runtime error: crun: bpf create ``: Invalid argument

again in

/a/vshankar-2024-02-28_06:39:32-fs:upgrade-main-testing-default-smithi/7575094

Checked with Ilya and looks like the run just picked up the kernel bits when it was getting redeployed. Will have to run the test again when the latest build trigger finishes.

The kclient related issue is fixed now. Tried a sample run yesterday

- https://pulpito.ceph.com/vshankar-2024-02-28_15:21:56-fs:upgrade:mds_upgrade_sequence-main-testing-default-smithi/

32/32 jobs failed. Onto this today!

Actions #16

Updated by Venky Shankar about 2 months ago

Finally with the infra and the kclient issues set aside, I was able to gdb the ceph-fuse process, add a breakpoint in Client::_unmount(). To my surprise, the function was never invoked when fusermount was done (the gdb setup had set follow-fork-mode child and I did try this locally before doing it in the smithi machines). This is weird. I'll see what further I can come up with.

Actions #17

Updated by Venky Shankar about 2 months ago

Another debug session with set detach-on-fork on which is supposed to let gdb debug both parent and child processes. Had a breakpoint in Client::_unmount

[New LWP 82130]
[New LWP 82141]
[New LWP 82143]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007fada1e9c39a in __futex_abstimed_wait_common () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install ceph-fuse-17.2.7-560.g931b9f17.el9.x86_64
(gdb) set detach-on-fork on
(gdb) b Client::_unmount
Breakpoint 1 at 0x563ac1a8fa00
(gdb) c
Continuing.

fusermount was invoked at timestamp

2024-03-04T16:50:24.033 INFO:tasks.cephfs.fuse_mount:Running fusermount -u on ubuntu@smithi003.front.sepia.ceph.com...
2024-03-04T16:50:24.033 DEBUG:teuthology.orchestra.run.smithi003:> sudo fusermount -u /home/ubuntu/cephtest/mnt.0

However, the breakpoint was triggered when during cluster teardown. I noted the timestamp at that point.

Teardown:

2024-03-04T16:55:23.199 ERROR:tasks.cephfs.fuse_mount:process failed to terminate after unmount. This probably indicates a bug within ceph-fuse.
2024-03-04T16:55:23.199 ERROR:teuthology.run_tasks:Manager failed: ceph-fuse
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_teuthology_b1dac5519c57c269ea98a22fb7729016a1d0e4d2/teuthology/run_tasks.py", line 154, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python3.8/contextlib.py", line 120, in __exit__
    next(self.gen)
  File "/home/teuthworker/src/github.com_vshankar_ceph_9822b9f8f96d88d1e7441f32961ead3c521c675e/qa/tasks/ceph_fuse.py", line 183, in task
    mount.umount_wait()
  File "/home/teuthworker/src/github.com_vshankar_ceph_9822b9f8f96d88d1e7441f32961ead3c521c675e/qa/tasks/cephfs/fuse_mount.py", line 403, in umount_wait
    run.wait([self.fuse_daemon], timeout)
  File "/home/teuthworker/src/git.ceph.com_teuthology_b1dac5519c57c269ea98a22fb7729016a1d0e4d2/teuthology/orchestra/run.py", line 473, in wait
    check_time()
  File "/home/teuthworker/src/git.ceph.com_teuthology_b1dac5519c57c269ea98a22fb7729016a1d0e4d2/teuthology/contextutil.py", line 134, in __call__
    raise MaxWhileTries(error_msg)
teuthology.exceptions.MaxWhileTries: reached maximum tries (51) after waiting for 300 seconds

gdb breakpoint:

[Detaching after vfork from child process 128806]
[Detaching after vfork from child process 128807]
[Detaching after vfork from child process 129019]
[Detaching after vfork from child process 129043]
[Thread 0x7fad81600640 (LWP 82141) exited]
[Thread 0x7fad6d600640 (LWP 82143) exited]
[Thread 0x7fad63400640 (LWP 82130) exited]
[Thread 0x7fad79600640 (LWP 82083) exited]
[Thread 0x7fad7aa00640 (LWP 82082) exited]
[Thread 0x7fad6ea00640 (LWP 82081) exited]
[Thread 0x7fad6e000640 (LWP 82079) exited]
[Thread 0x7fad78c00640 (LWP 82078) exited]
[Thread 0x7fad63e00640 (LWP 82025) exited]
[Thread 0x7fad6cc00640 (LWP 82014) exited]

Thread 1 "ceph-fuse" hit Breakpoint 1, 0x0000563ac1a8fa00 in Client::_unmount(bool) ()
(gdb) n
Single stepping until exit from function _ZN6Client8_unmountEb,
which has no line number information.
[Thread 0x7fad82a00640 (LWP 76392) exited]
[Thread 0x7fad8b400640 (LWP 76385) exited]
0x0000563ac1a2fbc3 in main ()
(gdb) bt
#0  0x0000563ac1a2fbc3 in main ()
(gdb) q
A debugging session is active.

    Inferior 1 [process 76351] will be detached.

Quit anyway? (y or n) y
Detaching from program: /usr/bin/ceph-fuse, process 76351
[Inferior 1 (process 76351) detached]
[vshankar@smithi003 ~]$ date
Mon Mar  4 04:57:52 PM UTC 2024
Actions #18

Updated by Venky Shankar about 2 months ago

Continuing on this today - fusermount(1) is basically invoking umount2(2). Will try to see what's going on.

Actions #19

Updated by Venky Shankar about 2 months ago

This is not as bad as it looks. The ceph-fuse process seems to be exiting as usual - its somewhere in the qa world where things are timing out.

Actions #20

Updated by Venky Shankar about 2 months ago

Venky Shankar wrote:

This is not as bad as it looks. The ceph-fuse process seems to be exiting as usual - its somewhere in the qa world where things are timing out.

I take that back. None of the MDSs are active after upgrade after upgrade to squid (down:dne).

Actions #21

Updated by Venky Shankar about 2 months ago

Venky Shankar wrote:

Venky Shankar wrote:

This is not as bad as it looks. The ceph-fuse process seems to be exiting as usual - its somewhere in the qa world where things are timing out.

I take that back. None of the MDSs are active after upgrade after upgrade to squid (down:dne).

Scrap that. The test branch had a change[*] that caused the MDS to assert. But, I do have some findings to share. I introduce a custom qa change to skip the fusermount and add a long enough sleep so as to debug the fuse mount.

-        log.info('Unmounting ceph-fuse clients...')
-
-        for info in mounted_by_me.values():
-            # Conditional because an inner context might have umounted it
-            mount = info["mount"]
-            if mount.is_mounted():
-                mount.umount_wait()
-        for remote in remotes:
-            FuseMount.cleanup_stale_netnses_and_bridge(remote)
+        log.info('Not unmounting ceph-fuse clients (hack)...')
+        time.sleep(3600)
+
+        #for info in mounted_by_me.values():
+        #    # Conditional because an inner context might have umounted it
+        #    mount = info["mount"]
+        #    if mount.is_mounted():
+        #        mount.umount_wait()
+        #for remote in remotes:
+        #    FuseMount.cleanup_stale_netnses_and_bridge(remote)

Sample run here with main branch: https://pulpito.ceph.com/vshankar-2024-03-11_05:30:10-fs:upgrade:mds_upgrade_sequence-main-testing-default-smithi/7590762/ which failed, but more importantly I was able to ssh to the node which had the fuse mount.

[vshankar@smithi132 ~]$ findmnt
...
...
└─/home/ubuntu/cephtest/mnt.0                           ceph-fuse     fuse.ceph-fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other

[vshankar@smithi132 ~]$ sudo fusermount -u /home/ubuntu/cephtest/mnt.0
[vshankar@smithi132 ~]$ findmnt

<ceph-fuse mount gone>

... and likewise with the other ceph-fuse mount on smithi187. Client logs also confirm the same.

[*]: https://github.com/ceph/ceph/pull/55616

Actions #22

Updated by Patrick Donnelly about 2 months ago

Venky Shankar wrote:

Venky Shankar wrote:

Venky Shankar wrote:

This is not as bad as it looks. The ceph-fuse process seems to be exiting as usual - its somewhere in the qa world where things are timing out.

I take that back. None of the MDSs are active after upgrade after upgrade to squid (down:dne).

Scrap that. The test branch had a change[*] that caused the MDS to assert. But, I do have some findings to share. I introduce a custom qa change to skip the fusermount and add a long enough sleep so as to debug the fuse mount.

[...]

Sample run here with main branch: https://pulpito.ceph.com/vshankar-2024-03-11_05:30:10-fs:upgrade:mds_upgrade_sequence-main-testing-default-smithi/7590762/ which failed, but more importantly I was able to ssh to the node which had the fuse mount.

[...]

... and likewise with the other ceph-fuse mount on smithi187. Client logs also confirm the same.

[*]: https://github.com/ceph/ceph/pull/55616

This is the worst kind of bug. I don't have a great suggestion beyond stating that the obvious yet highly improbable explanation is that the QA suite executing `fusermount -u ...` is clearly not actually running the command. It doesn't make sense but something odd is going on. Perhaps it's a user permission (fusermount uid != ceph-fuse uid) issue.

Actions #23

Updated by Venky Shankar about 2 months ago

Patrick Donnelly wrote:

Venky Shankar wrote:

Venky Shankar wrote:

Venky Shankar wrote:

This is not as bad as it looks. The ceph-fuse process seems to be exiting as usual - its somewhere in the qa world where things are timing out.

I take that back. None of the MDSs are active after upgrade after upgrade to squid (down:dne).

Scrap that. The test branch had a change[*] that caused the MDS to assert. But, I do have some findings to share. I introduce a custom qa change to skip the fusermount and add a long enough sleep so as to debug the fuse mount.

[...]

Sample run here with main branch: https://pulpito.ceph.com/vshankar-2024-03-11_05:30:10-fs:upgrade:mds_upgrade_sequence-main-testing-default-smithi/7590762/ which failed, but more importantly I was able to ssh to the node which had the fuse mount.

[...]

... and likewise with the other ceph-fuse mount on smithi187. Client logs also confirm the same.

[*]: https://github.com/ceph/ceph/pull/55616

This is the worst kind of bug. I don't have a great suggestion beyond stating that the obvious yet highly improbable explanation is that the QA suite executing `fusermount -u ...` is clearly not actually running the command. It doesn't make sense but something odd is going on. Perhaps it's a user permission (fusermount uid != ceph-fuse uid) issue.

I strace'd the fusermount call and the process exited normally (exitcode = 0) and the umount2(2) call returned 0 (success).

Actions #25

Updated by Venky Shankar 29 days ago

This issue is likely related to libfuse library in centos.9. To confirm this, I'll run the tests with centos.8 to verify. Will report in a day or two.

Actions #27

Updated by Venky Shankar 26 days ago

  • Related to Bug #61265: qa: tasks.cephfs.fuse_mount:process failed to terminate after unmount added
Actions #29

Updated by Laura Flores 17 days ago

/a/yuriw-2024-04-09_14:58:21-upgrade-wip-yuri4-testing-2024-04-08-1432-distro-default-smithi/7648857

Actions #30

Updated by Venky Shankar 15 days ago · Edited

I have a custom quincy branch (patched with debug in ceph-fuse/fuse_ll). That should give us enough debug to see what's happening.

EDIT: https://shaman.ceph.com/builds/ceph/wip-64502-quincy-2/

Actions #31

Updated by Venky Shankar 15 days ago

diff --git a/src/ceph_fuse.cc b/src/ceph_fuse.cc
index 3fa5346b463..061d9f576e4 100644
--- a/src/ceph_fuse.cc
+++ b/src/ceph_fuse.cc
@@ -331,14 +331,19 @@ int main(int argc, const char **argv, const char *envp[]) {
     cerr << "ceph-fuse[" << getpid() << "]: starting fuse" << std::endl;
     tester.init(cfuse, client);
     tester.create("tester");
+    cout << "ceph_fuse: starting fuse loop" << std::endl;
     r = cfuse->loop();
+    cout << "ceph_fuse: fuse loop returned - joining" << std::endl;
     tester.join(&tester_rp);
+    cout << "ceph_fuse: fuse loop returned - joined" << std::endl;
     tester_r = static_cast<int>(reinterpret_cast<uint64_t>(tester_rp));
     cerr << "ceph-fuse[" << getpid() << "]: fuse finished with error " << r
         << " and tester_r " << tester_r <<std::endl;

   out_client_unmount:
+    cout << "calling unmount" << std::endl;
     client->unmount();
+    cout << "calling finalize" << std::endl;
     cfuse->finalize();
   out_shutdown:
     icp.stop();
diff --git a/src/client/fuse_ll.cc b/src/client/fuse_ll.cc
index 109aa25094d..765fdd7ffc7 100644
--- a/src/client/fuse_ll.cc
+++ b/src/client/fuse_ll.cc
@@ -1612,21 +1612,31 @@ int CephFuse::Handle::loop()
   auto fuse_multithreaded = client->cct->_conf.get_val<bool>(
     "fuse_multithreaded");
   if (fuse_multithreaded) {
+    derr << "multithreaded fuse" << dendl;
 #if FUSE_VERSION >= FUSE_MAKE_VERSION(3, 1)
     {
       struct fuse_loop_config conf = {
         clone_fd: opts.clone_fd,
         max_idle_threads: opts.max_idle_threads
       };
-      return fuse_session_loop_mt(se, &conf);
+      int r = fuse_session_loop_mt(se, &conf);
+      derr << "fuse_session_loop_mt 1 -- done" << dendl;
+      return r;
     }
 #elif FUSE_VERSION >= FUSE_MAKE_VERSION(3, 0)
-    return fuse_session_loop_mt(se, opts.clone_fd);
+    int r = fuse_session_loop_mt(se, opts.clone_fd);
+    derr << "fuse_session_loop_mt 2 -- done" << dendl;
+    return r;
 #else
-    return fuse_session_loop_mt(se);
+    int r = fuse_session_loop_mt(se);
+    derr << "fuse_session_loop_mt 3 -- done" << dendl;
+    return r;
 #endif
   } else {
-    return fuse_session_loop(se);
+    derr << "singlethreaded fuse" << dendl;
+    int r = fuse_session_loop(se);
+    derr << "fuse_session_loop_mt 4 -- done" << dendl;
+    return r;
   }
 }

And patched up the yaml to use the custom quincy build to upgrade to. Here is the run: https://pulpito.ceph.com/vshankar-2024-04-12_13:03:29-fs:upgrade:mds_upgrade_sequence-main-testing-default-smithi/7653889/

Interestingly

vshankar@teuthology:/a/vshankar-2024-04-12_13:03:29-fs:upgrade:mds_upgrade_sequence-main-testing-default-smithi/7653889$ egrep "ceph_fuse\:|fusermount|unmount|finalize|fuse_session_loop_mt" teuthology.log
2024-04-12T13:38:54.194 INFO:tasks.ceph_fuse:Running ceph_fuse task...
2024-04-12T13:38:54.195 INFO:tasks.ceph_fuse:config is {'client.0': {}, 'client.1': {}}
2024-04-12T13:38:54.195 INFO:tasks.ceph_fuse:client.0 config is {}
2024-04-12T13:38:54.195 INFO:tasks.ceph_fuse:client.1 config is {}
2024-04-12T13:38:54.522 INFO:tasks.ceph_fuse:Mounting ceph-fuse clients...
2024-04-12T13:39:01.752 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi060.stdout:ceph_fuse: starting fuse loop
2024-04-12T13:39:10.578 INFO:tasks.cephfs.fuse_mount.ceph-fuse.1.smithi110.stdout:ceph_fuse: starting fuse loop
2024-04-12T13:56:40.623 INFO:tasks.ceph_fuse:Unmounting ceph-fuse clients...
2024-04-12T13:56:45.440 INFO:tasks.cephfs.fuse_mount:Running fusermount -u on ubuntu@smithi060.front.sepia.ceph.com...
2024-04-12T13:56:45.440 DEBUG:teuthology.orchestra.run.smithi060:> sudo fusermount -u /home/ubuntu/cephtest/mnt.0
2024-04-12T14:01:44.607 ERROR:tasks.cephfs.fuse_mount:process failed to terminate after unmount. This probably indicates a bug within ceph-fuse.
2024-04-12T14:02:51.770 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi060.stdout:ceph_fuse: fuse loop returned - joining
2024-04-12T14:02:51.770 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi060.stdout:ceph_fuse: fuse loop returned - joined
2024-04-12T14:02:51.771 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi060.stderr:2024-04-12T14:02:51.768+0000 7fd910abf300 -1 fuse_session_loop_mt 3 -- done
2024-04-12T14:02:51.771 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi060.stdout:calling unmount
2024-04-12T14:03:21.770 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.smithi060.stdout:calling finalize
vshankar@teuthology:/a/vshankar-2024-04-12_13:03:29-fs:upgrade:mds_upgrade_sequence-main-testing-default-smithi/7653889$

fusermount had no effect at all on the ceph-fuse daemon. One would expect fuse_session_loop_mt() to unblock and fuse_ll.cc::loop() to return. It's only when the cluster teardown was under execution, the relevant debug logs are seen.

I'm going to run this again with fuse_debug client option enabled - that adds a -d argument to the daemon and that should give us debugging logs from libfuse (at least the config option help says so).

Actions #32

Updated by Patrick Donnelly 15 days ago

Venky Shankar wrote in #note-31:

[...]

And patched up the yaml to use the custom quincy build to upgrade to. Here is the run: https://pulpito.ceph.com/vshankar-2024-04-12_13:03:29-fs:upgrade:mds_upgrade_sequence-main-testing-default-smithi/7653889/

Interestingly

[...]

fusermount had no effect at all on the ceph-fuse daemon. One would expect fuse_session_loop_mt() to unblock and fuse_ll.cc::loop() to return. It's only when the cluster teardown was under execution, the relevant debug logs are seen.

I'm going to run this again with fuse_debug client option enabled - that adds a -d argument to the daemon and that should give us debugging logs from libfuse (at least the config option help says so).

Another avenue to pursue is to gdb attach the ceph-fuse process and see what the fuse dispatcher threads are doing. It's possible there is a bug in the Client preventing one or more of the threads from causing the fuse loop function from returning; I suspect libfuse wants to join on all those dispatcher threads before it returns from its main loop. I looked at the client log but didn't see anything which would indicate why cluster teardown would prompt the libfuse loop to finally terminate. There are a lot of connection refused messages for reconnects to the mons. That's about it.

Actions #33

Updated by Venky Shankar 11 days ago

OK. So this bug has upgrades written all over it - it seemed obvious given that this is an upgrade task but we were trying to see what's going on with fuse. With some test iterations, it's now clear that the cluster upgrade is somehow causing this blowup. I patched up mds_upgrade_sequence to not upgrade and the umount hang isn't seen. So, reconnects are involved and I think this has to do more than just the MDS (read, monitors).

Actions #34

Updated by Venky Shankar 11 days ago

Venky Shankar wrote in #note-33:

OK. So this bug has upgrades written all over it - it seemed obvious given that this is an upgrade task but we were trying to see what's going on with fuse. With some test iterations, it's now clear that the cluster upgrade is somehow causing this blowup. I patched up mds_upgrade_sequence to not upgrade and the umount hang isn't seen. So, reconnects are involved and I think this has to do more than just the MDS (read, monitors).

Oh, one more thing - in a different test iteration I patched up the upgrade yaml to not run the workload-task (which runs fsstress) but let the cluster upgrade. So, a simple cluster upgrade without any workload is enough to trigger this umount hang.

Actions

Also available in: Atom PDF