https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2022-06-02T09:21:21Z
Ceph
CephFS - Bug #55824: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536
https://tracker.ceph.com/issues/55824?journal_id=217043
2022-06-02T09:21:21Z
Xiubo Li
xiubli@redhat.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/217043/diff?detail_id=229435">diff</a>)</li></ul>
CephFS - Bug #55824: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536
https://tracker.ceph.com/issues/55824?journal_id=217044
2022-06-02T09:25:44Z
Xiubo Li
xiubli@redhat.com
<ul></ul><p>Xiubo Li wrote:</p>
<blockquote>
<p><a class="external" href="https://pulpito.ceph.com/vshankar-2022-05-31_02:47:51-fs-wip-vshankar-fscrypt-20220530-091336-testing-default-smithi/6853737/">https://pulpito.ceph.com/vshankar-2022-05-31_02:47:51-fs-wip-vshankar-fscrypt-20220530-091336-testing-default-smithi/6853737/</a></p>
<p>When mounting a <strong><em>ceph-fuse</em></strong> it failed:</p>
<p>[...]</p>
<p>From the <strong><em>mds.0</em></strong> logs:</p>
<p>[...]</p>
<p>The client logs:</p>
<p>[...]</p>
</blockquote>
<p>Checked the <strong><em>mds.0</em></strong> logs, for the <strong><em>epoch 5</em></strong>, the <strong><em>mds.0</em></strong> was still in the <strong><em>up:creating</em></strong> state:</p>
<pre>
.21.15.136:6789/0] -- mdsbeacon(4270/d up:active seq=4 v4) v8 -- 0x5639ce331080 con 0x5639ce159400
2022-05-31T03:13:39.494+0000 7f21b3085700 10 mds.0.snapclient sync
2022-05-31T03:13:39.494+0000 7f21b3085700 10 mds.0.snapclient refresh want 1
2022-05-31T03:13:39.494+0000 7f21b9091700 1 -- [v2:172.21.15.136:6835/1045808825,v1:172.21.15.136:6837/1045808825] <== mon.1 v2:172.21.15.136:3300/0 46 ==== mdsmap(e 5) v2 ==== 1094+0+0 (secure 0 0 0) 0x5639ce1bb380 con 0x5639ce159400
2022-05-31T03:13:39.494+0000 7f21b9091700 1 mds.d Updating MDS map to version 5 from mon.1
2022-05-31T03:13:39.494+0000 7f21b9091700 10 mds.d my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
2022-05-31T03:13:39.494+0000 7f21b9091700 10 mds.d mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
2022-05-31T03:13:39.494+0000 7f21b9091700 10 mds.d my gid is 4270
2022-05-31T03:13:39.494+0000 7f21b9091700 10 mds.d map says I am mds.0.4 state up:creating
2022-05-31T03:13:39.494+0000 7f21b9091700 10 mds.d msgr says I am [v2:172.21.15.136:6835/1045808825,v1:172.21.15.136:6837/1045808825]
2022-05-31T03:13:39.494+0000 7f21b9091700 10 mds.d handle_mds_map: handling map as rank 0
2022-05-31T03:13:39.494+0000 7f21b9091700 10 notify_mdsmap: mds.metrics
2022-05-31T03:13:39.494+0000 7f21b9091700 10 notify_mdsmap: mds.metrics: rank0 is unavailable
2022-05-31T03:13:39.494+0000 7f21b9091700 10 reset_seq: mds.metrics: last_updated_seq=0
2022-05-31T03:13:39.494+0000 7f21b9091700 20 set_next_seq: mds.metrics: current sequence number 0, setting next sequence number 0
2022-05-31T03:13:39.495+0000 7f21b9091700 1 -- [v2:172.21.15.136:6835/1045808825,v1:172.21.15.136:6837/1045808825] <== mon.1 v2:172.21.15.136:3300/0 47 ==== osd_map(26..26 src has 1..26) v4 ==== 685+0+0 (secure 0 0 0) 0x5639cd41f6c0 con 0x5639ce159400
2022-05-31T03:13:39.495+0000 7f21b9091700 7 mds.0.server operator(): full = 0 epoch = 26
2022-05-31T03:13:39.495+0000 7f21b9091700 10 mds.0.server apply_blocklist: killed 0
</pre>
<p>So no MDS is in <strong><em>up</em></strong>.</p>
<blockquote>
<p>Still not sure why the <strong><em>ceph-fuse</em></strong> didn't detect any active MDS.</p>
</blockquote>
CephFS - Bug #55824: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536
https://tracker.ceph.com/issues/55824?journal_id=217049
2022-06-02T09:44:43Z
Xiubo Li
xiubli@redhat.com
<ul></ul><p>Maybe we should wait for a while when mounting the <strong><em>ceph-fuse</em></strong> ?</p>
CephFS - Bug #55824: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536
https://tracker.ceph.com/issues/55824?journal_id=217264
2022-06-06T12:57:40Z
Venky Shankar
vshankar@redhat.com
<ul><li><strong>Category</strong> set to <i>Administration/Usability</i></li><li><strong>Assignee</strong> set to <i>Xiubo Li</i></li><li><strong>Target version</strong> set to <i>v18.0.0</i></li><li><strong>Backport</strong> set to <i>quincy, pacific</i></li></ul>
CephFS - Bug #55824: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536
https://tracker.ceph.com/issues/55824?journal_id=217476
2022-06-08T05:09:14Z
Xiubo Li
xiubli@redhat.com
<ul></ul><p>The <strong><em>ec</em></strong> is enabled and the test will create and set the ec pool to the layout:</p>
<pre>
ceph:
cephfs:
ec_profile:
- m=2
- k=2
- crush-failure-domain=osd
max_mds: 5
session_timeout: 300
standby_replay: true
</pre>
<p>The logs:</p>
<pre>
2022-06-06T22:30:37.348 INFO:teuthology.orchestra.run:Running command with timeout 300
2022-06-06T22:30:37.348 DEBUG:teuthology.orchestra.run.smithi102:> stat --file-system '--printf=%T
2022-06-06T22:30:37.349 DEBUG:teuthology.orchestra.run.smithi102:> ' -- /home/ubuntu/cephtest/mnt.admin
2022-06-06T22:30:37.405 INFO:teuthology.orchestra.run.smithi102.stdout:fuseblk
2022-06-06T22:30:37.405 INFO:tasks.cephfs.fuse_mount:ceph-fuse is mounted on /home/ubuntu/cephtest/mnt.admin
2022-06-06T22:30:37.406 INFO:teuthology.orchestra.run:Running command with timeout 300
2022-06-06T22:30:37.406 DEBUG:teuthology.orchestra.run.smithi102:> sudo chmod 1777 /home/ubuntu/cephtest/mnt.admin
2022-06-06T22:30:37.493 INFO:teuthology.orchestra.run:Running command with timeout 300
2022-06-06T22:30:37.494 DEBUG:teuthology.orchestra.run.smithi102:> (cd /home/ubuntu/cephtest/mnt.admin && exec bash -c 'setfattr -n ceph.dir.layout.pool -v cephfs_data_ec . && getfattr -n ceph.dir.layout .')
2022-06-06T22:30:37.654 INFO:teuthology.orchestra.run.smithi102.stdout:# file: .
2022-06-06T22:30:37.654 INFO:teuthology.orchestra.run.smithi102.stdout:ceph.dir.layout="stripe_unit=4194304 stripe_count=1 object_size=4194304 pool=cephfs_data_ec"
2022-06-06T22:30:37.655 INFO:teuthology.orchestra.run.smithi102.stdout:
2022-06-06T22:30:37.656 INFO:tasks.cephfs.fuse_mount:Running fusermount -u on ubuntu@smithi102.front.sepia.ceph.com...
2022-06-06T22:30:37.657 INFO:teuthology.orchestra.run:Running command with timeout 300
2022-06-06T22:30:37.657 DEBUG:teuthology.orchestra.run.smithi102:> sudo fusermount -u /home/ubuntu/cephtest/mnt.admin
2022-06-06T22:30:37.737 INFO:tasks.cephfs.fuse_mount.ceph-fuse.admin.smithi102.stderr:ceph-fuse[124989]: fuse finished with error 0 and tester_r 0
2022-06-06T22:30:37.743 INFO:teuthology.orchestra.run:waiting for 300
2022-06-06T22:30:37.771 INFO:tasks.cephfs.mount:Cleaning up mount ubuntu@smithi102.front.sepia.ceph.com
</pre>
<p>From the code from qa/tasks/cephfs/filesystem.py, in Line#680 it will mount a fuse client and then set the ec pool to the layout:</p>
<pre>
626 def create(self):
...
653 if self.metadata_overlay:
654 self.mon_manager.raw_cluster_cmd('fs', 'new',
655 self.name, self.metadata_pool_name, data_pool_name,
656 '--allow-dangerous-metadata-overlay')
657 else:
658 self.mon_manager.raw_cluster_cmd('fs', 'new',
659 self.name,
660 self.metadata_pool_name,
661 data_pool_name)
662
663 if self.ec_profile and 'disabled' not in self.ec_profile:
664 ec_data_pool_name = data_pool_name + "_ec"
665 log.debug("EC profile is %s", self.ec_profile)
666 cmd = ['osd', 'erasure-code-profile', 'set', ec_data_pool_name]
667 cmd.extend(self.ec_profile)
668 self.mon_manager.raw_cluster_cmd(*cmd)
669 self.mon_manager.raw_cluster_cmd(
670 'osd', 'pool', 'create', ec_data_pool_name,
671 'erasure', ec_data_pool_name,
672 '--pg_num_min', str(self.pg_num_min),
673 '--target_size_ratio', str(self.target_size_ratio_ec))
674 self.mon_manager.raw_cluster_cmd(
675 'osd', 'pool', 'set',
676 ec_data_pool_name, 'allow_ec_overwrites', 'true')
677 self.add_data_pool(ec_data_pool_name, create=False)
678 self.check_pool_application(ec_data_pool_name)
679
680 self.run_client_payload(f"setfattr -n ceph.dir.layout.pool -v {ec_data_pool_name} . && getfattr -n ceph.dir.layout .")
681
...
</pre>
<p>The code of run_client_payload():</p>
<pre>
713 def run_client_payload(self, cmd):
714 # avoid circular dep by importing here:
715 from tasks.cephfs.fuse_mount import FuseMount
716 d = misc.get_testdir(self._ctx)
717 m = FuseMount(self._ctx, d, "admin", self.client_remote, cephfs_name=self.name)
718 m.mount_wait()
719 m.run_shell_payload(cmd)
720 m.umount_wait(require_clean=True)
721
</pre>
<p>But it ran too fast that the MDSes in the ceph cluster were not ready yet. So we need to wait for a while to get at least the <strong><em>rank 0</em></strong> to be <strong><em>up:active</em></strong>.</p>
CephFS - Bug #55824: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536
https://tracker.ceph.com/issues/55824?journal_id=217477
2022-06-08T05:29:49Z
Jos Collin
<ul><li><strong>Pull request ID</strong> set to <i>46560</i></li></ul>
CephFS - Bug #55824: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536
https://tracker.ceph.com/issues/55824?journal_id=217478
2022-06-08T05:43:53Z
Xiubo Li
xiubli@redhat.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Fix Under Review</i></li></ul>
CephFS - Bug #55824: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536
https://tracker.ceph.com/issues/55824?journal_id=218369
2022-06-20T04:22:52Z
Venky Shankar
vshankar@redhat.com
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Pending Backport</i></li></ul>
CephFS - Bug #55824: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536
https://tracker.ceph.com/issues/55824?journal_id=218371
2022-06-20T04:25:34Z
Backport Bot
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/56105">Backport #56105</a>: pacific: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536</i> added</li></ul>
CephFS - Bug #55824: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536
https://tracker.ceph.com/issues/55824?journal_id=218373
2022-06-20T04:25:44Z
Backport Bot
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/56106">Backport #56106</a>: quincy: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536</i> added</li></ul>
CephFS - Bug #55824: ceph-fuse[88614]: ceph mount failed with (65536) Unknown error 65536
https://tracker.ceph.com/issues/55824?journal_id=220557
2022-07-18T01:37:05Z
Xiubo Li
xiubli@redhat.com
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul>