Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2024-03-05T14:16:47ZCeph
Redmine rgw - Bug #64719 (Pending Backport): SSL session id reuse speedup mechanism of the SSL_CTX_set_se...https://tracker.ceph.com/issues/647192024-03-05T14:16:47ZMark Koganmkogan@redhat.com
<p>The OpenSSL session-id reuse acceleration mechanism that is described in SSL_CTX_set_session_id_context</p>
<p><a class="external" href="https://www.openssl.org/docs/man1.0.2/man3/SSL_CTX_set_session_id_context.html">https://www.openssl.org/docs/man1.0.2/man3/SSL_CTX_set_session_id_context.html</a><br /><em>SSL_CTX_set_session_id_context, SSL_set_session_id_context - set context within which session can be reused (server side only)</em></p>
<p>is not operating currently.</p>
<p>The check methodology is with the 'openssl s_client' command below , note the `--reconnect` which is reconnecting 5 times:<br /><pre>
echo "" | openssl s_client -connect 0:8443 --reconnect -no_ticket -tls1_2 |& grep Session-ID
</pre><br />When not working correctly the session-ids will be different<br />when working correctly the session-ids will be the same <br />(see example below)</p>
<p>performance measurments:<br />when the mechanism is not working performing a loop of 1000 openssl --connect --reconnect ... takes 38.870 seconds<br />when the mechanism is working performing a loop of 1000 openssl --connect --reconnect ... takes 16.038 seconds</p>
<pre>
// BEFORE FIX:
❯ time (for I in {1..1000}; do echo $I ; echo "" | openssl s_client -connect x.x.x.ceph.com:8443 --reconnect -no_ticket -tls1_2 |& grep 'Session-ID:' > openssl.txt ; done)
( for I in {1..1000}; do; echo $I; echo "" | openssl s_client -connect | ) 9.19s user 6.67s system 40% cpu 38.870 total
^^^^^^
❯ cat openssl.txt
Session-ID: 0CAB532FC91584CAC1BBB0A91FF874C88CD4233C426BD7F5332E6A32643DB668
Session-ID: E8349831EC98AC87215FAFCA12CC8573DEEDB4845522D417103AEB5109C5407D
Session-ID: 6B5B566EDE2D84F8D43F023D451896FF9B50DF4EA1AE76EED9300AB2C8730B10
Session-ID: ACDBD3EEDC4416C685BE962A6402869A6ECD25C00474EE457216C644E40719ED
Session-ID: AB4C2EC629017FE0433C3B3702AB44E0030F5FDFEF0D48117958034BC71F3AF7
Session-ID: 56BE99BC9E55A29A72A10B3BB88EEB3C40ED381140484382EB36186A5B56FB59
// AFTER FIX:
❯ time (for I in {1..1000}; do echo $I ; echo "" | openssl s_client -connect x.x.x.ceph.com:8443 --reconnect -no_ticket -tls1_2 |& grep 'Session-ID:' > openssl.txt ; done)
( for I in {1..1000}; do; echo $I; echo "" | openssl s_client -connect | ) 7.94s user 5.86s system 86% cpu 16.038 total
^^^^^^
❯ cat openssl.txt
Session-ID: 6791FAC534C991F5787568CCEB4DC3BE5F160872B5681AC967CFCB8864ED2593
Session-ID: 6791FAC534C991F5787568CCEB4DC3BE5F160872B5681AC967CFCB8864ED2593
Session-ID: 6791FAC534C991F5787568CCEB4DC3BE5F160872B5681AC967CFCB8864ED2593
Session-ID: 6791FAC534C991F5787568CCEB4DC3BE5F160872B5681AC967CFCB8864ED2593
Session-ID: 6791FAC534C991F5787568CCEB4DC3BE5F160872B5681AC967CFCB8864ED2593
Session-ID: 6791FAC534C991F5787568CCEB4DC3BE5F160872B5681AC967CFCB8864ED2593
</pre> Ceph - Bug #64548 (Fix Under Review): ceph-base: /var/lib/ceph/crash/posted not chowned to ceph:c...https://tracker.ceph.com/issues/645482024-02-23T13:18:48ZChristian Rohmann
<p>The Debian package ceph-base postinst applies some chown to ceph:ceph in <code>https://github.com/ceph/ceph/blob/87f6091b9e3c22af23199d5dc07c1ba57029ea6b/debian/ceph-base.postinst#L36</code></p>
<p>This also covers the folder <code>/var/lib/ceph/crash</code>, but not the subfolder <code>posted</code> which is also part of the package.<br />This then causes ceph-crash being unable to move posted crashes to this subfolder:</p>
<pre>
ceph-crash[3797824]: ERROR:ceph-crash:Error scraping /var/lib/ceph/crash: [Errno 13] Permission denied: '/var/lib/ceph/crash/2024-02-23T06:10:15.784710Z_8eed1e80-1aaa-41a6-a41b-f20866aa17a0' -> '/var/lib/ceph/crash/posted/2024-02-23T06:10:15.784710Z_8eed1e80-1aaa-41a6-a41b-f20866aa17a0'
</pre>
<p>1) One solution could be to have ceph-crash (<a class="external" href="https://github.com/ceph/ceph/blob/main/src/ceph-crash.in">https://github.com/ceph/ceph/blob/main/src/ceph-crash.in</a>) create this subfolder itself and just remove it from the debian package at <br /><a class="external" href="https://github.com/ceph/ceph/blob/87f6091b9e3c22af23199d5dc07c1ba57029ea6b/debian/ceph-base.dirs#L9C1-L9C26">https://github.com/ceph/ceph/blob/87f6091b9e3c22af23199d5dc07c1ba57029ea6b/debian/ceph-base.dirs#L9C1-L9C26</a> so it's not placed there on package install.</p>
<p>2) Another would be to extend element list for the for loop with <code>/var/lib/ceph/crash/*</code>.</p>
<p>I gladly send in a patch with whatever solution you like better.</p> CephFS - Bug #64479 (Pending Backport): Memory leak detected when accessing a CephFS volume from ...https://tracker.ceph.com/issues/644792024-02-17T07:17:28ZXavi Hernandez
<p>When files are continuously created and deleted from a CephFS volume accessed through Samba with libcephfs (Samba's vfs_ceph module), a steady increase of used memory is observed. The growth is relatively slow but constant and eventually it triggers the OOM killer.</p>
<p>The test I used to reproduce it was:</p>
<ol>
<li>while true; do dd if=/dev/zero of=/mnt/samba/file bs=4k count=1 status=none; rm -f /mnt/samba/file; done</li>
</ol>
<p>This can be run in parallel using different files to make the leak to happen faster.</p> Ceph - Bug #64456 (New): Missing entries for hardware alerts from the MIB filehttps://tracker.ceph.com/issues/644562024-02-15T22:49:15ZPaul Cuzner
<p>While working on adding nvmeof alerts with an snmp trap, I noticed that although the hardware alerts reference an oid they are missing from the CEPH-MIB file</p>
<p>When you run validate_rules.py manually (and have snmptranslate installed - from net-snmp-utils), you see the following;</p>
<p>Problem Report</p>
<pre><code>Group Severity Alert Name Problem Description<br /> ----- -------- ---------- -------------------<br /> hardware Error HardwareStorageError rule defines an OID 1.3.6.1.4.1.50495.1.2.1.13.1 that is missing from the MIB file(CEPH-MIB.txt)<br /> hardware Error HardwareMemoryError rule defines an OID 1.3.6.1.4.1.50495.1.2.1.13.2 that is missing from the MIB file(CEPH-MIB.txt)<br /> hardware Error HardwareProcessorError rule defines an OID 1.3.6.1.4.1.50495.1.2.1.13.3 that is missing from the MIB file(CEPH-MIB.txt)<br /> hardware Error HardwareNetworkError rule defines an OID 1.3.6.1.4.1.50495.1.2.1.13.4 that is missing from the MIB file(CEPH-MIB.txt)<br /> hardware Error HardwarePowerError rule defines an OID 1.3.6.1.4.1.50495.1.2.1.13.5 that is missing from the MIB file(CEPH-MIB.txt)<br /> hardware Error HardwareFanError rule defines an OID 1.3.6.1.4.1.50495.1.2.1.13.6 that is missing from the MIB file(CEPH-MIB.txt)</code></pre>
<p>No problems detected in unit tests file</p> CephFS - Bug #64440 (Pending Backport): mds: reversed encoding of MDSMap max_xattr_size/bal_rank_...https://tracker.ceph.com/issues/644402024-02-15T02:30:44ZPatrick Donnellypdonnell@redhat.com
<p>main branch needs updated to match the encoded order of reef for max_attr_size/bal_rank_mask. That change then needs backported to reef HEAD to maintain continuity of v18.2.1 -> v18.2.2 | squid.</p>
<p>See also: <a class="external" href="https://github.com/ceph/ceph/pull/53340#discussion_r1490282785">https://github.com/ceph/ceph/pull/53340#discussion_r1490282785</a></p> rgw - Bug #64439 (New): If-Match Header Not Unquoted On Conditional PutObjecthttps://tracker.ceph.com/issues/644392024-02-14T21:19:22ZDamian Peckett
<p>Looks like some codepaths have been missed over the years and the s3tests seem to be supplying an unquoted ETag.</p>
<p>I have a proposed fix on a branch here: <a class="external" href="https://github.com/dpeckett/ceph/commit/0ae123dafef3274a1f3ab292915554cbe9701ce6">https://github.com/dpeckett/ceph/commit/0ae123dafef3274a1f3ab292915554cbe9701ce6</a></p>
<p>Let me know if anything is missing from the commit, the s3tests probably also need to be corrected.</p> CephFS - Bug #64390 (New): client: async I/O stalls if the data pool gets fullhttps://tracker.ceph.com/issues/643902024-02-12T14:12:28ZDhairya Parmar
<p>test case:<br /><pre><code class="cpp syntaxhl"><span class="CodeRay">TEST_F(TestClient, LlreadvLlwritevDataPoolFull) {
<span class="comment">/* Test perfoming async I/O after filling the fs and make sure it handles
the read/write gracefully */</span>
<span class="predefined-type">int</span> mypid = getpid();
<span class="predefined-type">char</span> filename[<span class="integer">256</span>];
client->unmount();
TearDown();
SetUp();
sprintf(filename, <span class="string"><span class="delimiter">"</span><span class="content">test_llreadvllwritevdatapoolfullfile%u</span><span class="delimiter">"</span></span>, mypid);
Inode *root, *file;
root = client->get_root();
ASSERT_NE(root, (Inode *)<span class="predefined-constant">NULL</span>);
Fh *fh;
<span class="keyword">struct</span> ceph_statx stx;
ASSERT_EQ(<span class="integer">0</span>, client->ll_createx(root, filename, <span class="octal">0666</span>,
O_RDWR | O_CREAT | O_TRUNC,
&file, &fh, &stx, <span class="integer">0</span>, <span class="integer">0</span>, myperm));
<span class="keyword">struct</span> statvfs stbuf;
int64_t rc;
rc = client->ll_statfs(root, &stbuf, myperm);
ASSERT_EQ(rc, <span class="integer">0</span>);
int64_t fs_available_space = stbuf.f_bfree * stbuf.f_bsize;
ASSERT_GT(fs_available_space, <span class="integer">0</span>);
<span class="directive">const</span> int64_t BUFSIZE = <span class="integer">1024</span> * <span class="integer">1024</span> * <span class="integer">1024</span>;
int64_t bytes_written = <span class="integer">0</span>, offset = <span class="integer">0</span>;
<span class="predefined-type">char</span>* buf = <span class="keyword">new</span> <span class="predefined-type">char</span>[BUFSIZE];
<span class="predefined-type">char</span>* small_buf = <span class="predefined-constant">NULL</span>;
memset(buf, <span class="hex">0xCC</span>, BUFSIZE);
<span class="keyword">while</span>(fs_available_space) {
<span class="keyword">if</span> (fs_available_space >= BUFSIZE) {
bytes_written = client->ll_write(fh, offset, BUFSIZE, buf);
ASSERT_GT(bytes_written, <span class="integer">0</span>);
offset += BUFSIZE;
fs_available_space -= BUFSIZE;
} <span class="keyword">else</span> {
small_buf = <span class="keyword">new</span> <span class="predefined-type">char</span>[fs_available_space];
memset(small_buf, <span class="hex">0xDD</span>, fs_available_space);
bytes_written = client->ll_write(fh, offset, fs_available_space, small_buf);
ASSERT_GT(bytes_written, <span class="integer">0</span>);
<span class="keyword">break</span>;
}
}
std::unique_ptr<C_SaferCond> writefinish = nullptr;
std::unique_ptr<C_SaferCond> readfinish = nullptr;
writefinish.reset(<span class="keyword">new</span> C_SaferCond(<span class="string"><span class="delimiter">"</span><span class="content">test-nonblocking-writefinish-datapool-full</span><span class="delimiter">"</span></span>));
readfinish.reset(<span class="keyword">new</span> C_SaferCond(<span class="string"><span class="delimiter">"</span><span class="content">test-nonblocking-readfinish-datapool-full</span><span class="delimiter">"</span></span>));
<span class="predefined-type">char</span>* out_buf_0 = <span class="keyword">new</span> <span class="predefined-type">char</span>[BUFSIZE];
memset(out_buf_0, <span class="hex">0xDD</span>, BUFSIZE);
<span class="predefined-type">char</span>* out_buf_1 = <span class="keyword">new</span> <span class="predefined-type">char</span>[BUFSIZE];
memset(out_buf_1, <span class="hex">0xFF</span>, BUFSIZE);
<span class="predefined-type">char</span>* out_buf_2 = <span class="keyword">new</span> <span class="predefined-type">char</span>[BUFSIZE];
memset(out_buf_2, <span class="hex">0xFF</span>, BUFSIZE);
<span class="predefined-type">char</span>* out_buf_3 = <span class="keyword">new</span> <span class="predefined-type">char</span>[BUFSIZE];
memset(out_buf_3, <span class="hex">0xFF</span>, BUFSIZE);
<span class="predefined-type">char</span>* out_buf_4 = <span class="keyword">new</span> <span class="predefined-type">char</span>[BUFSIZE];
memset(out_buf_4, <span class="hex">0xFF</span>, BUFSIZE);
<span class="predefined-type">char</span>* out_buf_5 = <span class="keyword">new</span> <span class="predefined-type">char</span>[BUFSIZE];
memset(out_buf_5, <span class="hex">0xFF</span>, BUFSIZE);
<span class="keyword">struct</span> iovec iov_out[<span class="integer">6</span>] = {
{out_buf_0, BUFSIZE},
{out_buf_1, BUFSIZE},
{out_buf_2, BUFSIZE},
{out_buf_3, BUFSIZE},
{out_buf_4, BUFSIZE},
{out_buf_5, BUFSIZE},
};
bufferlist bl;
rc = client->ll_preadv_pwritev(fh, iov_out, <span class="integer">6</span>, <span class="integer">0</span>, <span class="predefined-constant">true</span>, writefinish.get(),
nullptr);
ASSERT_EQ(rc, <span class="integer">0</span>);
bytes_written = writefinish->wait();
ASSERT_EQ(bytes_written, -CEPHFS_ENOSPC);
client->ll_release(fh);
ASSERT_EQ(<span class="integer">0</span>, client->ll_unlink(root, filename, myperm));
<span class="keyword">delete</span>[] buf;
<span class="keyword">delete</span>[] small_buf;
<span class="keyword">delete</span>[] out_buf_0;
<span class="keyword">delete</span>[] out_buf_1;
<span class="keyword">delete</span>[] out_buf_2;
<span class="keyword">delete</span>[] out_buf_3;
<span class="keyword">delete</span>[] out_buf_4;
<span class="keyword">delete</span>[] out_buf_5;
}
</span></code></pre></p>
<p>firstly the assertion fails after the async write call<br /><pre><code class="text syntaxhl"><span class="CodeRay">2024-02-12T19:09:43.795+0530 7f84bac686c0 19 client.4304 C_Write_Finisher::try_complete this 0x5594b702bcc0 onuninlinefinished 1 iofinished 1 iofinished_r 2147483647 fsync_finished 1
2024-02-12T19:09:43.795+0530 7f84bac686c0 19 client.4304 complete with iofinished_r 2147483647
/home/dparmar/CephRepoForRunningTestsLocally/ceph/src/test/client/nonblocking.cc:800: Failure
Expected equality of these values:
bytes_written
Which is: 2147483647
-28
</span></code></pre></p>
<p>i expected the API to return ENOSPC but it returned be 2GiB i.e. 33% data was written (shouldn't happen since I had filled up all the available space in the first place)</p>
<p>Do we get the ENOSPC error when releasing file handle after this:<br /><pre><code class="text syntaxhl"><span class="CodeRay">2024-02-12T19:09:43.795+0530 7f84bf65c9c0 1 client.4304 _release_fh 0x5594b6f277f0 on inode 0x10000000000.head(faked_ino=0 nref=8 ll_ref=1 cap_refs={4=0,1024=0,4096=0,8192=0} open={3=0} mode=100666 size=106287857664/110582824960 nlink=1 btime=2024-02-12T18:42:52.646736+0530 mtime=2024-02-12T19:09:43.796040+0530 ctime=2024-02-12T19:09:43.796040+0530 change_attr=100 caps=p(0=p) flushing_caps=Fw objectset[0x10000000000 ts 0/0 objects 1000 dirty_or_tx 0] parents=0x1.head["test_llreadvllwritevdatapoolfullfile1269955"] 0x7f84900088e0) caught async_err = (28) No space left on device
</span></code></pre></p>
<p>and then this, the call is stalled:</p>
<pre><code class="text syntaxhl"><span class="CodeRay">2024-02-12T19:09:43.976+0530 7f84977fe6c0 20 client.4304 upkeep thread waiting interval 1.000000000s
2024-02-12T19:09:44.614+0530 7f84b1ffb6c0 1 client.4304 _handle_full_flag: FULL: cancelling outstanding operations on 1
2024-02-12T19:09:44.614+0530 7f84b1ffb6c0 1 client.4304 _handle_full_flag: FULL: cancelling outstanding operations on 2
2024-02-12T19:09:44.614+0530 7f84b1ffb6c0 1 client.4304 _handle_full_flag: FULL: cancelling outstanding operations on 3
2024-02-12T19:09:44.614+0530 7f84b1ffb6c0 10 client.4304 unmounting: trim pass, size was 0+2
2024-02-12T19:09:44.614+0530 7f84b1ffb6c0 20 client.4304 trim_cache size 0 max 16384
2024-02-12T19:09:44.614+0530 7f84b1ffb6c0 10 client.4304 unmounting: trim pass, size still 0+2
2024-02-12T19:09:44.977+0530 7f84977fe6c0 20 client.4304 tick
2024-02-12T19:09:44.977+0530 7f84977fe6c0 20 client.4304 collect_and_send_metrics
2024-02-12T19:09:44.977+0530 7f84977fe6c0 20 client.4304 collect_and_send_global_metrics
2024-02-12T19:09:44.977+0530 7f84977fe6c0 10 client.4304 _put_inode on 0x1.head(faked_ino=0 nref=2 ll_ref=0 cap_refs={1024=0} open={} mode=40755 size=0/0 nlink=1 btime=2024-02-12T18:41:58.976066+0530 mtime=2024-02-12T18:42:52.646736+0530 ctime=2024-02-12T18:42:52.646736+0530 change_attr=1 caps=pAsLsXs(0=pAsLsXs) has_dir_layout 0x7f84900081e0) n = 1
2024-02-12T19:09:44.977+0530 7f84977fe6c0 10 client.4304 remove_cap mds.0 on 0x1.head(faked_ino=0 nref=1 ll_ref=0 cap_refs={1024=0} open={} mode=40755 size=0/0 nlink=1 btime=2024-02-12T18:41:58.976066+0530 mtime=2024-02-12T18:42:52.646736+0530 ctime=2024-02-12T18:42:52.646736+0530 change_attr=1 caps=pAsLsXs(0=pAsLsXs) has_dir_layout 0x7f84900081e0)
2024-02-12T19:09:44.977+0530 7f84977fe6c0 15 client.4304 remove_cap last one, closing snaprealm 0x7f84900080f0
2024-02-12T19:09:44.977+0530 7f84977fe6c0 20 client.4304 put_snap_realm 0x1 0x7f84900080f0 2 -> 1
2024-02-12T19:09:44.977+0530 7f84977fe6c0 10 client.4304 _put_inode deleting 0x1.head(faked_ino=0 nref=1 ll_ref=0 cap_refs={1024=0} open={} mode=40755 size=0/0 nlink=1 btime=2024-02-12T18:41:58.976066+0530 mtime=2024-02-12T18:42:52.646736+0530 ctime=2024-02-12T18:42:52.646736+0530 change_attr=1 caps=- has_dir_layout 0x7f84900081e0)
2024-02-12T19:09:44.977+0530 7f84977fe6c0 10 client.4304 _put_inode on 0x10000000000.head(faked_ino=0 nref=4 ll_ref=0 cap_refs={4=0,1024=0,4096=0,8192=0} open={3=0} mode=100666 size=106287857664/110582824960 nlink=1 btime=2024-02-12T18:42:52.646736+0530 mtime=2024-02-12T19:09:43.796040+0530 ctime=2024-02-12T19:09:43.796040+0530 change_attr=100 caps=p(0=p) flushing_caps=Fw objectset[0x10000000000 ts 0/0 objects 332 dirty_or_tx 0] 0x7f84900088e0) n = 2
2024-02-12T19:09:44.977+0530 7f84977fe6c0 20 client.4304 trim_cache size 0 max 16384
2024-02-12T19:09:44.977+0530 7f84977fe6c0 20 client.4304 upkeep thread waiting interval 1.000000000s
2024-02-12T19:09:45.682+0530 7f84b1ffb6c0 1 client.4304 _handle_full_flag: FULL: cancelling outstanding operations on 1
2024-02-12T19:09:45.682+0530 7f84b1ffb6c0 1 client.4304 _handle_full_flag: FULL: cancelling outstanding operations on 2
2024-02-12T19:09:45.682+0530 7f84b1ffb6c0 1 client.4304 _handle_full_flag: FULL: cancelling outstanding operations on 3
2024-02-12T19:09:45.682+0530 7f84b1ffb6c0 10 client.4304 unmounting: trim pass, size was 0+1
2024-02-12T19:09:45.682+0530 7f84b1ffb6c0 20 client.4304 trim_cache size 0 max 16384
2024-02-12T19:09:45.682+0530 7f84b1ffb6c0 10 client.4304 unmounting: trim pass, size still 0+1
2024-02-12T19:09:45.977+0530 7f84977fe6c0 20 client.4304 tick
2024-02-12T19:09:45.977+0530 7f84977fe6c0 20 client.4304 collect_and_send_metrics
2024-02-12T19:09:45.977+0530 7f84977fe6c0 20 client.4304 collect_and_send_global_metrics
2024-02-12T19:09:45.977+0530 7f84977fe6c0 20 client.4304 trim_cache size 0 max 16384
2024-02-12T19:09:45.977+0530 7f84977fe6c0 20 client.4304 upkeep thread waiting interval 1.000000000s
2024-02-12T19:09:46.977+0530 7f84977fe6c0 20 client.4304 tick
</span></code></pre> CephFS - Bug #64389 (Triaged): client: check if pools are full when mountinghttps://tracker.ceph.com/issues/643892024-02-12T11:08:14ZDhairya Parmar
<p>otherwise the mounting stalls:</p>
<pre><code class="text syntaxhl"><span class="CodeRay">2024-02-12T16:20:27.609+0530 7f03e4e7a9c0 10 client.0 osdmap pool full0
2024-02-12T16:20:27.609+0530 7f03dca4d6c0 10 client.0 ms_handle_connect on v2:127.0.0.1:40206/0
2024-02-12T16:20:27.609+0530 7f03e4e7a9c0 10 client.4513 Subscribing to map 'mdsmap'
2024-02-12T16:20:27.610+0530 7f03c4ff96c0 20 client.4513 tick
2024-02-12T16:20:27.610+0530 7f03c4ff96c0 20 client.4513 collect_and_send_metrics
2024-02-12T16:20:27.610+0530 7f03c4ff96c0 20 client.4513 collect_and_send_global_metrics
2024-02-12T16:20:27.610+0530 7f03c4ff96c0 5 client.4513 collect_and_send_global_metrics MDS rank 0 is not ready yet -- not sending metric
2024-02-12T16:20:27.610+0530 7f03c4ff96c0 20 client.4513 trim_cache size 0 max 16384
2024-02-12T16:20:27.610+0530 7f03c4ff96c0 20 client.4513 upkeep thread waiting interval 1.000000000s
2024-02-12T16:20:27.610+0530 7f03dca4d6c0 1 client.4513 _handle_full_flag: FULL: cancelling outstanding operations on 1
2024-02-12T16:20:27.610+0530 7f03dca4d6c0 1 client.4513 _handle_full_flag: FULL: cancelling outstanding operations on 2
2024-02-12T16:20:27.610+0530 7f03dca4d6c0 1 client.4513 _handle_full_flag: FULL: cancelling outstanding operations on 3
2024-02-12T16:20:27.610+0530 7f03dca4d6c0 1 client.4513 handle_mds_map epoch 533
2024-02-12T16:20:27.610+0530 7f03e4e7a9c0 20 client.4513 populate_metadata read hostname 'li-d7acf5cc-234b-11b2-a85c-8f0e65e32dfd.ibm.com'
2024-02-12T16:20:27.610+0530 7f03e4e7a9c0 10 client.4513 did not get mds through better means, so chose random mds 0
2024-02-12T16:20:27.610+0530 7f03e4e7a9c0 20 client.4513 mds is 0
2024-02-12T16:20:27.610+0530 7f03e4e7a9c0 10 client.4513 _open_mds_session mds.0
2024-02-12T16:20:27.610+0530 7f03e4e7a9c0 10 client.4513 waiting for session to mds.0 to open
2024-02-12T16:20:27.610+0530 7f03dca4d6c0 10 client.4513 ms_handle_connect on v2:127.0.0.1:6830/1345341561
2024-02-12T16:20:28.610+0530 7f03c4ff96c0 20 client.4513 tick
2024-02-12T16:20:28.610+0530 7f03c4ff96c0 10 client.4513 renew_caps()
2024-02-12T16:20:28.610+0530 7f03c4ff96c0 15 client.4513 renew_caps requesting from mds.0
2024-02-12T16:20:28.610+0530 7f03c4ff96c0 10 client.4513 renew_caps mds.0
2024-02-12T16:20:28.610+0530 7f03c4ff96c0 20 client.4513 collect_and_send_metrics
2024-02-12T16:20:28.610+0530 7f03c4ff96c0 20 client.4513 collect_and_send_global_metrics
2024-02-12T16:20:28.610+0530 7f03c4ff96c0 5 client.4513 collect_and_send_global_metrics: no session with rank=0 -- not sending metric
2024-02-12T16:20:28.610+0530 7f03c4ff96c0 20 client.4513 trim_cache size 0 max 16384
2024-02-12T16:20:28.610+0530 7f03c4ff96c0 20 client.4513 upkeep thread waiting interval 1.000000000s
2024-02-12T16:20:29.048+0530 7f03dca4d6c0 1 client.4513 handle_mds_map epoch 534
2024-02-12T16:20:29.610+0530 7f03c4ff96c0 20 client.4513 tick
2024-02-12T16:20:29.610+0530 7f03c4ff96c0 20 client.4513 collect_and_send_metrics
2024-02-12T16:20:29.610+0530 7f03c4ff96c0 20 client.4513 collect_and_send_global_metrics
2024-02-12T16:20:29.610+0530 7f03c4ff96c0 5 client.4513 collect_and_send_global_metrics: no session with rank=0 -- not sending metric
2024-02-12T16:20:29.610+0530 7f03c4ff96c0 20 client.4513 trim_cache size 0 max 16384
2024-02-12T16:20:29.610+0530 7f03c4ff96c0 20 client.4513 upkeep thread waiting interval 1.000000000s
2024-02-12T16:20:30.610+0530 7f03c4ff96c0 20 client.4513 tick
</span></code></pre>
<p>health status:<br /><pre><code class="text syntaxhl"><span class="CodeRay"> health: HEALTH_ERR
2 client(s) laggy due to laggy OSDs
1 MDSs report slow metadata IOs
3 full osd(s)
3 pool(s) full
</span></code></pre></p>
<p>from mon logs:<br /><pre><code class="text syntaxhl"><span class="CodeRay">mon.a.log:1432535: "message": "2 slow ops, oldest one blocked for 3318 sec, mon.a has slow ops",
mon.a.log:1434391: "message": "2 slow ops, oldest one blocked for 3318 sec, mon.a has slow ops",
mon.a.log:980570: "message": "0 slow ops, oldest one blocked for 34 sec, osd.0 has slow ops",
mon.a.log:981411: "message": "0 slow ops, oldest one blocked for 34 sec, osd.0 has slow ops",
mon.a.log:982393: "message": "0 slow ops, oldest one blocked for 34 sec, osd.0 has slow ops"
</span></code></pre></p>
<p>mds log:<br /><pre><code class="text syntaxhl"><span class="CodeRay">mds.c.log:331991:2024-02-12T16:35:03.624+0530 7f8a13ef46c0 20 mds.beacon.c 9 slow metadata IOs found
</span></code></pre></p>
<p>there is no room left for the client it seems and so the mounting hangs.</p> CephFS - Bug #64348 (Triaged): mds: possible memory leak in up:rejoin when opening cap inodes (fr...https://tracker.ceph.com/issues/643482024-02-08T04:46:10ZVenky Shankarvshankar@redhat.com
<p>Seems to happen when there are entries in OFT for which the MDS prefetches inodes. The config <code>mds_oft_prefetch_dirfrags</code> which is disabled by default is concerned only to disable prefetching dirfrags, however, the OFT will still prefetch inodes and there <em>seems</em> to be a memleak somewhere (which isn't getting tested in our qa suite, else we probably would have noticed in valgrind test).</p>
<p>The memleak causes the MDS to get OOM killed (also partly because the cache limits aren't really taken into consideration in this state). This was observed in a couple of user clusters. Unfortunately the logs didn't provide any hints other than the MDS prefetching inodes from the OFT and the MDS rss size hitting the node memory limit.</p> Ceph - Bug #64319 (New): OSD does not move itself to crush_location on start, root=default is not...https://tracker.ceph.com/issues/643192024-02-05T18:01:36ZNiklas Hambuechen
<p>I'm currently setting up a Ceph (16.2.7) cluster where it's important to reflect the `datacenter` location in CRUSH, and encountered some issues that may be code bugs or documentation bugs.</p>
<p>First, <strong>crush_location</strong> does not seem to have a manual reference entry anywhere.<br />It is mentioned in <a class="external" href="https://docs.ceph.com/en/pacific/rados/operations/crush-map/#custom-location-hooks">https://docs.ceph.com/en/pacific/rados/operations/crush-map/#custom-location-hooks</a> but unlike other ceph.conf options its syntax is not fully explained, and it does not have a proper reference entry.<br /><a class="external" href="https://docs.ceph.com/en/pacific/rados/operations/crush-map/#crush-location">https://docs.ceph.com/en/pacific/rados/operations/crush-map/#crush-location</a> describes the general bucket syntax, but doesn't say that this very syntax works for <strong>crush_location</strong>.</p>
<p>Second, <a class="external" href="https://docs.ceph.com/en/pacific/rados/operations/crush-map/#crush-location">https://docs.ceph.com/en/pacific/rados/operations/crush-map/#crush-location</a> (and same for Reef) says that</p>
<pre>
3. Not all keys need to be specified. For example, by default, Ceph automatically sets an OSD’s location to be root=default host=HOSTNAME (based on the output from hostname -s).
</pre>
<p>I found that not to work: If I specify in <strong>ceph.conf</strong>:</p>
<pre>
[osd]
crush_location = region=HEL zone=HEL1 datacenter=HEL1-DC8 host=backupfs-1
</pre>
<p>then the `root=default` bucket is empty in <strong>ceph osd tree</strong>:</p>
<pre>
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-6 439.30875 region HEL
-5 439.30875 zone HEL1
-23 146.43625 datacenter HEL1-DC3
-22 146.43625 host backupfs-3
26 hdd 14.61089 osd.26 up 1.00000 1.00000
27 hdd 14.61089 osd.27 up 1.00000 1.00000
28 hdd 14.61089 osd.28 up 1.00000 1.00000
...
-1 0 root default
</pre>
<p>In the above, <strong>region</strong> and <strong>root</strong> are on the same level, which is wrong (certainly unintended).</p>
<p>This causes errors: The default CRUSH rules do not place any data because they contain <strong>take default</strong>, and default is empty. Thus, this causes in <strong>ceph status</strong>:</p>
<pre>
100.000% pgs unknown
</pre>
<p>and <strong>acting = []</strong>.</p>
<p>So it seems that the "Not all keys need to be specified ... Ceph automatically sets an OSD’s location to be root=default" is either wrong or confusing (if it means "only applies unless you definied `crush_location = ..." explicitly, why would one say "not all keys need ...").</p>
<p>This could be fixed by running <strong>ceph osd crush move HEL root=default</strong>.</p>
<p>Third, I expected that simply changing <strong>ceph.conf</strong> to</p>
<pre>
[osd]
crush_location = root=default region=HEL zone=HEL1 datacenter=HEL1-DC8 host=backupfs-1
</pre>
<p>and restarting the OSD daemon should work, because the docs at <a class="external" href="https://docs.ceph.com/en/pacific/rados/operations/crush-map/#crush-location">https://docs.ceph.com/en/pacific/rados/operations/crush-map/#crush-location</a> say <strong>each time the OSD starts, it verifies it is in the correct location in the CRUSH map and, if it is not, it moves itself</strong>.</p>
<p>My OSD did not move itself, the <strong>ceph osd tree</strong> stayed as it was above after restarting the OSD.</p>
<p>So in summary, the following issues:</p>
<p>1. <strong>crush_location</strong> has no explicit documentation.<br />2. "Not all keys need to be specified" is wrong.<br />3. The docs statement "verifies it is in the correct location in the CRUSH map and, if it is not, it moves itself" seems wrong.</p>
<p>It would be nice if somebody could point out what the intended behaviour is (docs wrong or code wrong?) so the docs can be fixed. Thank you!</p> CephFS - Bug #64313 (Pending Backport): client: do not proceed with I/O if filehandle is invalidhttps://tracker.ceph.com/issues/643132024-02-05T12:31:08ZDhairya Parmar
<p>otherwise it would crash:</p>
<pre><code class="text syntaxhl"><span class="CodeRay"> 0> 2024-02-05T17:48:35.230+0530 7f0a94c309c0 -1 *** Caught signal (Segmentation fault) **
in thread 7f0a94c309c0 thread_name:ceph_test_clien
ceph version 19.0.0-1018-g774ce9f98b7 (774ce9f98b7ef83f5b17268dae8f637fea775c94) squid (dev)
1: ./bin/ceph_test_client(+0x2941b4) [0x55914517d1b4]
2: /lib64/libc.so.6(+0x3dbb0) [0x7f0a9425fbb0]
3: (Client::_preadv_pwritev_locked(Fh*, iovec const*, unsigned int, long, bool, bool, Context*, ceph::buffer::v15_2_0::list*, bool, bool)+0x7b) [0x55914510ba33]
4: (Client::ll_preadv_pwritev(Fh*, iovec const*, int, long, bool, Context*, ceph::buffer::v15_2_0::list*, bool, bool)+0xbb) [0x55914510c395]
5: (TestClient_LlreadvLlwritevInvalidFileDescriptor_Test::TestBody()+0x1d0) [0x55914509ec24]
6: (void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x1b) [0x55914519c3db]
7: (void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x80) [0x5591451a4ccc]
8: (testing::Test::Run()+0xb4) [0x5591451957ee]
9: (testing::TestInfo::Run()+0x104) [0x5591451958f4]
10: (testing::TestSuite::Run()+0xb2) [0x5591451959a8]
11: (testing::internal::UnitTestImpl::RunAllTests()+0x36b) [0x55914519705f]
12: (bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x1b) [0x55914519c687]
13: (bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x80) [0x5591451a5229]
14: (testing::UnitTest::Run()+0x63) [0x559145195ac3]
15: (RUN_ALL_TESTS()+0x11) [0x55914509d84b]
16: main()
17: /lib64/libc.so.6(+0x27b8a) [0x7f0a94249b8a]
18: __libc_start_main()
19: _start()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
</span></code></pre> CephFS - Bug #64298 (New): CephFS metadata pool has large OMAP objects corresponding to strayshttps://tracker.ceph.com/issues/642982024-02-02T12:37:42ZAlexander Patrakov
<p>Hello developers,</p>
<p>A customer has a cluster which currently has 4 large OMAP objects (one old and three new) in its metadata pool. I am aware of <a class="external" href="https://tracker.ceph.com/issues/45333">https://tracker.ceph.com/issues/45333</a>, and in this comment <a class="external" href="https://tracker.ceph.com/issues/45333#note-6">https://tracker.ceph.com/issues/45333#note-6</a> a procedure for triggering the directory fragmentation exists: reconstruct the directory path and list that directory to get it fragmented. However, in our case, this procedure is inapplicable.</p>
<pre>
# rados getxattr --pool=mainfs.meta 100290d9cb3.00000000 parent | ceph-dencoder type inode_backtrace_t import - decode dump_json
{
"ino": 1100200385715,
"ancestors": [
{
"dirino": 1543,
"dname": "100290d9cb3",
"version": 318702055
},
{
"dirino": 256,
"dname": "stray7",
"version": 1405762425
}
],
"pool": 2,
"old_pools": []
}
</pre>
<p>See - it is a stray. Actually, all three new large OMAP objects correspond to stray directories that for this reason cannot be listed. Instructions should be provided on how to deal with this situation.</p>
<p>Regarding possible snapshots: the oldest snapshot of a directory that "officially" should have snapshots is dated January 28, 2024. There might be older snapshots of other directories, I have not searched for them and I don't know if they exist.</p>
<p>Regarding the contents of one of the stray objects, I did this to get some statistics:</p>
<pre>
# ceph tell mds.0 dump tree "~mdsdir/stray7" > stray7.json
# ls -l stray7.json
-rw-r--r-- 1 root root 710084873 Feb 2 08:25 stray7.json
# wc -l stray7.json
23391176 stray7.json
# grep stray_prior_path stray7.json | wc -l
135172
# grep stray_prior_path stray7.json | grep -v '"stray_prior_path": ""' | wc -l
358
</pre>
<p>I can confirm that the entries with non-empty stray_prior_path are "clustered" in two different directories. I have checked one entry manually - it does not exist as either a file or a directory, but its parent does and contains a lot of existing subdirectories named in a similar way.</p> teuthology - Bug #13700 (New): Command failed with status 139: ''sudo adjust-ulimits ceph-coverag...https://tracker.ceph.com/issues/137002015-11-05T07:39:46Zbo cai
<p>Many of my tasks in the implementation of all failed due to the following reasons .</p>
<p>2015-11-05T14:17:05.103 INFO:tasks.rados.rados.0.dtod003.stdout:3998: left oid 57 (ObjNum 1694 snap 377 seq_num 1694)<br />2015-11-05T14:17:05.103 INFO:tasks.rados.rados.0.dtod003.stdout:3998: done (0 left)<br />2015-11-05T14:17:05.137 INFO:tasks.rados.rados.0.dtod003.stderr:0 errors.<br />2015-11-05T14:17:05.137 INFO:tasks.rados.rados.0.dtod003.stderr:<br />2015-11-05T14:17:05.158 DEBUG:teuthology.run_tasks:Unwinding manager thrashosds<br />2015-11-05T14:17:05.158 INFO:tasks.thrashosds:joining thrashosds<br />2015-11-05T14:17:05.163 ERROR:teuthology.run_tasks:Manager failed: thrashosds<br />Traceback (most recent call last):<br /> File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 125, in run_tasks<br /> suppress = manager.__exit__(*exc_info)<br /> File "/usr/lib/python2.7/contextlib.py", line 24, in <i>exit</i><br /> self.gen.next()<br /> File "/home/teuthworker/src/ceph-qa-suite_master/tasks/thrashosds.py", line 179, in task<br /> thrash_proc.do_join()<br /> File "/home/teuthworker/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 435, in do_join<br /> self.thread.get()<br /> File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/gevent/greenlet.py", line 308, in get<br /> raise self._exception<br />CommandFailedError: Command failed on dtod003 with status 139: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty'<br />2015-11-05T14:17:05.168 DEBUG:teuthology.run_tasks:Unwinding manager ceph<br />2015-11-05T14:17:05.169 INFO:teuthology.orchestra.run.dtod003:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph pg dump --format json'<br />2015-11-05T14:17:05.832 INFO:teuthology.orchestra.run.dtod003.stderr:2015-11-05 14:17:06.940572 7fe9e77d8700 -1 WARNING: the following dangerous and experimental features are enabled: keyvaluestore,ms-type-async<br />2015-11-05T14:17:05.916 INFO:teuthology.orchestra.run.dtod003.stderr:2015-11-05 14:17:07.018080 7fe9e77d8700 -1 WARNING: the following dangerous and experimental features are enabled: keyvaluestore,ms-type-async<br />2015-11-05T14:17:05.916 INFO:teuthology.orchestra.run.dtod003.stderr:2015-11-05 14:17:07.019019 7fe9e77d8700 -1 WARNING: experimental feature 'ms-type-async' is enabled<br />2015-11-05T14:17:05.917 INFO:teuthology.orchestra.run.dtod003.stderr:Please be aware that this feature is experimental, untested,<br />2015-11-05T14:17:05.917 INFO:teuthology.orchestra.run.dtod003.stderr:unsupported, and may result in data corruption, data loss,<br />2015-11-05T14:17:05.918 INFO:teuthology.orchestra.run.dtod003.stderr:and/or irreparable damage to your cluster. Do not use<br />2015-11-05T14:17:05.918 INFO:teuthology.orchestra.run.dtod003.stderr:feature with important data.<br />2015-11-05T14:17:05.918 INFO:teuthology.orchestra.run.dtod003.stderr:</p>
<p>See detailed log。</p> Calamari - Bug #12127 (New): calamari virtualenv aplies SWIG flag cerrosaswarn too broadlyhttps://tracker.ceph.com/issues/121272015-06-23T21:29:54ZChristina Menocmeno@redhat.com
<p>in <a class="external" href="https://github.com/ceph/calamari/pull/299">https://github.com/ceph/calamari/pull/299</a><br />we added <a class="external" href="https://github.com/joehandzik/calamari/commit/23523483ca74f5f1474e0eac836e6eb58d8b9c38">https://github.com/joehandzik/calamari/commit/23523483ca74f5f1474e0eac836e6eb58d8b9c38</a><br />to allow m2crypto to be built in centos7</p>
<p>other requirements in requirements.txt will get the same treatment and could fail silently.</p>
<p>The options come to mind for fixing this (in order of preference):<br />1. treat m2crypto as a system level package dependency<br />2. fix whatever is broken in m2crypto upstream<br />3. narrow the salt to build just m2crypto with this flag</p> Calamari - Bug #12080 (New): calamari crush location needs to deal with configurable cephx keyrin...https://tracker.ceph.com/issues/120802015-06-18T21:26:17ZChristina Menocmeno@redhat.com
<p>21:12:03 gmeno | off_topic: you wanted to know where this was <a class="external" href="https://github.com/ceph/calamari/blob/master/salt/srv/salt/base/calamari-crush-location.py">https://github.com/ceph/calamari/blob/master/salt/srv/salt/base/calamari-crush-location.py</a><br />21:12:15 dmick | ah right<br />21:12:41 dmick | oh lookie there, try the osd key first explicitly, then try the admin key<br />21:13:27 gmeno | yep<br />21:13:39 dmick | you realize, of course, that those keyring paths are configurable, and you really ought to be asking ceph where they are<br />21:13:48 dmick | which of course requires you to have an admin key, probably :)<br />21:14:00 dmick | sigh<br />21:14:19 dmick | actually I wonder if that's true; maybe ceph-conf doesn't need keys <br />21:14:28 gmeno | hah that is something I should have guessed, convention is an outdated concept in linux <br />21:15:43 dmick | ceph-conf has a hidden switch (isn't that cool?) that actually asks rather than just parsing config files <br />21:15:46 dmick | so it gets defaults too <br />21:15:54 dmick | --show-config-value <br />21:16:28 dmick | $ ceph-conf --name osd.0 --show-config-value 'keyring' <br />21:16:28 dmick | /var/lib/ceph/osd/ceph-0/keyring <br />21:16:50 dmick | $ ceph-conf --name mon.a --show-config-value 'keyring' <br />21:16:50 dmick | /etc/ceph/ceph.mon.a.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin <br />21:17:04 dmick | and yay it doesn't appear to need a key itself</p>