Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2024-03-11T19:20:43ZCeph
Redmine rgw - Feature #64851 (New): RGW: Support read_from_replica everywherehttps://tracker.ceph.com/issues/648512024-03-11T19:20:43ZGreg Farnumgfarnum@redhat.com
<p>We would like RGW to support read-from-replica for stretch clusters.</p> rbd - Feature #64850 (New): rbd: Support read_from_replica everywherehttps://tracker.ceph.com/issues/648502024-03-11T19:19:57ZGreg Farnumgfarnum@redhat.com
<p>librbd already supports read-from-replica via the rbd_read_from_replica_policy config; does RBD need other modifications to support local reads in other tools? The mirror daemon comes to mind.</p> RADOS - Feature #64849 (New): rados: Support read_from_replica everywherehttps://tracker.ceph.com/issues/648492024-03-11T19:18:58ZGreg Farnumgfarnum@redhat.com
<p>The Objecter supports read-from-replica if you pass in the LOCALIZE_READS flag. If we want to serve all read IO from a local copy in a stretch cluster, do we need to do anything else?<br />For instance, OSDs have their own Objecter to serve as a RADOS client for both cache pool (yuck) and copy_from purposes. Do we need extra configs to make sure it gets that setting? Are there other components within RADOS that need to be updated?</p> mgr - Feature #64848 (New): mgr: Support read_from_replica everywhere (including modules)https://tracker.ceph.com/issues/648482024-03-11T19:14:17ZGreg Farnumgfarnum@redhat.com
<p>We would like to be able to use read-from-replica throughout the Ceph stack. Some mgr modules access the cluster (mgr/volumes has a cephfs client; we support postgres-on-ceph and modules make use of it; etc). How should we enable read-from-replica for those modules?</p> Linux kernel client - Feature #64847 (New): krbd/kcephfs: Support read_from_replica everywherehttps://tracker.ceph.com/issues/648472024-03-11T19:12:07ZGreg Farnumgfarnum@redhat.com
<p>We would like to be able to use read-from-replica throughout the Ceph stack. Should RBD and CephFS in the kernel use the same config, or different ones? Do we need new options here?</p> CephFS - Bug #64846 (New): CephFS: support read_from_replica everywherehttps://tracker.ceph.com/issues/648462024-03-11T19:10:16ZGreg Farnumgfarnum@redhat.com
<p>We would like to be able to use read-from-replica throughout the CephFS stack. Right now, there's a libcephfs::ceph_localize_reads() function which I presume is invoked by Ganesha? But no way for admins to configure a client to do local reads, and no way to configure the MDS to use them. We should support that.</p> Ceph - Feature #64845 (New): Support read_from_replica everywherehttps://tracker.ceph.com/issues/648452024-03-11T19:07:31ZGreg Farnumgfarnum@redhat.com
<p>RADOS supports reads from replicas now, and has done so for a while. It is not on by default and requires setting a flag in the Objecter/librados, because reading from random OSDs is less cache-efficient. But it is desirable in stretch cluster scenarios, which Ceph is seeing more of as time goes by.</p>
<p>Right now, only RBD supports read-from-replica, because the initial implementation did not properly order reads with in-progress writes on the OSD side, so it was only viable for snapshots. Happily, that changed several releases ago. So, we should extend this option to every component! This ticket will serve as an umbrella, and perhaps holds the actual work.</p>
<p>I see two obvious approaches to resolving this:<br />1) Every component adds a config analogous to rbd_read_from_replica_policy. (I see CephFS has a libcephfs::ceph_localize_reads() function, but no obvious way to set it — weird?)<br />2) We make a RADOS config option that can be set by any random Ceph client.</p>
<p>The second is obviously broadly applicable and a shorter config, but leaves out the possibility of components applying appropriate system-specific checks or tweaks. The first is a little more work?</p> CephFS - Feature #62364 (New): support dumping rstats on a particular pathhttps://tracker.ceph.com/issues/623642023-08-09T00:36:06ZGreg Farnumgfarnum@redhat.com
<p>Especially now that we have rstats disabled by default, we need an easy way to dump rstats (primarily rbytes, though not exclusively that) on given paths. This is really useful when monitoring things like deletion progress in a large directory, or tracking down problems — especially in mgr/volumes environments.</p>
<p>There are a few ways we could approach this — perhaps we add "fs" or "subvolume" commands to enable this, that just do a lookup on the mgr's built-in client? We could extend the cache dump options (which already let you dump a particular inode by number) to let you dump a particular path? Perhaps a brand new admin socket command on the MDS which does a full path walk lookup?</p> mgr - Bug #59580 (Pending Backport): memory leak (RESTful module, maybe others?)https://tracker.ceph.com/issues/595802023-04-28T00:13:00ZGreg Farnumgfarnum@redhat.com
<p>There are two separate reports on the mailing list of memory leaks in the mgr module:</p>
<p>[ceph-users] Memory leak in MGR after upgrading to pacific<br /><pre>
After upgrading from Octopus (15.2.17) to Pacific (16.2.12) two days
ago, I'm noticing that the MGR daemons keep failing over to standby and
then back every 24hrs. Watching the output of 'ceph orch ps' I can see
that the memory consumption of the mgr is steadily growing until it
becomes unresponsive.
When the mgr becomes unresponsive, tasks such as RESTful calls start to
fail, and the standby eventually takes over after ~20 minutes. I've
included a log of memory consumption (in 10 minute intervals) at the end
of this message. While the cluster recovers during this issue, the loss
of usage data during the outage, and the fact its occurring is
problematic. Any assistance would be appreciated.
Note, this is a cluster that has been upgraded from an original jewel
based ceph using filestore, through bluestore conversion, container
conversion, and now to Pacific. The data below shows memory use with
three mgr modules enabled: cephadm, restful, iostat. By disabling
iostat, I can reduce the rate of memory consumption increasing to about
200MB/hr.
</pre></p>
<p>[ceph-users] MGR Memory Leak in Restful<br /><pre>
We've hit a memory leak in the Manager Restful interface, in versions
17.2.5 & 17.2.6. On our main production cluster the active MGR grew to
about 60G until the oom_reaper killed it, causing a successful failover
and restart of the failed one. We can then see that the problem is
recurring, actually on all 3 of our clusters.
We've traced this to when we enabled full Ceph monitoring by Zabbix last
week. The leak is about 20GB per day, and seems to be proportional to
the number of PGs. For some time we just had the default settings, and
no memory leak, but had not got around to finding why many of the Zabbix
items were showing as Access Denied. We traced this to the MGR's MON
CAPS which were "mon 'profile mgr'".
The MON logs showed recurring:
log_channel(audit) log [DBG] : from='mgr.284576436 192.168.xxx.xxx:0/2356365' entity='mgr.host1' cmd=[{"format": "json", "prefix": "pg dump"}]: access denied
Changing the MGR CAPS to "mon 'allow *'" and restarting the MGR
immediately allowed that to work, and all the follow-on REST calls worked.
log_channel(audit) log [DBG] : from='mgr.283590200 192.168.xxx.xxx:0/1779' entity='mgr.host1' cmd=[{"format": "json", "prefix": "pg dump"}]: dispatch
However it has also caused the memory leak to start.
We've reverted the CAPS and are back to how we were.
</pre></p> CephFS - Bug #56522 (Resolved): Do not abort MDS on unknown messageshttps://tracker.ceph.com/issues/565222022-07-11T17:18:59ZGreg Farnumgfarnum@redhat.com
<p>Right now, in Server::dispatch(), we abort the MDS if we get a message type we don't understand.</p>
<p>This is horrible: it means that any malicious client can crash the server by just sending a message of a new type to the server! That's a trivial denial of service.<br />Besides malicious clients, it also means that when there's a protocol issue such as a new client erroneously sending new messages to the server, it crashes the whole system instead of just the new client.</p>
<p>Instead, we'll need to drop the message in a way that makes any kind of sense — perhaps we respond to unknown messages by blacklisting the client and closing the session?</p> CephFS - Bug #44916 (Resolved): client: syncfs flush is only fast with a single MDShttps://tracker.ceph.com/issues/449162020-04-02T15:51:52ZGreg Farnumgfarnum@redhat.com
<p>When we invoke Client::syncfs, we call into flush_caps_sync() and that invokes check_caps() for everything dirty, adding the internal flag CHECK_CAPS_SYNCHRONOUS to the last dirty cap in the list. check_caps() will set FLAG_SYNC on the MClientCaps message in that case.</p>
<p>But that method only sends the FLAG_SYNC on a single MDS session, and if we have dirty data for more than one MDS we will have to wait for all of them to decide to commit to disk! This may syncfs() quite slow on scale-out clusters.</p> CephFS - Feature #18475 (Resolved): qa: run xfstests in the nightlieshttps://tracker.ceph.com/issues/184752017-01-10T14:40:59ZGreg Farnumgfarnum@redhat.com
<p>We have manually run xfstests against ceph-fuse and kceph before, but apparently don't do so in the nightlies. Jeff recently did it again and found a bunch of bugs.</p>
<p>Since xfstests is obviously a major test suite, this is silly and we want to run it. Fix this!</p> CephFS - Bug #14557 (Duplicate): snaps: failed snaptest-multiple-capsnaps.shhttps://tracker.ceph.com/issues/145572016-01-29T05:19:07ZGreg Farnumgfarnum@redhat.com
<p>This is on a testing branch, but it's about to get merged to master.<br /><a class="external" href="http://pulpito.ceph.com/gregf-2016-01-26_15:35:20-fs-greg-fs-testing-126---basic-mira/44959/">http://pulpito.ceph.com/gregf-2016-01-26_15:35:20-fs-greg-fs-testing-126---basic-mira/44959/</a></p>
<pre>2016-01-26T19:58:40.416 INFO:tasks.workunit:Running workunit fs/snaps/snaptest-multiple-capsnaps.sh...
2016-01-26T19:58:40.417 INFO:teuthology.orchestra.run.mira069:Running (workunit test fs/snaps/snaptest-multiple-capsnaps.sh): 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=0ec5adcdcfce5ef675be0fa88c3806dd27a8deb7 TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" PATH=$PAT
H:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/fs/snaps/snaptest-multiple-capsnaps.sh'
2016-01-26T19:58:40.462 INFO:tasks.workunit.client.0.mira069.stderr:+ set -e
2016-01-26T19:58:40.462 INFO:tasks.workunit.client.0.mira069.stderr:+ ceph mds set allow_new_snaps true --yes-i-really-mean-it
2016-01-26T19:58:41.782 INFO:tasks.workunit.client.0.mira069.stderr:enabled new snapshots
2016-01-26T19:58:41.801 INFO:tasks.workunit.client.0.mira069.stderr:+ echo asdf
2016-01-26T19:58:41.802 INFO:tasks.workunit.client.0.mira069.stderr:+ mkdir .snap/1
2016-01-26T19:58:41.820 INFO:tasks.workunit.client.0.mira069.stderr:+ chmod 777 a
2016-01-26T19:58:41.825 INFO:tasks.workunit.client.0.mira069.stderr:+ mkdir .snap/2
2016-01-26T19:58:41.862 INFO:tasks.workunit.client.0.mira069.stderr:+ echo qwer
2016-01-26T19:58:41.869 INFO:tasks.workunit.client.0.mira069.stderr:+ mkdir .snap/3
2016-01-26T19:58:41.891 INFO:tasks.workunit.client.0.mira069.stderr:+ chmod 666 a
2016-01-26T19:58:41.893 INFO:tasks.workunit.client.0.mira069.stderr:+ mkdir .snap/4
2016-01-26T19:58:41.926 INFO:tasks.workunit.client.0.mira069.stderr:+ echo zxcv
2016-01-26T19:58:41.934 INFO:tasks.workunit.client.0.mira069.stderr:+ mkdir .snap/5
2016-01-26T19:58:41.972 INFO:tasks.workunit.client.0.mira069.stderr:+ ls -al .snap/1/a .snap/2/a .snap/3/a .snap/4/a .snap/5/a
2016-01-26T19:58:42.003 INFO:tasks.workunit.client.0.mira069.stdout:-rw-rw-r-- 1 ubuntu ubuntu 0 Jan 27 03:58 .snap/1/a
2016-01-26T19:58:42.003 INFO:tasks.workunit.client.0.mira069.stdout:-rwxrwxrwx 1 ubuntu ubuntu 5 Jan 27 03:58 .snap/2/a
2016-01-26T19:58:42.003 INFO:tasks.workunit.client.0.mira069.stdout:-rwxrwxrwx 1 ubuntu ubuntu 5 Jan 27 03:58 .snap/3/a
2016-01-26T19:58:42.003 INFO:tasks.workunit.client.0.mira069.stdout:-rw-rw-rw- 1 ubuntu ubuntu 5 Jan 27 03:58 .snap/4/a
2016-01-26T19:58:42.004 INFO:tasks.workunit.client.0.mira069.stdout:-rw-rw-rw- 1 ubuntu ubuntu 5 Jan 27 03:58 .snap/5/a
2016-01-26T19:58:42.004 INFO:tasks.workunit.client.0.mira069.stderr:+ grep asdf .snap/1/a
2016-01-26T19:58:42.008 INFO:tasks.workunit:Stopping ['fs/snaps/snaptest-multiple-capsnaps.sh'] on client.0...
2016-01-26T19:58:42.008 INFO:teuthology.orchestra.run.mira069:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/workunit.client.0 /home/ubuntu/cephtest/clone'</pre> teuthology - Bug #12381 (Resolved): nodes are missing dbenchhttps://tracker.ceph.com/issues/123812015-07-17T11:03:19ZGreg Farnumgfarnum@redhat.com
<p><a class="external" href="http://pulpito.ceph.com/teuthology-2015-07-14_23:04:02-fs-next---basic-multi/974430/">http://pulpito.ceph.com/teuthology-2015-07-14_23:04:02-fs-next---basic-multi/974430/</a></p>
<pre>2015-07-16T12:10:07.279 INFO:tasks.workunit.client.0.plana35.stderr:/home/ubuntu/cephtest/workunit.client.0/suites/dbench.sh: line 5: dbench: command not found
2015-07-16T12:10:07.280 INFO:tasks.workunit:Stopping ['suites/dbench.sh'] on client.0...
2015-07-16T12:10:07.280 INFO:teuthology.orchestra.run.plana35:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/workunit.client.0'
2015-07-16T12:10:07.352 ERROR:teuthology.parallel:Exception in parallel execution</pre>
<p>I went and checked on the node and it indeed does not have dbench installed. I checked plana34 and dbench is present. Apologies if this is the wrong tracker but I suspect it's a result of the switch from Chef to Ansible...</p> CephFS - Cleanup #4744 (In Progress): mds: pass around LogSegments via std::shared_ptrhttps://tracker.ceph.com/issues/47442013-04-17T10:06:10ZGreg Farnumgfarnum@redhat.com
<p>These really ought to be ref-counted in some way to prevent early expiry.</p>