Activity
From 03/16/2017 to 04/14/2017
04/14/2017
- 10:00 PM Backport #19620 (In Progress): kraken: MDS server crashes due to inconsistent metadata.
- 09:59 PM Bug #19406: MDS server crashes due to inconsistent metadata.
- *master PR*: https://github.com/ceph/ceph/pull/14234
- 09:57 PM Backport #19483 (In Progress): kraken: No output for "ceph mds rmfailed 0 --yes-i-really-mean-it"...
- 09:55 PM Backport #19335 (In Progress): kraken: MDS heartbeat timeout during rejoin, when working with lar...
- 09:54 PM Backport #19045 (In Progress): kraken: buffer overflow in test LibCephFS.DirLs
- 09:52 PM Backport #18950 (In Progress): kraken: mds/StrayManager: avoid reusing deleted inode in StrayMana...
- 09:49 PM Backport #18899 (In Progress): kraken: Test failure: test_open_inode
- 09:48 PM Backport #18706 (In Progress): kraken: fragment space check can cause replayed request fail
- 09:46 PM Backport #18700 (In Progress): kraken: client: fix the cross-quota rename boundary check conditions
- 09:44 PM Backport #18616 (In Progress): kraken: segfault in handle_client_caps
- 09:43 PM Backport #18566 (In Progress): kraken: MDS crashes on missing metadata object
- 09:42 PM Backport #18562 (In Progress): kraken: Test Failure: kcephfs test_client_recovery.TestClientRecovery
- 09:40 PM Backport #18552 (In Progress): kraken: ceph-fuse crash during snapshot tests
- 09:39 PM Bug #18166 (Resolved): monitor cannot start because of "FAILED assert(info.state == MDSMap::STATE...
- 09:39 PM Bug #18166: monitor cannot start because of "FAILED assert(info.state == MDSMap::STATE_STANDBY)"
- kraken backport is unnecessary (fix already in v11.2.0)
- 09:39 PM Backport #18283 (Resolved): kraken: monitor cannot start because of "FAILED assert(info.state == ...
- Already included in v11.2.0...
- 08:07 PM Support #16738: mount.ceph: unknown mount options: rbytes and norbytes
- Ceph: v10.2.7
Linux Kernel: 4.9.21-040921-generic on Ubuntu 16.04.1
The "rbytes" option in fstab seems to be work... - 01:42 PM Bug #16886 (Can't reproduce): multimds: kclient hang (?) in tests
- 12:50 PM Bug #19630 (Fix Under Review): StrayManager::num_stray is inaccurate
- https://github.com/ceph/ceph/pull/14554
- 09:26 AM Bug #19630 (Resolved): StrayManager::num_stray is inaccurate
- 09:50 AM Bug #19204 (Pending Backport): MDS assert failed when shutting down
- 09:48 AM Feature #19551 (Resolved): CephFS MDS health messages should be logged in the cluster log
- 08:28 AM Bug #18680 (Fix Under Review): multimds: cluster can assign active mds beyond max_mds during fail...
- 08:27 AM Bug #18680: multimds: cluster can assign active mds beyond max_mds during failures
- commit "mon/MDSMonitor: only allow deactivating the mds with max rank" in https://github.com/ceph/ceph/pull/14550 sho...
- 07:59 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
- by commit a1499bc4 (mds: stop purging strays when mds is being shutdown)
- 07:56 AM Bug #18755 (Resolved): multimds: MDCache.cc: 4735: FAILED assert(in)
- 07:58 AM Bug #18754 (Resolved): multimds: MDCache.cc: 8569: FAILED assert(!info.ancestors.empty())
- commit 20d43372 (mds: drop superfluous MMDSOpenInoReply)
- 07:55 AM Bug #19239 (Fix Under Review): mds: stray count remains static after workflows complete
- should be fixed by commits in https://github.com/ceph/ceph/pull/14550...
04/13/2017
- 04:23 PM Bug #19395 (Fix Under Review): "Too many inodes in cache" warning can happen even when trimming i...
- The bad state is long gone, so I'm just going to change this ticket to fixing the weird case where we were getting a ...
- 03:39 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Yeah, you're right Darrell, operator error :) I guess VM did not have correct storage network attached when I tried t...
- 02:22 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- FYI, I don't think the module signature stuff is an issue. It's just notifying you that you've loaded a module that d...
- 12:23 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- One difference I noticed between 4.4 and 4.9 kernels
- with 4.4 kernel on cephfs directory sizes (total bytes of al... - 08:04 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Yeah, did that.
My test server with compiled 4.9 kernel and patched ceph module mounted cephfs just fine.
It might... - 02:15 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- elder one wrote:
> Well, managed to patch ceph kernel module with 2b1ac852 commit, but my Ubuntu 4.9 kernel will not... - 03:02 PM Bug #19566 (Fix Under Review): MDS crash on mgr message during shutdown
- https://github.com/ceph/ceph/pull/14505
- 02:24 PM Bug #19566 (In Progress): MDS crash on mgr message during shutdown
- 02:31 PM Backport #19620 (Resolved): kraken: MDS server crashes due to inconsistent metadata.
- https://github.com/ceph/ceph/pull/14574
- 02:31 PM Backport #19619 (Resolved): jewel: MDS server crashes due to inconsistent metadata.
- https://github.com/ceph/ceph/pull/14676
- 12:03 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- Patrick was going to spin up a patch for this, so reassigning to him. Patrick, if I have that wrong, then please just...
- 11:07 AM Bug #19406 (Pending Backport): MDS server crashes due to inconsistent metadata.
- Marking as pending backport for the fix to data-scan which seems likely to be the underlying cause https://github.com...
- 08:13 AM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- Well, currently the last MDS fails over to the old balancer, so he can in fact shift his load back to the others acco...
04/12/2017
- 10:27 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Well, managed to patch ceph kernel module with 2b1ac852 commit, but my Ubuntu 4.9 kernel will not load unsigned modul...
- 01:36 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- could you try adding commit 2b1ac852 (ceph: try getting buffer capability for readahead/fadvise) to your 4.9.x kernel...
- 07:36 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- I do have mmap disabled in Dovecot conf.
Relevant bits from Dovecot conf:
mmap_disable = yes
mail_nfs_index = ... - 02:17 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- elder one wrote:
> Got another error with 4.9.21 kernel.
>
> On the cephfs node /sys/kernel/debug/ceph/xxx/:
>
... - 08:29 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- We could make the greedyspill.lua balancer check to see if it is the last MDS. Then just return instead of failing. I...
- 03:17 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- and the cpu use on MDS0 stays at +/- 250%
- 03:16 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- This error shouldn't be an expected occurrence. I'll create a fix for this.
- 03:15 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- i see this in the logs...
- 03:12 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- Nope. Is your load on mds.0 ? If yes, and it get's heavily loaded, and if mds.1 has load = 0, then i expect the balan...
- 03:09 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- did you modify the greedyspill.lua script at all?
- 02:51 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- BTW, you need to set debug_mds_balancer = 2 to see the balance working.
- 02:49 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- Yes. For example, when I have 50 clients untarring the linux kernel into unique directories, the load is moved around.
- 02:39 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- Dan, do you see any evidence of actual load balancing?
- 02:39 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- I also see this error. I have 2 Active/active mdses. The first shows no errors, the second shows the errors above. No...
- 11:56 AM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- Ahh, it's even documented:...
- 11:55 AM Bug #19589 (Resolved): greedyspill.lua: :18: attempt to index a nil value (field '?')
- The included greedyspill.lua doesn't seem to work in a simple 3-active MDS scenario....
- 01:58 PM Bug #19593 (Resolved): purge queue and standby replay mds
- Current code opens the purge queue when mds starts standby replay. purge queue can have changed when the standby repl...
- 12:46 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- John Spray wrote:
> Intuitively I would say that things we expose as xattrs should probably bump ctime when they're ... - 12:30 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- Intuitively I would say that things we expose as xattrs should probably bump ctime when they're set, if we're being c...
- 11:54 AM Bug #19388 (Closed): mount.ceph does not accept -s option
- 11:43 AM Bug #18578 (Resolved): failed filelock.can_read(-1) assertion in Server::_dir_is_nonempty
- 11:43 AM Backport #18707 (Resolved): kraken: failed filelock.can_read(-1) assertion in Server::_dir_is_non...
- 10:46 AM Bug #18461 (Resolved): failed to reconnect caps during snapshot tests
- 10:46 AM Backport #18678 (Resolved): kraken: failed to reconnect caps during snapshot tests
- 10:24 AM Bug #19205 (Resolved): Invalid error code returned by MDS is causing a kernel client WARNING
- 10:24 AM Backport #19206 (Resolved): jewel: Invalid error code returned by MDS is causing a kernel client ...
04/11/2017
- 09:03 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Dumped also mds cache (healhty cluster state 8 hours later)
Searched inode in mds error log:... - 01:54 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Got another error with 4.9.21 kernel.
On the cephfs node /sys/kernel/debug/ceph/xxx/:... - 08:16 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- No, I mean the ctime. You only want to update the mtime if you're changing the directory's contents. I would consider...
- 07:55 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- > Should we be updating the ctime and change_attr in the ceph.dir.layout case as well?
Did you mean mtime?
I th... - 07:32 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- I think you're right -- well spotted! In particular, we need to bump it in the case where we update the ctime (ceph.f...
- 07:21 PM Bug #19583 (Resolved): mds: change_attr not inc in Server::handle_set_vxattr
- Noticed this was missing, was this intentional?
- 07:22 PM Bug #19388: mount.ceph does not accept -s option
- Looking at the complete picture, I think it's easiest (and acceptable for me) to wait for Jessie's successor, Stretch...
- 06:53 AM Feature #19578: mds: optimize CDir::_omap_commit() and CDir::_committed() for large directory
- We can track dirty dentries in a dirty list, CDir::_omap_commit() and CDir::_committed() only needs to check dentries...
- 06:51 AM Feature #19578 (Resolved): mds: optimize CDir::_omap_commit() and CDir::_committed() for large di...
- CDir::_omap_commit() and CDir::_committed() need to traverse whole dirfrag to find dirty dentries. It's not efficienc...
- 03:38 AM Bug #19450 (Fix Under Review): PurgeQueue read journal crash
- https://github.com/ceph/ceph/pull/14447
- 02:18 AM Bug #19450: PurgeQueue read journal crash
- mds always first read all entries in mds log, then write new entries to it. The partial entry is detected and dropped...
04/10/2017
- 10:58 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Some wierd issues raised with my applications (unrelated to cephfs) with 4.10.x kernel.
The same fix is in 4.9.x k... - 02:19 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Upgraded cephfs clients to 4.10.9 kernel.
- 12:08 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Can you try updating the cephfs client to 4.10.x kernel
- 09:59 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Error happened again:...
- 11:48 AM Bug #19566 (Resolved): MDS crash on mgr message during shutdown
- ...
04/09/2017
- 05:30 PM Bug #19388: mount.ceph does not accept -s option
- Fair point about options getting passed through.
The thing I'm not sure about here is who is going to backport a f... - 01:16 PM Bug #19388: mount.ceph does not accept -s option
- mount.ceph passes options which it does not recognize directly to the kernel mount function, therefore a full impleme...
- 12:45 PM Bug #19450: PurgeQueue read journal crash
- hit this again...
04/08/2017
- 10:03 AM Bug #19406: MDS server crashes due to inconsistent metadata.
- The master install-deps.sh installed a flood of other stuff the earlier didn't. But still get fatal errors when runni...
- 09:52 AM Bug #19406: MDS server crashes due to inconsistent metadata.
- I'll download latest master instead and try.
- 09:41 AM Bug #19406: MDS server crashes due to inconsistent metadata.
- Okay, I followed the guide at the Ceph site how to prepare and compile Ceph.
I already did run install_deps.sh in or...
04/07/2017
- 03:04 PM Feature #19551 (Fix Under Review): CephFS MDS health messages should be logged in the cluster log
This was inspired by a ceph.log from cern's testing, in which we could see mysterious-looking fsmap epoch bump mess...- 03:01 PM Feature #19551 (Resolved): CephFS MDS health messages should be logged in the cluster log
- 08:37 AM Bug #19239 (In Progress): mds: stray count remains static after workflows complete
- 07:26 AM Bug #18850 (Rejected): Leak in MDCache::handle_dentry_unlink
- 05:53 AM Bug #19445 (Resolved): client: warning: ‘*’ in boolean context, suggest ‘&&’ instead #14263
- The PR#14308 is merged.
04/06/2017
- 01:30 PM Bug #18850: Leak in MDCache::handle_dentry_unlink
- It's likely some CInode/CDir/CDentry in cache are not properly freed when mds process exits. not caused by MDCache::h...
- 01:03 AM Bug #19406: MDS server crashes due to inconsistent metadata.
- Christoffer Lilja wrote:
> Hi,
>
> I have little spare time to spend on this and can't get past this compile erro...
04/05/2017
- 07:54 PM Bug #19406: MDS server crashes due to inconsistent metadata.
- Hi,
I have little spare time to spend on this and can't get past this compile error due to my very limited knowled... - 02:03 AM Bug #19406: MDS server crashes due to inconsistent metadata.
- Christoffer Lilja wrote:
> I downloaded the 10.2.6 tar and tried to apply the above patch from Zheng Yan but it got ... - 05:10 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Thanks for the link Zheng. Looks like that fix is in mainline as of 4.10.2. When I have a chance, I'll try upgrading ...
- 09:13 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- elder one wrote:
> My cephfs clients (Ubuntu Xenial) are running kernel from Ubuntu PPA: http://kernel.ubuntu.com/~k... - 08:24 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- My cephfs clients (Ubuntu Xenial) are running kernel from Ubuntu PPA: http://kernel.ubuntu.com/~kernel-ppa/mainline/v...
- 07:39 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- elder one wrote:
> Hit the same bug today with 4.4.59 kernel client.
> 2 MDS servers (1 standby) 10.2.6-1 on Ubuntu... - 07:36 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Darrell Enns wrote:
> Zheng Yan wrote:
> > probably fixed by https://github.com/ceph/ceph-client/commit/10a2699426a... - 01:33 PM Bug #19501 (Fix Under Review): C_MDSInternalNoop::complete doesn't free itself
- 01:33 PM Bug #19501: C_MDSInternalNoop::complete doesn't free itself
- https://github.com/ceph/ceph/pull/14347
- 01:17 PM Bug #19501 (Resolved): C_MDSInternalNoop::complete doesn't free itself
- This cause memory leak
- 09:20 AM Bug #19306: fs: mount NFS to cephfs, and then ls a directory containing a large number of files, ...
- geng jichao wrote:
> I have a question, if the file struct is destroyed,how to ensure that cache_ctl.index is correc... - 05:52 AM Bug #19306: fs: mount NFS to cephfs, and then ls a directory containing a large number of files, ...
- I have a question, if the file struct is destroyed,how to ensure that cache_ctl.index is correct。
In other words,req... - 01:32 AM Bug #19306: fs: mount NFS to cephfs, and then ls a directory containing a large number of files, ...
- kernel patch https://github.com/ceph/ceph-client/commit/b7e2eee12aa174bc91279a7cee85e9ea73092bad
- 06:01 AM Bug #19438: ceph mds error "No space left on device"
- ceph mds error "No space left on device" ,This problem I have encountered in the 10.2.6 version 。
Need to add a lin... - 04:12 AM Bug #19450: PurgeQueue read journal crash
- ...
04/04/2017
- 09:09 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Very interesting! That means we have:
4.4.59 - bug
4.7.5 - no bug (or at least rare enough that I'm not seeing it... - 06:03 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Hit the same bug today with 4.4.59 kernel client.
2 MDS servers (1 standby) 10.2.6-1 on Ubuntu Trusty.
From mds l... - 09:05 PM Bug #19388: mount.ceph does not accept -s option
- The kernel client has a mount helper in src/mount/mount.ceph.c -- although that only comes into play if the ceph pack...
- 01:50 PM Bug #19306 (Fix Under Review): fs: mount NFS to cephfs, and then ls a directory containing a larg...
- https://github.com/ceph/ceph/pull/14317
- 01:18 PM Bug #19456: Hadoop suite failing on missing upstream tarball
- I see the hadoop task is in ceph/teuthology - would it make sense to move it to ceph/ceph?
- 12:44 PM Backport #19483 (Resolved): kraken: No output for "ceph mds rmfailed 0 --yes-i-really-mean-it" co...
- https://github.com/ceph/ceph/pull/14573
- 12:44 PM Backport #19482 (Resolved): jewel: No output for "ceph mds rmfailed 0 --yes-i-really-mean-it" com...
- https://github.com/ceph/ceph/pull/14674
- 12:42 PM Backport #19466 (Resolved): jewel: mds: log rotation doesn't work if mds has respawned
- https://github.com/ceph/ceph/pull/14673
- 01:03 AM Bug #19445 (In Progress): client: warning: ‘*’ in boolean context, suggest ‘&&’ instead #14263
- https://github.com/ceph/ceph/pull/14308
04/03/2017
- 09:29 PM Bug #19456: Hadoop suite failing on missing upstream tarball
- Looks like v2.5.2 is too old. It is now in the Apache archive: https://archive.apache.org/dist/hadoop/core/hadoop-2.5...
- 08:20 PM Bug #19456 (Rejected): Hadoop suite failing on missing upstream tarball
- This is jewel v10.2.7 point release candidate
Also true on all branches
Run: http://pulpito.ceph.com/yuriw-2017-0... - 08:15 PM Bug #19388: mount.ceph does not accept -s option
- Looking at it, it seems I'll have to update the Linux kernel as well to support the sloppy option. Doing so in a sane...
- 12:53 PM Bug #19306 (In Progress): fs: mount NFS to cephfs, and then ls a directory containing a large num...
- 12:44 PM Bug #19450 (Resolved): PurgeQueue read journal crash
- ...
- 11:16 AM Bug #19437 (Fix Under Review): fs:The mount point break off when mds switch hanppened.
- https://github.com/ceph/ceph/pull/14267
04/02/2017
- 07:30 AM Bug #19445 (Resolved): client: warning: ‘*’ in boolean context, suggest ‘&&’ instead #14263
- The following warning appears during make. Here roll_die and empty() returns bool. This can be easily fixed by removi...
04/01/2017
- 06:15 PM Bug #19406: MDS server crashes due to inconsistent metadata.
- I downloaded the 10.2.6 tar and tried to apply the above patch from Zheng Yan but it got rejected. I guess it's writt...
03/31/2017
- 11:49 PM Bug #19291 (Pending Backport): mds: log rotation doesn't work if mds has respawned
- The bug is also in jewel 10.2.6, due to 6efad699249ba7c6928193dba111dbb23b606beb.
- 10:41 AM Bug #19343: OOM kills ceph-fuse: objectcacher doesn't seem to respect its max objects limits (jew...
- The log says "objects: max 100" but the default of client_oc_max_objects is 1000, so I assumed there was some manual ...
- 10:32 AM Bug #19438: ceph mds error "No space left on device"
- Hmm, so fragmentation is enabled but apparently isn't happening? Is your ceph.conf with "mds_bal_frag = true" presen...
- 10:17 AM Bug #19438 (Won't Fix): ceph mds error "No space left on device"
- Through testing the bash script for MDS cluster, create a test directory under the ceph mount path, in the test direc...
- 05:04 AM Bug #19437 (Resolved): fs:The mount point break off when mds switch hanppened.
My ceph version is jewel and my cluster have two nodes. I start two mds, one active and the other is hot-standby mo...- 02:58 AM Bug #18579 (Fix Under Review): Fuse client has "opening" session to nonexistent MDS rank after MD...
- should be fixed in https://github.com/ceph/ceph/pull/13698
03/30/2017
- 02:33 PM Bug #19343: OOM kills ceph-fuse: objectcacher doesn't seem to respect its max objects limits (jew...
- I don't think I have changed the defaults. How can I see the values I use?
- 12:35 PM Bug #19426: knfs blogbench hang
- I got several oops at other places. seem like random memory corruption, It only happens when cephfs is exported by nf...
- 09:57 AM Bug #19426: knfs blogbench hang
- ...
- 08:34 AM Bug #19426 (Can't reproduce): knfs blogbench hang
- http://pulpito.ceph.com/yuriw-2017-03-24_21:44:17-knfs-jewel-integration-ktdreyer-testing-basic-smithi/941067/
http:... - 01:23 AM Bug #19406: MDS server crashes due to inconsistent metadata.
- ...
03/29/2017
- 07:09 PM Bug #19406: MDS server crashes due to inconsistent metadata.
- I don't have a build environment nor the Ceph source code on the machines.
I'll download the source code, apply the ... - 06:45 PM Bug #19406: MDS server crashes due to inconsistent metadata.
- Looking at what DataScan.cc currently does, looks like we were injecting things with hash set to zero, so it would pi...
- 02:02 PM Bug #19406: MDS server crashes due to inconsistent metadata.
- looks like ~mds0's dir_layout.dl_dir_hash is wrong. Please check if below patch can prevent mds from crash...
- 01:28 PM Bug #19406: MDS server crashes due to inconsistent metadata.
- The logfile contains file structure that in it self is sensitive information, that's why it's truncated.
I'll see if... - 01:01 PM Bug #19406: MDS server crashes due to inconsistent metadata.
- To work out how the metadata is now corrupted (i.e. what the recovery tools broke), it would be useful to have the fu...
- 12:43 PM Bug #19406: MDS server crashes due to inconsistent metadata.
- Can you go back to your logs from when the original metadata damage occurred? There should be a message in your clus...
- 10:55 AM Bug #19406: MDS server crashes due to inconsistent metadata.
- I see now that the two other MDS servers crash the same way.
- 10:44 AM Bug #19406: MDS server crashes due to inconsistent metadata.
- A NIC broke and cause tremendous load on the server. Shortly after I had to kill the server by power, note that this ...
- 10:15 AM Bug #19406: MDS server crashes due to inconsistent metadata.
- Please also tell us what else was going on around this event. Had you recently upgraded? Had another daemon recentl...
- 09:43 AM Bug #19406: MDS server crashes due to inconsistent metadata.
- please set 'debug_mds = 20', restart the mds and upload the mds.log
- 07:57 AM Bug #19406 (Resolved): MDS server crashes due to inconsistent metadata.
- -48> 2017-03-26 23:34:38.886779 7f4b95030700 5 mds.neutron handle_mds_map epoch 620 from mon.1
-47> 2017-03-2...
03/28/2017
- 10:21 PM Bug #19343: OOM kills ceph-fuse: objectcacher doesn't seem to respect its max objects limits (jew...
- Hmm.
I've tried setting some comically low limits:... - 09:09 PM Bug #19401 (Fix Under Review): MDS goes readonly writing backtrace for a file whose data pool has...
- So on reflection I realise that the deletion/recovery cases are not an issue because those code paths already handle ...
- 05:28 PM Bug #19401 (Resolved): MDS goes readonly writing backtrace for a file whose data pool has been re...
- Reproduce:
1. create a pool
2. add it as a data pool
3. set that pool in a layout on the client, and write a file
... - 09:05 PM Bug #19395: "Too many inodes in cache" warning can happen even when trimming is working
- Couple of observations from today:
* after an mds failover, the issue cleared and we're back to having "ino" (CIno... - 02:55 PM Bug #19395: "Too many inodes in cache" warning can happen even when trimming is working
- Cache dumps in /root/19395 on mira060
- 01:36 PM Bug #19395 (Resolved): "Too many inodes in cache" warning can happen even when trimming is working
- 01:29 PM Bug #19291 (Resolved): mds: log rotation doesn't work if mds has respawned
- 01:29 PM Bug #19282 (Resolved): RecoveryQueue::_recovered asserts out on OSD errors
- 01:28 PM Fix #19288 (Resolved): Remove legacy "mds tell"
- 01:27 PM Bug #16709 (Pending Backport): No output for "ceph mds rmfailed 0 --yes-i-really-mean-it" command
- 01:25 PM Feature #17604 (Resolved): MDSMonitor: raise health warning when there are no standbys but there ...
03/27/2017
- 05:26 PM Bug #19388: mount.ceph does not accept -s option
- OK, if you're motivated to do this (and update the pull request to implement -s rather than dropping it) then I don't...
- 05:24 PM Bug #19388: mount.ceph does not accept -s option
- I wasn't aware of this fix in autofs, thanks for pointing me to it.
However, several mount programs accept the -s ... - 10:32 AM Bug #19388: mount.ceph does not accept -s option
- According to the release notes, autofs 5.1.1 includes a fix to avoid passing -s to filesystems other than NFS (https:...
- 03:39 AM Bug #19388 (Fix Under Review): mount.ceph does not accept -s option
- https://github.com/ceph/ceph/pull/14158
- 11:58 AM Bug #16842: mds: replacement MDS crashes on InoTable release
- A patch that should allow MDSs to throw out the invalid sessions and start up. Does not fix whatever got the system ...
03/26/2017
- 12:31 PM Bug #19388 (Closed): mount.ceph does not accept -s option
- The mount.ceph tool does not accept the -s (sloppy) option. On Debian Jessie (and possibly more distributions), autof...
03/23/2017
- 01:35 PM Feature #18509 (Fix Under Review): MDS: damage reporting by ino number is useless
- 01:30 PM Feature #18509: MDS: damage reporting by ino number is useless
- https://github.com/ceph/ceph/pull/14104
- 02:16 AM Feature #19362 (Resolved): mds: add perf counters for each type of MDS operation
- It's desirable to know the types of operations are being performed on the MDS so we have a better idea over time what...
03/22/2017
- 06:31 AM Bug #19306: fs: mount NFS to cephfs, and then ls a directory containing a large number of files, ...
- I have used the offset parameter of the ceph_dir_llseek function, and it will be passed to mds in readdir request,if ...
- 03:38 AM Bug #19306: fs: mount NFS to cephfs, and then ls a directory containing a large number of files, ...
- This bug is not specific to kernel client. Enabling directory fragments can help. The complete fix is make client enc...
03/21/2017
- 08:56 PM Bug #16842: mds: replacement MDS crashes on InoTable release
- If anyone needs the debug level "30/30" dump I can send it directly to them.
Also, so far I have not nuked the cep... - 04:08 PM Bug #16842: mds: replacement MDS crashes on InoTable release
- I don't have any brilliant ideas for debugging this (can't go back in time to find out how those bogus inode ranges m...
- 08:39 PM Bug #19343: OOM kills ceph-fuse: objectcacher doesn't seem to respect its max objects limits (jew...
- Just in case it wasn't clear from the log snippet above: the current counter is continously growing until the process...
- 08:25 PM Bug #19343: OOM kills ceph-fuse: objectcacher doesn't seem to respect its max objects limits (jew...
- Hmm, I've looked at the code now and it turns out we do already have a client_oc_max_dirty setting that is supposed t...
- 05:18 PM Bug #19343: OOM kills ceph-fuse: objectcacher doesn't seem to respect its max objects limits (jew...
- I see ... so the fix would be to have a dirty_limit (like vm.dirty_ratio) above which all new request would not be ca...
- 03:11 PM Bug #19343: OOM kills ceph-fuse: objectcacher doesn't seem to respect its max objects limits (jew...
- Hmm, could be that we just have no limit on the number of dirty objects (trim is only dropping clean things), so if w...
- 01:10 PM Bug #19343 (New): OOM kills ceph-fuse: objectcacher doesn't seem to respect its max objects limit...
- Creating a 100G file on CephFS reproducibly triggers OOM to kill the
ceph-fuse process.
With objectcacher debug ... - 01:01 PM Backport #19335 (Resolved): kraken: MDS heartbeat timeout during rejoin, when working with large ...
- https://github.com/ceph/ceph/pull/14572
- 01:01 PM Backport #19334 (Resolved): jewel: MDS heartbeat timeout during rejoin, when working with large a...
- https://github.com/ceph/ceph/pull/14672
03/20/2017
- 07:07 AM Bug #19245 (Resolved): Crash in PurgeQueue::_execute_item when deletions happen extremely quickly
03/19/2017
- 08:25 AM Bug #19306 (Resolved): fs: mount NFS to cephfs, and then ls a directory containing a large number...
- The ceph_readdir function save lot of date in the file->private_date, include the last_name which uses as offset.Howe...
03/18/2017
- 09:17 PM Bug #19240: multimds on linode: troubling op throughput scaling from 8 to 16 MDS in kernel bulid ...
- Note: inodes loaded is visible in mds-ino+.png in both workflow directories.
- 09:16 PM Bug #19240: multimds on linode: troubling op throughput scaling from 8 to 16 MDS in kernel bulid ...
- I believe I have this one figured out. The client requests graph (mds-request.png) shows that the 8 MDS workflow is r...
03/17/2017
- 04:37 PM Bug #19291 (Fix Under Review): mds: log rotation doesn't work if mds has respawned
- https://github.com/ceph/ceph/pull/14021
- 12:00 PM Bug #17939 (Fix Under Review): non-local cephfs quota changes not visible until some IO is done
- https://github.com/ceph/ceph/pull/14018
- 11:51 AM Bug #19282 (Fix Under Review): RecoveryQueue::_recovered asserts out on OSD errors
- https://github.com/ceph/ceph/pull/14017
- 11:36 AM Fix #19288 (Fix Under Review): Remove legacy "mds tell"
- https://github.com/ceph/ceph/pull/14015
03/16/2017
- 10:17 PM Bug #19291: mds: log rotation doesn't work if mds has respawned
- Sorry hit submit on accident before finishing writing this up. Standby!
- 10:16 PM Bug #19291 (Resolved): mds: log rotation doesn't work if mds has respawned
- If an MDS respawns then its "comm" name becomes "exe" which confuses logrotate since it relies on killlall. What ends...
- 09:42 PM Bug #16842: mds: replacement MDS crashes on InoTable release
Hi All,
After upgrading to 10.2.6 on Debian Jessie, the MDS server fails to start. Below is what is written to ...- 08:44 PM Feature #19286 (Closed): Log/warn if "mds bal fragment size max" is reached
- 10:32 AM Feature #19286: Log/warn if "mds bal fragment size max" is reached
- OK, not such a big deal for me, I guess I can increase fragment size till Luminous release.
- 10:14 AM Feature #19286: Log/warn if "mds bal fragment size max" is reached
- Were you planning to work on this?
I don't think we're likely to work on this as a new feature, given that in Lumi... - 09:51 AM Feature #19286 (Closed): Log/warn if "mds bal fragment size max" is reached
- Since in Jewel cephfs directory fragmentation is disabled by default, would be nice to know when we hit "mds bal frag...
- 04:49 PM Fix #19288 (Resolved): Remove legacy "mds tell"
This has been deprecated for a while in favour of the new-style "ceph tell mds.<id>". Luminous seems like a good t...- 10:14 AM Bug #18151 (Resolved): Incorrect report of size when quotas are enabled.
- 03:10 AM Bug #18151: Incorrect report of size when quotas are enabled.
- Hi John. Let us close it now since according to Greg it has been resolved. Still did not upgrade but will open a new ...
Also available in: Atom