Project

General

Profile

Activity

From 08/18/2016 to 09/16/2016

09/16/2016

01:54 PM Bug #17275: MDS long-time blocked ops. ceph-fuse locks up with getattr of file
{
"ops": [],
"linger_ops": [],
"pool_ops": [],
"pool_stat_ops": [],
"statfs_ops": [],
"...
Henrik Korkuc
01:50 PM Bug #17275: MDS long-time blocked ops. ceph-fuse locks up with getattr of file
... Zheng Yan
06:56 AM Bug #17275: MDS long-time blocked ops. ceph-fuse locks up with getattr of file
it seems there is a blocked write in client.2816210. could you use gdb to attach that ceph-fuse, the run "thread appl... Zheng Yan
09:51 AM Backport #17285 (In Progress): ceph-mon leaks in MDSMonitor when ceph-mds process is running but ...
https://github.com/ceph/ceph/pull/10238 Nathan Cutler
09:51 AM Backport #17285: ceph-mon leaks in MDSMonitor when ceph-mds process is running but MDS is not con...
h3. original description
Memory is leaking every time when you see in monitor log:
@mon.0@0(leader).mds e1 warn...
Nathan Cutler
07:38 AM Backport #17285: ceph-mon leaks in MDSMonitor when ceph-mds process is running but MDS is not con...
PR https://github.com/ceph/ceph/pull/10238 is fixing this leak at hammer branch. After v10.0.5 there was massive MDSM... Igor Podoski
07:33 AM Backport #17285 (Resolved): ceph-mon leaks in MDSMonitor when ceph-mds process is running but MDS...
https://github.com/ceph/ceph/pull/10238 Igor Podoski
09:19 AM Bug #17286 (Resolved): Failure in dirfrag.sh

So, we've been running with dirfrags off for ages because we weren't setting the flag to allow it:
https://github....
John Spray
07:14 AM Bug #17284 (New): Bogus "unmatched fragstat", "unmatched rstat" after missing dirfrag object
Ideally we should not spit out these mismatch messages, because they're just pointing out that the empty dirfrag (aft... John Spray

09/15/2016

06:19 PM Bug #16255 (Pending Backport): ceph-create-keys: sometimes blocks forever if mds "allow" is set
https://github.com/ceph/ceph/pull/10415 Sage Weil
09:57 AM Bug #16857 (Duplicate): Crash in Client::_invalidate_kernel_dcache
John Spray
08:26 AM Bug #17192 (Duplicate): "SELinux denials" in knfs-master-testing-basic-smithi
Ian Colle
06:42 AM Bug #17275: MDS long-time blocked ops. ceph-fuse locks up with getattr of file
Attaching.
btw as you'll probably notice, 2016-09-11-170046 directory is in another path in this dump. It was move...
Henrik Korkuc
06:24 AM Bug #17275: MDS long-time blocked ops. ceph-fuse locks up with getattr of file
the client that causes problem is 2816210, please dump its cache Zheng Yan

09/14/2016

09:04 PM Feature #17276 (Resolved): stick client PID in client_metadata
As best I can tell, we no longer provide any way for an admin to backtrack from a misbehaving client session to a spe... Greg Farnum
02:36 PM Bug #17275: MDS long-time blocked ops. ceph-fuse locks up with getattr of file
as I killed that ceph-fuse, I hanged another ceph-fuse. Attaching outputs from that one Henrik Korkuc
01:52 PM Bug #17275: MDS long-time blocked ops. ceph-fuse locks up with getattr of file
please run "dump cache", "mds_requests" and "objecter_requests" commands for client.7397637 and upload the outputs. Zheng Yan
12:56 PM Bug #17275: MDS long-time blocked ops. ceph-fuse locks up with getattr of file
forgot to attach files... Attaching now. Also recompressed mds_cache with xz for it to go bellow 1MB (50% smaller fil... Henrik Korkuc
12:52 PM Bug #17275 (Resolved): MDS long-time blocked ops. ceph-fuse locks up with getattr of file
In 10.2.1 cluster we are having some blocked ceph-fuse (tested with 10.2.2) metadata accesses to files.
I cannot r...
Henrik Korkuc
10:36 AM Bug #17271 (Fix Under Review): Failure in snaptest-git-ceph.sh
https://github.com/ceph/ceph/pull/11078 Zheng Yan
08:24 AM Bug #17270: [cephfs] fuse client crash when adding a new osd
I have tested the built ceph-fuse(ceph-fuse-10.2.2-2.g666cfe6.x86_64.rpm) in my environment(with osd-0.94.3). It work... xiangyang yu
05:16 AM Bug #17270: [cephfs] fuse client crash when adding a new osd
I will test your built ceph-fuse with OSD (0.94.3) later. xiangyang yu
04:13 AM Bug #17270: [cephfs] fuse client crash when adding a new osd
But you originally were running ceph-fuse-10.2.2-0.el7.centos.x86_64 so the osdc part was always jewel.
It's possi...
John Spray
02:10 AM Bug #17270: [cephfs] fuse client crash when adding a new osd
I have tried all Jewel packages and it runs correctly and I think the problem is in osdc at ceph-0.94-3.
There mus...
xiangyang yu
07:11 AM Bug #17231 (Resolved): CephFS : IOs get terminated while reading/writing on larger files
It was a tool use case problem, unrelated to CephFS. Closing the bug. Vishal Kanaujia
07:09 AM Bug #17231: CephFS : IOs get terminated while reading/writing on larger files
The problem lied with usage of VDbench. If you have 256 threads, but 10 files, VDBench might fail.
From VDBench us...
Vishal Kanaujia
04:10 AM Bug #17181 (Duplicate): "[ FAILED ] LibCephFS.ThreesomeInterProcessRecordLocking" in smoke
John Spray
01:30 AM Bug #16914: multimds: pathologically slow deletions in some tests
Okay so it appears Greg is right after looking through the mds logs for the client getattr op posted above. It appear... Patrick Donnelly

09/13/2016

11:42 PM Bug #16914: multimds: pathologically slow deletions in some tests
Looking at this new run which toggles fuse_default_permissions:
http://pulpito.ceph.com/pdonnell-2016-08-11_20:40:...
Patrick Donnelly
10:37 PM Bug #17181 (Need More Info): "[ FAILED ] LibCephFS.ThreesomeInterProcessRecordLocking" in smoke
Ceph bug? Zack Cerza
01:51 PM Bug #17270: [cephfs] fuse client crash when adding a new osd
Jason suggests that 1a48a8a is the culprit, I've pushed branch wip-17270 to the gitbuilders which is a revert of that... John Spray
12:12 PM Bug #17270: [cephfs] fuse client crash when adding a new osd

The OSD op that's triggering this....
John Spray
11:24 AM Bug #17270: [cephfs] fuse client crash when adding a new osd
I just add the debug objectcacher and objecter log level and reproducte the fuse-client crash. Please see the log att... xiangyang yu
10:25 AM Bug #17270: [cephfs] fuse client crash when adding a new osd
Thanks for the report, please could you try reproducing with --debug-objectcacher=10 and debug-objecter=7 so that we ... John Spray
10:01 AM Bug #17270: [cephfs] fuse client crash when adding a new osd
Detail log is attached below. xiangyang yu
09:54 AM Bug #17270 (Resolved): [cephfs] fuse client crash when adding a new osd
Hello everyone,
I have met a ceph-fuse crash when i add osd to osd pool.
I am writing data through ceph-fuse,th...
xiangyang yu
12:10 PM Bug #17236: MDS goes damaged on blacklist (failed to read JournalPointer: -108 ((108) Cannot send...
http://pulpito.ceph.com/jspray-2016-09-13_08:34:56-fs-wip-no-recordlock-test-testing-basic-smithi/413348/
Promotin...
John Spray
10:45 AM Bug #17271: Failure in snaptest-git-ceph.sh
Zheng, could you take a look please? John Spray
10:44 AM Bug #17271 (Resolved): Failure in snaptest-git-ceph.sh

Either in addition to http://tracker.ceph.com/issues/17172 or we didn't really fix it.
This test branch was runn...
John Spray
05:58 AM Bug #17253: Crash in Client::_invalidate_kernel_dcache when reconnecting during unmount
This was not specific to that test configuration, happening in a normal test branch:
http://qa-proxy.ceph.com/teutho...
John Spray

09/12/2016

07:59 AM Backport #17264 (Resolved): jewel: multimds: allow_multimds not required when max_mds is set in c...
https://github.com/ceph/ceph/pull/10997 Loïc Dachary
12:25 AM Bug #17259: multimds: ranks >= max_mds may be assigned after reducing max_mds
Yeah, ranks are assigned out of the 'in' set rather than sequentially, so if you deactivate one in the middle of the ... John Spray

09/11/2016

11:23 PM Bug #17259 (Won't Fix): multimds: ranks >= max_mds may be assigned after reducing max_mds
I'm not sure if this is really a problem or not but I noticed in:
http://pulpito.ceph.com/pdonnell-2016-09-10_15:1...
Patrick Donnelly

09/10/2016

08:54 AM Feature #9466: kclient: Extend CephFSTestCase tests to cover kclient
Latest:
http://pulpito.ceph.com/jspray-2016-09-05_12:30:10-kcephfs:recovery-master-testing-basic-mira/...
John Spray

09/09/2016

04:03 PM Bug #16919 (Resolved): MDS: Standby replay daemons don't drop purged strays
Not backporting because it's a behavioural cleanup rather than a bugfix John Spray
01:28 PM Bug #17253 (Resolved): Crash in Client::_invalidate_kernel_dcache when reconnecting during unmount
... John Spray
12:14 PM Feature #17249 (Fix Under Review): cephfs tool for finding files that use named PGs
https://github.com/ceph/ceph/pull/11026 John Spray

09/08/2016

11:02 PM Feature #17249 (Resolved): cephfs tool for finding files that use named PGs
Sometimes when a data pool is damaged, it is useful to work out which files are affected.
http://lists.ceph.com/pi...
John Spray
02:00 PM Backport #17246 (Resolved): jewel: Log path as well as ino when detecting metadata damage
https://github.com/ceph/ceph/pull/11418 Loïc Dachary
01:59 PM Backport #17244 (Resolved): jewel: Failure in snaptest-git-ceph.sh
https://github.com/ceph/ceph/pull/11419 Loïc Dachary
01:55 PM Bug #17240: inode_permission error with kclient when running client IO with recovery operations.
Cluster state at the time:-
ceph -s
cluster ee17af9f-24f1-425e-abd3-d2289102dec1
health HEALTH_ERR
...
Rohith Radhakrishnan
01:49 PM Bug #17240 (Closed): inode_permission error with kclient when running client IO with recovery ope...
Steps to reproduce:-
1) With client io running make few osds down and out of the cluster
2) Bring up the osds to ...
Rohith Radhakrishnan
10:32 AM Bug #17172 (Pending Backport): Failure in snaptest-git-ceph.sh
John Spray
02:57 AM Bug #17231: CephFS : IOs get terminated while reading/writing on larger files
the slow requests seem like caused by "dirty page writeback" Zheng Yan

09/07/2016

10:07 PM Bug #16983 (Resolved): mds: handle_client_open failing on open
Randy privately confirmed https://github.com/ceph/ceph/pull/8778 resolves his problem. Closing. Patrick Donnelly
08:18 PM Feature #16973 (Pending Backport): Log path as well as ino when detecting metadata damage
John Spray
02:43 PM Feature #16973: Log path as well as ino when detecting metadata damage
Marking for backport because although it's not a bugfix, it's a supportability thing. John Spray
08:16 PM Bug #17236 (Resolved): MDS goes damaged on blacklist (failed to read JournalPointer: -108 ((108) ...

http://qa-proxy.ceph.com/teuthology/teuthology-2016-09-05_17:25:02-kcephfs-master-testing-basic-mira/401388/
OSD...
John Spray
01:56 PM Bug #17231: CephFS : IOs get terminated while reading/writing on larger files
I don't have any idea what this test does, but your pasted bit includes as output:
> 17:22:57.411 Do you maybe hav...
Greg Farnum
11:59 AM Bug #17231: CephFS : IOs get terminated while reading/writing on larger files
Yes I ran it on multiple clients, but each client had been mounted with different sub-directories(each with 6 files-1... Parikshith B
11:07 AM Bug #17231: CephFS : IOs get terminated while reading/writing on larger files
I see several clients in your MDS log. How are the operations in your test split between clients?
It is possible ...
John Spray
10:54 AM Bug #17231 (Resolved): CephFS : IOs get terminated while reading/writing on larger files
CephFs Environment:
1 MDS node
1 Ceph Kernel Filesystem, Ubuntu 14.04 LTS, kernel version 4.4.0-36-generic
Test ...
Parikshith B
10:48 AM Feature #17230 (Resolved): ceph_volume_client: py3 compatible
Manila drivers and their CIs are encouraged to be py3 compatible. Manila's cephfs_native driver uses ceph_volume_clie... Ramana Raja
03:36 AM Bug #17212: Unable to remove symlink / fill_inode badness on ffff88025f049f88
the mds bugs are fixed by https://github.com/ceph/ceph/pull/8778/commits. they will be included in next jewel release... Zheng Yan

09/06/2016

04:07 PM Bug #17105: multimds: allow_multimds not required when max_mds is set in ceph.conf at startup
Backport: https://github.com/ceph/ceph/pull/10997 Patrick Donnelly
01:43 PM Bug #17105 (Pending Backport): multimds: allow_multimds not required when max_mds is set in ceph....
John Spray
03:42 PM Support #17171: Ceph-fuse client hangs on unmount
Hmm, so if that's resproducible then it sounds like we could reproduce it by killing an MDS, invoking umount, seeing ... John Spray
02:48 PM Support #17171: Ceph-fuse client hangs on unmount
Actually when it becomes responsive - unmount is still hanging and only `kill -9` helps. Arturas Moskvinas
01:45 PM Support #17171: Ceph-fuse client hangs on unmount
To be clear, you're saying that while the server cluster is unresponsive, ceph-fuse hangs on unmount? That is expect... John Spray
12:59 PM Support #17171: Ceph-fuse client hangs on unmount
Actually after several checks it seems like ceph is actually not responding from time to time due to very high load/d... Arturas Moskvinas
12:22 PM Bug #17216: ceph_volume_client: recovery of partial auth update is broken
The reproducer https://github.com/ceph/ceph-qa-suite/pull/1166 Ramana Raja
11:55 AM Bug #17216 (Resolved): ceph_volume_client: recovery of partial auth update is broken
I run into the following traceback when the volume_client tries
to recover from partial auth update,
Connecting t...
Ramana Raja
12:17 PM Feature #16973 (Fix Under Review): Log path as well as ino when detecting metadata damage
https://github.com/ceph/ceph/pull/10996 John Spray
11:48 AM Bug #17212: Unable to remove symlink / fill_inode badness on ffff88025f049f88
The complete log file has been uploaded with ceph-post-file under id cd309feb-70b3-4291-b32f-8c559ecd3866
The prob...
Burkhard Linke
09:41 AM Bug #17212: Unable to remove symlink / fill_inode badness on ffff88025f049f88
The SessionMap crash will be related to the "client.4031546 does not advance its oldest_client_tid" message, as with ... John Spray
09:40 AM Bug #17212: Unable to remove symlink / fill_inode badness on ffff88025f049f88
Please can we have the MDS log from the Locker::check_inode_max_size crash?
Is that happening with both the kernel...
John Spray
01:06 AM Bug #16842: mds: replacement MDS crashes on InoTable release
@John Spray
add an complete MDS(cephfs103) log, and set debug_mds = 30/30
huanwen ren

09/05/2016

01:09 PM Bug #17212 (Resolved): Unable to remove symlink / fill_inode badness on ffff88025f049f88
We have a symlink in our filesystem that we cannot remove.
Ceph MDS version: 10.2.2
Kernel version: 4.4.0-34-gene...
Burkhard Linke
11:13 AM Bug #16842: mds: replacement MDS crashes on InoTable release
Hmm, so the client session thinks it had something preallocated that the inotable thinks is already free.
I wonder...
John Spray
07:46 AM Bug #16842: mds: replacement MDS crashes on InoTable release
I met the same problem, ceph version is 10.2.2.0,
the client just simple copy file to the dir of ceph-fuse mounting
...
huanwen ren

09/02/2016

06:52 PM Backport #17206 (In Progress): jewel: ceph-fuse crash in Client::get_root_ino
Nathan Cutler
06:44 PM Backport #17206 (Resolved): jewel: ceph-fuse crash in Client::get_root_ino
https://github.com/ceph/ceph/pull/10921 Nathan Cutler
06:44 PM Backport #17207 (Resolved): jewel: ceph-fuse crash on force unmount with file open
https://github.com/ceph/ceph/pull/10958 Nathan Cutler
09:04 AM Bug #17197 (Resolved): ceph-fuse crash in Client::get_root_ino

(this is a retroactively created ticket for a fix that was pushed without a ticket)
Failure:
http://pulpito.cep...
John Spray
08:46 AM Bug #17172 (Fix Under Review): Failure in snaptest-git-ceph.sh
Zheng Yan
08:42 AM Bug #17172 (Resolved): Failure in snaptest-git-ceph.sh
https://github.com/ceph/ceph/pull/10957 Zheng Yan
08:44 AM Bug #16764: ceph-fuse crash on force unmount with file open
https://github.com/ceph/ceph/pull/10958 Zheng Yan
06:58 AM Bug #16764 (Pending Backport): ceph-fuse crash on force unmount with file open
John Spray
06:59 AM Bug #17184: "Segmentation fault" in samba-jewel---basic-mira
The area of code mentioned above was fixed in master for http://tracker.ceph.com/issues/16764
That wasn't marked f...
John Spray
06:09 AM Bug #17184: "Segmentation fault" in samba-jewel---basic-mira
Maybe this is a solved problem
> while (!ll_unclosed_fh_set.empty()) {
set<Fh*>::iterator it = ll_unclosed_fh...
chuan jiang
04:42 AM Bug #17184: "Segmentation fault" in samba-jewel---basic-mira
Can you fix that up and make sure it was just a backport error, Zheng? We aren't seeing this in master at all so I pr... Greg Farnum
12:11 AM Bug #17184: "Segmentation fault" in samba-jewel---basic-mira
... Zheng Yan
12:15 AM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
No, unless you can accept losing data (use umount -f /mnt/xxx) Zheng Yan

09/01/2016

10:13 PM Bug #17184: "Segmentation fault" in samba-jewel---basic-mira
Ran 6 times against commit:e400999a2cb0972919e35dd8510f8d85f48ceace (jewel-samba-1) and got zero failures. That's one... Greg Farnum
08:42 PM Bug #17184: "Segmentation fault" in samba-jewel---basic-mira
Well, I ran the test in question (samba-basic, btrfs, install, fuse, smbtorture) 9 times and got 4 failures. There ar... Greg Farnum
05:31 PM Bug #17193: truncate can cause unflushed snapshot data lose
Added some more debugging for this to my wip qa-suite branch https://github.com/ceph/ceph-qa-suite/pull/1156/commits/... John Spray
05:25 PM Bug #17193 (Resolved): truncate can cause unflushed snapshot data lose

Failure in test TestStrays.test_snapshot_remove
http://qa-proxy.ceph.com/teuthology/jspray-2016-08-30_12:07:21-kce...
John Spray
05:14 PM Bug #17192: "SELinux denials" in knfs-master-testing-basic-smithi
This is the same as http://tracker.ceph.com/issues/16397, right? John Spray
04:29 PM Bug #17192 (Duplicate): "SELinux denials" in knfs-master-testing-basic-smithi
This is for jewel 10.2.3 release
See #17074 we closed for hammer
Run: http://pulpito.front.sepia.ceph.com/yuriw-2...
Yuri Weinstein
05:09 PM Support #17183: caught error when trying to handle auth request, probably malformed request
I'm guessing blob_size=2 is never a reasonable thing for the MDS to be sending to the mon, so I'd suspect that someth... John Spray
03:47 PM Support #17183: caught error when trying to handle auth request, probably malformed request
The keyring in question has mon "allow *" osd "allow *" mds "allow *" permissions, and is configured in the ceph.conf... Chris MacNaughton
02:15 PM Feature #9466: kclient: Extend CephFSTestCase tests to cover kclient
Updated test branch that only calls umount when really needed (https://github.com/ceph/ceph-qa-suite/pull/1156)
Lo...
John Spray
10:13 AM Support #17171: Ceph-fuse client hangs on unmount
Hmm, logs are at the moment pretty useless only contains such entries:... Arturas Moskvinas
07:18 AM Support #17171: Ceph-fuse client hangs on unmount
This happens during automount/autofs process decision to unmount filesystem when no process is using it for couple of... Arturas Moskvinas
09:16 AM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
Upgraded to 4.8.0-040800rc1-generic. Results are different now.
When pool becomes full gets the below message:-
l...
Rohith Radhakrishnan

08/31/2016

10:01 PM Support #17171: Ceph-fuse client hangs on unmount
When are you doing this unmount? If it's on shutdown, and it happens to be unmounted after networking gets shut down,... Greg Farnum
09:21 PM Bug #17184: "Segmentation fault" in samba-jewel---basic-mira
Okay, there's also one at http://pulpito.ceph.com/teuthology-2016-08-14_02:35:02-samba-jewel---basic-mira/
That se...
Greg Farnum
08:19 PM Bug #17184: "Segmentation fault" in samba-jewel---basic-mira
I'm not seeing this at all on master (just from browsing http://pulpito.ceph.com/?suite=samba#)
Jewel has a core d...
Greg Farnum
03:47 PM Bug #17184 (Rejected): "Segmentation fault" in samba-jewel---basic-mira
This is for jewel 10.2.3 release
Seems to be verified by several last runs
Runs:
http://pulpito.ceph.com/teuthol...
Yuri Weinstein
09:03 PM Bug #16909 (Resolved): Stopping an MDS rank does not stop standby-replays for that rank
Greg Farnum
09:01 PM Bug #17172: Failure in snaptest-git-ceph.sh
This also showed up in a testing branch of mine: http://pulpito.ceph.com/gregf-2016-08-29_04:30:16-fs-greg-fs-testing... Greg Farnum
08:03 PM Support #17183: caught error when trying to handle auth request, probably malformed request
You'll need to be a little more clear about the keyring involved; I imagine that's the problem. You should be able to... Greg Farnum
03:44 PM Support #17183 (New): caught error when trying to handle auth request, probably malformed request
When trying to start up a new MDS server, I'm getting an authentication failure. Attached is a snippet of the authent... Chris MacNaughton
04:01 PM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
i am using 4.4.8-040408-generic Rohith Radhakrishnan
01:29 PM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
I tried pool quota on 4.8-rc1 kernel. the kernel does recover from hang when unset quota Zheng Yan
10:24 AM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
In this latest kernel, the warnings appear only when we try to unmount the FS. And the umount command hangs and fails... Rohith Radhakrishnan
09:48 AM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
This is the expected behaviour. (otherwise cephfs needs to drop some dirty data silently). Does kernel stop to print ... Zheng Yan
08:19 AM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
Reproduced with below 4.8 kernel :-
rack6-client-5:~$ uname -a
Linux rack6-client-5 4.4.8-040408-generic #201604200...
Rohith Radhakrishnan
03:13 PM Bug #17181 (Duplicate): "[ FAILED ] LibCephFS.ThreesomeInterProcessRecordLocking" in smoke
Run: http://pulpito.ceph.com/teuthology-2016-08-31_05:00:01-smoke-master-testing-basic-vps/
Job: 394020
Logs: http:...
Yuri Weinstein

08/30/2016

12:20 PM Bug #17173 (Resolved): Duplicate damage table entries
Seen on mira021 long-running MDS.... John Spray
10:14 AM Bug #17172 (Resolved): Failure in snaptest-git-ceph.sh
This run on master:
http://pulpito.ceph.com/jspray-2016-08-29_11:24:10-fs-master-testing-basic-mira/389772/...
John Spray
09:32 AM Support #17171 (Closed): Ceph-fuse client hangs on unmount
We use autofs/automount to mount/unmount ceph-fuse mounts and from time to time ceph-fuse client hangs on umount and ... Arturas Moskvinas
12:37 AM Bug #16255: ceph-create-keys: sometimes blocks forever if mds "allow" is set
I've had the same issue when use ceph-deploy gatherkeys(jewel)
if I change "mds 'allow'" to "mds 'allow *'", it's t...
huanwen ren

08/29/2016

09:23 PM Bug #17105: multimds: allow_multimds not required when max_mds is set in ceph.conf at startup
New PR: https://github.com/ceph/ceph/pull/10914 Patrick Donnelly
06:20 PM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
It would be helpful; we're still *surprised* that this is a problem. Just noting that we don't include it in our nigh... Greg Farnum
02:22 PM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
@Greg: How to proceed now. Is there a need to test with 4.8 kernel now? Rohith Radhakrishnan
02:19 PM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
Rohith Radhakrishnan wrote:
> 4.4 is is the latest for Ubuntu 14.04.5. But let me see if i can get hold of a 16.04 ...
Rohith Radhakrishnan
02:17 PM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
4.4 is is the latest for Ubuntu 14.04.5. But let me see if i can get hold of a 16.04 machine with 4.8 kernel and try... Rohith Radhakrishnan
02:16 PM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
Turns out we don't actually test the kernel against full pools; see #9466 for updates on it. Greg Farnum
01:55 PM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
4.4.0 is pretty old at this point, and there are some fixes that may help this that have gone upstream since then. Is... Jeff Layton
02:15 PM Feature #9466: kclient: Extend CephFSTestCase tests to cover kclient
Updating title to reflect that these days we have lots of tests (in fs/recovery, which is now a bit of a silly name f... John Spray

08/26/2016

04:14 PM Feature #9880 (Resolved): mds: more gracefully handle EIO on missing dir object
I think we're good to go, then. Greg Farnum
03:01 PM Feature #9880: mds: more gracefully handle EIO on missing dir object
no specific suggestions Sage Weil
03:55 PM Bug #17113: MDS EImport crashing with mds/journal.cc: 2929: FAILED assert(mds->sessionmap.get_ver...
I think the logs you've provided should be enough. Thanks! Greg Farnum
10:43 AM Bug #17113: MDS EImport crashing with mds/journal.cc: 2929: FAILED assert(mds->sessionmap.get_ver...
Will full logs be enough for diagnose?
I'd like to start recovering this cluster, but if you would need me to run ad...
Tomasz Torcz

08/25/2016

10:47 PM Backport #17126 (Resolved): mds: fix double-unlock on shutdown
Loïc Dachary
07:28 PM Feature #11172 (In Progress): mds: inode filtering on 'dump cache' asok
Douglas Fuller
05:33 PM Feature #12274 (Fix Under Review): mds: start forward scrubs from all subtree roots, skip non-aut...
https://github.com/ceph/ceph/pull/10876 Douglas Fuller
05:31 PM Backport #16946: jewel: client: nlink count is not maintained correctly
FYI: Github is annoying and does some kind of timestamp sort when displaying commits. I'm not sure if it's the origin... Greg Farnum
05:17 PM Backport #16946: jewel: client: nlink count is not maintained correctly
@Jeff this is a very unusual situation and I apologize for the noise. It turns out that github does not display the c... Loïc Dachary
03:13 PM Backport #16946 (In Progress): jewel: client: nlink count is not maintained correctly
Jeff Layton
03:13 PM Backport #16946: jewel: client: nlink count is not maintained correctly
You want the latter approach, and you want to pick them in the order they were originally committed, in case we need ... Jeff Layton
02:54 PM Backport #16946 (Need More Info): jewel: client: nlink count is not maintained correctly
Actually, you were right to ask, my question was about something else :-) It's good to know that the four commits are... Loïc Dachary
02:40 PM Backport #16946 (New): jewel: client: nlink count is not maintained correctly
This is perfect, thank you ! Loïc Dachary
02:38 PM Backport #16946 (In Progress): jewel: client: nlink count is not maintained correctly
Jeff Layton
12:11 PM Backport #16946: jewel: client: nlink count is not maintained correctly
Yes. I think you'll want the entire patch pile from that PR. These 4 patches at least:
https://github.com/ceph/cep...
Jeff Layton
11:59 AM Backport #16946 (Need More Info): jewel: client: nlink count is not maintained correctly
git cherry-pick -x https://github.com/ceph/ceph/pull/10386/commits/f3605d39e53b3ff777eb64538abfa62a5f98a4f2 which is ... Loïc Dachary
04:59 PM Bug #17074 (Closed): "SELinux denials" in knfs-master-testing-basic-smithi
per IRC
(09:54:34 AM) yuriw: loicd dgalloway can we say that old tests for hammer ran in ovh never had SELinux enabl...
Yuri Weinstein
04:53 PM Bug #17074: "SELinux denials" in knfs-master-testing-basic-smithi
the suite defensively passed in previous point releases
http://pulpito.ovh.sepia.ceph.com:8081/teuthology-2016-04-24...
Yuri Weinstein
04:47 PM Bug #17074 (Need More Info): "SELinux denials" in knfs-master-testing-basic-smithi
I don't think CephFS/knfs tests and SELinux ever worked on Hammer. Yuri, can you find evidence they did or else close... Greg Farnum
04:55 PM Feature #4142 (Duplicate): MDS: forward scrub: Implement cross-MDS scrubbing
Douglas Fuller
04:25 PM Bug #16592 (Need More Info): Jewel: monitor asserts on "mon/MDSMonitor.cc: 2796: FAILED assert(in...
Moving this down and setting Need More Info based on Patrick's investigation and the new asserts; let me know if that... Greg Farnum
04:23 PM Bug #15903: smbtorture failing on pipe_number test
We aren't seeing this in regular nightlies; marking it down. Greg Farnum
03:28 PM Bug #17113: MDS EImport crashing with mds/journal.cc: 2929: FAILED assert(mds->sessionmap.get_ver...
It's not super-likely the rebooting client actually caused this problem. If it did, it was only incidentally, and it'... Greg Farnum
06:55 AM Bug #17113: MDS EImport crashing with mds/journal.cc: 2929: FAILED assert(mds->sessionmap.get_ver...
Full log was uploaded ceph-post-file: 610fd186-9150-4e6b-8050-37dc314af39b
Before I recover, I'd really like to se...
Tomasz Torcz
12:35 PM Bug #16655 (Resolved): ceph-fuse is not linked to libtcmalloc
Loïc Dachary
12:35 PM Bug #15705 (Resolved): ceph status mds output ignores active MDS when there is a standby replay
Loïc Dachary
11:56 AM Backport #15968 (Resolved): jewel: ceph status mds output ignores active MDS when there is a stan...
Loïc Dachary
11:54 AM Backport #15968 (In Progress): jewel: ceph status mds output ignores active MDS when there is a s...
Loïc Dachary
11:56 AM Backport #16697 (Resolved): jewel: ceph-fuse is not linked to libtcmalloc
Loïc Dachary
11:54 AM Backport #16697 (In Progress): jewel: ceph-fuse is not linked to libtcmalloc
Loïc Dachary
11:56 AM Backport #17131 (In Progress): jewel: Jewel: segfault in ObjectCacher::FlusherThread
Loïc Dachary
06:27 AM Backport #17131 (Resolved): jewel: Jewel: segfault in ObjectCacher::FlusherThread
https://github.com/ceph/ceph/pull/10864 Loïc Dachary
07:23 AM Bug #15702 (Resolved): mds: wrongly treat symlink inode as normal file/dir when symlink inode is ...
Loïc Dachary
07:20 AM Backport #16083 (Resolved): jewel: mds: wrongly treat symlink inode as normal file/dir when symli...
Loïc Dachary
01:11 AM Bug #16610 (Pending Backport): Jewel: segfault in ObjectCacher::FlusherThread
This got merged to master forever ago. Guess it should get backported too. Greg Farnum

08/24/2016

11:41 PM Bug #17105 (Fix Under Review): multimds: allow_multimds not required when max_mds is set in ceph....
PR: https://github.com/ceph/ceph/pull/10848 Patrick Donnelly
09:55 PM Bug #17096 (Won't Fix): Pool name is not displayed after changing CephFS File layout using extend...
I think this is just a result of not having the current OSDMap yet. If you're doing IO on the client, you're unlikely... Greg Farnum
08:59 PM Backport #17126 (Resolved): mds: fix double-unlock on shutdown
https://github.com/ceph/ceph/pull/10847 Loïc Dachary
06:00 PM Bug #17113 (Need More Info): MDS EImport crashing with mds/journal.cc: 2929: FAILED assert(mds->s...
It looks like you're running with multiple active MDSes, which is not currently recommended. We saw this in #16043 as... Greg Farnum
09:44 AM Bug #17113 (Can't reproduce): MDS EImport crashing with mds/journal.cc: 2929: FAILED assert(mds->...
I have tiny CEPH cluster (3xmon, 8xosd, 2xmds) with ceph-mds-10.2.2-2.fc24.x86_64.
Recently, one of the clients usin...
Tomasz Torcz
04:24 PM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
We increased the pool size to a higher size. but system is in same state
Steps done:-
=========================...
Rohith Radhakrishnan
01:36 PM Bug #17115: kernel panic when running IO with cephfs and resource pool becomes full
These are warning (write blocked for too long) instead of panic. When pool is full, write osd requests get paused. If... Zheng Yan
01:12 PM Bug #17115 (Resolved): kernel panic when running IO with cephfs and resource pool becomes full
Steps:-
Create a data pool with limited quota size and start running IO from client. After the pool becomes full, ...
Rohith Radhakrishnan
03:53 PM Bug #16288 (Resolved): mds: `session evict` tell command blocks forever with async messenger (Tes...
Loïc Dachary
08:41 AM Support #17079: Io runs only on one pool even though 2 pools are attached to cephfs FS.
You are right. I could do that. Rohith Radhakrishnan
07:17 AM Support #17079: Io runs only on one pool even though 2 pools are attached to cephfs FS.
There is no option to do that. Your requirement is strange, why not enlarge quota of the first pool. Zheng Yan
05:58 AM Support #17079: Io runs only on one pool even though 2 pools are attached to cephfs FS.
@Zheng: What I would like to achieve is after adding 2 pools to a ceph FS, I should be able to redirect the objects f... Rohith Radhakrishnan

08/23/2016

06:23 PM Bug #17105: multimds: allow_multimds not required when max_mds is set in ceph.conf at startup
I think we want to force users to set multi-mds flags explicitly, not implicitly via the initial config. I'm fine wit... Greg Farnum
06:02 PM Bug #17105 (Resolved): multimds: allow_multimds not required when max_mds is set in ceph.conf at ...
Problem:... Patrick Donnelly
04:08 PM Bug #17099 (Closed): MDS command for listing mds_cache_size
The config option can be shown through the standard config interface. The counter values are exported via the perf co... Greg Farnum
07:52 AM Bug #17099 (Closed): MDS command for listing mds_cache_size
Not able to find mds_cache_size listed anywhere. For e.g in ceph mds dump or elsewhere. If currently there is no way ... Rohith Radhakrishnan
01:44 PM Backport #16621 (Resolved): jewel: mds: `session evict` tell command blocks forever with async me...
Loïc Dachary
01:27 PM Bug #17096: Pool name is not displayed after changing CephFS File layout using extended attributes
Just saw the note: *Note When reading layouts, the pool will usually be indicated by name. However, in rare cases whe... Rohith Radhakrishnan
07:39 AM Bug #16396 (Resolved): Fix shutting down mds timed-out due to deadlock
Loïc Dachary
07:39 AM Bug #16358 (Resolved): Session::check_access() is buggy
Loïc Dachary
07:39 AM Bug #16164 (Resolved): mds: enforce a dirfrag limit on entries
Loïc Dachary
07:39 AM Bug #16137 (Resolved): client: crash in unmount when fuse_use_invalidate_cb is enabled
Loïc Dachary
07:39 AM Bug #16042 (Resolved): MDS Deadlock on shutdown active rank while busy with metadata IO
Loïc Dachary
07:39 AM Bug #16022 (Resolved): MDSMonitor::check_subs() is very buggy
Loïc Dachary
07:39 AM Bug #16013 (Resolved): Failing file operations on kernel based cephfs mount point leaves unaccess...
Loïc Dachary
07:39 AM Bug #12653 (Resolved): fuse mounted file systems fails SAMBA CTDB ping_pong rw test with v9.0.2
Loïc Dachary
06:51 AM Backport #16037 (Resolved): jewel: MDSMonitor::check_subs() is very buggy
Loïc Dachary
06:51 AM Backport #16215 (Resolved): jewel: client: crash in unmount when fuse_use_invalidate_cb is enabled
Loïc Dachary
06:51 AM Backport #16299 (Resolved): jewel: mds: fix SnapRealm::have_past_parents_open()
Loïc Dachary
06:51 AM Backport #16320 (Resolved): jewel: fs: fuse mounted file systems fails SAMBA CTDB ping_pong rw te...
Loïc Dachary
06:51 AM Backport #16515 (Resolved): jewel: Session::check_access() is buggy
Loïc Dachary
06:50 AM Backport #16560 (Resolved): jewel: mds: enforce a dirfrag limit on entries
Loïc Dachary
06:50 AM Backport #16620 (Resolved): jewel: Fix shutting down mds timed-out due to deadlock
Loïc Dachary
06:50 AM Backport #16625 (Resolved): jewel: Failing file operations on kernel based cephfs mount point lea...
Loïc Dachary
06:50 AM Backport #16797 (Resolved): jewel: MDS Deadlock on shutdown active rank while busy with metadata IO
Loïc Dachary

08/22/2016

06:45 PM Bug #17096 (Won't Fix): Pool name is not displayed after changing CephFS File layout using extend...
Steps-
1)Create a pool and a metadata pool and create a new cephfs using the pools and mount the file system from ...
Rohith Radhakrishnan
11:47 AM Support #17079: Io runs only on one pool even though 2 pools are attached to cephfs FS.
Tried setting a non-default pool using "SETFATT", but I am not able to set more than one pool to a directory at a tim... Rohith Radhakrishnan

08/19/2016

04:18 PM Bug #14716: "Thread.cc: 143: FAILED assert(status == 0)" in fs-hammer---basic-smithi
Same in hammer 0.94.8
http://qa-proxy.ceph.com/teuthology/yuriw-2016-08-18_20:11:00-fs-master---basic-smithi/373246/...
Yuri Weinstein
01:18 PM Support #17079: Io runs only on one pool even though 2 pools are attached to cephfs FS.
the first pool is default pool. see http://docs.ceph.com/docs/master/cephfs/file-layouts/ for how to store file in no... Zheng Yan
11:23 AM Support #17079 (New): Io runs only on one pool even though 2 pools are attached to cephfs FS.
Steps:-
1) Create a pool and a metadata pool and create a new cephfs using the pools.
2) Now create another data ...
Rohith Radhakrishnan

08/18/2016

08:44 PM Bug #17074: "SELinux denials" in knfs-master-testing-basic-smithi
Not a result of environmental issue or system misconfiguration. David Galloway
08:21 PM Bug #17074 (Closed): "SELinux denials" in knfs-master-testing-basic-smithi
This is point release tests hammer 0.94.8
Run: http://pulpito.front.sepia.ceph.com/yuriw-2016-08-17_20:57:47-knfs-...
Yuri Weinstein
06:47 AM Bug #17069: multimds: slave rmdir assertion failure
strange. have you ever use snapshot on the testing cluster? Zheng Yan
 

Also available in: Atom