Activity
From 06/09/2015 to 07/08/2015
07/08/2015
- 01:28 PM Backport #11737 (Resolved): MDS is crashed (mds/CDir.cc: 1391: FAILED assert(!is_complete()))
- 12:55 PM Backport #12098 (In Progress): kernel_untar_build fails on EL7
- 12:55 PM Backport #11999 (In Progress): cephfs Dumper tries to load whole journal into memory at once
- 12:12 PM Support #11923 (Resolved): MDS init script starts multiple instances when MDS is referenced in ce...
- Yes, in a default configuration you don't list the daemons and sysvinit starts them based on what folders exist in /v...
- 11:57 AM Support #11923: MDS init script starts multiple instances when MDS is referenced in ceph.conf
- Greg,
I'd like to put in a slight correction in the 'ps' output. even it may be the same as what Brian said in the... - 09:43 AM Bug #10944: Deadlock, MDS logs "slow request", getattr pAsLsXsFs failed to rdlock
- please use the new timeout patch
- 08:15 AM Bug #10944: Deadlock, MDS logs "slow request", getattr pAsLsXsFs failed to rdlock
- please try the attached patches. (the second one is modified version of your timeout patch)
- 02:10 AM Bug #10944: Deadlock, MDS logs "slow request", getattr pAsLsXsFs failed to rdlock
- Hi Greg,
Not exactly. We have enabled timeout in 2 ways, one is the patch you saw, the other is in Client::tick()...
07/07/2015
- 04:25 PM Bug #10944: Deadlock, MDS logs "slow request", getattr pAsLsXsFs failed to rdlock
- Do I guess correctly that by the timeout mechanism you're referring to the patch in http://tracker.ceph.com/issues/12...
- 12:07 PM Bug #10944: Deadlock, MDS logs "slow request", getattr pAsLsXsFs failed to rdlock
- We met this issue *because we enabled timeout mechanism in tick on cephfs client for all the mds requests*. Once time...
- 04:02 PM Support #11923: MDS init script starts multiple instances when MDS is referenced in ceph.conf
- How was this ceph.conf and MDS set up? If you remove the "sysvinit" file from the MDS directory it shouldn't be start...
- 03:09 PM Bug #12222 (Resolved): MDSMonitor: set max_mds doesn't respect MAX_MDS
- Various allocation ranges are calculated based on MAX_MDS, but we don't actually stop the user from creating more MDS...
07/06/2015
- 02:08 AM Bug #10944: Deadlock, MDS logs "slow request", getattr pAsLsXsFs failed to rdlock
- Thanks, guys. I will keep digging to see if we can find something. Will keep you updated and pls stay tuned.
07/03/2015
- 10:49 AM Feature #9755: Fence late clients during reconnect timeout
- Nope -- the machinery went in to barrier on OSD epoch after blacklisting a client, but the actual act of blacklisting...
- 08:41 AM Bug #10944: Deadlock, MDS logs "slow request", getattr pAsLsXsFs failed to rdlock
- Hi David,
After changing CentOS 7 client OS to Fedora 21 (kernel 3.18) the deadlock problem disappeared.
I did ... - 05:53 AM Bug #10944: Deadlock, MDS logs "slow request", getattr pAsLsXsFs failed to rdlock
- please dump mds when this hapeens (ceph mds tell \* dumpcache, cache dump file at /cachedump.* on the machine that ru...
- 05:35 AM Bug #10944: Deadlock, MDS logs "slow request", getattr pAsLsXsFs failed to rdlock
- Hi Greg and Yan,
We are using ceph-dokan on windows and hitting the same problem few times. And this deadlock happ... - 05:18 AM Bug #12209 (Won't Fix): CephFS should have a complete timeout mechanism to avoid endless waiting ...
- Recently, when we made pressure test on cephfs through ceph-dokan using in windows, there are always some ceph-dokan ...
07/02/2015
- 09:30 PM Feature #12204 (Resolved): ceph-fuse: warn and shut down when there is no MDS present
- Right now if you try to mount ceph-fuse and there's no MDS in the system, it simply hangs. This is confusing for new ...
- 09:27 PM Feature #8358: client: opportunistically update backtraces on files
- I think the steps to do this are:
1) Have MDS provide a bufferlist to the client whenever an inode is created (as an... - 08:55 PM Feature #9755: Fence late clients during reconnect timeout
- Didn't this get done when the epoch barrier stuff did? (If not, please unassign.)
- 11:16 AM Support #11923: MDS init script starts multiple instances when MDS is referenced in ceph.conf
- Hello Sage,
The listing of /var/lib/ceph/mds is as following:
~~~
# ls -l /var/lib/ceph/mds/
total 0
drwxr-x... - 10:48 AM Bug #12088: cephfs client crash after enable readahead mechanism through setting conf option 'cli...
- no extra lock is needed. the async readahead context is called while client_lock is locked.
- 09:30 AM Bug #12088: cephfs client crash after enable readahead mechanism through setting conf option 'cli...
- hi Yan,
Thanks for the patch. We also have a similar patch tested internally, which uses readahead's pending count... - 08:47 AM Bug #12088: cephfs client crash after enable readahead mechanism through setting conf option 'cli...
- please try the attached patch
- 09:01 AM Bug #12189: Editing / Creating files fails for NFS-over-CephFS on EC pool with cache tier
- no idea what happened, please try using newer kernel (such as 4.0 kernel) on both NFS server and NFS client
- 07:59 AM Bug #12189: Editing / Creating files fails for NFS-over-CephFS on EC pool with cache tier
- The desktop machine do not have access to the ceph network at all. That's why I have to use a NFS gateway.
dd on N... - 07:40 AM Bug #12189: Editing / Creating files fails for NFS-over-CephFS on EC pool with cache tier
- it's likely the client on your desktop machine does not RW permission to the pools. please try doing direct write on ...
- 07:02 AM Bug #12189: Editing / Creating files fails for NFS-over-CephFS on EC pool with cache tier
- Zheng Yan wrote:
> it's likely your client does not have RW permission to the pools
I don't think the problem is ... - 01:27 AM Bug #12189: Editing / Creating files fails for NFS-over-CephFS on EC pool with cache tier
- it's likely your client does not have RW permission to the pools
- 02:56 AM Bug #11481: "mds/MDSTable.cc: 146: FAILED assert(is_undef())" on standby->replay transition
- fixed in https://github.com/ceph/ceph/pull/4658
07/01/2015
- 05:32 PM Bug #11746: cephfs Dumper tries to load whole journal into memory at once
- Hammer backport: https://github.com/ceph/ceph/pull/5120
- 03:51 PM Bug #12094 (Duplicate): "Segmentation fault" in smoke-master-distro-basic-multi run
- #12123
06/30/2015
- 10:29 PM Cleanup #12191 (Resolved): Remove ceph-mds --journal-check aka ONESHOT_REPLAY
Now that we have separate tools for validating the journal, we should remove MDSMap::STATE_ONESHOT_REPLAY -- it add...- 02:57 PM Bug #12189: Editing / Creating files fails for NFS-over-CephFS on EC pool with cache tier
- Replicated pools also seems to be affected:
On client:
:/ceph/test$ ls
:/ceph/test$ touch foo
:/ceph/test$ cp ... - 02:36 PM Bug #12189: Editing / Creating files fails for NFS-over-CephFS on EC pool with cache tier
- ~# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
126T 53823G 75664G 58.38 ... - 02:33 PM Bug #12189 (Won't Fix): Editing / Creating files fails for NFS-over-CephFS on EC pool with cache ...
- Ubuntu 14.04, Kernel 3.13.0-55-generic
Standard kernel-based NFS server
Ceph Hammer release
~# ceph version
ceph... - 09:23 AM Bug #12123 (Resolved): testlibcephfs: segfault in preadv/pwritev tests
- 03:40 AM Bug #9994 (Fix Under Review): ceph-qa-suite: nfs mount timeouts
- 03:40 AM Bug #9994: ceph-qa-suite: nfs mount timeouts
- https://github.com/ceph/teuthology/pull/554
06/29/2015
- 11:23 AM Bug #12172 (Resolved): tasks.cephfs.test_auto_repair.TestMDSAutoRepair fails
- ...
- 09:55 AM Bug #12172 (Fix Under Review): tasks.cephfs.test_auto_repair.TestMDSAutoRepair fails
- 09:54 AM Bug #12172: tasks.cephfs.test_auto_repair.TestMDSAutoRepair fails
- https://github.com/ceph/ceph-qa-suite/pull/473
06/26/2015
- 04:21 PM Feature #12107: mds: use versioned wire protocol; obviate CEPH_MDS_PROTOCOL
- Mmmm. For over-the-wire encodings that don't go to disk, it's only about whether cross-version daemons can communicat...
- 01:24 PM Bug #12175 (Resolved): Fix ceph-fuse --help
- Problems with this:
* it starts running after printing the help output, instead of quitting
* it omits useful/i... - 01:03 PM Feature #12138: cephfs-data-scan: write inode backtraces when injecting to lost+found
- Hmm. When we do move something into lost+found, are we sure we can't make use of the backtrace any more, or might it ...
- 10:55 AM Bug #12123: testlibcephfs: segfault in preadv/pwritev tests
- Please use the new pull request:
https://github.com/ceph/ceph/pull/5084 - 03:14 AM Bug #12123: testlibcephfs: segfault in preadv/pwritev tests
- A pull request for fixing this issue has been submitted, pending on review.
- 03:13 AM Bug #12123 (Fix Under Review): testlibcephfs: segfault in preadv/pwritev tests
- https://github.com/ceph/ceph/pull/5083
- 01:25 AM Bug #12172 (Resolved): tasks.cephfs.test_auto_repair.TestMDSAutoRepair fails
- http://qa-proxy.ceph.com/teuthology/teuthology-2015-06-23_23:04:02-fs-next---basic-multi/947532/teuthology.log
06/25/2015
- 01:22 PM Feature #12107: mds: use versioned wire protocol; obviate CEPH_MDS_PROTOCOL
- yes, it's better to not rely on CEPH_MDS_PROTOCOL. (CEPH_MDS_PROTOCOL will make the rework easier)
- 11:03 AM Feature #12107: mds: use versioned wire protocol; obviate CEPH_MDS_PROTOCOL
- Oh right, so as Zheng reminds me in his patch for #12105, we do have the CEPH_MDS_PROTOCOL value for these changes.
... - 11:01 AM Bug #12105 (Resolved): CInode misses oldest_snap field during migration
- ...
- 08:36 AM Bug #12105 (Fix Under Review): CInode misses oldest_snap field during migration
- 08:39 AM Bug #12019 (Resolved): multiple_rsync failure
- 02:30 AM Bug #11989 (Resolved): Cephfs Kernel Client data corruption
06/24/2015
- 11:03 PM Feature #4161 (Fix Under Review): MDS: add file layout to head object
- https://github.com/ceph/ceph/pull/5070
- 01:43 PM Feature #4161 (In Progress): MDS: add file layout to head object
- 12:46 PM Feature #12145 (Resolved): cephfs-data-scan: pgls filter for 0th file objects
- Currently, when iterating over 0th objects, we are actually listing all the objects and selecting the ones we care ab...
- 12:44 PM Feature #12144 (Resolved): cephfs-data-scan: integrated with sharded pgls
- This bit was taken out pending the new-style sharded pgls that should be in infernalis.
cephfs-data-scan should ta... - 12:39 PM Feature #12143 (New): cephfs-data-scan: Tool for orchestrating multiple workers
To run at any kind of scale, this tool requires multiple workers executing across multiple clients.
It would be ...- 12:35 PM Feature #12142 (New): cephfs-data-scan: Structured output of errors and operations done
Need a machine-consumable list of:
* Any I/O or encoding errors encountered (i.e. objects that might need manual ...- 12:33 PM Feature #12141 (New): cephfs-data-scan: File size correction from backward scan
- Currently, if a dentry already exists at the backtrace location which points to the inode, we do nothing. We should ...
- 12:30 PM Feature #12140 (New): cephfs-data-scan: Use ancestor file layouts when injecting inodes
- Currently we synthesize layouts on a best effort basis when injecting. We should also look for ancestor layouts whic...
- 12:29 PM Feature #12139 (New): cephfs-data-scan: cache fragtrees during injection
In order to inject dentries, we have to learn the fragtree of the directory we're injecting into. The process exis...- 12:28 PM Feature #12138 (New): cephfs-data-scan: write inode backtraces when injecting to lost+found
- Currently, for cases where we inject linkage for an inode into /lost+found, the inode's backtrace potentially still p...
- 12:26 PM Feature #12137 (New): cephfs-data-scan: backward scan of dirfrag objects, inject orphans
Similar procedure to what we already have for data objects: inject linkage for orphaned (detected a la #12133) dirf...- 12:24 PM Feature #12136 (New): fsck: snapshots: Enumerate snapshots during scan_extents
Optionally, issue a RADOS op to list all the snapshots for every object seen, and accumulate these into a set on th...- 12:20 PM Feature #12135 (New): cephfs-data-scan: Layout override by path
We can't always guess file layouts correctly. Provide a mechanism for users to manually specify the file layout to...- 12:18 PM Feature #12134 (New): cephfs-data-scan: Filter on ino/path/dname expression
Sometimes the user might want to go and recover only certain files from a damaged filesystem, or they might know th...- 12:17 PM Feature #12133 (Resolved): cephfs-data-scan: Filter on inodes not touched by forward scrub
Where forward scrub has marked those inodes that it has touched, add an option (possibly the default) to cephfs-dat...- 12:15 PM Feature #12132 (Resolved): cephfs-data-scan: Cleanup phase
A phase to remove the xattrs created during the scan_extents phase. They are small and harmless, but we should at ...- 12:13 PM Feature #12131 (New): cephfs-data-scan: Update InoTable after injection
Currently, inodes are injected without any consideration to whether their number is still regarded as free in inota...- 12:11 PM Feature #12130 (New): cephfs-data-scan: Accumulate dirfrag sizes on injection
Use a RADOS class to increment an xattr for the number of dentries injected into a fragment object during repair. ...
06/23/2015
- 01:38 PM Bug #12123: testlibcephfs: segfault in preadv/pwritev tests
- Emailed JevonQ to ask him to take a look at this
- 10:39 AM Bug #12123 (Resolved): testlibcephfs: segfault in preadv/pwritev tests
- http://qa-proxy.ceph.com/teuthology/teuthology-2015-06-19_23:04:01-fs-master---basic-multi/941831/...
- 10:26 AM Bug #9994: ceph-qa-suite: nfs mount timeouts
- Still happening:
http://qa-proxy.ceph.com/teuthology/teuthology-2015-06-19_23:10:01-knfs-next-testing-basic-multi/94...
06/22/2015
- 12:00 PM Bug #12105: CInode misses oldest_snap field during migration
- Probably can't do much about this safely without addressing the larger encoding issues here (#12107)
- 11:02 AM Bug #12105 (Resolved): CInode misses oldest_snap field during migration
See CInode::_encode_base -- because encoding is duplicated here wrt InodeStore.
Need to either update that fn or...- 11:46 AM Feature #12107: mds: use versioned wire protocol; obviate CEPH_MDS_PROTOCOL
- This code needs a rework to support versioning, because the outer message encoding in e.g. handle_discover_reply oper...
- 11:37 AM Feature #12107 (Resolved): mds: use versioned wire protocol; obviate CEPH_MDS_PROTOCOL
- 11:05 AM Feature #12106 (New): CInodes encoded unversioned in dirfrags
- Where we encode CInodes in the omap values of a dirfrag, we do it without any ENCODE_START decorators (InodeStoreBase...
06/21/2015
- 02:19 PM Bug #12088: cephfs client crash after enable readahead mechanism through setting conf option 'cli...
- While diving into the source code, we found the code path which will cause the crash. Described as the following step...
06/19/2015
- 07:13 PM Backport #12098 (Resolved): kernel_untar_build fails on EL7
- https://github.com/ceph/ceph/pull/5119
- 07:12 PM Backport #12097 (Resolved): kernel_untar_build fails on EL7
- https://github.com/ceph/ceph/pull/6000
- 03:39 PM Bug #12094 (Duplicate): "Segmentation fault" in smoke-master-distro-basic-multi run
- Run: http://pulpito.ceph.com/teuthology-2015-06-19_05:00:05-smoke-master-distro-basic-multi/
Job: 940481
Logs: http... - 11:03 AM Bug #12088 (Resolved): cephfs client crash after enable readahead mechanism through setting conf ...
- I run fio tool to test the randread performance of cephfs. Ceph client will crash, when I enable readahead on cep...
06/18/2015
- 10:16 AM Bug #11989: Cephfs Kernel Client data corruption
- Zheng Yan wrote:
> please try the attached patch
i have tried your patch with my test case with 300GB of data and... - 03:52 AM Bug #11989: Cephfs Kernel Client data corruption
- please try the attached patch
06/17/2015
- 01:32 PM Bug #11989: Cephfs Kernel Client data corruption
- I reproduced this locally
- 09:24 AM Bug #11784: ceph-fuse hang on unmount (stuck dentry refs)
- Hmm, there shouldn't have been any activity on the mount by this point. Maybe we've got some other kind of bug, though.
- 12:58 AM Bug #11784: ceph-fuse hang on unmount (stuck dentry refs)
- ...
06/16/2015
- 03:33 PM Bug #11758 (Pending Backport): kernel_untar_build fails on EL7
- 03:33 PM Bug #11758 (Resolved): kernel_untar_build fails on EL7
- 03:16 PM Bug #11758 (Fix Under Review): kernel_untar_build fails on EL7
- https://github.com/ceph/ceph/pull/4967
- 01:52 PM Bug #11985 (In Progress): MDS asserts in objecter when transitioning from replay to DNE
- 01:49 PM Bug #11541 (Resolved): MDS is crashed (mds/CDir.cc: 1391: FAILED assert(!is_complete()))
- 11:22 AM Bug #11913 (Resolved): Failure in TestClusterFull.test_barrier
- commit:bf9a9a2d9ff2be129b303d535899f60ad49f7c23
- 10:23 AM Bug #12019: multiple_rsync failure
- Yeah, I think this was me being silly when reading the log, I read straight from the rsync invocation to the error, w...
- 10:09 AM Bug #12019: multiple_rsync failure
- 10:09 AM Bug #12019: multiple_rsync failure
- commit:0804655725d84d866a32826203638fcfd71d4b51
Since we're using sudo to copy we presumably need it to delete. I ... - 07:13 AM Bug #12019 (Fix Under Review): multiple_rsync failure
- https://github.com/ceph/ceph/pull/4964
- 07:10 AM Bug #11989: Cephfs Kernel Client data corruption
- could you please provide me a list of corrupt blocks (offset and size of corrupt block). Besides, could you please tr...
06/15/2015
- 09:56 AM Bug #12019 (Resolved): multiple_rsync failure
- Related to #11781?
This is running with the recent change to source files from a dir in /tmp instead of directly f... - 05:50 AM Bug #11989: Cephfs Kernel Client data corruption
- Zheng Yan wrote:
> are there any suspected message when this happens?
dmesg is silent, ceph logs in /var/log/ceph... - 01:57 AM Bug #11989: Cephfs Kernel Client data corruption
- are there any suspected message when this happens?
06/12/2015
- 07:31 PM Backport #11999 (Resolved): cephfs Dumper tries to load whole journal into memory at once
- https://github.com/ceph/ceph/pull/5120
- 07:06 PM Bug #11989: Cephfs Kernel Client data corruption
- I imagine this is a result of some kind of memory exhaustion, but I'm not sure how best to diagnose it or if there ar...
- 01:22 PM Bug #11989 (Resolved): Cephfs Kernel Client data corruption
- Hi. i get random data corruption with the cephfs kernel client. i do streaming from a non-ceph machine using "cat <fi...
- 06:19 PM Bug #11986: logs changing during tarball generation at end of job
- Actually this job doesn't have log rotation enabled at all.
- 12:34 PM Bug #11986 (Closed): logs changing during tarball generation at end of job
http://pulpito.ceph.com/teuthology-2015-06-08_23:04:01-fs-master---basic-multi/926320/...- 12:31 PM Bug #11985 (Resolved): MDS asserts in objecter when transitioning from replay to DNE
Seen once:
http://pulpito.ceph.com/teuthology-2015-06-08_23:04:01-fs-master---basic-multi/926330/...
06/11/2015
- 01:56 PM Feature #3826 (Resolved): uclient: Be more aggressive about checking for pools we can't write to
- Zheng did this a few months ago; we now write to a test object in every newly-seen pool.
- 01:55 PM Feature #4885 (Resolved): dumper: do an incremental log dump
- https://github.com/ceph/ceph/pull/4835
- 01:52 PM Feature #11588 (Resolved): teuthology: set up log rotate for MDS logs
- https://github.com/ceph/ceph-qa-suite/pull/452
- 01:28 PM Bug #11959 (Resolved): qa-suite: /usr copy needs more perms
- 10:13 AM Bug #11959 (Fix Under Review): qa-suite: /usr copy needs more perms
- That's kind of yucky of virtual box to put those in /usr/lib but whatever!
https://github.com/ceph/ceph/pull/4930 - 05:23 AM Bug #11959 (Resolved): qa-suite: /usr copy needs more perms
- /a/ubuntu-2015-06-10_10:18:19-fs-greg-fs-testing---basic-multi/928341/teuthology.log...
- 01:15 PM Bug #11913 (Fix Under Review): Failure in TestClusterFull.test_barrier
- https://github.com/ceph/ceph-qa-suite/pull/457
06/10/2015
- 08:34 PM Support #11923: MDS init script starts multiple instances when MDS is referenced in ceph.conf
- Can you attach the ceph.conf and an ls -al of the instance directory? This will happen if the section is in ceph.co...
- 02:19 PM Bug #10950 (Resolved): Unable to remove MDS host: error handling
- Merged to master in commit:5441f89c022aa1f4df084a4280e45c5c5b278f00
- 02:18 PM Bug #11746 (Pending Backport): cephfs Dumper tries to load whole journal into memory at once
- Merged to master in commit:04a11f0f2f6d46091d6868ba1cc2fec7a4e7a99c
- 01:19 PM Feature #11950 (Resolved): Strays enqueued for purge cause MDCache to exceed size limit
If your purge operations are going slowly (either because of throttle or because of slow data pool), and you do lot...
Also available in: Atom