Activity
From 03/27/2013 to 04/25/2013
04/25/2013
- 06:34 PM Bug #4742: mds: stuck clientreplay request
- Yeah, we've discussed this some on github around wip-4742 and on irc. :)
- 06:31 PM Bug #4742: mds: stuck clientreplay request
- Looks like a client bug, it may add cap releases to the replay requests. (encode_cap_releases() should be called when...
- 10:38 AM Bug #4742: mds: stuck clientreplay request
- Logs for two runs, one is stuck in replay from a setattr, the other is stuck in replay from a rename.
04/23/2013
- 03:53 PM Feature #4799 (Resolved): Client Security for CephFS
- As discussed on the #ceph IRC channel with gregaf and others, I would find some added level of client security in Cep...
- 01:34 PM Bug #4721 (Resolved): libcephfs tests fail when using ceph-deploy
- strange that it works fine on the latest next branch [0.60-624-g426e3be-1precise] ...
- 10:29 AM Bug #4742: mds: stuck clientreplay request
- Attaching mds log from mds stuck on clientreplay. Looks like setattr is gets put on the inode waiting list by the lo...
04/21/2013
- 06:12 AM Bug #4753: mds/Locker.cc: 4167: FAILED assert(0)
- Additional: I resolve it runtime, changing assert(0) to some lock (IMHO first in this case) on one node and found for...
04/19/2013
- 10:17 AM Bug #4105: mds: fix up the Dumper
- This has annoyed me a couple more times and I think it's now at the top of the queue, so here we go again.
- 10:08 AM Bug #4746: client: invalidate callback can deadlock
- pushed wip-fuse to ceph-client.git
- 09:42 AM Bug #4753: mds/Locker.cc: 4167: FAILED assert(0)
- You mean file_eval should just short-circuit if it's scanning? That seems like the most sensible place for it, but I'...
- 09:31 AM Bug #4753: mds/Locker.cc: 4167: FAILED assert(0)
- yeah, that transition doesn't make sense. i think it should do nothing in the scan state..
- 09:05 AM Bug #4753: mds/Locker.cc: 4167: FAILED assert(0)
- file_eval is trying to move ifile from "scan" to "mixed" in order to serve up the client caps, and scatter_mix doesn'...
- 02:13 AM Bug #4601: symlink with size zero
- I was looking at the <inode>.<frag>_head* file in the osd that held the directory where the link was stored. As it t...
04/18/2013
- 05:22 PM Bug #4753 (Resolved): mds/Locker.cc: 4167: FAILED assert(0)
- Every mds crashed after some startup checks: "mds/Locker.cc: 4167: FAILED assert(0)":
mds/Locker.cc: 4167: FAILED ... - 05:12 PM Bug #4746: client: invalidate callback can deadlock
- The suggestion from Maxim is to modify fuse to serialize reads and invalidate via a mutex. That ought to do the tric...
- 09:37 AM Bug #4746: client: invalidate callback can deadlock
- It's not any of our internal locking that are getting stuck; it's the VFS inode mutexes in combination with us. If I ...
- 07:31 AM Bug #4746: client: invalidate callback can deadlock
- The invalidate is queued in a separate thread, and when we call the invalidate, we don't have the client lock held. ...
- 05:06 PM Bug #4601: symlink with size zero
- >I looked a bit in the ceph-osd file holding the directory that contains the symlink, and I can see ^Q in the yes_hea...
- 04:57 PM Bug #1945 (Can't reproduce): blogbench hang on caps
- We haven't seen this in a long time (at least, that's marked here), and there's been a ton of work here over the last...
- 04:39 PM Bug #4732: uclient: client/Inode.cc: 126: FAILED assert(cap_refs[c] > 0)
- This was in the async invalidate thread, so I'm turning this down. It should probably be investigated alongside/after...
- 04:34 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- Okay, pushed the update for more debugging, and am downgrading this to "High" since it only appears under so many fai...
- 04:17 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- Also, both of these are the same job as the first incident was: fsstress workunit on ceph-fuse, messenger failure inj...
- 04:15 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- Those machines are cleared out again, of course (d'oh!). Next time we see this we need to gather up everything we can...
- 04:03 PM Bug #4741: MDS: stuck in clientreplay
- Interesting; on #4742 it was clearly waiting on a request because it kept saying "still have 1 active replay requests...
- 03:57 PM Bug #4741 (Duplicate): MDS: stuck in clientreplay
- This is a duplicate of #4742. It looks like setattr is the culprit. I was able to generate a core file of the mds w...
- 11:13 AM Bug #4741: MDS: stuck in clientreplay
- Also /a/teuthology-2013-04-18_01:01:07-fs-next-testing-basic/15101
- 03:58 PM Bug #4721 (Need More Info): libcephfs tests fail when using ceph-deploy
- (Trying to track the responsibility flow more clearly.)
- 03:19 PM Bug #4721: libcephfs tests fail when using ceph-deploy
- Have you reproduced this, Tamil? Since all the tests are failing I'm pretty sure this is some kind of authentication ...
- 03:57 PM Bug #4742 (In Progress): mds: stuck clientreplay request
- 03:57 PM Bug #4742: mds: stuck clientreplay request
- Marked #4741 as a duplicate of this bug. It looks like setattr is the culprit. I was able to generate a core file o...
- 01:57 PM Bug #4722: kernel BUG at fs/ceph/caps.c:1006 invalid opcode: 0000
- I did a checkout of v3.5, and caps.c:1006 is...
- 01:37 PM Bug #4738: libceph: unlink vs. readdir (and other dir orders)
- I don't believe locking is implemented yet via the Samba VFS bindings, since we don't have a userspace implementation...
- 01:27 PM Bug #4738: libceph: unlink vs. readdir (and other dir orders)
- On top only:
vfs objects = scannedonly ceph
And if i switching to:
vfs objects = scannedonly
or:
vfs objects = c... - 11:03 AM Bug #3637 (Resolved): client: not issuing caps for with clients doing shared writes
- Merged into next in commit:efbe2e8b55ba735673a3fdb925a6304915f333d8
04/17/2013
- 07:42 PM Bug #4713 (Resolved): mds: hang related to access from two clients
- The following have been committed to the "testing" branch
of the ceph-client git repository. With them in place
I ... - 07:39 PM Bug #4706 (Resolved): kclient: Oops when two clients concurrently write a file
- The following have been committed to the ceph-client
"testing" branch:
8f68229 libceph: change how "safe" callbac... - 07:38 PM Bug #4679 (Resolved): ceph: hang while running blogbench on mira nodes
- Sorry Greg, I should have been in better communication
with you. I have been testing these all afternoon and
Sage ... - 03:48 PM Bug #4679: ceph: hang while running blogbench on mira nodes
- I believe Sage has been over all these now. I'm trying to go over the newest versions off the mailing list as well, n...
- 07:20 PM Bug #4726 (Can't reproduce): mds: segv during blogbench in remove_pending_backtraces
- I wasn't able to reproduce this after more than 200 runs, so I'm marking it as Can't reproduce for now.
- 05:37 PM Bug #3597 (Resolved): ceph-fuse: denying root access
- Oh, this was a bug that got fixed in commit:d87035c0c4ff, included in v0.60.
- 05:05 PM Bug #4746: client: invalidate callback can deadlock
- Hmm, you're right, this is a more fundamental problem.
- 04:50 PM Bug #4746: client: invalidate callback can deadlock
- Maybe; we didn't think this through much beyond going "yep, that's broken".
However, I think we can queue up the i... - 04:44 PM Bug #4746: client: invalidate callback can deadlock
- "We may need to introduce a second locking layer to deal with this, that covers draining out all VFS requests before ...
- 03:04 PM Bug #4746 (Resolved): client: invalidate callback can deadlock
- I saw this when testing the fix for #3637. We appear to be (correctly) safe against deadlocks on our own locks, but w...
- 04:12 PM Feature #4326: qa: add samba + (kclient|ceph-fuse) to suite
- I think you might have mentioned you were trying to do this while you were working on the samba vfs-based ones? If no...
- 04:09 PM Bug #1878 (Resolved): ceph.ko doesn't setattr (lchown, utimes) on symlinks
- I've pushed this to our testing branch. It's presently commit:baf0169b77f6a0c384a15fb425e5700fb0239e89, although that...
- 03:59 PM Bug #3637: client: not issuing caps for with clients doing shared writes
- And he gave me a reviewed-by tag. Will merge this tomorrow morning after some more testing.
- 03:53 PM Bug #3637: client: not issuing caps for with clients doing shared writes
- This now appears to be passing (I've got it continuing to loop in the background), but it needs review and merging. S...
- 03:05 PM Bug #3637: client: not issuing caps for with clients doing shared writes
- That latest issue was #4746. Turning off the callback and testing again...
- 05:42 AM Bug #3637: client: not issuing caps for with clients doing shared writes
- Zheng Yan wrote:
> there are only 4 states that allow Fw caps, they are MIX, MIX_EXCL, EXCL and EXCL_MIX. they all a... - 05:39 AM Bug #3637: client: not issuing caps for with clients doing shared writes
- Greg Farnum wrote:
> I don't remember how all the locking works when you have multiple writers, but I don't believe ... - 10:17 AM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- And also /a/teuthology-2013-04-16_01:00:52-fs-next-testing-basic/13665
- 09:26 AM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- This just happened again at /a/teuthology-2013-04-17_01:00:56-fs-master-testing-basic/14248 (it's still running, for ...
- 10:12 AM Bug #4742: mds: stuck clientreplay request
- Looks like a setattr and a create:
ubuntu@plana72:~$ sudo ceph --admin-daemon /var/run/ceph/ceph-client.0.19374.as... - 09:36 AM Bug #4742 (Resolved): mds: stuck clientreplay request
- /a/teuthology-2013-04-17_01:00:56-fs-master-testing-basic/14246
It has a single request which isn't completing; wh... - 10:06 AM Cleanup #4744 (In Progress): mds: pass around LogSegments via std::shared_ptr
- These really ought to be ref-counted in some way to prevent early expiry.
- 09:34 AM Bug #4741 (Duplicate): MDS: stuck in clientreplay
- /a/teuthology-2013-04-17_01:00:56-fs-master-testing-basic/14249
I can't find any hints, except that it is in fact ... - 09:00 AM Feature #3243 (In Progress): qa: test samba reexport via libcephfs vfs plugin in teuthology
- 08:58 AM Feature #3242 (Resolved): samba: push plugin upstream
- Posted patches to mailing list:
https://lists.samba.org/archive/samba-technical/2013-April/091651.html - 08:01 AM Bug #4738 (Need More Info): libceph: unlink vs. readdir (and other dir orders)
- Denis,
I've seen similar behavior with the smbtorture dir1 test, but it happens without the vfs_ceph module. Does... - 04:54 AM Bug #4738 (Closed): libceph: unlink vs. readdir (and other dir orders)
- Combining (stacking) in samba vfs_scannedonly with vfs_ceph, I experienced some bugs, looks like libceph readdir prob...
04/16/2013
- 06:41 PM Bug #3637: client: not issuing caps for with clients doing shared writes
- Greg Farnum wrote:
> I don't remember how all the locking works when you have multiple writers, but I don't believe ... - 03:43 PM Bug #3637: client: not issuing caps for with clients doing shared writes
- Okay, it's not quite that simple. This (all following the data writeout; I think this is the data check — anyway, thi...
- 02:58 PM Bug #3637: client: not issuing caps for with clients doing shared writes
- Reproduced at last. There continues to be a problem with the fix branch too :( but it's not a max_size issue; one of ...
- 01:47 PM Bug #3637: client: not issuing caps for with clients doing shared writes
- And that wasn't working because teuthology was creating working dirs like /tmp/cephtest/gregf@kai-2013-04-16_12-59-21...
- 10:48 AM Bug #3637 (Fix Under Review): client: not issuing caps for with clients doing shared writes
- Regarding the testing (which I'm doing now), what those warnings turned out to mean is that each instance had their o...
- 10:37 AM Bug #3637: client: not issuing caps for with clients doing shared writes
- I don't remember how all the locking works when you have multiple writers, but I don't believe either of those suppos...
- 01:11 PM Feature #4734: libcephfs: async interfaces
- If when we do this, whoever does so should please be careful to refactor our synchronous interfaces in terms of the a...
- 12:48 PM Feature #4734 (New): libcephfs: async interfaces
Implement async interfaces to libcephfs, at the least for the write and read calls.
This is motivated by the cep...- 12:53 PM Bug #4732: uclient: client/Inode.cc: 126: FAILED assert(cap_refs[c] > 0)
- You might want to grab the ceph-fuse binary too so that the core dump is useful.
- 12:37 PM Bug #4732 (Closed): uclient: client/Inode.cc: 126: FAILED assert(cap_refs[c] > 0)
- ...
- 09:59 AM Bug #4729 (Can't reproduce): mds: stuck in clientreplay
- Unfortunately by the time I got in one of the machines had been allocated for another job, and now it looks like the ...
- 07:52 AM Bug #4729 (Can't reproduce): mds: stuck in clientreplay
- job was...
- 09:31 AM Bug #4694 (Resolved): client: put_snap_realm assert failure
- Looks good to me; I merged it into next. This was an impressively narrow race so we couldn't get a good reproducer go...
04/15/2013
- 04:38 PM Documentation #4727 (Resolved): upgrade doc has to be modified to include upgrading ceph-mds as well
- Changed package to ceph-mds: http://ceph.com/docs/master/install/upgrading-ceph/#upgrading-a-metadata-server
- 04:26 PM Documentation #4727 (In Progress): upgrade doc has to be modified to include upgrading ceph-mds a...
- 11:42 AM Documentation #4727 (Resolved): upgrade doc has to be modified to include upgrading ceph-mds as well
- http://ceph.com/docs/master/install/upgrading-ceph/
In the above mentioned doc, in section "upgrading a metadata s... - 12:47 PM Bug #4713 (Fix Under Review): mds: hang related to access from two clients
- I have tested the commands listed above on a system with the
patches described here:
http://tracker.ceph.com/is... - 11:03 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- I ran the blogbench test with all of the above-mentioned
patches applied on a mira cluster and I never saw it hang.
... - 09:35 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- FYI, these kernel patches (Zheng's and mine) are available on
the ceph-client git repository branch "review/wip-4706... - 09:27 AM Bug #4679 (Fix Under Review): ceph: hang while running blogbench on mira nodes
- > Found 5 bugs, fixed 4.
I reviewed the four kernel patches (they were posted on the mailing
list). I also provi... - 09:15 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- > The fix for writepages race is easier than I thought, patch is attached.
This is interesting. When I was workin... - 10:59 AM Bug #4660: mds: segfault in queue_backtrace_update
- *blink*
Of course it's not; sorry about that. - 10:57 AM Bug #4660 (Resolved): mds: segfault in queue_backtrace_update
- That isn't the same bug. Opening #4726 for that issue.
- 10:52 AM Bug #4660 (In Progress): mds: segfault in queue_backtrace_update
- ubuntu@teuthology:/a/teuthology-2013-04-13_01:00:48-fs-next-testing-basic/12134
- 10:57 AM Bug #4726 (Can't reproduce): mds: segv during blogbench in remove_pending_backtraces
ubuntu@teuthology:/a/teuthology-2013-04-13_01:00:48-fs-next-testing-basic/12134
2013-04-13T18:52:50.199 INFO:t...- 09:33 AM Bug #4706 (Fix Under Review): kclient: Oops when two clients concurrently write a file
- I have posted two patches, one which resolves the
crash due to an interrupt while waiting and one
that resolves Zhe... - 08:46 AM Bug #3579: kclient: Use less secure random number generator so we don't consume entropy
- commit 442318d09506d33e811d9d6a7bd2514287df729d
04/13/2013
- 09:46 AM Bug #4722 (Can't reproduce): kernel BUG at fs/ceph/caps.c:1006 invalid opcode: 0000
- Top of Call trace:...
04/12/2013
- 11:07 PM Bug #4721: libcephfs tests fail when using ceph-deploy
- I'm able to reproduce this failure.
I'm much less familiar with libceph than I am the libcephfs-java code, so I'm g... - 05:42 PM Bug #4721: libcephfs tests fail when using ceph-deploy
- and the logs are placed in burnupi06.front.sepia.ceph.com:/home/ubuntu/apr12_cdep_libcephfs/
- 05:41 PM Bug #4721 (Resolved): libcephfs tests fail when using ceph-deploy
- ceph version : 0.60-467-g6b98162-1precise
config.yaml used to reproduce
tamil@ubuntu:~/test_logs_cuttlefish/apr... - 08:36 PM Bug #3637: client: not issuing caps for with clients doing shared writes
- If Locker::_do_cap_update can't get wrlock for a given client, the client should have no Fw cap. I think we can make ...
- 04:47 PM Bug #3637: client: not issuing caps for with clients doing shared writes
- I'm having difficulty reproducing this at all on current next, but am leaving it churning in the background... :/
... - 01:36 PM Feature #3242 (In Progress): samba: push plugin upstream
- Sam has been working on this for the last couple days.
- 11:06 AM Bug #3579 (Resolved): kclient: Use less secure random number generator so we don't consume entropy
- 10:13 AM Bug #4660 (Resolved): mds: segfault in queue_backtrace_update
- The commit that hit this segv above looks like it was off of master, whereas the fix went into next. I was able to r...
- 09:30 AM Bug #4694 (Fix Under Review): client: put_snap_realm assert failure
- Pushed wip-4694. Still trying to reproduce this reliably so that I can test the proposed fix.
- 09:26 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- Zheng Yan wrote:
> The Oops is caused by uninitialized req->r_inode
Already tracked down the Oops. time to sleep,... - 09:07 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- FYI I just reproduced the problem without interrupt
and it matches what I saw before. (So I don't believe
the inte... - 07:39 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- I also proposed a fix: [PATCH 1/4] ceph: add osd request to inode unsafe list in advance
- 07:22 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- Zheng I think I have a fix. I'm going to test it first,
but then I'd like to supply it to you to see if it resolves... - 05:23 AM Bug #4706 (New): kclient: Oops when two clients concurrently write a file
- > Found a potential cause. the request may complete before adding it
> to the unsafe list.
I think that not being... - 12:09 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- The Oops is caused by uninitialized req->r_inode
- 07:35 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- The fix for writepages race is easier than I thought, patch is attached.
- 01:08 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- Found 5 bugs, fixed 4. The remaining one is a race between truncate and writepages. Truncate message from MDS can cha...
04/11/2013
- 08:26 PM Bug #4714 (Duplicate): kclient: ceph_sync_{read,write} only accept single buffer.
- So readv and writev are broken for SYNC IO
- 07:28 PM Bug #4713: mds: hang related to access from two clients
- I discovered this while trying to reproduce the issue
in http://tracker.ceph.com/issues/4706.
I documented it the... - 07:24 PM Bug #4713 (Resolved): mds: hang related to access from two clients
- 06:31 PM Bug #4706: kclient: Oops when two clients concurrently write a file
- This crash looks a little bit familiar to me, and I think
I created a bug for it, but at the moment I can't find it.... - 05:52 PM Bug #4706: kclient: Oops when two clients concurrently write a file
- OK, well I believe I have reproduced the problem.
I did this on two nodes simultaneously:
dd if=/dev/zero of=... - 09:23 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- Yes, test branch of ceph-client. The hint to trigger the Oops is multiple clients write date to a file at the same ti...
- 08:52 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- Well, I unfortunately got the same problem using
the "bobtail" branch.
Specifically what I'm doing:... - 08:15 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- Well that's interesting.
I haven't been working with the ceph file system much so
I'm not sure what to expect. B... - 07:43 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- > the request may complete before adding it to the unsafe list.
That looks like a reasonable explanation to me. A... - 06:28 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- ...
- 05:56 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- It is a new issue in the sync write path, nothing to do with cap revoke. Alex has made quite a lot of changes in that...
- 05:01 AM Bug #4706: kclient: Oops when two clients concurrently write a file
- Them doing a sync write is probably correct as their concurrency is being managed by the MDS now, and they aren't goi...
- 06:06 PM Bug #3637 (In Progress): client: not issuing caps for with clients doing shared writes
- Since I apparently forgot to mention it here, this has nothing to do with #4489; I just pattern-matched a little too ...
- 09:09 AM Bug #4644 (Resolved): mds crashing after upgrade from 0.58 to 0.60
- Merged into next as of commit:d777b8e66b2e950266e52589c129b00f77b8afc0 (Thanks Sam!).
- 02:25 AM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- so patch tested, mds is running fine now. thx !
- 02:18 AM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- Last patch seems work. At least mds dont crash anymore. Also df reports non bogus values.
I'll add this patch to gen... - 12:14 AM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- let me know if i can test patches for you ! :)
- 09:06 AM Bug #4451 (Resolved): client: Ceph client not releasing cap
- Merged into next via commit:e32849c4eef2f5d911288aabeac0a6967b1e6ae4
I'm electing not to backport this despite its... - 08:16 AM Fix #4708 (Rejected): MDS: journaler pre-zeroing is dangerous
- See http://pastebin.com/NJd0UCfF
At first glance it looks like there's a short and a missing log object, and then ... - 08:15 AM Bug #4105: mds: fix up the Dumper
- Promoting this to high as it can be so useful for gathering important debug data; it would be nice to have done befor...
04/10/2013
- 11:52 PM Bug #4706 (Resolved): kclient: Oops when two clients concurrently write a file
- ...
- 08:31 PM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- The code looks good.
- 01:10 PM Bug #4644 (Fix Under Review): mds crashing after upgrade from 0.58 to 0.60
- Hurray, I did manage to reproduce so I guess I just missed before, and indeed it works with that patch and fails with...
- 12:38 PM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- I'm having trouble reproducing this bug, but I'm probably not going through the right steps. A patch that I think sho...
- 12:20 PM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- if you have some patch that we can test, i'd be glad =)
- 10:27 AM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- Ah, this looks to be less bad than I thought — the (struct_v == 2) check should be (struct_v <= 2) is all, from the s...
- 09:03 AM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- update directly from IRC, as alexxy is still having registration issues:
<alexxy> joao: upgrade was from version 0... - 09:11 AM Bug #3579 (Fix Under Review): kclient: Use less secure random number generator so we don't consum...
- Patches sent to the mailing list and pushed to wip-3579.
- 09:07 AM Bug #4569: ceph-mds: segfault
- It looks like this fix didn't make it into 0.60. See #4696.
- 09:06 AM Bug #4696: MDS Crashes with Segmentation fault near Objecter::handle_osd_op_reply
- Oh you're using 0.60. Looks like that commit didn't make it into the 0.60 release. It will be fixed in the next one!
- 09:04 AM Bug #4696 (Duplicate): MDS Crashes with Segmentation fault near Objecter::handle_osd_op_reply
- This is a duplicate of #4569. Its fixed in 0.60 if you're willing to upgrade.
- 06:37 AM Bug #4696 (Duplicate): MDS Crashes with Segmentation fault near Objecter::handle_osd_op_reply
- Limited logs at http://goo.gl/VAIFh...
- 05:23 AM Bug #4679 (In Progress): ceph: hang while running blogbench on mira nodes
- I reproduced a hang, it is an 'i_mutex + cap revoking' deadlock....
- 12:58 AM Bug #1878: ceph.ko doesn't setattr (lchown, utimes) on symlinks
- For xattrs, there is no difference between symbol links and regular file. For setattr, I think the only difference is...
04/09/2013
- 07:49 PM Bug #4451: client: Ceph client not releasing cap
- Please review again based on the latest changed pushed to wip-4451.
- 04:27 PM Bug #4451: client: Ceph client not releasing cap
- Does this need more review or just testing? (I ask because I notice you've got two reviewed-by tags on it, although I...
- 08:48 AM Bug #4451: client: Ceph client not releasing cap
- Thanks Yan for fixing up that patch and testing it out. The inode check was just cruft from the previous changes, an...
- 06:00 AM Bug #4451: client: Ceph client not releasing cap
- After removing the path_is_mine check, MDCache::parallel_fetch_traverse_dir() needs skip non-auth dirfrags. The modif...
- 06:34 PM Bug #4644 (In Progress): mds crashing after upgrade from 0.58 to 0.60
- That shouldn't be a problem for v0.58; it included version 2 session_info_t. You sure that's the version you upgraded...
- 06:18 PM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- The 26th byte of Norbert's sessionmap is 1. If I'm not wrong, it's struct_v for session_info_t. But the oldest versio...
- 10:58 AM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- alexxy's sessionmap doesn't look anything like a sessionmap should; this won't fix his issue. Norbert's is at least s...
- 06:20 AM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- alexxy on IRC is reporting that the patch doesn't work. He would have provided his report himself, but it appears th...
- 04:13 PM Bug #4618 (Resolved): Journaler: _is_readable() and _prefetch() don't communicate correctly
- Merged into next in commit:8eb5465c10840d047a894d1a4f079ff8b8d608b5. This would apply to bobtail as well if we decide...
- 03:12 PM Bug #4679: ceph: hang while running blogbench on mira nodes
- Not off-hand, but I haven't spent any time thinking about it yet. This one could be differences between how aggressiv...
- 03:03 PM Bug #4679: ceph: hang while running blogbench on mira nodes
- We've only seen a certain set of errors at the mds with the kernel client (this one and #4660 - although they may be ...
- 02:57 PM Bug #4679: ceph: hang while running blogbench on mira nodes
- *sigh* Yep...
I've marked this as an MDS issue for now, but it could be a broader protocol change or something as ... - 02:45 PM Bug #4679 (Rejected): ceph: hang while running blogbench on mira nodes
- I re-ran the blogbench test 10 times using the "bobtail"
branch of ceph and never saw a hang.
I'm going to call t... - 12:13 PM Bug #4679: ceph: hang while running blogbench on mira nodes
- I got another hang without any debug info being dumped
from the MDS. This time I just abandoned it. I'm about
to ... - 02:50 PM Bug #4694 (Resolved): client: put_snap_realm assert failure
- ...
- 11:04 AM Bug #1878: ceph.ko doesn't setattr (lchown, utimes) on symlinks
- I'm actually not sure how the symlink stuff is represented in our kernel client or the VFS — do these functions handl...
- 08:31 AM Bug #4660 (In Progress): mds: segfault in queue_backtrace_update
- 08:30 AM Bug #4660: mds: segfault in queue_backtrace_update
- Alex hit the same segfault with the next branch yesterday, looks like the commit 3cdc61ec doesn't fix this bug. The ...
04/08/2013
- 08:32 PM Bug #4680 (Closed): mds: log possibly not trimming
- 2013-03-28 10:27:35.154461 7f1fc96b8700 10 mds.0.log trim 2 / 30 segments, 10 / -1 events, 0 (0) expiring, 0 (0) expi...
- 10:32 AM Bug #4680: mds: log possibly not trimming
- Yeah, it's not a generic never trimming; just not certain about this one. It could also be fine and just that there's...
- 10:27 AM Bug #4680: mds: log possibly not trimming
- I've seen it trim logs in the tests I've been running, but that's with mds_log_segment_size=16K and mds_log_max_segme...
- 10:04 AM Bug #4680 (Closed): mds: log possibly not trimming
- Apparently there are a lot of old files showing up in the log replay, and I noticed previously on a different issue t...
- 08:20 PM Bug #4644 (Fix Under Review): mds crashing after upgrade from 0.58 to 0.60
- there is a typo in session_info_t::decode
- 08:04 PM Bug #4451: client: Ceph client not releasing cap
- Greg Farnum wrote:
> Although I think the MDS would need to have the inode in cache for that to happen — it would ha... - 10:59 AM Bug #4451: client: Ceph client not releasing cap
- Zheng Yan wrote:
> "Regarding the cap export, is it possible that the client has a cap that it thinks belongs to the... - 09:43 AM Bug #4451: client: Ceph client not releasing cap
- "Regarding the cap export, is it possible that the client has a cap that it thinks belongs to the mds, but the mds do...
- 09:13 AM Bug #4451: client: Ceph client not releasing cap
- "After removing the path_is_mine check in Server::handle_client_reconnect(), I think we should also call mdcache->rej...
- 04:41 PM Bug #4685 (Can't reproduce): BUG: unable to handle kernel NULL pointer dereference at
- 0.56.4 ceph, 3.8 kernel...
- 02:22 PM Bug #4679: ceph: hang while running blogbench on mira nodes
- It looked very promising. 4 successful passes, but the
last one hung again. This time there were two blogbench
ta... - 12:26 PM Bug #4679: ceph: hang while running blogbench on mira nodes
- One pass succeeded, so it's looking good.
I'll let it run 5 times and if all are successful, I'll just
close this... - 11:56 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- I talked with Sam Lang who said I should try again with
mds debugging on. That led to more info getting dumped
on ... - 11:01 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- ...
- 10:49 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- Actually, the other common theme (maybe more important)
is the involvement of an in-progress ceph_setattr() call.
... - 10:40 AM Bug #4679 (In Progress): ceph: hang while running blogbench on mira nodes
- Unfortunately it looks like I've reproduced the problem
with my patches. The common theme is ceph_aio_write(), so
... - 10:04 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- I ran those tests a few times with the testing branch and
the problem did not show up. I reduced the test to just
... - 05:49 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- Here is an excerpt of the yaml file driving the
tests, leading up to the blogbench run:... - 05:29 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- Here are the versions of ceph and teuthology I'm using
while running these tests:
ceph
f5ba0fb mon: make 'osd cr... - 05:26 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- Here is a log of the commits in place during these
tests. (I know, quite a few...) The last one is
the current te... - 05:24 AM Bug #4679: ceph: hang while running blogbench on mira nodes
- Here is an excerpt of the stack trace generated using:
echo t > /proc/sysrq-trigger
[31482.585095] blogbench.... - 05:21 AM Bug #4679 (Resolved): ceph: hang while running blogbench on mira nodes
- I have seen this only on mira nodes, now twice on two
consecutive attempts. I've run the same set of tests
with th... - 11:02 AM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Said he could look at this for me today.
- 09:29 AM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Heh, no; that was supposed to be a 10. Re-pushed; thanks!
- 09:34 AM Bug #3579 (In Progress): kclient: Use less secure random number generator so we don't consume ent...
- 07:16 AM Bug #4660 (Fix Under Review): mds: segfault in queue_backtrace_update
- Pushed a fix to wip-4660. The mdr was getting deleted before we queued the backtrace for update, so mdr->ls was inva...
04/07/2013
- 01:46 AM Bug #1878 (Fix Under Review): ceph.ko doesn't setattr (lchown, utimes) on symlinks
- ceph_symlink_iops does not have getattr/setattr and xattrs related mothods
- 01:25 AM Bug #4241 (Duplicate): SELinux fails because it can't set xattrs
- This is the same problem as #1878 (ceph_symlink_iops doesn't have setattr method)
04/06/2013
- 11:30 AM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Confirmed, i tested with my system, and the journal-check can load the journal.
But, there is a line in commit:
<...
04/05/2013
- 04:02 PM Bug #4618 (Fix Under Review): Journaler: _is_readable() and _prefetch() don't communicate correctly
- There were a couple related bugs which prevented this from working right. I don't guarantee it's bug-free now, but th...
- 04:32 AM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Can i continue testing cephfs, or you make the fix quickly for this bug, and i can verify it on my system?
- 03:37 PM Bug #4451: client: Ceph client not releasing cap
- After removing the path_is_mine check in Server::handle_client_reconnect(), I think we should also call mdcache->rejo...
- 10:25 AM Bug #4451 (Fix Under Review): client: Ceph client not releasing cap
- Pushed a proposed fix to wip-4451. The fix is to not adjust the conditional for checking if an inode is auth or not....
- 10:26 AM Bug #4660 (In Progress): mds: segfault in queue_backtrace_update
- 09:37 AM Bug #4660: mds: segfault in queue_backtrace_update
- No wonder this wasn't showing up in my bug queue!
- 08:20 AM Bug #4660 (Resolved): mds: segfault in queue_backtrace_update
- ...
- 09:36 AM Bug #4565 (Can't reproduce): MDS/client: issue decoding MClientReconnect on MDS
- I've had this running for more than 24 hours and it still hasn't reproduced. I'll let it keep going, but I don't beli...
04/04/2013
- 11:15 PM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- sessionmap, command is rados --pool=metadata get mds0_sessionmap /tmp/sessionmap (without -o) :)
- 11:07 PM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- logfile with debug mds = 20...
- 05:16 PM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- I guess this bug was introduced by commit 0bcf2ac081b8386fe00387b654aa5676a7902c80...
- 11:29 AM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- I got a SessionMap from alexxy and it somehow has a bad version number attached to it. More importantly when I hexdum...
- 10:36 AM Bug #4644 (Need More Info): mds crashing after upgrade from 0.58 to 0.60
- It failed to decode the SessionMap properly here, but I can't tell why and the code hasn't changed at all between tho...
- 03:34 AM Bug #4644: mds crashing after upgrade from 0.58 to 0.60
- alexxy @ IRC also hit this issue. Attaching log.
- 02:37 AM Bug #4644 (Resolved): mds crashing after upgrade from 0.58 to 0.60
- after upgrade from 0.58 to 0.60, one mds is crashed and still crashing directly after start...
- 03:03 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- 02:18 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Okay, so the next entry is >40MB and we have 38MB in our read buffer. I'm not certain, but I think our use of "temp_f...
- 12:54 PM Bug #4618 (In Progress): Journaler: _is_readable() and _prefetch() don't communicate correctly
- 12:53 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Okay, there's not a lot there so apparently it doesn't have as much data as it thinks it needs in order to read the n...
04/03/2013
- 06:09 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Greg Farnum wrote:
> Are those logs posted somewhere? That indicates it's waiting to be allowed to read the stuff pa... - 05:41 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Are those logs posted somewhere? That indicates it's waiting to be allowed to read the stuff past where it stopped, b...
- 04:50 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- just a guess: with journaler debug, there is a line:...
- 03:08 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- you said "My off-hand guess is that something isn't getting cleaned up properly with the slave requests, which leads ...
- 03:07 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- I think of it every time i hear "stuck in replay", that's all. I havne't looked at the logs or anything.
- 02:59 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Sorry, but I'm a bit lost about why that might apply here. Are you just speculating or did something in the logs look...
- 02:57 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- see commit 7e04504d3ed119bb43a4eb99ca524b39dc3696bc. But the bug should just make replay slow.
- 02:38 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- here is a logcut with "debug journaler = 20": http://pastebin.com/nrzJg87E
- 01:59 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Yeah, that all looks good too. My off-hand guess is that something isn't getting cleaned up properly with the slave r...
- 01:52 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Don't forget #3351.. if the osd returns a short read on an object before the end of the journal, the Journaler replay...
- 01:35 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- if you tell me (here or irc) where to add new debug/assert lines, we can hunt down this bug.
- 01:15 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Thanks. (For future onlookers, the summary of those links is that everything is perfectly normal and as it should be,...
- 01:02 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Here is the status: http://pastebin.com/x1XEvuWc
Here is the config dump: http://pastebin.com/YTFbY5jW
- 10:09 AM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- The MDS maintains a journal that it writes metadata into before committing the aggregated updates into the actual ino...
- 02:01 AM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Greg Farnum wrote:
> Sorry, I mean the mds journal, not the debug logs, when referring to the size.
So the mds jo... - 03:43 PM Bug #3266 (Resolved): "ceph mds tell 0 dumpcache /etc/passwd" is not cool
- Merged in with commit:32aac00c7043aa1564272697879b1c626814b143
- 03:33 PM Bug #3266 (Fix Under Review): "ceph mds tell 0 dumpcache /etc/passwd" is not cool
- wip-3266
- 03:02 PM Bug #4582 (Resolved): mds: Client hang on fsstress with mds_thrasher
- 09:41 AM Bug #4582 (Fix Under Review): mds: Client hang on fsstress with mds_thrasher
- With the latest changes to the mds merged to master, and the fix from #4637, I was able to get a successful run of fs...
- 01:35 PM Bug #4489 (New): ceph fs hangs on file stat
- Never mind, forgot the other one involved max size changes.
- 01:05 PM Bug #4489 (Duplicate): ceph fs hangs on file stat
- All right; that should be more stable for you. :)
Thanks for the steps to reproduce. I'm going to tentatively mark... - 01:27 PM Bug #3637: client: not issuing caps for with clients doing shared writes
- Starting to look at this now.
- 01:04 PM Bug #3637: client: not issuing caps for with clients doing shared writes
- #4489 is probably a duplicate of this and has steps to reproduce, if we need alternate angles of attack. (And we shou...
- 12:56 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- [Meant to post this yesterday but I guess I forgot to hit submit.]
Sadly, this test didn't slurp up any logs, so all... - 12:53 PM Bug #4637 (Resolved): mds: standby takeover stuck in rejoin
- Thanks. Don't you ever sleep? :)
Merged into master in commit:0d6ddd926432821842a7e40fdb78d793ab0737bb - 12:37 PM Bug #4637: mds: standby takeover stuck in rejoin
- Greg's fix looks good, sorry for the bug.
- 10:45 AM Bug #4637: mds: standby takeover stuck in rejoin
- Pushed that to wip-no-fail-whoami-4637. Sage, Yan, care to check it out? :)
- 10:33 AM Bug #4637: mds: standby takeover stuck in rejoin
- Can you try this patch instead, and see if that works? (If it does I'll want a review from Sage or Yan; it looks okay...
- 08:43 AM Bug #4637 (Fix Under Review): mds: standby takeover stuck in rejoin
- Pushed a fix to wip-4637.
- 08:40 AM Bug #4637 (Resolved): mds: standby takeover stuck in rejoin
- With current master, with one active mds and one standby, if the active fails, the standby gets stuck in rejoin while...
- 12:44 PM Bug #4638 (Duplicate): client: fsstress and mds_thrasher hangs client on unmount
- This is the same problem as #4451 (client inodes getting disconnected on unmount.
- 09:42 AM Bug #4638 (Duplicate): client: fsstress and mds_thrasher hangs client on unmount
After a successful run of fsstress and mds_thrasher, the client hangs on unmount and eventually returns EBUSY.
04/02/2013
- 11:24 PM Bug #1535 (Resolved): concurrent creating and removing directories crashes cmds
- I think this has been fixed by commit 00025462
- 10:48 PM Bug #1945: blogbench hang on caps
- Sorry for the delay, I didn't noticed the notification. I fixed several bugs that may cause hangs of this type, but I...
- 07:24 PM Bug #4489: ceph fs hangs on file stat
- Hm, snapdirname is something obfuscated (but have no use, actually).
I've got the same error one more time, so I bel... - 06:14 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Sorry, I mean the mds journal, not the debug logs, when referring to the size.
- 05:12 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Greg Farnum wrote:
> Strange, it looks like you have an MDS log of about 1236MB, which is...large. What config optio... - 04:28 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- Strange, it looks like you have an MDS log of about 1236MB, which is...large. What config options are you setting?
... - 12:36 PM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- I changed back to max_mds 1. same result:...
- 09:42 AM Bug #4618: Journaler: _is_readable() and _prefetch() don't communicate correctly
- I'll check my assumptions today (already downloaded the logs), but with multiple active MDSes this doesn't warrant a ...
- 07:14 AM Bug #4618 (Resolved): Journaler: _is_readable() and _prefetch() don't communicate correctly
- The Journaler has mechanisms to try and read extra data if an event is large enough that it exceeds the current prefe...
- 02:48 PM Bug #4619 (Resolved): mds: anchortable hangs on new cluster
- Merged and pushed to master in commit:3842ff7d677bae98462f7d050f5fda9d85f6273d
- 02:20 PM Bug #4619: mds: anchortable hangs on new cluster
- Code looks good, Sorry for the bug!.
- 01:06 PM Bug #4619 (Fix Under Review): mds: anchortable hangs on new cluster
- recovery_done() breaks on a fresh machine because of the populate_mydir() ordering. The problem is that both recover...
- 09:52 AM Bug #4619 (In Progress): mds: anchortable hangs on new cluster
- Sage said he'd look at the double-send as well.
- 09:27 AM Bug #4619 (Resolved): mds: anchortable hangs on new cluster
- commit:968c6c0c9408b33904041e5ddbd9ea738e831713
- 09:13 AM Bug #4619: mds: anchortable hangs on new cluster
- I think this isn't correct. If we restart the table server MDS, it will send two ready messages to the table client. ...
- 09:02 AM Bug #4619: mds: anchortable hangs on new cluster
- Code looks good, assuming the tests run.
Sorry about that! :( - 08:15 AM Bug #4619 (Fix Under Review): mds: anchortable hangs on new cluster
- wip-4619
- 08:14 AM Bug #4619 (Resolved): mds: anchortable hangs on new cluster
- 02:30 PM Bug #4621 (Rejected): failed pjd chown/00.t 124
- Okay, all symlink attempts that made it to the MDS were successes, and I can't find any failed ceph-fuse symlink/ll_s...
- 01:59 PM Bug #4621: failed pjd chown/00.t 124
- Sorry, not an lchown, just a symlink create.
- 01:29 PM Bug #4621: failed pjd chown/00.t 124
- Well, it's always an adventure to figure out which one is busted, but it looks to be an lchown on a symlink failing. ...
- 09:30 AM Bug #4621 (Rejected): failed pjd chown/00.t 124
- 2013-04-02T09:04:34.029 INFO:teuthology.task.workunit.client.0.out:../pjd-fstest-20090130-RC-open24/tests/chown/00.t ...
- 02:27 PM Feature #4630 (New): make lchown work in ceph-fuse for pjd
- pjd doesn't believe that ceph-fuse supports lchown. Maybe this is pjd's fault; maybe it's ours. Figure out why so tha...
- 11:49 AM Documentation #2206: Need a control command to gracefully shutdown an active MDS prior to planned...
- This is partially documented by 0c16b31db7a5ed72a9c306ae91b191c326d0776a on github.
04/01/2013
- 03:18 PM Bug #3266: "ceph mds tell 0 dumpcache /etc/passwd" is not cool
- Before anybody embarks on solving this, I assume there's a standard way to handle this by outlawing certain kinds of ...
- 01:23 PM Bug #2657: kclient: direct io write larger than 8MiB fails
- in testing, there is now a test workunit
- 01:23 PM Bug #2657 (Resolved): kclient: direct io write larger than 8MiB fails
- 01:22 PM Bug #4434 (Resolved): looping waiting for quorum after upgrade
- Whoops@!
- 01:14 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- I'll look into the code around this today.
- 11:03 AM Bug #4489: ceph fs hangs on file stat
- Why are you specifying the snapdirname to that weird value when mounting this?
- 11:00 AM Bug #4405: MDCache::populate_mydir can loop forever
- This dump has 1063591 inodes in the cache, of which only 122104 are non-stray. That doesn't seem quite right.
I do... - 09:37 AM Bug #4590 (Resolved): ceph-fuse: fsx fails with 'client oc = false'
- commit:c01e2e42f368ca003e03debe9a7bd5f12eb79d2c
03/31/2013
- 10:33 AM Bug #4601 (Can't reproduce): symlink with size zero
- Somehow I got into a situation in which a number of symlinks, all of them created and later modified at about the sam...
03/29/2013
- 09:05 PM Bug #4590 (Resolved): ceph-fuse: fsx fails with 'client oc = false'
- ...
- 03:22 PM Bug #4582: mds: Client hang on fsstress with mds_thrasher
- Oh, yeah, we can do the same in the userspace client. I'll do that and re-push. Thanks Yan!
- 03:12 PM Bug #4582: mds: Client hang on fsstress with mds_thrasher
- FYI:
The kclient deals with this case by calling wake_up_session_caps(). It just clear i_wanted_max_size/i_requested... - 01:04 PM Bug #4582: mds: Client hang on fsstress with mds_thrasher
- I believe those are okay as truncate size changes should end up actually journaled (as setattrs) so they'll be replay...
- 12:58 PM Bug #4582: mds: Client hang on fsstress with mds_thrasher
- I spent most of this morning figuring out if it made sense to send the full cap (ceph_mds_caps -- and get rid of the ...
- 12:31 PM Bug #4582: mds: Client hang on fsstress with mds_thrasher
- I'm not sure this is wrong, but it's confusing me a bit. I thought that the Client sent all capabilities it holds bac...
- 12:14 PM Bug #4582: mds: Client hang on fsstress with mds_thrasher
- I just pushed wip-4582. Testing it on the fsstress test with mds_thrasher now. I'm not positive this is the right a...
- 11:53 AM Bug #4582 (In Progress): mds: Client hang on fsstress with mds_thrasher
- 11:53 AM Bug #4582 (Resolved): mds: Client hang on fsstress with mds_thrasher
While trying to reproduce #4565, fsstress eventually hangs where the client is waiting for a max size update that t...- 01:55 PM Feature #4583 (Resolved): libcephfs: add test that kills a client and verifies mds cleans it up
- 01:28 PM Feature #4022 (In Progress): client: qa: test non-cached operation (force sync mode)
- 01:24 PM Fix #4191 (Resolved): qa: mulitiple mds in nightly (non-failure case)
- 11:31 AM Bug #4578 (Resolved): client: hangs on unlink
- 11:16 AM Bug #4578: client: hangs on unlink
- This patch solves the problem :)
- 12:51 AM Bug #4578: client: hangs on unlink
- yes, patch is also attached
- 11:11 AM Feature #4442 (Resolved): java: add topology API support
- Err, forgot to close. Thanks. ebc3abaf6dc62678f5ef5914862e9d8f216fffbf
- 11:05 AM Feature #4442: java: add topology API support
- I think this already got reviewed and merged, right? Or is there something else we need?
- 11:02 AM Bug #4569 (Resolved): ceph-mds: segfault
- commit:4f8ba0e7756a1b0647867db0e9b5549b3e82f6b1 in master. This wasn't a bug in any released versions, so no backports.
- 10:50 AM Bug #4569: ceph-mds: segfault
- In case it matters at all, the segfault was happening when I was furiously sigterm'n my hung-on-unlink client.
- 10:33 AM Bug #4569: ceph-mds: segfault
- Yep, the problem here is that the Session was created during replay and it never had a Connection associated with it ...
- 10:20 AM Bug #4569: ceph-mds: segfault
- In the logs the session in question is one that failed to reconnect. Was there a different event that caused the MDS ...
03/28/2013
- 08:47 PM Bug #4578 (Resolved): client: hangs on unlink
- Looks like somebody accidentally deleted #4570 (and there's no undelete in Redmine best I can tell), so this ticket w...
- 06:58 PM Feature #4576 (Rejected): java: support ByteBuffer interface for NIO and NIO.2 high-perf I/O
- ByteBuffer interface in NIO avoids needless copying, and is used by NIO.2 and the new VFS infrastructure in Java 7. T...
- 10:21 AM Bug #4569: ceph-mds: segfault
- It looks like the session is getting closed because its stale, and then killed, but the session->connection field pas...
- 10:00 AM Feature #4354 (In Progress): mds: add an equivalent to the OSD OpTracker
- 07:31 AM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- Update on trying to track this down...running this test in teuthology, I don't hit the same assertion, but I do see t...
03/27/2013
- 09:23 PM Bug #4308 (Won't Fix): ceph-fuse crashed during blogbench test (argonaut)
- this is most likely memory corruption in argonaut's ceph-fuse.
- 09:21 PM Bug #4564 (Resolved): client: Close session doesn't wait for outstanding requests
- 09:09 AM Bug #4564 (Fix Under Review): client: Close session doesn't wait for outstanding requests
- Pushed a fix to wip-4564.
- 07:13 AM Bug #4564 (Resolved): client: Close session doesn't wait for outstanding requests
Ran into another failure related to testing #4451 on the client where the following occurs:
client sends create/...- 11:45 AM Bug #4569 (Resolved): ceph-mds: segfault
- I started receiving this segfault in ceph-mds with the latest master today....
- 09:35 AM Bug #4565 (Resolved): MDS/client: issue decoding MClientReconnect on MDS
- ...
- 08:26 AM Bug #4539 (Resolved): include/elist.h: 92: FAILED assert(_head.empty()) from MDLog::standby_trim_...
- commit:295c92c
- 07:47 AM Bug #4539 (Fix Under Review): include/elist.h: 92: FAILED assert(_head.empty()) from MDLog::stand...
- Yep. There's no state bit, and the cache is unchanged by the backtrace updates list. The standby mds is free to cle...
- 08:04 AM Bug #4555 (Resolved): The CephFileSystem class is missing the createNonRecursive method
- 0a5175722a8444579715c1871c09c246969e7890
Also available in: Atom