Activity
From 08/09/2011 to 09/07/2011
09/07/2011
- 04:58 PM Bug #1509 (Resolved): cfuse sometimes hangs after unmount
- recent regression, now fixed by commit:fc587d6caa2376f95fe15567bd632a2d4b8bb81f
- 09:54 AM Bug #1425 (Resolved): mds: stuck in prexlock
09/06/2011
- 10:23 PM Bug #1509: cfuse sometimes hangs after unmount
- This is usually caused by leaked inode references. A full client log (debug ms = 1, debug client = 20, debug objectc...
- 12:32 PM Bug #1509 (Can't reproduce): cfuse sometimes hangs after unmount
- After fusermount completes successfully, cfuse did not exit in these runs:
teuthology:~teuthworker/archive/nightly... - 10:01 PM Bug #1472: cfuse hangs with v0.34
- Any update on this? Were you able to reproduce?
- 12:44 PM Bug #1511 (Closed): fsstress failure with 3 active mds
- Logs are in teuthology:~teuthworker/archive/nightly_coverage_2011-09-05/653...
- 12:40 PM Bug #1510 (Resolved): fsx failure on cfuse
- Logs are in teuthology:~teuthworker/archive/nightly_coverage_2011-09-05/623:...
09/05/2011
- 10:45 AM Bug #1108: Large number of files in a directory makes things grind to a halt
- I've just re-created the cluster I was testing this on, and given a 50G lv to store the ceph logs on, so running ever...
09/02/2011
- 09:04 PM Cleanup #1499 (Resolved): mds: clean up directory layouts
- Rip out all the default_layout stuff and just stick this in the inode_t::layout value. This should remove a lot of a...
- 03:16 PM Bug #1437: cfuse can't change permissions of a file
09/01/2011
- 09:40 PM Bug #1435: mds: loss of layout policies upon mds restart
- Seriously, if we just put it in the layout field, this...
- 09:36 PM Bug #1435: mds: loss of layout policies upon mds restart
- Okay, I see at least one problem.. the IFILE lock state isn't sharing the default_file_layout with other nodes. CIno...
- 03:04 PM Bug #1460 (Resolved): mds: file locks don't work right with 0-length locks
- I updated locktest.c; it now fails before my fix and succeeds afterwards. Hurray!
(Also the task that runs it now ac... - 10:59 AM Bug #1460 (In Progress): mds: file locks don't work right with 0-length locks
- Oh, I pushed this patch yesterday since it seems to be working. But I'm leaving this bug open until I can clean up th...
- 02:29 PM Bug #1425 (In Progress): mds: stuck in prexlock
- 01:15 PM Bug #1467 (Resolved): cfuse crash during fsx workunit
- This is just a bad assert, fixed by commit:c8c205fa73078c1ee46152ed860084a272867f5e
- 11:52 AM Bug #1467 (In Progress): cfuse crash during fsx workunit
- This wasn't the OSD reply bug - got this crash again today:...
- 12:40 PM Bug #1464 (In Progress): mds crash during shutdown (after trivial_sync workunit on kclient)
- 09:36 AM Bug #1472: cfuse hangs with v0.34
- I was able to run the active mds in debug mode when a hang occurred. This is the log a few seconds before and after ...
- 09:29 AM Bug #1472: cfuse hangs with v0.34
- Can you 'ceph mds tell 0 dumpcache /tmp/foo' and grep out the inode that the open is blocked on?
- 09:06 AM Bug #1472: cfuse hangs with v0.34
- Again stuck in an open:
(gdb) bt
#0 0x00007f7407a33bac in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/x86_... - 09:04 AM Feature #626: qa: add IOR, rompio, or other parallel workloads suite
- IOR depends on mpi. mpich2 is pretty easy to set up (there's a package).
I think an ior task would need to:
- t...
08/31/2011
- 09:48 PM Bug #1447 (Resolved): mds: does not validate pool IDs in handle_client_set[dir]layout
- 09:45 PM Bug #1437: cfuse can't change permissions of a file
- Sam, is this something you can reproduce? All we should need is a client log.. something like '--log-file foo --log-...
- 06:51 PM Bug #1108 (Closed): Large number of files in a directory makes things grind to a halt
- Anything new here? Large directories aren't a part of our qa yet, but when they are this'll come up...
- 06:23 PM Cleanup #431 (Resolved): mds: clean up inode journaling internal interfaces
- 05:41 PM Bug #1318: directories disappear across multiple rsyncs
- added a workunit misc/multiple_rsyncs.sh to do a couple rsyncs and make sure no additional files are transfered. src...
- 04:58 PM Bug #1467 (Closed): cfuse crash during fsx workunit
- same, i think this was the MOSDOpReply bug
- 03:52 PM Bug #1472: cfuse hangs with v0.34
At about the time of that last client hang (_open), I do see these messages in the active mds log:
2011-08-31 17...- 03:28 PM Bug #1472: cfuse hangs with v0.34
- At most 20 processes running at any given time (different instances of the same application) from a single client, re...
- 03:15 PM Bug #1472: cfuse hangs with v0.34
- What does the workload look like?
- 03:12 PM Bug #1472 (In Progress): cfuse hangs with v0.34
- Well, so much for that then.
Are these actually new hangs compared to v0.33? Newly-noticed but possibly present be... - 03:05 PM Bug #1472: cfuse hangs with v0.34
- I have verified that this hang is not due to osds crashing. With all osds running, and all pgs active+clean, I still...
- 02:31 PM Bug #1472: cfuse hangs with v0.34
- Well with 3 OSDs down you probably lost access to some objects?
It probably shouldn't hang all other requests on t... - 02:26 PM Bug #1472: cfuse hangs with v0.34
- Only 3 osds crashed though. It seems like there should be other PGs on other osds that are still accessible, unless ...
- 02:14 PM Bug #1472 (Duplicate): cfuse hangs with v0.34
- Yeah, this is probably due to dead OSDs, so the client's unable to find anywhere to read the data from and is just wa...
- 01:34 PM Bug #1472: cfuse hangs with v0.34
- FYI: These hangs may have just been caused by osd failures (see #1473). I will update if this issue persists.
- 12:10 PM Bug #1472 (Can't reproduce): cfuse hangs with v0.34
- I see hangs with cfuse that appear to be at random (random requests to servers). Here are the backtraces of some cfu...
- 09:19 AM Bug #1367 (Resolved): cfuse and mon crash after dbench
08/30/2011
- 11:20 PM Bug #1467 (Resolved): cfuse crash during fsx workunit
- Logs are in teuthology:~teuthworker/archive/nightly_coverage_2011-08-30/276...
- 11:04 PM Bug #1464 (Can't reproduce): mds crash during shutdown (after trivial_sync workunit on kclient)
- Logs are in teuthology:~teuthworker/archive/nightly_coverage_2011-08-30/293...
- 02:32 PM Bug #1460 (Resolved): mds: file locks don't work right with 0-length locks
- Right now it just doesn't handle them properly. See, eg ...
- 01:25 PM Bug #1456 (Resolved): cfuse: crash in snaptest2 during full snaps run
08/29/2011
- 01:51 PM Bug #1456 (Resolved): cfuse: crash in snaptest2 during full snaps run
- cfuse seems to be failing on master with the following config:
roles:
- - mon.0
- mds.0
- osd.0
- - mon.1
... - 09:17 AM Bug #1367 (In Progress): cfuse and mon crash after dbench
- ok, just hit the top one after 35 runs.
08/25/2011
- 09:19 PM Bug #1444 (Resolved): client: crash on flush completion under blogbench
- 08:48 AM Bug #1444 (Resolved): client: crash on flush completion under blogbench
- ...
- 09:19 PM Bug #1391 (Resolved): client: crash on std::string in insert_trace()
- 03:57 PM Bug #1318: directories disappear across multiple rsyncs
- Looking at these symptoms again, I wonder if this could have been a result of the path_traverse changes we were makin...
- 01:08 PM Bug #1446 (Resolved): cephfs: pool option doesn't work
- Fixed in commit:65b30507590e9ef47623b7bfe1e672aba01ce823
- 09:14 AM Bug #1446 (Resolved): cephfs: pool option doesn't work
- While testing the pool layout option, it's accepted, but reading back the pool it's still located in pool 0.
This ... - 01:01 PM Bug #1405 (Resolved): cephfs: shouldn't have to specify all layout options
- Fixed in userspace commit:b8267492551f1adc5e0079a670b20f6180de18f0
and kernel client commit:7c296cadd05d28329e595b... - 08:14 AM Bug #1405: cephfs: shouldn't have to specify all layout options
- The in-kernel code rejects any layout that doesn't set the stripe unit (and if you set the object_size it makes sure ...
- 12:51 PM Feature #1448 (Resolved): test hadoop on sepia
- - set it up on some sepia nodes (8?)
- do some basic testing of ceph vs hdfs
from doug cutting:... - 12:45 PM Bug #1368 (Can't reproduce): mds crash after blogbench on cfuse
- 12:45 PM Bug #1367 (Can't reproduce): cfuse and mon crash after dbench
- 10:01 AM Bug #1447 (Resolved): mds: does not validate pool IDs in handle_client_set[dir]layout
- Yep, there's no checking that they're valid mds data pools or even that they exist!
08/24/2011
- 10:27 PM Bug #1405: cephfs: shouldn't have to specify all layout options
- This is going to be related to #1446, obviously. I'll take both.
- 05:43 PM Bug #1391: client: crash on std::string in insert_trace()
- 03:11 PM Bug #1391: client: crash on std::string in insert_trace()
- Hmm, any other hints on what workloads might trigger this? I'm not getting anything from valgrind or my workloads.
... - 11:23 AM Bug #1391 (New): client: crash on std::string in insert_trace()
- Reopening this...
- 03:10 PM Bug #1442 (Resolved): client: non-empty ObjectSet on last inode->put()
- fixed by commit:f2381f97dea9f3563897857c0a0482281b449b61
- 12:56 PM Bug #1442 (Resolved): client: non-empty ObjectSet on last inode->put()
- commit:e9b739f8dd39f3373dd0869a0fd5436350e1e3f3...
- 01:12 PM Bug #1429 (Resolved): cfuse assert failed assert(diri->dn_set.size() < 2)
- fixed by commit:6c6fa6dffddb6f388d03ca59e95844ddf845f491
08/23/2011
- 11:33 AM Bug #1437 (Can't reproduce): cfuse can't change permissions of a file
- I've hit a case where I cannot change the permissions of a script to 755.
> chmod 777 ./extract_full.sh
> ls -... - 10:33 AM Bug #1391: client: crash on std::string in insert_trace()
I've been seeing a segfault in a similar spot regularly, but its been hard to reproduce. The segfault is always in...
08/22/2011
- 04:12 PM Bug #1435 (Resolved): mds: loss of layout policies upon mds restart
- Cluster running ceph 0.33 + patch to add support for “ceph mds add_data_pool”.
I set up layout policies for variou... - 04:09 PM Bug #1433 (Resolved): mds: assert in path_traverse
- Fixed by commit:b03a1841b4b08c82fa37a45dc31a0c0255949235
- 03:10 PM Bug #1433: mds: assert in path_traverse
- Yep! Need it to wait in that case. Pushing as soon as I write some documentation for path_traverse.
- 02:46 PM Bug #1433: mds: assert in path_traverse
- Looks like it's bailing out because another client is holding a lock, so the (existing) null dentry isn't readable. R...
- 01:44 PM Bug #1433 (Resolved): mds: assert in path_traverse
- While testing my teuthology lock test: ...
- 03:16 PM Bug #1366 (Can't reproduce): mds segfault
- 02:05 PM Bug #1428 (Resolved): MDS: Load and pin stray dirs in memory
- 10:28 AM Bug #1428 (Resolved): MDS: Load and pin stray dirs in memory
- MDCache::populate_mydir() already does some of this; we also need to it load each frag and set the STICKY flag on the...
- 01:26 PM Bug #1429: cfuse assert failed assert(diri->dn_set.size() < 2)
- Hopefully -- we'll have to reproduce with logging and check it out in more detail. My concern is that it may be revea...
- 01:23 PM Bug #1429: cfuse assert failed assert(diri->dn_set.size() < 2)
- Sounds like an easy fix. For dirs it should just unlink the old link in insert_trace (or whatever it is).
- 01:19 PM Bug #1429: cfuse assert failed assert(diri->dn_set.size() < 2)
- There's probably something wonky going on with the way the client is handling moved directories -- that assert is bec...
- 11:28 AM Bug #1429 (Resolved): cfuse assert failed assert(diri->dn_set.size() < 2)
- This assertion happens when a directory is moved on one client, and then the other client changes to that directory. ...
08/21/2011
- 09:34 PM Bug #1425 (Resolved): mds: stuck in prexlock
- See mds.a.log on sepia78.
- setattr request starts locking
- auth_pins auth stuff
- rdlocks parent dirs, does no...
08/19/2011
- 09:14 PM Bug #1367: cfuse and mon crash after dbench
- nuked and unlocked nodes, nothing useful there.
- 04:38 PM Bug #1417 (Resolved): mds: failed assert on xlock
- Well, I hit a path_traverse bug instead. I'm going to mark this particular one as resolved unless it pops up again.
- 04:26 PM Bug #1417: mds: failed assert on xlock
- Testing that fix I worked out with Sage.
- 02:33 PM Bug #1417: mds: failed assert on xlock
- Okay:
1) dispatch client1 request, gets xlock on filelock (lock_xlock)
2) early_reply to client1 request, which cal... - 09:17 AM Bug #1417 (Resolved): mds: failed assert on xlock
- ...
08/18/2011
- 11:30 AM Bug #1405: cephfs: shouldn't have to specify all layout options
- And you also need to specify these even if you only want to set the pool. :(
- 11:02 AM Bug #1405 (Resolved): cephfs: shouldn't have to specify all layout options
- Right now, you need to specify all layout options in cephfs (of the stripe unit, stripe count, and block size, anyway...
08/17/2011
- 03:58 PM Bug #1389 (Resolved): re-created snapshot gets removed by mds journal replay
- fixed by commit:d60d5319ad5d6674488cab96b4a452ff553e779b
- 03:26 PM Bug #1399: mds crash
- replay crash looks like the one fixed in commit:8c5e7dcf8cf7f3daa65eb9905, yay!
- 02:17 PM Bug #1399 (Resolved): mds crash
- I'm not sure I can reproduce the second (replay) crash. Sam, next time you see one of these, please capture a replay...
- 09:12 AM Bug #1399 (In Progress): mds crash
- original crash is fixed by commit:e98669ea69059e26e0c4aa72c46e0be5bfc96386
- 08:07 AM Bug #1399: mds crash
- Hmmm. If ...
- 07:53 AM Bug #1399: mds crash
- As for the original error, it does seem reproducible by creating a snapshot of a directory using the mkdir system cal...
- 07:28 AM Bug #1399: mds crash
- I removed the assertion: assert(in->is_head());
That allowed the mds servers to restart and complete recovery, and... - 02:16 PM Bug #1393 (Resolved): cfuse failed 3 pjd tests
08/16/2011
- 09:58 PM Bug #1399: mds crash
- Sam, do you still have this cluster? Can you restart the mds with debug mds = 20 and attach the resulting log? There...
- 03:11 PM Bug #1399 (Resolved): mds crash
- After running successfully with one active mds and two standbys, the active mds has crashed, and on restart, it crash...
08/15/2011
- 01:31 PM Cleanup #1307 (Closed): client cleanup
- 09:58 AM Feature #1398 (New): qa: multiclient file io test
- test read/write consistency across clients.
i thinking:
- teuthology task gets list of client names (or uses all... - 09:49 AM Bug #1391 (Can't reproduce): client: crash on std::string in insert_trace()
- It's not clear from code inspection where this might be coming from, unless there is general heap corruption. If you...
08/12/2011
- 10:34 AM Bug #1390 (Resolved): MDS crash in function 'bool Locker::issue_caps(CInode*, Capability*)', in t...
- pushed to next/master, will be in v0.33.
- 04:01 AM Bug #1390: MDS crash in function 'bool Locker::issue_caps(CInode*, Capability*)', in thread '0x7f...
- It's looking good, cluster has started up okay, no metadata crashes :-)
08/11/2011
- 04:36 PM Bug #1393 (Resolved): cfuse failed 3 pjd tests
- Teuthology results are in teuthology:~teuthworker/archive/full_suite_coverage_20110811/35/teuthology.log
Nodes sepia... - 03:51 PM Bug #1390: MDS crash in function 'bool Locker::issue_caps(CInode*, Capability*)', in thread '0x7f...
- Will it be safe to cherry-pick this onto 0.32? Else I can try packaging up that branch and deploying it!
- 03:23 PM Bug #1390: MDS crash in function 'bool Locker::issue_caps(CInode*, Capability*)', in thread '0x7f...
- Pushed commit:26871eff1740d6ec5b9b287bf47e098db913fb27 (branch wip-needissue) that should fix this. Can you let me k...
- 03:36 AM Bug #1390 (Resolved): MDS crash in function 'bool Locker::issue_caps(CInode*, Capability*)', in t...
- Yesterday I was playing about with snapshots, which seemed to expose a bug in btrfs (delayed_inode.c) which I have si...
- 12:54 PM Bug #1391 (Resolved): client: crash on std::string in insert_trace()
- Random cfuse client crash. Sorry I don't have a core file for this. It only happened on
*** Caught signal (Segm... - 08:24 AM Bug #1389: re-created snapshot gets removed by mds journal replay
- I'm pretty sure I got it with both, before I understood why my snapshots were disappearing. Once I did, I only teste...
08/10/2011
- 06:15 PM Bug #1389: re-created snapshot gets removed by mds journal replay
- Is this with the kernel or fuse client?
- 06:10 PM Bug #1389 (Resolved): re-created snapshot gets removed by mds journal replay
- Say you enter a snapshot pseudo-dir then run:
mkdir test
rmdir test
mkdir test
then restart the mds, and run:... - 11:27 AM Bug #1360: mds crash during pjd workunit on cfuse
- Machines are nuked and unlocked.
08/09/2011
- 12:40 PM Bug #1368: mds crash after blogbench on cfuse
- unlocked the nodes
- 12:38 PM Bug #1368: mds crash after blogbench on cfuse
- The crash was on shutdown. Have core file but gitbuilder binaries were expired.
Running in a loop to reproduce.
Also available in: Atom