Activity
From 12/13/2014 to 01/11/2015
01/11/2015
- 11:35 PM Subtask #10489: Mpi tests fail on both ceph-fuse and kclient
- please remove the sudo before /home/ubuntu/cephtest/fsx-mpi. Otherwise rank of all processes will be zero
- 10:38 PM Bug #10416: quota test failures
- The configure log is look like below:
overrides:
admin_socket:
branch: master
ceph:
conf:
...
01/09/2015
- 07:58 PM Bug #10448: wrong parameter passed to ceph_zero_pape_vector_range() in striped_read() of fs/ceph/...
- direct io failure is fixed by "ceph: fix reading inline data when i_size > PAGE_SIZE"
- 04:18 PM Bug #10448: wrong parameter passed to ceph_zero_pape_vector_range() in striped_read() of fs/ceph/...
- the directio failure is caused by the 'reading inline data' changes
- 10:47 AM Bug #10448: wrong parameter passed to ceph_zero_pape_vector_range() in striped_read() of fs/ceph/...
- Zheng?
- 02:51 PM Feature #10504: kclient: include client version in client_metadata
- That userspace bit got merged.
- 02:48 PM Feature #10504: kclient: include client version in client_metadata
- https://github.com/ceph/ceph/pull/3341 for the userspace side!
- 01:53 PM Feature #10504 (Resolved): kclient: include client version in client_metadata
- It would be helpful for debugging. Probably just put it in Client::populate_metadata() for userspace; I imagine the k...
- 01:48 PM Bug #10503 (Resolved): ceph-fuse: quota code is not 32-bit safe for vxattr output
- From the gitbuilders:...
- 11:02 AM Feature #10498 (New): ObjectCacher: order wakeups when write calls block on throttling
- The ObjectCacher can block write calls if the dirty data limits are exceeded by sleeping on a cond. Unfortunately, we...
- 09:56 AM Subtask #10489: Mpi tests fail on both ceph-fuse and kclient
- If I'm understanding my quick skim correctly, this is not "MPI tests are failing" but "this mpi-fsx" test is failing,...
- 03:39 AM Subtask #10489: Mpi tests fail on both ceph-fuse and kclient
- Is this a new failure on the existing cluster, or something that's come up on the new cluster?
01/08/2015
- 05:52 PM Feature #1398: qa: multiclient file io test
- See #10489 for a related issue.
- 05:50 PM Feature #1398 (Fix Under Review): qa: multiclient file io test
- pull request #284 for wip-1398-multiclientio-wusui has been made.
commit: f93e531de6fa69a3ad32117c613094fc0aa0283e - 01:16 PM Feature #1398: qa: multiclient file io test
- changes look ok, although you can probably just use mnt.0 instead of the gmnt symlink.
also, make sure the -dev p... - 05:18 PM Subtask #10489: Mpi tests fail on both ceph-fuse and kclient
- Running teuthology using the following yaml demonstrates this problem:...
- 05:15 PM Subtask #10489: Mpi tests fail on both ceph-fuse and kclient
- Running teuthology using the following yaml demonstrates this problem:...
- 05:05 PM Subtask #10489 (New): Mpi tests fail on both ceph-fuse and kclient
- Mpi tests fail on ceph-fuse and kclient. I will post some tests and results that demonstrate this failure.
- 10:40 AM Bug #10041 (Resolved): ceph-fuse: never exit when no MDS server is available
- Merged into master as of commit:1b22857f2bc945ad2f5e60a2cc6f9be24d977079.
- 06:14 AM Bug #10449: MDS crashing in "C_SM_Save::finish" due to OSD response "-90 ((90) Message too long"
- ok, i have now a test running with all three kernel patches plus the one mentioned in #10450.
- 04:39 AM Bug #10449: MDS crashing in "C_SM_Save::finish" due to OSD response "-90 ((90) Message too long"
- looks like kernel bug, please try attached kernel patches
- 02:07 AM Bug #10449: MDS crashing in "C_SM_Save::finish" due to OSD response "-90 ((90) Message too long"
- Out of 8 crashes, there was only one incident with slow requests in the logs. I added the mds/mon logs for further in...
01/07/2015
- 06:59 PM Bug #10449: MDS crashing in "C_SM_Save::finish" due to OSD response "-90 ((90) Message too long"
- there are some file lock related fixes after 3.14. did you see something like below in ceph.log
slow request 121.3... - 12:28 AM Bug #10041 (Fix Under Review): ceph-fuse: never exit when no MDS server is available
01/06/2015
- 07:18 PM Feature #1398: qa: multiclient file io test
- Here's the story so far.
The teuthology tests for mpi testing used a yaml file with the following tasks:... - 07:08 PM Bug #10416: quota test failures
- the nondeterministic design is nightmare
- 11:00 AM Bug #10416: quota test failures
- It is not fixed already:
http://qa-proxy.ceph.com/teuthology/teuthology-2014-12-26_23:04:02-fs-master-testing-basic-... - 06:36 PM Bug #10412 (Resolved): samba: failed smbtorture run
- this bug was introduced after giant
- 08:18 AM Bug #10412 (Pending Backport): samba: failed smbtorture run
- 04:26 AM Bug #10412 (Fix Under Review): samba: failed smbtorture run
- 11:26 AM Bug #10465: Audit and fix ceph-qa-suite exec tasks
- Also, I noticed this because http://qa-proxy.ceph.com/teuthology/teuthology-2014-12-26_23:04:02-fs-master-testing-bas...
- 11:26 AM Bug #10465 (Resolved): Audit and fix ceph-qa-suite exec tasks
- suites/fs/basic/tasks/cfuse_workunit_suites_truncate_delay.yaml executes a dd and a truncate on a file in the local n...
- 11:03 AM Bug #10448: wrong parameter passed to ceph_zero_pape_vector_range() in striped_read() of fs/ceph/...
- Zheng, we've got a new failing directio test at http://qa-proxy.ceph.com/teuthology/teuthology-2014-12-28_23:08:01-kc...
- 10:42 AM Bug #10413: samba: coredumps after tests run
- That looks good to me (I guess?), but I don't think putting it into our own samba repo is the right place. IIRC we're...
- 07:46 AM Bug #10449: MDS crashing in "C_SM_Save::finish" due to OSD response "-90 ((90) Message too long"
- Regarding the stuck requests .. I have definitely seen one or two stuck mds requests. But I had then thought they wer...
- 05:31 AM Bug #10449: MDS crashing in "C_SM_Save::finish" due to OSD response "-90 ((90) Message too long"
- I think there was a stuck request, which prevented mds from trimming completed_requests
01/05/2015
- 11:35 PM Bug #10448: wrong parameter passed to ceph_zero_pape_vector_range() in striped_read() of fs/ceph/...
- push a fix to the testing branch
- 03:44 PM Bug #10448: wrong parameter passed to ceph_zero_pape_vector_range() in striped_read() of fs/ceph/...
- I don't think rbd uses anything under fs/ceph, so here's a category where Zheng and Sage are more likely to notice it.
- 09:03 PM Feature #1398: qa: multiclient file io test
- Here's what appears to be a bunch of problems:...
- 04:05 PM Feature #1398: qa: multiclient file io test
- Here's what looks to be the last issue:...
- 06:20 PM Bug #10413 (Fix Under Review): samba: coredumps after tests run
- https://github.com/ceph/samba/pull/1
- 05:48 AM Bug #10413 (In Progress): samba: coredumps after tests run
- smbd can call cephwrap_{getcwd,chdir,stat} after umount
- 03:49 PM Bug #10449: MDS crashing in "C_SM_Save::finish" due to OSD response "-90 ((90) Message too long"
- If this is two separate bugs I'd like to figure out if we can prevent the session_info_t from growing so large pretty...
- 03:54 AM Bug #10449: MDS crashing in "C_SM_Save::finish" due to OSD response "-90 ((90) Message too long"
- Funnily enough, I was just noticing this in the code while working on #9883 -- all the MDS table persistence code use...
- 01:30 AM Bug #10449 (Resolved): MDS crashing in "C_SM_Save::finish" due to OSD response "-90 ((90) Message...
- The relevant part of the log is the following:...
- 03:28 PM Bug #10436: ceph-fuse: snapshot flushing from page cache to Client is not coherent
- Just to be clear, the problem here is that
1) there is dirty data in the page cache
2) a snapshot happens
3) ceph-... - 03:26 PM Bug #10323: lock get stuck in snap->sync state
- https://github.com/ceph/ceph/pull/3270
- 03:26 PM Bug #10343: qa/workunits/snaps/snaptest-xattrwb.sh fails
- https://github.com/ceph/ceph/pull/3270
01/04/2015
- 11:38 PM Bug #10448 (Resolved): wrong parameter passed to ceph_zero_pape_vector_range() in striped_read() ...
- Hi, all
A bug is found in striped_read() of fs/ceph/file.c.
striped_read() calls ceph_zero_pape_vector_range() at...
12/28/2014
- 07:09 PM Bug #10436 (Resolved): ceph-fuse: snapshot flushing from page cache to Client is not coherent
- fuse kernel module has no understanding of snapshot. It does not flush data when snapshot is created. Besides the fus...
- 06:57 PM Bug #10312 (Fix Under Review): creating snapshot makes parent snapshot lost
- 06:55 PM Bug #10315 (Fix Under Review): set last snapid according to removed snaps in data pools
- 06:54 PM Bug #10323 (Fix Under Review): lock get stuck in snap->sync state
- 06:54 PM Bug #10343 (Fix Under Review): qa/workunits/snaps/snaptest-xattrwb.sh fails
- 06:11 PM Bug #10387 (Fix Under Review): ceph-fuse doesn't release capability on deleted directory
12/26/2014
- 05:01 AM Bug #10336 (Need More Info): hung ffsb test
- need log for diagnosing
- 04:23 AM Bug #10387 (In Progress): ceph-fuse doesn't release capability on deleted directory
12/23/2014
12/22/2014
- 06:25 PM Bug #10415 (Fix Under Review): libcephfs: test failures
- it was hang on umount...
- 09:38 AM Bug #10415 (Resolved): libcephfs: test failures
- hang on GetOsdCrushLocation test, with no core dumps:
http://pulpito.ceph.com/teuthology-2014-12-17_23:04:01-fs-mast... - 05:34 PM Bug #10417 (In Progress): snaptest-2.sh is failing
- 09:54 AM Bug #10417 (Resolved): snaptest-2.sh is failing
- All we've got is the teuthology log, whose last line is when it's deleting the directory tree (but not the snapshots)...
- 02:25 PM Bug #10423 (Closed): update hadoop gitbuilders
- They're building "branch-1", whatever that is, and so I suspect they aren't pointing at cephfs-hadoop. :(
- 09:58 AM Bug #10414 (Resolved): client: valgrind found uninit value used in conditional
- merged fix to master in 547ee783d44b8598bdf30aad191252aaecca0754
- 09:36 AM Bug #10414: client: valgrind found uninit value used in conditional
- http://pulpito.ceph.com/teuthology-2014-12-17_23:04:01-fs-master-testing-basic-multi/667065/
- 09:34 AM Bug #10414 (Resolved): client: valgrind found uninit value used in conditional
- http://pulpito.ceph.com/teuthology-2014-12-17_23:04:01-fs-master-testing-basic-multi/667063/...
- 09:49 AM Bug #10416 (Resolved): quota test failures
- The quota tests failed. Maybe this is fixed already?
http://pulpito.ceph.com/teuthology-2014-12-19_23:04:06-fs-mas... - 09:20 AM Bug #9994: ceph-qa-suite: nfs mount timeouts
- teuthology-2014-12-17_23:10:01-knfs-master-testing-basic-multi/667121/
http://pulpito.ceph.com/teuthology-2014-12-17... - 09:16 AM Bug #10413 (Resolved): samba: coredumps after tests run
- These tests had coredumps without corresponding failures in teuthology.log or backtraces in any of the logs. The clie...
- 09:08 AM Bug #10412 (Resolved): samba: failed smbtorture run
- http://pulpito.ceph.com/teuthology-2014-12-17_23:14:01-samba-master-testing-basic-multi/667136/
I'm not even sure ...
12/21/2014
- 04:53 PM Bug #10381: health HEALTH_WARN mds ceph239 is laggy
- Greg Farnum wrote:
> Can you provide the output of "ceph mds dump" and "ceph osd dump"?
>
> It looks like the MDS...
12/19/2014
- 02:44 PM Bug #10277 (Pending Backport): ceph-fuse: Consistent pjd failure in getcwd
- Let it cook for a bit in master before doing the giant backport.
- 11:37 AM Bug #10381: health HEALTH_WARN mds ceph239 is laggy
- Can you provide the output of "ceph mds dump" and "ceph osd dump"?
It looks like the MDS is trying to access a poo... - 09:31 AM Bug #10382: mds/MDS.cc: In function 'void MDS::heartbeat_reset()
- I tried to reproduce this today with debug_mds set to 10 and 20, but I wasn't able to reproduce it at that moment.
... - 07:00 AM Bug #10382 (In Progress): mds/MDS.cc: In function 'void MDS::heartbeat_reset()
- 07:28 AM Feature #10393 (Rejected): client: remove mount prefix shenanigans for quota
- Once we have a better subvolume abstraction in place, replace the ancestor opens that are needed by snaps with someth...
- 07:26 AM Feature #10392 (Rejected): mds: refactor subvolume vs snaprealm, capture quota trees
- Unify the subvolume abstraction that is used by snapshots and quotas. This will make the ownership of files by a quo...
- 06:53 AM Feature #10390 (Resolved): mds: throttle deletes in flight
- make sure the number of deletes due to purging strays are bounded in some way
- 05:30 AM Feature #10388 (Resolved): Add MDS perf counters for stray/purge status
- It would be useful to know how many inodes are currently in the stray folders, waiting to be deleted, and also how ma...
- 05:25 AM Bug #10387 (Resolved): ceph-fuse doesn't release capability on deleted directory
I thought this was a repeat of #10164, but then noticed that the dirfrag objects were deleted once the client unmou...
12/18/2014
- 11:56 PM Bug #10382 (Resolved): mds/MDS.cc: In function 'void MDS::heartbeat_reset()
- While running a Active/Standby set of MDSes I see this happen quite often when stopping the Active MDS:...
- 10:52 PM Bug #10381 (Resolved): health HEALTH_WARN mds ceph239 is laggy
- Hi there.
Today,I runned a script to do some test on my ceph cluster via a cephfs client,include dd/rm/cp files less... - 11:04 AM Fix #10377 (New): Impose lower bound on "mds events per segment"
Currently, if the MDS is started with a small mds_events_per_segment, it can go into a trim/rejournal cycle when fi...- 07:10 AM Feature #10369 (Resolved): qa-suite: detect unexpected MDS failovers and daemon crashes
- Currently some of our tests can be run with standby MDSs, and a failover event might occur without our tests noticing...
- 06:05 AM Bug #10368: Assertion in _trim_expired_segments
- Reproduce (took two tries to see the crash again, so it's intermittent)...
- 05:43 AM Bug #10368 (Resolved): Assertion in _trim_expired_segments
- ...
- 04:18 AM Bug #10361: teuthology: only direct admin socket commands to active MDS
- Filesystem::mds_asok would be a good place to put this
12/17/2014
- 10:33 PM Bug #10344 (In Progress): qa/workunits/snaps/snaptest-git-ceph.sh fails
- 06:58 PM Bug #10336: hung ffsb test
- Logs of passed tests also contains "opendir: No such file or directory"
- 06:49 PM Bug #10335 (Resolved): MDS: disallow flush_path and related commands if not active
- 03:34 PM Bug #10335 (Fix Under Review): MDS: disallow flush_path and related commands if not active
- https://github.com/ceph/ceph/pull/3198
- 03:19 PM Bug #10335 (In Progress): MDS: disallow flush_path and related commands if not active
- Oh duh, right, this is us hitting the standby MDS with a flush request. (Thus the mds.-1 up there.) I guess the admin...
- 03:18 PM Bug #10361 (Resolved): teuthology: only direct admin socket commands to active MDS
- Right now our flush and scrub asok command tests are directed at mds.a, but sometimes that's not the active MDS. Find...
- 02:15 PM Bug #10277: ceph-fuse: Consistent pjd failure in getcwd
- Needs test (or at least test description) and some performance numbers.
- 02:14 PM Bug #9997 (Resolved): test_client_pin case is failing
- Hum, I was thinking that we could backport the simple fix since most users will be on older kernels where it behaves ...
- 12:10 AM Bug #9997: test_client_pin case is failing
- The fix is buggy, we shouldn't backport it. we should use patches for #10277 instead
- 12:07 AM Bug #9997: test_client_pin case is failing
- https://github.com/ceph/ceph/pull/3191
- 02:11 PM Feature #7317 (Resolved): mds: behave with fs fills (e.g., allow deletion)
- Merged to master in commit:be7b2f8a30b12841d260c4ce486a59ed28953a4f
- 05:47 AM Bug #10343: qa/workunits/snaps/snaptest-xattrwb.sh fails
- only fuse-client fails
12/16/2014
- 11:27 PM Bug #10344 (Resolved): qa/workunits/snaps/snaptest-git-ceph.sh fails
- 11:26 PM Bug #10343 (Resolved): qa/workunits/snaps/snaptest-xattrwb.sh fails
- 01:43 PM Bug #10302 (Resolved): "fsync-tester.sh: line 10: lsof: command not found"
- 12:22 PM Bug #10316 (Won't Fix): pool quota no offect for cephfs client
- Oh, you mean you got to 1050MB instead of exactly 1GB.
Yeah, that's expected behavior; these are all soft limits. ... - 12:09 PM Bug #9997 (Pending Backport): test_client_pin case is failing
- Can we get a giant backport for this, please?
- 11:02 AM Bug #10336 (Can't reproduce): hung ffsb test
- http://qa-proxy.ceph.com/teuthology/teuthology-2014-12-14_23:08:01-kcephfs-next-testing-basic-multi/
It looks like... - 10:56 AM Bug #10335 (Resolved): MDS: disallow flush_path and related commands if not active
- ...
- 06:53 AM Bug #8576 (Resolved): teuthology: nfs tests failing on umount
- Haven't seen this since we made those changes!
- 01:43 AM Bug #10323 (Resolved): lock get stuck in snap->sync state
- ...
12/15/2014
- 07:19 PM Bug #10316: pool quota no offect for cephfs client
- I think the data is already put into Rados, because my ceph cluster and client are rebooted but my data is not lo...
- 11:32 AM Bug #10316: pool quota no offect for cephfs client
- I suspect this is just that you're writing into the local page cache (and ceph client userspace cache), rather than t...
- 06:48 AM Feature #9883 (In Progress): journal-tool: smarter scavenge (conditionally update dir objects)
12/14/2014
- 11:58 PM Bug #10316 (Won't Fix): pool quota no offect for cephfs client
- ceph version:0.87
ceph cluster os:redhat 6
mds num: 1
step:
(1) set pool 'data' max-byte limit is 1G
[root... - 11:47 PM Bug #10315 (Resolved): set last snapid according to removed snaps in data pools
- handle the case that we create cephfs using existing pools and the existing pools' removed snaps are not empty.
- 07:24 PM Bug #10312 (Resolved): creating snapshot makes parent snapshot lost
- [root@zhyan-kvm1 testdir]# mkdir dir1
[root@zhyan-kvm1 testdir]# cd dir1
[root@zhyan-kvm1 dir1]# mkdir dir2
[root@...
Also available in: Atom