Project

General

Profile

Activity

From 11/20/2014 to 12/19/2014

12/19/2014

02:44 PM Bug #10277 (Pending Backport): ceph-fuse: Consistent pjd failure in getcwd
Let it cook for a bit in master before doing the giant backport. Greg Farnum
11:37 AM Bug #10381: health HEALTH_WARN mds ceph239 is laggy
Can you provide the output of "ceph mds dump" and "ceph osd dump"?
It looks like the MDS is trying to access a poo...
Greg Farnum
09:31 AM Bug #10382: mds/MDS.cc: In function 'void MDS::heartbeat_reset()
I tried to reproduce this today with debug_mds set to 10 and 20, but I wasn't able to reproduce it at that moment.
...
Wido den Hollander
07:00 AM Bug #10382 (In Progress): mds/MDS.cc: In function 'void MDS::heartbeat_reset()
John Spray
07:28 AM Feature #10393 (Rejected): client: remove mount prefix shenanigans for quota
Once we have a better subvolume abstraction in place, replace the ancestor opens that are needed by snaps with someth... Sage Weil
07:26 AM Feature #10392 (Rejected): mds: refactor subvolume vs snaprealm, capture quota trees
Unify the subvolume abstraction that is used by snapshots and quotas. This will make the ownership of files by a quo... Sage Weil
06:53 AM Feature #10390 (Resolved): mds: throttle deletes in flight
make sure the number of deletes due to purging strays are bounded in some way Sage Weil
05:30 AM Feature #10388 (Resolved): Add MDS perf counters for stray/purge status
It would be useful to know how many inodes are currently in the stray folders, waiting to be deleted, and also how ma... John Spray
05:25 AM Bug #10387 (Resolved): ceph-fuse doesn't release capability on deleted directory

I thought this was a repeat of #10164, but then noticed that the dirfrag objects were deleted once the client unmou...
John Spray

12/18/2014

11:56 PM Bug #10382 (Resolved): mds/MDS.cc: In function 'void MDS::heartbeat_reset()
While running a Active/Standby set of MDSes I see this happen quite often when stopping the Active MDS:... Wido den Hollander
10:52 PM Bug #10381 (Resolved): health HEALTH_WARN mds ceph239 is laggy
Hi there.
Today,I runned a script to do some test on my ceph cluster via a cephfs client,include dd/rm/cp files less...
science luo
11:04 AM Fix #10377 (New): Impose lower bound on "mds events per segment"

Currently, if the MDS is started with a small mds_events_per_segment, it can go into a trim/rejournal cycle when fi...
John Spray
07:10 AM Feature #10369 (Resolved): qa-suite: detect unexpected MDS failovers and daemon crashes
Currently some of our tests can be run with standby MDSs, and a failover event might occur without our tests noticing... John Spray
06:05 AM Bug #10368: Assertion in _trim_expired_segments
Reproduce (took two tries to see the crash again, so it's intermittent)... John Spray
05:43 AM Bug #10368 (Resolved): Assertion in _trim_expired_segments
... John Spray
04:18 AM Bug #10361: teuthology: only direct admin socket commands to active MDS
Filesystem::mds_asok would be a good place to put this John Spray

12/17/2014

10:33 PM Bug #10344 (In Progress): qa/workunits/snaps/snaptest-git-ceph.sh fails
Zheng Yan
06:58 PM Bug #10336: hung ffsb test
Logs of passed tests also contains "opendir: No such file or directory" Zheng Yan
06:49 PM Bug #10335 (Resolved): MDS: disallow flush_path and related commands if not active
Zheng Yan
03:34 PM Bug #10335 (Fix Under Review): MDS: disallow flush_path and related commands if not active
https://github.com/ceph/ceph/pull/3198 Greg Farnum
03:19 PM Bug #10335 (In Progress): MDS: disallow flush_path and related commands if not active
Oh duh, right, this is us hitting the standby MDS with a flush request. (Thus the mds.-1 up there.) I guess the admin... Greg Farnum
03:18 PM Bug #10361 (Resolved): teuthology: only direct admin socket commands to active MDS
Right now our flush and scrub asok command tests are directed at mds.a, but sometimes that's not the active MDS. Find... Greg Farnum
02:15 PM Bug #10277: ceph-fuse: Consistent pjd failure in getcwd
Needs test (or at least test description) and some performance numbers. Greg Farnum
02:14 PM Bug #9997 (Resolved): test_client_pin case is failing
Hum, I was thinking that we could backport the simple fix since most users will be on older kernels where it behaves ... Greg Farnum
12:10 AM Bug #9997: test_client_pin case is failing
The fix is buggy, we shouldn't backport it. we should use patches for #10277 instead Zheng Yan
12:07 AM Bug #9997: test_client_pin case is failing
https://github.com/ceph/ceph/pull/3191 Zheng Yan
02:11 PM Feature #7317 (Resolved): mds: behave with fs fills (e.g., allow deletion)
Merged to master in commit:be7b2f8a30b12841d260c4ce486a59ed28953a4f Greg Farnum
05:47 AM Bug #10343: qa/workunits/snaps/snaptest-xattrwb.sh fails
only fuse-client fails Zheng Yan

12/16/2014

11:27 PM Bug #10344 (Resolved): qa/workunits/snaps/snaptest-git-ceph.sh fails
Zheng Yan
11:26 PM Bug #10343 (Resolved): qa/workunits/snaps/snaptest-xattrwb.sh fails
Zheng Yan
01:43 PM Bug #10302 (Resolved): "fsync-tester.sh: line 10: lsof: command not found"
Sage Weil
12:22 PM Bug #10316 (Won't Fix): pool quota no offect for cephfs client
Oh, you mean you got to 1050MB instead of exactly 1GB.
Yeah, that's expected behavior; these are all soft limits. ...
Greg Farnum
12:09 PM Bug #9997 (Pending Backport): test_client_pin case is failing
Can we get a giant backport for this, please? Greg Farnum
11:02 AM Bug #10336 (Can't reproduce): hung ffsb test
http://qa-proxy.ceph.com/teuthology/teuthology-2014-12-14_23:08:01-kcephfs-next-testing-basic-multi/
It looks like...
Greg Farnum
10:56 AM Bug #10335 (Resolved): MDS: disallow flush_path and related commands if not active
... Greg Farnum
06:53 AM Bug #8576 (Resolved): teuthology: nfs tests failing on umount
Haven't seen this since we made those changes! Greg Farnum
01:43 AM Bug #10323 (Resolved): lock get stuck in snap->sync state
... Zheng Yan

12/15/2014

07:19 PM Bug #10316: pool quota no offect for cephfs client
I think the data is already put into Rados, because my ceph cluster and client are rebooted but my data is not lo... wei qiaomiao
11:32 AM Bug #10316: pool quota no offect for cephfs client
I suspect this is just that you're writing into the local page cache (and ceph client userspace cache), rather than t... Greg Farnum
06:48 AM Feature #9883 (In Progress): journal-tool: smarter scavenge (conditionally update dir objects)
John Spray

12/14/2014

11:58 PM Bug #10316 (Won't Fix): pool quota no offect for cephfs client
ceph version:0.87
ceph cluster os:redhat 6
mds num: 1
step:
(1) set pool 'data' max-byte limit is 1G
[root...
wei qiaomiao
11:47 PM Bug #10315 (Resolved): set last snapid according to removed snaps in data pools
handle the case that we create cephfs using existing pools and the existing pools' removed snaps are not empty. Zheng Yan
07:24 PM Bug #10312 (Resolved): creating snapshot makes parent snapshot lost
[root@zhyan-kvm1 testdir]# mkdir dir1
[root@zhyan-kvm1 testdir]# cd dir1
[root@zhyan-kvm1 dir1]# mkdir dir2
[root@...
Zheng Yan

12/11/2014

12:49 PM Bug #10302: "fsync-tester.sh: line 10: lsof: command not found"
LSOF is definitely being installed on the machines but it appears to be a $PATH issue. Dan and I talked about this a ... Sandon Van Ness
12:45 PM Bug #10302 (Resolved): "fsync-tester.sh: line 10: lsof: command not found"
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-12-09_13:44:11-powercycle-giant-distro-basic-multi/64... Yuri Weinstein
10:54 AM Bug #10288 (Resolved): ceph fs ls fails to list newly created fs
Greg Farnum
06:02 AM Bug #10288 (Fix Under Review): ceph fs ls fails to list newly created fs
https://github.com/ceph/ceph/pull/3150 (master)
https://github.com/ceph/ceph/pull/3151 (giant)
John Spray

12/10/2014

11:03 AM Bug #10164 (Resolved): Dirfrag objects for deleted dir not purged until MDS restart
Merged to master as of commit:24ca9f1c259d6222b54290bc4ea2030f0271af8f.
Test run at http://pulpito.ceph.com/gregf-20...
Greg Farnum

12/09/2014

10:24 PM Bug #10288: ceph fs ls fails to list newly created fs
This is probably going to be something obvious in the MDSMonitor. Greg Farnum
09:38 PM Bug #10288 (Resolved): ceph fs ls fails to list newly created fs
Hi!
After upgrading from .6 to .8 (giant current from ceph ubuntu packages), I wanted to play with CephFS. I foll...
Steve H.
06:14 PM Feature #1398: qa: multiclient file io test
Currently I am testing with the following yaml file.... Anonymous
01:49 PM Bug #10248: messenger: failed Pipe;:connect::assert(m) in Hadoop client
Hmm, the client only calls _closed_mds_session if:
1) it gets back a session close
2) the session goes stale
2a)...
Greg Farnum

12/08/2014

11:12 PM Bug #10277 (Fix Under Review): ceph-fuse: Consistent pjd failure in getcwd
Zheng Yan
02:58 PM Bug #10277 (Resolved): ceph-fuse: Consistent pjd failure in getcwd
"job-working-directory: error retrieving current directory: getcwd: cannot access parent directories: No such file or... Greg Farnum
03:03 PM Bug #10263 (Resolved): [ERR] bad backtrace on dir ino 600
They're all happy now, merged everything in. Greg Farnum
12:36 PM Bug #10263: [ERR] bad backtrace on dir ino 600
Merged in the patch for Giant as of commit:247a6fac54854e92a7df0e651e248a262d3efa05.
The others are a little unhap...
Greg Farnum
02:05 PM Bug #10248: messenger: failed Pipe;:connect::assert(m) in Hadoop client
the new assert for wip-10057 would trigger this.
this looks like a corner case is the session close + reopen seque...
Sage Weil

12/07/2014

05:50 PM Bug #10263 (Fix Under Review): [ERR] bad backtrace on dir ino 600
It's introduced by the 'verify backtrace on fetching dirfrag' patch. Stray directories of old fs has no backtrace, th... Zheng Yan

12/06/2014

05:36 PM Bug #10263 (Resolved): [ERR] bad backtrace on dir ino 600
ubuntu@teuthology:/a/sage-bug-10171-base/639742
and the other runs in this set. It's an upgrade test:...
Sage Weil

12/05/2014

05:48 PM Feature #1398: qa: multiclient file io test
The problem i believe is that we need to install ceph and make sure that we have some mount points before we run the ... Anonymous

12/04/2014

07:36 PM Bug #10229 (Resolved): Filer: lock inversion with Objecter
Zheng Yan
05:26 PM Feature #1398: qa: multiclient file io test
... Anonymous
10:37 AM Bug #10248 (New): messenger: failed Pipe;:connect::assert(m) in Hadoop client
We have logs and a core dump from the QA run: http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-30_23:12:01-hado... Greg Farnum

12/03/2014

09:31 PM Bug #10229: Filer: lock inversion with Objecter
Zheng Yan
10:40 AM Bug #10229 (Resolved): Filer: lock inversion with Objecter
Saw this on a next test (http://qa-proxy.ceph.com/teuthology/sage-2014-12-01_11:11:17-fs-next-distro-basic-multi/6289... Greg Farnum
05:55 PM Feature #1398: qa: multiclient file io test
Note to self:
Try: rbd import to create an image name, rbd resize the image, make sure reads return EOF at right...
Anonymous
06:56 AM Fix #10135 (Resolved): OSDMonitor: allow adding cache pools to cephfs pools already in use
26e8cf174b8e76b4282ce9d9c1af6ff12f5565a9 Greg Farnum
05:16 AM Bug #10164 (Fix Under Review): Dirfrag objects for deleted dir not purged until MDS restart
https://github.com/ceph/ceph/pull/3071 Zheng Yan

12/02/2014

06:57 AM Bug #9997 (Resolved): test_client_pin case is failing
Merged to next (https://github.com/ceph/ceph/pull/3056) John Spray
06:53 AM Bug #10217 (Resolved): old fuse should warn on flock
This works in master. Greg Farnum
06:19 AM Bug #10217: old fuse should warn on flock
yes, we need recent version of ceph-fuse and MDS. old version does not support interrupting flock Zheng Yan
03:37 AM Bug #10217 (Resolved): old fuse should warn on flock

Test failure: test_filelock (tasks.mds_client_recovery.TestClientRecovery):
http://pulpito.front.sepia.ceph.com/sa...
John Spray
03:46 AM Fix #10135 (Fix Under Review): OSDMonitor: allow adding cache pools to cephfs pools already in use
giant backport PR: https://github.com/ceph/ceph/pull/3055 John Spray
03:35 AM Bug #10151 (Resolved): mds client cache pressure health warning oscillates on/off
The version on next has a pass on client-limits (the one that exercises health): http://pulpito.front.sepia.ceph.com/... John Spray

12/01/2014

06:05 PM Fix #10135 (Pending Backport): OSDMonitor: allow adding cache pools to cephfs pools already in use
merged to next in commit:25fc21b837ba74bab2f6bc921c78fb3c43993cf5
This also should go into giant (I think Firefly ...
Greg Farnum
05:58 PM Bug #10011 (Resolved): Journaler: failed on shutdown or EBLACKLISTED
giant commit:65f6814847fe8644f5d77a9021fbf13043b76dbe Greg Farnum
06:37 AM Bug #10011 (Fix Under Review): Journaler: failed on shutdown or EBLACKLISTED
Haven't seen any failures around this, let's backport to giant: https://github.com/ceph/ceph/pull/3047 John Spray
06:59 AM Bug #10164 (In Progress): Dirfrag objects for deleted dir not purged until MDS restart
Zheng: assigning to you since you mentioned you were working on it John Spray
06:34 AM Bug #9997 (Fix Under Review): test_client_pin case is failing
https://github.com/ceph/ceph/pull/3045 John Spray
04:42 AM Bug #9994: ceph-qa-suite: nfs mount timeouts
http://pulpito.ceph.com/teuthology-2014-11-23_23:10:01-knfs-next-testing-basic-multi/617093/
http://pulpito.ceph.com...
John Spray
04:20 AM Feature #9881 (Resolved): mds: admin command to flush the mds journal
Merged to master (forgot the Fixes:, doh)... John Spray

11/26/2014

10:26 PM Bug #9997: test_client_pin case is failing
For 3.18+ kernel, I think we can iterate the all dir inodes and invalidate dentry one by one. Zheng Yan
12:19 AM Bug #9997: test_client_pin case is failing
yes, I think it caused by the d_invalidate change. In 3.18-rc kernel, d_invalidate() unhash dentry regardless if the... Zheng Yan

11/25/2014

09:27 AM Fix #10135 (Fix Under Review): OSDMonitor: allow adding cache pools to cephfs pools already in use
https://github.com/ceph/ceph/pull/3008 John Spray
04:42 AM Bug #9997: test_client_pin case is failing
After much head scratching and log examination, this appears to be a kernel regression (assuming our behaviour was va... John Spray

11/24/2014

05:19 PM Bug #10151 (Pending Backport): mds client cache pressure health warning oscillates on/off
Merged to master as of commit:aa4d1478647ce416e9cf4e8fcd32411230639f40. I like to let things go through testing befor... Greg Farnum
09:20 AM Bug #10151: mds client cache pressure health warning oscillates on/off
Opened PR against master instead of next by mistake. Next PR is https://github.com/ceph/ceph/pull/2996 John Spray
03:16 AM Bug #10151 (Fix Under Review): mds client cache pressure health warning oscillates on/off
master: https://github.com/ceph/ceph/pull/2989
giant: https://github.com/ceph/ceph/pull/2990
John Spray
09:07 AM Bug #9997 (In Progress): test_client_pin case is failing
John Spray

11/23/2014

08:30 PM Bug #9997: test_client_pin case is failing
http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-16_23:04:01-fs-next-testing-basic-multi/603971/ Greg Farnum

11/21/2014

12:34 PM Bug #9674 (Resolved): nightly failed multiple_rsync.sh
I haven't seen this fail since then, hurray. Greg Farnum
06:39 AM Bug #10151 (In Progress): mds client cache pressure health warning oscillates on/off
Reproduced this locally by just allowing 3 mons in a vstart cluster and following the procedure from the mds_client_l... John Spray
12:50 AM Bug #10151: mds client cache pressure health warning oscillates on/off
Yes -- the leader is reporting the health warning but the peons are not.
The warning is "Client 2922132 failing to...
John Spray
06:34 AM Fix #10135: OSDMonitor: allow adding cache pools to cephfs pools already in use
Yeah, we didn't think about this first time around because the focus was on cache tiers to EC pools, but it would mak... John Spray
03:58 AM Bug #10164: Dirfrag objects for deleted dir not purged until MDS restart
Alternatively less contrived way to see the issue: just do a loop of "cp -r /etc . ; rm -rf ./etc" in a filesystem mo... John Spray
03:14 AM Bug #10164 (Resolved): Dirfrag objects for deleted dir not purged until MDS restart

Seen while playing with the #9881 flush functionality: the dirfrag objects for deleted directories are never cleane...
John Spray

11/20/2014

09:59 AM Bug #10151 (Resolved): mds client cache pressure health warning oscillates on/off
seeing this on lab cluster. not sure if it is a problem in the mds health reporting or the mon, but it goes on and o... Sage Weil
 

Also available in: Atom