Activity
From 01/07/2017 to 02/05/2017
02/05/2017
- 05:04 AM Feature #10792 (Fix Under Review): qa: enable thrasher for MDS cluster size (vary max_mds)
- PR: https://github.com/ceph/ceph/pull/13262
- 01:36 AM Bug #16881: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)
- This exact error popped up in a branch of mine (http://pulpito.ceph.com/gregf-2017-02-04_03:30:50-fs-wip-17594---basi...
02/04/2017
- 07:12 AM Bug #18816: MDS crashes with log disabled
- test
- 06:51 AM Bug #18816 (Resolved): MDS crashes with log disabled
- Note the "mds_log = false" below. If you do that, this happens:
Have crushed MDS daemon during executing different...
02/03/2017
- 05:30 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Cache dump is here:
https://knowledgenetworkbc-my.sharepoint.com/personal/darrelle_knowledge_ca/_layouts/15/guesta... - 07:24 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Does the mds cache dump need to be done while it is hung? It's a production system, so I wasn't able to leave it in a...
- 06:57 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- probably fixed by https://github.com/ceph/ceph-client/commit/10a2699426a732cbf3fc9e835187e8b914f0c61cy
- 01:40 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- please run 'ceph daemon mds.kb-ceph03 dump cache /tmp/cachedump.0' and upload /tmp/cachedump.0. Besides, please check...
- 12:30 PM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
- Comparing logs I noticed that MDS clock is ~30s behind client. ntpd was dead on one of test servers... Will try to co...
- 10:17 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
- Actually, one more thing: following this crash, the clients did not fail over to the standby MDS. Processes accessing...
- 09:57 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
- Excellent! Sorry that my search of the tracker didn't find that issue.
We'll apply that backport when it's ready. - 09:50 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
- duplicate of http://tracker.ceph.com/issues/18578. The fix is pending backport
- 09:29 AM Bug #18802 (New): Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc:...
- A user just did:...
02/02/2017
- 07:13 PM Bug #18798 (Resolved): FS activity hung, MDS reports client "failing to respond to capability rel...
- I've had two occurrences in the past 3 weeks where filesystem activity hangs, with the MDS report a client "failing t...
- 05:15 PM Bug #18797: valgrind jobs hanging in fs suite
- They're going dead during teardown when teuthology tries to list cephtest/ but something went wrong much earlier with...
- 05:08 PM Bug #18797: valgrind jobs hanging in fs suite
- Ignore the last passing link above, that was actually a hammer run (which shows up as fs-master for some reason).
... - 04:55 PM Bug #18797 (Duplicate): valgrind jobs hanging in fs suite
First one to show the issue was:
http://pulpito.ceph.com/teuthology-2017-01-28_17:15:01-fs-master---basic-smithi
...- 11:12 AM Bug #18754 (Fix Under Review): multimds: MDCache.cc: 8569: FAILED assert(!info.ancestors.empty())
- https://github.com/ceph/ceph/pull/13227/commits/3b899fb0c6153c30ad6ef782499f79ba5c7b2a22
- 09:53 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
- forgot to add a timeline:
08:08:29 mounted
08:09:54 iptables up
08:10:50 ls stuck
08:16:02 iptables down
08:17:5... - 08:25 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
- Attaching client and MDS logs. This time I was mounting from another server and firewalling both input and output to ...
- 04:10 AM Bug #18755 (Fix Under Review): multimds: MDCache.cc: 4735: FAILED assert(in)
- https://github.com/ceph/ceph/pull/13227/commits/097dc86b330392c4aa89f0033ce6e40528481857
02/01/2017
- 11:14 PM Bug #18757 (New): Jewel ceph-fuse does not recover after lost connection to MDS
- Hmm, now that I actually read the log (like a reasonable person :-)) it is a little bit strange that the server is se...
- 10:30 PM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
- Yes, I am reproducing it artificially after short network blip caused permanent mount point hangs of multiple servers...
- 07:59 PM Bug #18757 (Rejected): Jewel ceph-fuse does not recover after lost connection to MDS
- Clients which lose connectivity to the MDS are evicted after a timeout given by the "mds session timeout" setting. E...
- 12:26 PM Bug #18757 (Resolved): Jewel ceph-fuse does not recover after lost connection to MDS
- After ceph-fuse loses connection to MDS for few minutes, it does not recover - accessing mountpoint hangs processes.
... - 06:53 PM Feature #18763 (New): Ganesha instances should restart themselves when blacklisted
If I evict the client session that a ganesha instance was using, then the libcephfs instance will start to see sess...- 02:25 PM Bug #18759 (Fix Under Review): multimds suite tries to run norstats tests against kclient
- https://github.com/ceph/ceph/pull/13089
- 01:11 PM Bug #18759 (Resolved): multimds suite tries to run norstats tests against kclient
You get failures on the workunit like:
jspray-2017-02-01_02:51:18-multimds-wip-jcsp-testing-20170201-testing-basic...- 03:33 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
- 03:32 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
- ...
- 01:11 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
- There are actually 3 assertion failures (on 3 different MDS) in this run!...
- 01:07 AM Bug #18755 (Resolved): multimds: MDCache.cc: 4735: FAILED assert(in)
- ...
- 02:37 AM Bug #18717: multimds: FAILED assert(0 == "got export_cancel in weird state")
- 02:19 AM Bug #18717: multimds: FAILED assert(0 == "got export_cancel in weird state")
- ceph-mds.a.log...
- 12:55 AM Bug #18754 (Resolved): multimds: MDCache.cc: 8569: FAILED assert(!info.ancestors.empty())
- ...
01/31/2017
- 01:34 PM Bug #17801 (Resolved): Cleanly reject "session evict" command when in replay
- 01:34 PM Backport #18010 (Resolved): jewel: Cleanly reject "session evict" command when in replay
- 01:25 PM Bug #17193: truncate can cause unflushed snapshot data lose
- Re-enable test: https://github.com/ceph/ceph/pull/13200
- 01:17 PM Bug #17193 (Resolved): truncate can cause unflushed snapshot data lose
- 01:17 PM Backport #18103 (Resolved): jewel: truncate can cause unflushed snapshot data lose
- 01:16 PM Bug #18408 (Resolved): lookup of /.. in jewel returns -ENOENT
- 01:16 PM Backport #18413 (Resolved): jewel: lookup of /.. in jewel returns -ENOENT
- 01:15 PM Bug #18519 (Resolved): speed up readdir by skipping unwanted dn
- 01:15 PM Backport #18520 (Resolved): jewel: speed up readdir by skipping unwanted dn
- 01:14 PM Backport #18565 (Resolved): jewel: MDS crashes on missing metadata object
- 01:14 PM Backport #18551 (Resolved): jewel: ceph-fuse crash during snapshot tests
- 01:12 PM Backport #18282 (Resolved): jewel: monitor cannot start because of "FAILED assert(info.state == M...
- 01:12 PM Bug #18086 (Resolved): cephfs: fix missing ll_get for ll_walk
- 01:11 PM Backport #18195 (Resolved): jewel: cephfs: fix missing ll_get for ll_walk
- 01:11 PM Bug #17954 (Resolved): standby-replay daemons can sometimes miss events
- 01:10 PM Backport #18192 (Resolved): jewel: standby-replay daemons can sometimes miss events
- 01:01 PM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- Current status of lab cluster is:
* Fixed the "missing dirfrag object" damage with a script that removed the offe... - 11:59 AM Bug #18743 (Resolved): Scrub considers dirty backtraces to be damaged, puts in damage table even ...
Two things are wrong here:
* When running scrub_path /teuthology-archive on the lab filesystem, I get a flurry ...- 07:40 AM Bug #9935 (Resolved): client: segfault on ceph_rmdir path "/"
- 07:39 AM Backport #18611 (Resolved): jewel: client: segfault on ceph_rmdir path "/"
01/30/2017
- 07:28 PM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- Would that have caused ESTALE?
- 01:41 PM Bug #18730 (Closed): mds: backtrace issues getxattr for every file with cap on rejoin
In Server::handle_client_reconnect, a inode numbers that had client caps but were not in cache are passed into MDCa...
01/28/2017
- 11:17 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- Dan Mick wrote:
> [...]
teuthology-2016-12-18_02:01:14-rbd-master-distro-basic-smithi is not in root directory,... - 04:37 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- Tried them again tonight after repairing the broken stray object, and they worked this time. <shrug>
I guess the ...
01/27/2017
- 10:55 PM Bug #17236 (Resolved): MDS goes damaged on blacklist (failed to read JournalPointer: -108 ((108) ...
- 10:54 PM Bug #17832 (Resolved): "[ FAILED ] LibCephFS.InterProcessLocking" in jewel v10.2.4
- 10:53 PM Bug #17270 (Resolved): [cephfs] fuse client crash when adding a new osd
- 10:53 PM Bug #17275 (Resolved): MDS long-time blocked ops. ceph-fuse locks up with getattr of file
- 10:52 PM Bug #17611 (Resolved): mds: false "failing to respond to cache pressure" warning
- 10:51 PM Bug #17716 (Resolved): MDS: false "failing to respond to cache pressure" warning
- 10:51 PM Bug #18131 (Resolved): ceph-fuse not clearing setuid/setgid bits on chown
- 10:48 PM Bug #18361 (Resolved): Test failure: test_session_reject (tasks.cephfs.test_sessionmap.TestSessio...
- 06:56 PM Bug #18717 (Resolved): multimds: FAILED assert(0 == "got export_cancel in weird state")
- ...
- 06:31 PM Backport #18708 (Resolved): jewel: failed filelock.can_read(-1) assertion in Server::_dir_is_none...
- https://github.com/ceph/ceph/pull/13459
- 06:31 PM Backport #18707 (Resolved): kraken: failed filelock.can_read(-1) assertion in Server::_dir_is_non...
- https://github.com/ceph/ceph/pull/13555
- 06:31 PM Backport #18706 (Resolved): kraken: fragment space check can cause replayed request fail
- https://github.com/ceph/ceph/pull/14568
- 06:31 PM Backport #18705 (Resolved): jewel: fragment space check can cause replayed request fail
- https://github.com/ceph/ceph/pull/14668
- 04:44 PM Backport #18700: kraken: client: fix the cross-quota rename boundary check conditions
- master one: https://github.com/ceph/ceph/pull/12489
- 04:34 PM Backport #18700 (Resolved): kraken: client: fix the cross-quota rename boundary check conditions
- https://github.com/ceph/ceph/pull/14567
- 04:40 PM Bug #18660 (Pending Backport): fragment space check can cause replayed request fail
- 04:38 PM Bug #11124 (Resolved): MDSMonitor: refuse to do "fs new" on metadata pools containing objects
- 04:37 PM Bug #18578 (Pending Backport): failed filelock.can_read(-1) assertion in Server::_dir_is_nonempty
- 04:37 PM Backport #18699: jewel: client: fix the cross-quota rename boundary check conditions
- master one: https://github.com/ceph/ceph/pull/12489
- 04:34 PM Backport #18699 (Resolved): jewel: client: fix the cross-quota rename boundary check conditions
- https://github.com/ceph/ceph/pull/14667
- 01:18 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- They all return ESTALE. Not sure what else I need to be doing
- 01:07 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- ...
- 01:04 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- Duh, ls was fine:
ls -ld * | sort -n -k 5
drwxrwxr-x 1 1001 1001 18446744057908416832 Jan 23 02:23 teuthology-201... - 01:00 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- I would have sworn Greg directed me to try that, but perhaps we didn't include 'force'. Shrug. Thanks for the help....
01/26/2017
- 11:28 PM Bug #18691 (New): multimds thrash: FAILED assert(_head.empty())
- ...
- 05:30 PM Backport #18100 (In Progress): jewel: ceph-mon crashed after upgrade from hammer 0.94.7 to jewel ...
- Here it is: https://github.com/ceph/ceph/pull/13139
- 09:18 AM Backport #18100 (Need More Info): jewel: ceph-mon crashed after upgrade from hammer 0.94.7 to jew...
- @John - In http://tracker.ceph.com/issues/17837#note-18 you mentioned that you backported this to jewel. If you still...
- 04:57 PM Bug #18661 (Fix Under Review): Test failure: test_open_inode
- https://github.com/ceph/ceph/pull/13137
- 04:16 PM Bug #18661: Test failure: test_open_inode
- You're right, that piece of test code is racy :-/ Need to get the remote python code running inside open_background ...
- 09:14 AM Bug #18661: Test failure: test_open_inode
- 08:32 AM Bug #18661: Test failure: test_open_inode
- ...
- 09:23 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- 07:13 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- "ceph daemon mds.mira049 scrub_path / repair recursive force" will find and fix any other issue. But it will take ver...
- 07:06 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- Fixed by:
ceph daemon mds.mira049 scrub_path /teuthology-archive/sage-2016-11-12_02:26:45-rados-wip-sage-testing--... - 09:09 AM Backport #18192 (In Progress): jewel: standby-replay daemons can sometimes miss events
- 09:08 AM Backport #18195 (In Progress): jewel: cephfs: fix missing ll_get for ll_walk
- 09:03 AM Bug #18675 (Fix Under Review): client: during multimds thrashing FAILED assert(session->requests....
- https://github.com/ceph/ceph/pull/13124
- 09:03 AM Backport #18282 (In Progress): jewel: monitor cannot start because of "FAILED assert(info.state =...
- 06:28 AM Backport #18652 (Resolved): jewel: Test failure: test_session_reject (tasks.cephfs.test_sessionma...
- 05:49 AM Bug #18680 (Resolved): multimds: cluster can assign active mds beyond max_mds during failures
- From: http://pulpito.ceph.com/pdonnell-2017-01-25_22:42:21-multimds:thrash-wip-multimds-tests-testing-basic-mira/7483...
- 04:18 AM Backport #18551 (In Progress): jewel: ceph-fuse crash during snapshot tests
- 04:12 AM Backport #18565 (In Progress): jewel: MDS crashes on missing metadata object
01/25/2017
- 11:52 PM Backport #18678: kraken: failed to reconnect caps during snapshot tests
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-07_17:15:02-fs-master---basic-smithi/698957/
- 11:20 PM Backport #18678 (In Progress): kraken: failed to reconnect caps during snapshot tests
- https://github.com/ceph/ceph/pull/13112
- 11:18 PM Backport #18678 (Resolved): kraken: failed to reconnect caps during snapshot tests
- https://github.com/ceph/ceph/pull/13112
- 11:52 PM Backport #18679: jewel: failed to reconnect caps during snapshot tests
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-07_17:15:02-fs-master---basic-smithi/698957/
- 11:23 PM Backport #18679 (In Progress): jewel: failed to reconnect caps during snapshot tests
- https://github.com/ceph/ceph/pull/13113
- 11:22 PM Backport #18679 (Resolved): jewel: failed to reconnect caps during snapshot tests
- https://github.com/ceph/ceph/pull/13113
- 11:35 PM Bug #18574 (Resolved): cephfs test failures (ceph.com/qa is broken, should be download.ceph.com/qa)
- 11:34 PM Backport #18604 (Resolved): kraken: cephfs test failures (ceph.com/qa is broken, should be downlo...
- 11:32 PM Bug #18309 (Resolved): TestVolumeClient.test_evict_client failure creating pidfile
- 11:32 PM Backport #18439 (Resolved): kraken: TestVolumeClient.test_evict_client failure creating pidfile
- 11:31 PM Backport #18540 (Resolved): kraken: Test failure: test_session_reject (tasks.cephfs.test_sessionm...
- 11:29 PM Backport #18612 (Resolved): kraken: client: segfault on ceph_rmdir path "/"
- 11:28 PM Backport #18531 (Resolved): kraken: speed up readdir by skipping unwanted dn
- 11:27 PM Bug #18311 (Resolved): Decode errors on backtrace will crash MDS
- 11:26 PM Backport #18463 (Resolved): kraken: Decode errors on backtrace will crash MDS
- 10:11 PM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- I don't know how to repair this or even identify other instances.
- 06:57 PM Bug #16397: nfsd selinux denials causing knfs tests to fail
- I'll plan to leave this open for the next week or two and we can see if any failures crop up between now and then. If...
- 05:21 PM Bug #18675 (Resolved): client: during multimds thrashing FAILED assert(session->requests.empty())
- ...
- 05:07 PM Backport #17705: jewel: ceph_volume_client: recovery of partial auth update is broken
- The test commits from https://github.com/ceph/ceph-qa-suite/pull/1221 were moved into the main ceph/ceph.git PR after...
- 02:04 PM Backport #17705 (Resolved): jewel: ceph_volume_client: recovery of partial auth update is broken
- 03:49 PM Bug #18670 (Duplicate): cephfs get strange permission deny when playing with git.
- Aha, yes , remove the caps works.
Thanks for pointing me out such quick:) - 03:24 PM Bug #18670: cephfs get strange permission deny when playing with git.
- It is likely that you are hitting http://tracker.ceph.com/issues/17858
Try modifying your client's auth caps to re... - 03:12 PM Bug #18670 (Duplicate): cephfs get strange permission deny when playing with git.
- Can reproduce it really stable , via git pull <any_projct>. The failed file changed between each attempt, but always...
- 02:16 PM Bug #18646 (Fix Under Review): mds: rejoin_import_cap FAILED assert(session)
- 02:16 PM Bug #18646: mds: rejoin_import_cap FAILED assert(session)
- https://github.com/ceph/ceph/pull/12974/commits/eef9e568dcccc0bb84322527c45028d3dc275c6b
- 02:06 PM Backport #17974 (Resolved): jewel: ceph/Client segfaults in handle_mds_map when switching mds
- 02:06 PM Bug #17858 (Resolved): Cannot create deep directories when caps contain "path=/somepath"
- 02:05 PM Backport #18008 (Resolved): jewel: Cannot create deep directories when caps contain "path=/somepath"
- 02:05 PM Bug #17216 (Resolved): ceph_volume_client: recovery of partial auth update is broken
- 02:04 PM Backport #18615 (Resolved): jewel: segfault in handle_client_caps
- 02:04 PM Bug #17800 (Resolved): ceph_volume_client.py : Error: Can't handle arrays of non-strings
- 02:03 PM Backport #18026 (Resolved): jewel: ceph_volume_client.py : Error: Can't handle arrays of non-strings
- 02:03 PM Bug #17798 (Resolved): Clients without pool-changing caps shouldn't be allowed to change pool_nam...
- 02:03 PM Backport #17956 (Resolved): jewel: Clients without pool-changing caps shouldn't be allowed to cha...
- 02:02 PM Backport #18603 (Resolved): jewel: cephfs test failures (ceph.com/qa is broken, should be downloa...
- 01:59 PM Backport #18462 (Resolved): jewel: Decode errors on backtrace will crash MDS
- 12:32 PM Bug #18663 (Fix Under Review): teuthology teardown hangs if kclient umount fails
- https://github.com/ceph/ceph/pull/13099
- 12:09 PM Bug #18663 (Resolved): teuthology teardown hangs if kclient umount fails
http://qa-proxy.ceph.com/teuthology/jspray-2017-01-25_02:52:36-multimds-wip-jcsp-testing-20170124-testing-basic-smi...- 10:53 AM Bug #18662: TestClientLimits hang on teuthology teardown
- ...
- 08:53 AM Bug #18662 (New): TestClientLimits hang on teuthology teardown
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-16_17:25:01-kcephfs-master-testing-basic-mira/722735/teutholog...
- 08:43 AM Bug #18661 (Resolved): Test failure: test_open_inode
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-15_10:10:01-fs-jewel---basic-smithi/719557/
- 07:01 AM Bug #18660 (Fix Under Review): fragment space check can cause replayed request fail
- https://github.com/ceph/ceph/pull/13095
- 06:57 AM Bug #18660 (Resolved): fragment space check can cause replayed request fail
01/24/2017
- 11:36 PM Bug #18600 (Fix Under Review): multimds suite tries to run quota tests against kclient, fails
- https://github.com/ceph/ceph/pull/13089
- 10:09 PM Bug #16397 (Fix Under Review): nfsd selinux denials causing knfs tests to fail
- Patch to unpin from ubuntu, waiting for that test to go through https://github.com/ceph/ceph/pull/13088
- 10:04 PM Bug #16397: nfsd selinux denials causing knfs tests to fail
- ...
- 09:41 PM Feature #16219: test: smallfile benchmark tool
- Just saw this, I don't know teuthology but I wrote smallfile. Ben Turner in QE has automated Gluster performance reg...
- 04:14 PM Backport #18652 (In Progress): jewel: Test failure: test_session_reject (tasks.cephfs.test_sessio...
- 02:21 PM Backport #18652 (Fix Under Review): jewel: Test failure: test_session_reject (tasks.cephfs.test_s...
- https://github.com/ceph/ceph/pull/13085
- 02:18 PM Backport #18652 (Resolved): jewel: Test failure: test_session_reject (tasks.cephfs.test_sessionma...
- https://github.com/ceph/ceph/pull/13085
- 02:18 PM Bug #18361: Test failure: test_session_reject (tasks.cephfs.test_sessionmap.TestSessionMap)
- Oops, this needed backporting to jewel as well.
- 01:28 PM Bug #18646: mds: rejoin_import_cap FAILED assert(session)
- I've been thinking a bit about how we handle eviction in the multimds case, and whether we perhaps ought to centraliz...
- 06:40 AM Bug #18646: mds: rejoin_import_cap FAILED assert(session)
- Yes, it's reasonable. If client did not close the session volunteerly. It's likely the session was killed (due to tim...
- 04:43 AM Bug #18646 (Resolved): mds: rejoin_import_cap FAILED assert(session)
- ...
01/23/2017
- 09:25 PM Bug #18641 (Can't reproduce): mds: stalled clients apparently due to stale sessions
- 4/16 clients building the kernel with 2 active MDS blocked on IO. After digging into the ceph-fuse log, I found that ...
- 05:35 PM Backport #18615: jewel: segfault in handle_client_caps
- Thanks for the backport, Alexey!
- 05:34 PM Backport #18615 (In Progress): jewel: segfault in handle_client_caps
- 08:44 AM Backport #18615 (Fix Under Review): jewel: segfault in handle_client_caps
- 08:43 AM Backport #18615: jewel: segfault in handle_client_caps
- https://github.com/ceph/ceph/pull/13060
- 02:15 PM Bug #18600 (In Progress): multimds suite tries to run quota tests against kclient, fails
- 02:13 PM Feature #18537: libcephfs cache invalidation upcalls
- iirc, we still have the file handle cache, even if we disable caching in general, right?
Is it the case that we wo... - 11:54 AM Backport #18602 (Resolved): hammer: cephfs test failures (ceph.com/qa is broken, should be downlo...
- 09:58 AM Bug #17370: knfs ffsb hang on master
- new one http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-21_17:35:01-knfs-master-testing-basic-mira/736357/
... - 06:00 AM Bug #17594 (In Progress): cephfs: permission checking not working (MDS should enforce POSIX permi...
01/22/2017
- 06:20 AM Bug #16768 (Fix Under Review): multimds: check_rstat assertion failure
- https://github.com/ceph/ceph/pull/13052
01/20/2017
- 06:26 PM Bug #17531 (Resolved): mds fails to respawn if executable has changed
- 06:25 PM Bug #17670 (Resolved): multimds: mds entering up:replay and processing down mds aborts
- 06:24 PM Bug #17518 (Resolved): monitor assertion failure when deactivating mds in (invalid) fscid 0
- 05:36 PM Backport #18612 (In Progress): kraken: client: segfault on ceph_rmdir path "/"
- 03:43 PM Backport #18612 (Resolved): kraken: client: segfault on ceph_rmdir path "/"
- https://github.com/ceph/ceph/pull/13030
- 05:16 PM Backport #18611 (In Progress): jewel: client: segfault on ceph_rmdir path "/"
- 03:43 PM Backport #18611 (Resolved): jewel: client: segfault on ceph_rmdir path "/"
- https://github.com/ceph/ceph/pull/13029
- 05:03 PM Backport #18531 (In Progress): kraken: speed up readdir by skipping unwanted dn
- 05:02 PM Backport #18531: kraken: speed up readdir by skipping unwanted dn
- deleted description
- 03:42 PM Backport #18531 (New): kraken: speed up readdir by skipping unwanted dn
- 04:12 PM Backport #18604 (In Progress): kraken: cephfs test failures (ceph.com/qa is broken, should be dow...
- 03:41 PM Backport #18604 (Resolved): kraken: cephfs test failures (ceph.com/qa is broken, should be downlo...
- https://github.com/ceph/ceph/pull/13024
- 04:03 PM Backport #18603 (In Progress): jewel: cephfs test failures (ceph.com/qa is broken, should be down...
- 03:41 PM Backport #18603 (Resolved): jewel: cephfs test failures (ceph.com/qa is broken, should be downloa...
- https://github.com/ceph/ceph/pull/13023
- 03:52 PM Backport #18602 (In Progress): hammer: cephfs test failures (ceph.com/qa is broken, should be dow...
- 03:41 PM Backport #18602 (Resolved): hammer: cephfs test failures (ceph.com/qa is broken, should be downlo...
- https://github.com/ceph/ceph/pull/13022
- 03:45 PM Backport #18616 (Resolved): kraken: segfault in handle_client_caps
- https://github.com/ceph/ceph/pull/14566
- 03:45 PM Backport #18615 (Resolved): jewel: segfault in handle_client_caps
- https://github.com/ceph/ceph/pull/13060
- 03:12 PM Bug #18600 (Resolved): multimds suite tries to run quota tests against kclient, fails
- http://pulpito.ceph.com/jspray-2017-01-19_17:59:40-multimds-wip-jcsp-testing-20170119b-testing-basic-smithi/731274/
... - 02:29 PM Bug #18487 (Resolved): Crash in MDCache::split_dir -- FAILED assert(dir->is_auth())
- 02:24 PM Bug #16768: multimds: check_rstat assertion failure
- Hmm, well it didn't grab the logs for some reason but I did get the crashing MDS's log before the test tore down. It...
- 12:27 PM Bug #16768: multimds: check_rstat assertion failure
- I noticed that failure while the test was still stuck trying to unmount the kernel client, so I went in and killed th...
- 12:19 PM Bug #16768: multimds: check_rstat assertion failure
- Another instance:
http://qa-proxy.ceph.com/teuthology/jspray-2017-01-19_17:59:40-multimds-wip-jcsp-testing-20170119b... - 12:29 PM Bug #8090 (Duplicate): multimds: mds crash in check_rstats
- This is very similar to #16768 (which is happening in more recent runs), so I'm going to close this.
01/19/2017
- 08:16 AM Bug #9935 (Pending Backport): client: segfault on ceph_rmdir path "/"
- 08:10 AM Bug #8808 (Resolved): multimds: stale NFS file handle on delete
- https://github.com/ceph/ceph/commit/530bd6e67a0842733edff7ac036f18ed990788f9
- 07:52 AM Bug #4829: client: handling part of MClientForward incorrectly?
- I can't see why we shouldn't encode cap releases after receiving MClientForward. The old releases is for the old MDS,...
- 07:31 AM Bug #18487 (Fix Under Review): Crash in MDCache::split_dir -- FAILED assert(dir->is_auth())
- https://github.com/ceph/ceph/pull/12994
- 05:18 AM Bug #18589 (Duplicate): ceph_volume_client.py doesn't create enough mds caps
- Assuming that you encounted this issue with kernel client, the bug was http://tracker.ceph.com/issues/17191, which wa...
- 02:12 AM Bug #18578 (Fix Under Review): failed filelock.can_read(-1) assertion in Server::_dir_is_nonempty
- https://github.com/ceph/ceph/pull/12973
01/18/2017
- 06:37 PM Bug #18589: ceph_volume_client.py doesn't create enough mds caps
- This report applies to mounting with the kernel, not ceph-fuse, right? I think that makes this a kernel issue where i...
- 06:02 PM Bug #18589: ceph_volume_client.py doesn't create enough mds caps
- Pull request at https://github.com/ceph/ceph/pull/12985
- 05:48 PM Bug #18589 (Duplicate): ceph_volume_client.py doesn't create enough mds caps
- In _authorize_ceph() at https://github.com/ceph/ceph/blob/master/src/pybind/ceph_volume_client.py#L1032, the caps is ...
- 02:47 PM Bug #18579: Fuse client has "opening" session to nonexistent MDS rank after MDS cluster shrink
- I'll take this one.
- 09:23 AM Bug #18579: Fuse client has "opening" session to nonexistent MDS rank after MDS cluster shrink
- Logs in /home/jspray/18579 (should be world readable) on teuthology
- 08:58 AM Bug #18579 (Resolved): Fuse client has "opening" session to nonexistent MDS rank after MDS cluste...
- ...
- 08:30 AM Bug #18578 (Resolved): failed filelock.can_read(-1) assertion in Server::_dir_is_nonempty
- An user said he encountered this assertion while running vdbench
01/17/2017
- 05:38 PM Bug #18574 (Pending Backport): cephfs test failures (ceph.com/qa is broken, should be download.ce...
- 04:16 PM Bug #18574 (Fix Under Review): cephfs test failures (ceph.com/qa is broken, should be download.ce...
- https://github.com/ceph/ceph/pull/12964
- 04:14 PM Bug #18574 (Resolved): cephfs test failures (ceph.com/qa is broken, should be download.ceph.com/qa)
- Like http://tracker.ceph.com/issues/18542 but for the remaining references to ceph.com.
- 03:13 PM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- maybe there is a bad remote link in the directory
- 11:13 AM Bug #16842: mds: replacement MDS crashes on InoTable release
- Hmm, this one got lost somehow, targeting to Luminous so that it gets at least looked at.
- 08:35 AM Backport #18566 (Resolved): kraken: MDS crashes on missing metadata object
- https://github.com/ceph/ceph/pull/14565
- 08:35 AM Backport #18565 (Resolved): jewel: MDS crashes on missing metadata object
- https://github.com/ceph/ceph/pull/13119
- 08:35 AM Backport #18562 (Resolved): kraken: Test Failure: kcephfs test_client_recovery.TestClientRecovery
- https://github.com/ceph/ceph/pull/14564
- 08:34 AM Backport #18552 (Resolved): kraken: ceph-fuse crash during snapshot tests
- https://github.com/ceph/ceph/pull/14563
- 08:34 AM Backport #18551 (Resolved): jewel: ceph-fuse crash during snapshot tests
- https://github.com/ceph/ceph/pull/13120
01/16/2017
- 09:45 PM Backport #18540 (Fix Under Review): kraken: Test failure: test_session_reject (tasks.cephfs.test_...
- 09:45 PM Backport #18540 (Resolved): kraken: Test failure: test_session_reject (tasks.cephfs.test_sessionm...
- https://github.com/ceph/ceph/pull/12951
- 03:09 PM Bug #18363 (Can't reproduce): Test failure: test_ops_throttle (tasks.cephfs.test_strays.TestStrays)
- This appears to have been something intermittent, and the deletion code is going to change anyway so I'm going to ski...
- 03:08 PM Bug #18362 (Duplicate): Test failure: test_evict_client (tasks.cephfs.test_volume_client.TestVolu...
- 02:18 PM Feature #18537: libcephfs cache invalidation upcalls
- My intent in this experiment was, yes, that upcalls be synchronous. To be brief I'd say that my understanding was, if...
- 12:42 PM Feature #18537: libcephfs cache invalidation upcalls
- This really seems like the wrong approach to me. Is this callback going to be synchronous or async? Imagine you get a...
- 12:24 PM Feature #18537 (Rejected): libcephfs cache invalidation upcalls
- Matt Benjamin did some work in this area:
https://github.com/linuxbox2/nfs-ganesha/tree/ceph-invalidates
https://... - 12:24 PM Feature #18490: client: implement delegation support in userland cephfs
- Ah, I think I had (incorrectly) assumed that the work Matt did on invalidations before had been merged, but if that's...
- 12:19 PM Bug #18530: ceph tell mds prints warning about ms_handle_reset
- I think the OSD does do roughly the same sort of thing (hidden inside Objecter), so it might be instructive to look a...
- 12:15 PM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
- Without having looked into this in detail yet, my presumption would be that the bug is that the repair code isn't fix...
01/15/2017
- 03:39 PM Feature #17855 (Fix Under Review): Don't evict a slow client if it's the only client
- https://github.com/ceph/ceph/pull/12935
01/14/2017
- 01:33 AM Bug #18532 (New): mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirs...
- Somehow a path in the long-running cluster got a corrupted number of files/subdirs, and responds to "rm -rf" with "ca...
- 12:42 AM Backport #18531 (Resolved): kraken: speed up readdir by skipping unwanted dn
- https://github.com/ceph/ceph/pull/13028
- 12:28 AM Feature #18514: qa: don't use a node for each kclient
- Probably, though I'm not really sure. I just found https://github.com/ceph/ceph-qa-suite/pull/1156/ in the depths of ...
- 12:26 AM Bug #17193: truncate can cause unflushed snapshot data lose
- So do we think this is fixed or not? Need to undo https://github.com/ceph/ceph-qa-suite/pull/1156/commits/5f1abf9c310...
- 12:00 AM Bug #18530 (New): ceph tell mds prints warning about ms_handle_reset
- Client::ms_handle_reset logs at level 0, and ceph tell mds seems to always print two of those log messages. I *think...
01/13/2017
- 04:00 PM Feature #18490: client: implement delegation support in userland cephfs
- Matt B. also had some upcall/invalidate work that may be relevant here that he has in these branches:
https://gith... - 01:39 PM Backport #18520 (In Progress): jewel: speed up readdir by skipping unwanted dn
- 01:30 PM Backport #18520 (Resolved): jewel: speed up readdir by skipping unwanted dn
- https://github.com/ceph/ceph/pull/12921
- 01:29 PM Bug #18519: speed up readdir by skipping unwanted dn
- https://github.com/ceph/ceph/pull/12870
- 01:29 PM Bug #18519 (Resolved): speed up readdir by skipping unwanted dn
- we hit MDS CPU bottleneck (100% on one core as it is single thread) in our cephFS production enviroment.
Troublesh... - 11:46 AM Feature #18514: qa: don't use a node for each kclient
- What's the proposed solution here? To isolate the tests that require killing mounts in a directory with a different ...
- 07:07 AM Feature #18514 (Resolved): qa: don't use a node for each kclient
- https://github.com/ceph/ceph-qa-suite/pull/1156/commits/c5f6dfc14f47cca251dcac5c53f6369fd36ace1a
Right now each ke... - 11:24 AM Bug #18396 (Pending Backport): Test Failure: kcephfs test_client_recovery.TestClientRecovery
- 11:23 AM Bug #18306 (Pending Backport): segfault in handle_client_caps
- 11:23 AM Bug #18361 (Pending Backport): Test failure: test_session_reject (tasks.cephfs.test_sessionmap.Te...
- 11:22 AM Bug #18179 (Pending Backport): MDS crashes on missing metadata object
- 11:22 AM Bug #18460 (Pending Backport): ceph-fuse crash during snapshot tests
- 02:59 AM Feature #18513 (Resolved): MDS: scrub: forward scrub reports missing backtraces on new files as d...
- It appears that running a recursive repair scrub_path results in reports of MDS damage on files that are new enough t...
- 01:09 AM Feature #18509: MDS: damage reporting by ino number is useless
- Path string is certainly the one I was thinking of.
- 12:58 AM Feature #18509: MDS: damage reporting by ino number is useless
The log message reporting the path is still there:...
01/12/2017
- 11:30 PM Feature #18509 (Resolved): MDS: damage reporting by ino number is useless
- We had two damaged directories on the long-running cluster, but examining the directories in question (other than thr...
- 10:54 PM Feature #18490: client: implement delegation support in userland cephfs
- This is basically what we've discussed previously in this area. My main concern is just designing an interface that c...
- 10:51 PM Bug #18461: failed to reconnect caps during snapshot tests
01/11/2017
- 08:07 PM Feature #18490: client: implement delegation support in userland cephfs
- I've created an nfs-ganesha category to match our Samba category.
- 05:52 PM Feature #18490 (Resolved): client: implement delegation support in userland cephfs
- To properly implement NFSv4 delegations in ganesha, we need something that operates a little like Linux's fcntl(..., ...
- 05:49 PM Feature #11950 (Fix Under Review): Strays enqueued for purge cause MDCache to exceed size limit
- https://github.com/ceph/ceph/pull/12786
- 05:00 PM Feature #18489 (New): mds: Multi-MDS-aware dirfrag split/join test
- Similar to the existing dirfrag tests, but do some import/exporting of the resulting fragments so that they span mult...
- 01:04 PM Bug #18487 (Resolved): Crash in MDCache::split_dir -- FAILED assert(dir->is_auth())
- ...
- 12:40 PM Feature #18477: O_TMPFILE support in libcephfs
- Yeah, with Linux' O_TMPFILE you can definitely do I/O to the inode before it's linked, and I think it'd be good to mi...
- 11:49 AM Feature #18477: O_TMPFILE support in libcephfs
- I was assuming that when doing it ephemerally we would not be allowing any data IO operations on the inode until it w...
- 11:50 AM Feature #18483: Forward scrub ops are not in Op Tracker
- See also http://tracker.ceph.com/issues/17852
- 01:22 AM Feature #18483 (New): Forward scrub ops are not in Op Tracker
- We started a forward scrub on the LRC today to look for any busted rstats since we appear to have leaked data somewhe...
01/10/2017
- 10:38 PM Feature #18477: O_TMPFILE support in libcephfs
- I'm pretty skeptical that doing it ephemerally (without initially setting it up as a journaled stray) is a feasible s...
- 08:50 PM Feature #18477: O_TMPFILE support in libcephfs
- The stray would end up getting journaled, probably never written to backing store as long as the link operation came ...
- 07:19 PM Feature #18477: O_TMPFILE support in libcephfs
- I think it makes sense to optimize for the success case here. In most cases, the link will be successful and it'll en...
- 07:12 PM Feature #18477: O_TMPFILE support in libcephfs
- Main decision here is probably whether it should be a stray or some new mechanism.
Strays feel like overkill here ... - 05:39 PM Feature #18477 (New): O_TMPFILE support in libcephfs
- nfs-ganesha could make use of the ability to create a disconnected inode (pinned only by an open file descriptor) tha...
- 02:40 PM Feature #18475 (Resolved): qa: run xfstests in the nightlies
- We have manually run xfstests against ceph-fuse and kceph before, but apparently don't do so in the nightlies. Jeff r...
- 02:16 PM Support #16526 (Resolved): cephfs client side quotas - nfs-ganesha
- Yep, I think so.
- 01:53 PM Support #16526: cephfs client side quotas - nfs-ganesha
- In a Ganesha (V2.5-dev-6) and Ceph (latest Jewel) setup, I set `client quota = true` in the client section of the cep...
- 09:45 AM Bug #18460 (Fix Under Review): ceph-fuse crash during snapshot tests
- https://github.com/ceph/ceph/pull/12859
- 03:40 AM Bug #18461 (Fix Under Review): failed to reconnect caps during snapshot tests
- 03:40 AM Bug #18461: failed to reconnect caps during snapshot tests
- https://github.com/ceph/ceph/pull/12852
01/09/2017
- 01:58 PM Backport #18462 (In Progress): jewel: Decode errors on backtrace will crash MDS
- 11:21 AM Backport #18462 (Resolved): jewel: Decode errors on backtrace will crash MDS
- https://github.com/ceph/ceph/pull/12836
- 01:57 PM Backport #18463 (In Progress): kraken: Decode errors on backtrace will crash MDS
- 11:21 AM Backport #18463 (Resolved): kraken: Decode errors on backtrace will crash MDS
- https://github.com/ceph/ceph/pull/12835
- 01:02 PM Bug #18396 (Fix Under Review): Test Failure: kcephfs test_client_recovery.TestClientRecovery
- kernel_mount.py does implement force umount
https://github.com/ceph/ceph/pull/12833 - 08:36 AM Bug #18396: Test Failure: kcephfs test_client_recovery.TestClientRecovery
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-05_11:20:01-kcephfs-kraken-testing-basic-smithi/691532/
http:... - 11:04 AM Bug #18311 (Pending Backport): Decode errors on backtrace will crash MDS
- 08:31 AM Bug #18461 (Resolved): failed to reconnect caps during snapshot tests
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-07_17:15:02-fs-master---basic-smithi/698957/
- 07:47 AM Bug #18460 (Resolved): ceph-fuse crash during snapshot tests
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-05_11:10:02-fs-kraken---basic-smithi/691432/teuthology.log
...
01/08/2017
- 08:04 PM Bug #11124 (Fix Under Review): MDSMonitor: refuse to do "fs new" on metadata pools containing obj...
- https://github.com/ceph/ceph/pull/12825
Also available in: Atom