Activity
From 08/13/2015 to 09/11/2015
09/11/2015
- 04:01 PM Support #13055: Problem with disconnect fuse by mds
- when "transported point is not connected" happen, could you check if the ceph-fuse process still exists. If not, run ...
- 02:35 PM Support #13055: Problem with disconnect fuse by mds
- and about high load... i've checked mds nodes for memory overload or some other physical overload - didnt find anythi...
- 02:31 PM Support #13055: Problem with disconnect fuse by mds
- Zheng Yan wrote:
Can you find any clue in ceph-fuse.log?
there is no ceph-fuse.log on clients...
fuse have same... - 02:26 PM Support #13055: Problem with disconnect fuse by mds
- dmesg and other clients logs are empty...
i have 5 clients with mounted to ceph.
3 of them are using more often(ma... - 02:20 PM Support #13055: Problem with disconnect fuse by mds
- "Transport endpoint is not connected" almost always means that the ceph-fuse client process has gone away. Check dmes...
- 02:20 PM Support #13055: Problem with disconnect fuse by mds
- Zheng Yan,
yes,as i pasted earlier in my config, i have changed it to mds_session_autoclose = 60 - 02:19 PM Support #13055: Problem with disconnect fuse by mds
- Sergey Mir wrote:
> fuse is just disconnecting from ceph and says "transported point is not connected" and thats all... - 02:14 PM Support #13055: Problem with disconnect fuse by mds
- >> 2015-09-10 13:11:17.068839 mds.0 [INF] closing stale session client.343916 (client_ip):0/8623 after 64.626196
MDS... - 01:59 PM Support #13055: Problem with disconnect fuse by mds
- fuse is just disconnecting from ceph and says "transported point is not connected" and thats all, no logs or some oth...
- 01:41 PM Support #13055: Problem with disconnect fuse by mds
- We need to know how the system is misbehaving -- not just to see log snippets.
What is the symptom of the problem ... - 12:59 PM Support #13055: Problem with disconnect fuse by mds
- Need some solution to solve that problem. it drops again...
2015-09-11 15:42:57.450895 7fb118d06700 0 -- (mds00/m... - 11:09 AM Support #13055: Problem with disconnect fuse by mds
- fuse loose connections few times a day with output in logs as i pasted at my previous message from few different log ...
- 10:43 AM Support #13055: Problem with disconnect fuse by mds
- The log snippets you've pasted don't make much sense — you have some entries going back in time, and it looks like th...
- 10:25 AM Support #13055 (Closed): Problem with disconnect fuse by mds
- Hello. Nobody knows answer in hex chat(ceph) about, so im asking advice here.
i have ceph version 0.94.3 (95cefea9fd... - 01:37 PM Backport #13044 (In Progress): LibCephFS.GetPoolId failure
- 09:06 AM Backport #13044 (Resolved): LibCephFS.GetPoolId failure
- https://github.com/ceph/ceph/pull/5887
- 01:19 PM Bug #12971 (In Progress): TestQuotaFull fails
- Looks like this commit might have broken it:...
- 02:03 AM Bug #12806: nfs restart failures
09/10/2015
- 11:16 AM Bug #12209: CephFS should have a complete timeout mechanism to avoid endless waiting or unpredict...
- Yes. I'm also not very comfortable with the patches that I've looked at, but it's been a while (they have been langui...
- 11:08 AM Bug #12209: CephFS should have a complete timeout mechanism to avoid endless waiting or unpredict...
- This all sounds a bit strange. The designed behaviour is to block on slow metadata operations, not time out. How ma...
- 11:05 AM Bug #12971: TestQuotaFull fails
- Greg's theory sounds plausible, will look
09/09/2015
- 12:53 PM Backport #12499 (Resolved): ceph-fuse 0.94.2-1trusty segfaults / aborts
- 12:33 PM Bug #12598 (Pending Backport): LibCephFS.GetPoolId failure
- 12:28 PM Bug #13011 (Duplicate): LibCephFS.ReleaseMounted: FAILED assert(crypto_context != __null)
- 12:23 PM Bug #13011 (Duplicate): LibCephFS.ReleaseMounted: FAILED assert(crypto_context != __null)
- This happened on a hammer branch with one CephFS related backport ( https://github.com/ceph/ceph/commit/256620e37fd94...
- 12:10 PM Bug #12875: LibCephFS.LibCephFS.InterProcessLocking segment fault.
- This might be a consequence of the wip-mds-caps branch, and will get more testing in there.
- 12:02 PM Bug #12971: TestQuotaFull fails
- That's not quite the problem. The MDS calls Objecter::unset_honor_osdmap_full(), at which point the Objecter should l...
09/08/2015
- 01:40 PM Bug #12820 (Resolved): stuck looping on 'ls /sys/fs/fuse/connections'
- 01:39 PM Bug #12875 (Can't reproduce): LibCephFS.LibCephFS.InterProcessLocking segment fault.
- 08:09 AM Bug #12875 (Need More Info): LibCephFS.LibCephFS.InterProcessLocking segment fault.
- The lockdep errors are side effect of previous failiure....
- 09:23 AM Bug #12971: TestQuotaFull fails
- the objecter pauses all write osd request (including delete request) when cluster is full or pool-quota is reached. T...
- 08:29 AM Bug #12674 (Need More Info): Semi-reproducible crash of ceph-fuse
09/07/2015
- 10:53 AM Bug #12732 (Resolved): very slow read when a file has holes.
- https://github.com/ceph/ceph-client/commit/3e8b3d8cbf92aba5485e68bc5cba6eee2075ee71
- 10:33 AM Bug #12895: Failure in TestClusterFull.test_barrier
- ...
- 08:38 AM Bug #12971: TestQuotaFull fails
- ...
09/06/2015
- 03:25 PM Bug #12417 (Resolved): segfault launching ceph-fuse with bad --name
- 03:25 PM Backport #12500 (Resolved): segfault launching ceph-fuse with bad --name
- 10:02 AM Bug #12971 (Resolved): TestQuotaFull fails
- http://qa-proxy.ceph.com/teuthology/teuthology-2015-09-01_23:04:01-fs-infernalis---basic-multi/1041649/teuthology.log...
- 09:59 AM Bug #12896 (Rejected): EIO in multiple_rsync.sh
- John Spray wrote:
> Looking at the paths, rsync *seems* to be complaining about the local files (in /tmp). Maybe it...
09/04/2015
- 03:23 PM Bug #12822 (Resolved): ceph-fuse crash in test_client_recovery
- sorry for clobbering this - restored the state
- 03:18 PM Bug #12822 (In Progress): ceph-fuse crash in test_client_recovery
09/03/2015
- 07:10 PM Bug #12909 (Resolved): cmake: client/fuse_ll.cc can't locate fuse_lowlevel.h
- 11:05 AM Bug #11482: kclient: intermittent log warnings "client.XXXX isn't responding to mclientcaps(revoke)"
- Okay, added debug mds 20 to the other kcephfs subsuites so we should be good next time we see it for jobs scheduled a...
09/02/2015
- 03:13 PM Bug #12896: EIO in multiple_rsync.sh
- Looking at the paths, rsync *seems* to be complaining about the local files (in /tmp). Maybe it's just a bad test node.
- 01:21 PM Bug #12896: EIO in multiple_rsync.sh
- this failure looks weired. there is no ll_read entry in the client log, there are ll_write enties, but no error.
- 10:42 AM Bug #11746 (Resolved): cephfs Dumper tries to load whole journal into memory at once
- 10:35 AM Backport #12098 (Resolved): kernel_untar_build fails on EL7
- 10:35 AM Backport #11999 (Resolved): cephfs Dumper tries to load whole journal into memory at once
- 08:27 AM Backport #12590 (In Progress): "ceph mds add_data_pool" check for EC pool is wrong
09/01/2015
- 06:03 PM Bug #12909 (Resolved): cmake: client/fuse_ll.cc can't locate fuse_lowlevel.h
- "Commit f064e90ae554b64741284ef1cdf8a00bb7b4a312":https://github.com/ceph/ceph/commit/f064e90ae554b64741284ef1cdf8a00...
- 04:23 PM Bug #12820: stuck looping on 'ls /sys/fs/fuse/connections'
- 10:33 AM Bug #12820 (Fix Under Review): stuck looping on 'ls /sys/fs/fuse/connections'
- https://github.com/ceph/ceph-qa-suite/pull/551
- 12:30 PM Bug #12776 (Fix Under Review): qa: standby MDS not shutting down, "reached maximum tries (50) aft...
- https://github.com/ceph/ceph/pull/5739
- 11:22 AM Bug #12896 (Rejected): EIO in multiple_rsync.sh
http://pulpito.ceph.com/teuthology-2015-08-28_23:04:01-fs-master---basic-multi/1037227/...- 11:12 AM Bug #12895 (Can't reproduce): Failure in TestClusterFull.test_barrier
teuthology-2015-08-24_23:04:02-fs-master---basic-multi/1030586/
08/31/2015
- 02:58 PM Bug #12875 (Can't reproduce): LibCephFS.LibCephFS.InterProcessLocking segment fault.
- ...
- 02:29 PM Bug #12806 (Fix Under Review): nfs restart failures
- https://github.com/ceph/ceph-qa-suite/pull/550
- 09:16 AM Bug #12806: nfs restart failures
- looks like it's added by ceph-qa-suite/tasks/qemu.py.
- 12:53 PM Bug #11482: kclient: intermittent log warnings "client.XXXX isn't responding to mclientcaps(revoke)"
- For some reason the override we provided isn't being added to the configs, I created #12869 for that. :/
08/29/2015
08/28/2015
- 01:13 PM Feature #10369: qa-suite: detect unexpected MDS failovers and daemon crashes
- We just keep re-creating this feature: #12821
- 01:13 PM Bug #12821 (Duplicate): mds_thrasher: handle MDSes failing on startup
- #10369
- 01:11 PM Bug #12821: mds_thrasher: handle MDSes failing on startup
- This is kind of a special case of http://tracker.ceph.com/issues/10369 -- there is a more general need for something ...
- 11:31 AM Bug #12821 (Duplicate): mds_thrasher: handle MDSes failing on startup
- http://pulpito.ceph.com/teuthology-2015-08-21_23:04:01-fs-master---basic-multi/1026045/...
- 01:02 PM Bug #12822 (Resolved): ceph-fuse crash in test_client_recovery
- If we actually see that it's crashing with the timeout we can reopen this.
- 12:58 PM Bug #12822: ceph-fuse crash in test_client_recovery
- I would expect us to see a backtrace from ceph-fuse stderr in the case of an actual crash. Seems more like the clien...
- 11:53 AM Bug #12822 (Resolved): ceph-fuse crash in test_client_recovery
- http://pulpito.ceph.com/teuthology-2015-08-17_23:04:01-fs-master---basic-multi/1020395/
Sadly there are absolutely... - 12:12 PM Feature #12823 (Rejected): cephfs_test_runner: print test names when executing them
- They're already logged, like this:...
- 11:57 AM Feature #12823 (Rejected): cephfs_test_runner: print test names when executing them
- It looks like we don't print out test names. When running through a whole suite that makes telling where we are in a ...
- 12:10 PM Bug #12776: qa: standby MDS not shutting down, "reached maximum tries (50) after waiting for 300 ...
- Actually, I just tried sending SIGTERM to a standby mds here, and it's getting stuck too.
- 12:08 PM Bug #12776: qa: standby MDS not shutting down, "reached maximum tries (50) after waiting for 300 ...
- It's getting the signal, but not making it through shutdown:...
- 11:48 AM Bug #12820: stuck looping on 'ls /sys/fs/fuse/connections'
- http://pulpito.ceph.com/teuthology-2015-08-17_23:04:01-fs-master---basic-multi/1020354/
- 11:24 AM Bug #12820 (Resolved): stuck looping on 'ls /sys/fs/fuse/connections'
- http://pulpito.ceph.com/teuthology-2015-08-21_23:04:01-fs-master---basic-multi/1025967/...
- 11:47 AM Bug #12612 (Can't reproduce): fuse jobs fail to start on centos7
- Okay, haven't seen this particular one again, just the new one #12820.
- 10:11 AM Bug #12808: smbtorture failure on scan-pipe
- It doesn't look like that in the error logs to me; I think it just failed to allocate — but perhaps I'm misreading th...
- 02:19 AM Bug #12808: smbtorture failure on scan-pipe
- this test case makes smbd allocate tens of GBs memory. maybe smbd got killed during the test
- 10:04 AM Bug #12806: nfs restart failures
- Zheng, do we have any idea how the machines are getting into that duplicated export state? It looks pretty clear that...
- 01:35 AM Bug #12806 (Resolved): nfs restart failures
- 01:35 AM Bug #12806: nfs restart failures
- ...
- 09:38 AM Bug #12657 (Can't reproduce): Failure in TestStrays.test_ops_throttle
- Hmm, now this looks like a teuthology burp. The stats polling is meant to happen every second, but in this instance ...
- 08:18 AM Bug #12777: qa: leftover files in cephtest directory
- merged to next and master.
- 01:56 AM Bug #12777: qa: leftover files in cephtest directory
08/27/2015
- 04:55 PM Bug #12777 (Fix Under Review): qa: leftover files in cephtest directory
Oops, this is CephFSTestCase.tearDown not getting called (so there's still a client mount, so the dir can't be remo...- 01:34 PM Bug #12777: qa: leftover files in cephtest directory
- This shows up in the logs as 'rmdir -- /home/ubuntu/cephtest'
- 04:20 PM Bug #12806: nfs restart failures
- iirc starting the nfs service on el7 depends on rpcbind, nfs-lock already being up, so that might be what's missing here
- 01:19 PM Bug #12806 (Resolved): nfs restart failures
- ...
- 01:46 PM Bug #12808 (New): smbtorture failure on scan-pipe
- Log summary line: "Command failed on burnupi59 with status 137: 'TESTDIR=/home/ubuntu/cephtest bash -s'"...
- 01:35 PM Bug #12807 (Duplicate): rmdir cephtest failing
- #12777
- 01:30 PM Bug #12807 (Duplicate): rmdir cephtest failing
- http://pulpito.ceph.com/teuthology-2015-08-22_23:04:02-fs-next---basic-multi/1027445/
http://pulpito.ceph.com/teutho...
08/25/2015
- 02:00 PM Bug #11789: knfs mount fails with "getfh failed: Function not implemented"
- We think this is the NFS kernel module not being loaded. If this is still happening we need to figure out why.
- 01:58 PM Bug #12653 (Fix Under Review): fuse mounted file systems fails SAMBA CTDB ping_pong rw test with ...
- 01:56 PM Bug #11783 (Fix Under Review): protocol: flushing caps on MDS restart can go bad
- 01:52 PM Bug #11784 (Can't reproduce): ceph-fuse hang on unmount (stuck dentry refs)
- 01:51 PM Bug #9994: ceph-qa-suite: nfs mount timeouts
- 01:51 PM Bug #12365 (Resolved): kcephfs: hang on umount
- Haven't seen this since then.
- 12:59 PM Bug #12777: qa: leftover files in cephtest directory
- http://pulpito.ceph.com/teuthology-2015-08-21_23:04:01-fs-master---basic-multi/1026047/
- 12:13 PM Bug #12777 (Resolved): qa: leftover files in cephtest directory
- http://pulpito.ceph.com/teuthology-2015-08-17_23:04:01-fs-master---basic-multi/1020426/
http://pulpito.ceph.com/teut... - 12:33 PM Bug #11482: kclient: intermittent log warnings "client.XXXX isn't responding to mclientcaps(revoke)"
- http://pulpito.ceph.com/teuthology-2015-08-17_23:08:05-kcephfs-master-testing-basic-multi/1020518/
Doesn't have th... - 12:10 PM Bug #12776 (Resolved): qa: standby MDS not shutting down, "reached maximum tries (50) after waiti...
- http://pulpito.ceph.com/teuthology-2015-08-17_23:04:01-fs-master---basic-multi/1020415/
The standby MDS doesn't lo...
08/24/2015
- 06:45 PM Bug #12715 (Resolved): "[ERR] bad backtrace on dir ino 600" in cluster log"
- 06:42 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
- https://github.com/ceph/ceph-qa-suite/pull/539
- 06:36 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
- whitelist "bad backtrace on dir ino" warning message (per irc chat with Greg)
- 05:30 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
- Also in run:
http://pulpito.ceph.com/teuthology-2015-08-21_08:42:54-upgrade:firefly-x-hammer-distro-basic-vps/
Jobs... - 03:37 AM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
- the test uses 0.80.4 (7c241cfaa6c8c068bc9da8578ca00b9f4fc7567f). the newest firefly include the fix (commit a5970963)
- 01:22 PM Bug #12710 (Resolved): fsstress.sh fails
- 01:22 PM Bug #12709 (Resolved): hammer chmod.sh fails
- fixed by commit 81a311a744987564b70852fdacfd915523c73b5d
- 01:21 PM Bug #12711 (Resolved): mds get damaged
- 08:36 AM Bug #12676 (Resolved): MDSMap assertion in MDCache::trim (multimds)
- ...
- 08:33 AM Bug #12321 (Can't reproduce): MDS crash when try to connect clients
- 03:52 AM Bug #12321: MDS crash when try to connect clients
- John Spray wrote:
> Hi, do you have any updates for us on this? If the system is unavailable for any more debug the... - 06:30 AM Bug #12753 (Resolved): cls_cephfs_client encodes time_t directly
- fixed by 1213dde3d207d0d91ccecfca4dd6af1bdee0ed65
08/21/2015
- 09:40 PM Bug #12753 (Resolved): cls_cephfs_client encodes time_t directly
- fail to build on i386.
we should never encode time_t directly.. cast to uint32_t or uint64_t so it is sized explic... - 08:54 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
- Zheng, Sage's mentioned that this may have been fixed by you, can you take a look?
- 08:19 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
- This is an old bug, right? We should just whitelist this?
- 07:06 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
- Run http://pulpito.ceph.com/teuthology-2015-08-21_08:42:54-upgrade:firefly-x-hammer-distro-basic-vps/
Jobs: 1024928,... - 05:44 AM Bug #12732: very slow read when a file has holes.
- '(rc < 0 && rc != -ENOENT)' should work. please send a patch to ceph-devel@vger.kernel.org
08/20/2015
- 10:07 AM Bug #12732 (Resolved): very slow read when a file has holes.
- When a file in cephfs has holes, reading this file is very slow.
This problem can be reproduced by the commands bel...
08/19/2015
- 03:27 PM Bug #12727 (Duplicate): fsstress.sh failed in powercycle suite
- dup of #12710
- 03:03 PM Bug #12727 (Duplicate): fsstress.sh failed in powercycle suite
- Run: http://pulpito.ceph.com/teuthology-2015-08-18_09:06:57-powercycle-hammer-testing-basic-multi/
Job: 1020696
Log...
08/18/2015
- 12:07 PM Bug #12710: fsstress.sh fails
- commit:47519365484056e1731cac54cce835332d258121
- 07:28 AM Bug #12710 (Fix Under Review): fsstress.sh fails
- https://github.com/ceph/ceph/pull/5595
- 12:01 PM Bug #12711: mds get damaged
- merged in commit:3cfb7e4ccc08a67ceec73ee684049320c75e9bb2
- 06:59 AM Bug #12711 (Fix Under Review): mds get damaged
- https://github.com/ceph/ceph/pull/5594
08/17/2015
- 08:14 PM Bug #12715 (Resolved): "[ERR] bad backtrace on dir ino 600" in cluster log"
- Run: http://pulpito.ceph.com/teuthology-2015-08-14_16:56:20-upgrade:firefly-x-hammer-distro-basic-multi/
Job: 101471... - 01:06 PM Bug #12711 (Resolved): mds get damaged
- http://pulpito.ceph.com/teuthology-2015-08-10_23:08:02-kcephfs-master-testing-basic-multi/1010323/
It's easy to re... - 08:06 AM Bug #12710 (Resolved): fsstress.sh fails
- see quite a lot fsstress failures. one of them is http://qa-proxy.ceph.com/teuthology/teuthology-2015-08-10_23:04:02-...
- 07:50 AM Bug #11783: protocol: flushing caps on MDS restart can go bad
- see this again http://pulpito.ceph.com/teuthology-2015-08-11_23:04:02-fs-next---basic-multi/1011375
- 03:49 AM Bug #12709 (Resolved): hammer chmod.sh fails
- http://magna002.ceph.redhat.com/teuthology-2015-08-13_18:04:02-fs-hammer---basic-magna/173814/teuthology.log...
08/14/2015
- 08:48 AM Bug #12676 (Fix Under Review): MDSMap assertion in MDCache::trim (multimds)
- 08:47 AM Bug #12676: MDSMap assertion in MDCache::trim (multimds)
- https://github.com/ceph/ceph/pull/5583
08/13/2015
- 02:48 PM Bug #12598 (Resolved): LibCephFS.GetPoolId failure
- commit:4d4fe9dbc0eb0d0eaa9a608474fecc892626f542
Also available in: Atom