Project

General

Profile

Activity

From 08/13/2015 to 09/11/2015

09/11/2015

04:01 PM Support #13055: Problem with disconnect fuse by mds
when "transported point is not connected" happen, could you check if the ceph-fuse process still exists. If not, run ... Zheng Yan
02:35 PM Support #13055: Problem with disconnect fuse by mds
and about high load... i've checked mds nodes for memory overload or some other physical overload - didnt find anythi... Sergey Mir
02:31 PM Support #13055: Problem with disconnect fuse by mds
Zheng Yan wrote:
Can you find any clue in ceph-fuse.log?
there is no ceph-fuse.log on clients...
fuse have same...
Sergey Mir
02:26 PM Support #13055: Problem with disconnect fuse by mds
dmesg and other clients logs are empty...
i have 5 clients with mounted to ceph.
3 of them are using more often(ma...
Sergey Mir
02:20 PM Support #13055: Problem with disconnect fuse by mds
"Transport endpoint is not connected" almost always means that the ceph-fuse client process has gone away. Check dmes... Greg Farnum
02:20 PM Support #13055: Problem with disconnect fuse by mds
Zheng Yan,
yes,as i pasted earlier in my config, i have changed it to mds_session_autoclose = 60
Sergey Mir
02:19 PM Support #13055: Problem with disconnect fuse by mds
Sergey Mir wrote:
> fuse is just disconnecting from ceph and says "transported point is not connected" and thats all...
Zheng Yan
02:14 PM Support #13055: Problem with disconnect fuse by mds
>> 2015-09-10 13:11:17.068839 mds.0 [INF] closing stale session client.343916 (client_ip):0/8623 after 64.626196
MDS...
Zheng Yan
01:59 PM Support #13055: Problem with disconnect fuse by mds
fuse is just disconnecting from ceph and says "transported point is not connected" and thats all, no logs or some oth... Sergey Mir
01:41 PM Support #13055: Problem with disconnect fuse by mds
We need to know how the system is misbehaving -- not just to see log snippets.
What is the symptom of the problem ...
John Spray
12:59 PM Support #13055: Problem with disconnect fuse by mds
Need some solution to solve that problem. it drops again...
2015-09-11 15:42:57.450895 7fb118d06700 0 -- (mds00/m...
Sergey Mir
11:09 AM Support #13055: Problem with disconnect fuse by mds
fuse loose connections few times a day with output in logs as i pasted at my previous message from few different log ... Sergey Mir
10:43 AM Support #13055: Problem with disconnect fuse by mds
The log snippets you've pasted don't make much sense — you have some entries going back in time, and it looks like th... Greg Farnum
10:25 AM Support #13055 (Closed): Problem with disconnect fuse by mds
Hello. Nobody knows answer in hex chat(ceph) about, so im asking advice here.
i have ceph version 0.94.3 (95cefea9fd...
Sergey Mir
01:37 PM Backport #13044 (In Progress): LibCephFS.GetPoolId failure
Abhishek Varshney
09:06 AM Backport #13044 (Resolved): LibCephFS.GetPoolId failure
https://github.com/ceph/ceph/pull/5887 Abhishek Varshney
01:19 PM Bug #12971 (In Progress): TestQuotaFull fails
Looks like this commit might have broken it:... John Spray
02:03 AM Bug #12806: nfs restart failures
Zheng Yan

09/10/2015

11:16 AM Bug #12209: CephFS should have a complete timeout mechanism to avoid endless waiting or unpredict...
Yes. I'm also not very comfortable with the patches that I've looked at, but it's been a while (they have been langui... Greg Farnum
11:08 AM Bug #12209: CephFS should have a complete timeout mechanism to avoid endless waiting or unpredict...
This all sounds a bit strange. The designed behaviour is to block on slow metadata operations, not time out. How ma... John Spray
11:05 AM Bug #12971: TestQuotaFull fails
Greg's theory sounds plausible, will look John Spray

09/09/2015

12:53 PM Backport #12499 (Resolved): ceph-fuse 0.94.2-1trusty segfaults / aborts
Loïc Dachary
12:33 PM Bug #12598 (Pending Backport): LibCephFS.GetPoolId failure
Loïc Dachary
12:28 PM Bug #13011 (Duplicate): LibCephFS.ReleaseMounted: FAILED assert(crypto_context != __null)
Loïc Dachary
12:23 PM Bug #13011 (Duplicate): LibCephFS.ReleaseMounted: FAILED assert(crypto_context != __null)
This happened on a hammer branch with one CephFS related backport ( https://github.com/ceph/ceph/commit/256620e37fd94... Loïc Dachary
12:10 PM Bug #12875: LibCephFS.LibCephFS.InterProcessLocking segment fault.
This might be a consequence of the wip-mds-caps branch, and will get more testing in there. Greg Farnum
12:02 PM Bug #12971: TestQuotaFull fails
That's not quite the problem. The MDS calls Objecter::unset_honor_osdmap_full(), at which point the Objecter should l... Greg Farnum

09/08/2015

01:40 PM Bug #12820 (Resolved): stuck looping on 'ls /sys/fs/fuse/connections'
Sage Weil
01:39 PM Bug #12875 (Can't reproduce): LibCephFS.LibCephFS.InterProcessLocking segment fault.
Sage Weil
08:09 AM Bug #12875 (Need More Info): LibCephFS.LibCephFS.InterProcessLocking segment fault.
The lockdep errors are side effect of previous failiure.... Zheng Yan
09:23 AM Bug #12971: TestQuotaFull fails
the objecter pauses all write osd request (including delete request) when cluster is full or pool-quota is reached. T... Zheng Yan
08:29 AM Bug #12674 (Need More Info): Semi-reproducible crash of ceph-fuse
Zheng Yan

09/07/2015

10:53 AM Bug #12732 (Resolved): very slow read when a file has holes.
https://github.com/ceph/ceph-client/commit/3e8b3d8cbf92aba5485e68bc5cba6eee2075ee71 Zheng Yan
10:33 AM Bug #12895: Failure in TestClusterFull.test_barrier
... Zheng Yan
08:38 AM Bug #12971: TestQuotaFull fails
... Zheng Yan

09/06/2015

03:25 PM Bug #12417 (Resolved): segfault launching ceph-fuse with bad --name
Loïc Dachary
03:25 PM Backport #12500 (Resolved): segfault launching ceph-fuse with bad --name
Loïc Dachary
10:02 AM Bug #12971 (Resolved): TestQuotaFull fails
http://qa-proxy.ceph.com/teuthology/teuthology-2015-09-01_23:04:01-fs-infernalis---basic-multi/1041649/teuthology.log... Zheng Yan
09:59 AM Bug #12896 (Rejected): EIO in multiple_rsync.sh
John Spray wrote:
> Looking at the paths, rsync *seems* to be complaining about the local files (in /tmp). Maybe it...
Zheng Yan

09/04/2015

03:23 PM Bug #12822 (Resolved): ceph-fuse crash in test_client_recovery
sorry for clobbering this - restored the state Nathan Cutler
03:18 PM Bug #12822 (In Progress): ceph-fuse crash in test_client_recovery
Nathan Cutler

09/03/2015

07:10 PM Bug #12909 (Resolved): cmake: client/fuse_ll.cc can't locate fuse_lowlevel.h
Sage Weil
11:05 AM Bug #11482: kclient: intermittent log warnings "client.XXXX isn't responding to mclientcaps(revoke)"
Okay, added debug mds 20 to the other kcephfs subsuites so we should be good next time we see it for jobs scheduled a... Greg Farnum

09/02/2015

03:13 PM Bug #12896: EIO in multiple_rsync.sh
Looking at the paths, rsync *seems* to be complaining about the local files (in /tmp). Maybe it's just a bad test node. John Spray
01:21 PM Bug #12896: EIO in multiple_rsync.sh
this failure looks weired. there is no ll_read entry in the client log, there are ll_write enties, but no error. Zheng Yan
10:42 AM Bug #11746 (Resolved): cephfs Dumper tries to load whole journal into memory at once
Loïc Dachary
10:35 AM Backport #12098 (Resolved): kernel_untar_build fails on EL7
Loïc Dachary
10:35 AM Backport #11999 (Resolved): cephfs Dumper tries to load whole journal into memory at once
Loïc Dachary
08:27 AM Backport #12590 (In Progress): "ceph mds add_data_pool" check for EC pool is wrong
Loïc Dachary

09/01/2015

06:03 PM Bug #12909 (Resolved): cmake: client/fuse_ll.cc can't locate fuse_lowlevel.h
"Commit f064e90ae554b64741284ef1cdf8a00bb7b4a312":https://github.com/ceph/ceph/commit/f064e90ae554b64741284ef1cdf8a00... Casey Bodley
04:23 PM Bug #12820: stuck looping on 'ls /sys/fs/fuse/connections'
Greg Farnum
10:33 AM Bug #12820 (Fix Under Review): stuck looping on 'ls /sys/fs/fuse/connections'
https://github.com/ceph/ceph-qa-suite/pull/551 John Spray
12:30 PM Bug #12776 (Fix Under Review): qa: standby MDS not shutting down, "reached maximum tries (50) aft...
https://github.com/ceph/ceph/pull/5739 John Spray
11:22 AM Bug #12896 (Rejected): EIO in multiple_rsync.sh

http://pulpito.ceph.com/teuthology-2015-08-28_23:04:01-fs-master---basic-multi/1037227/...
John Spray
11:12 AM Bug #12895 (Can't reproduce): Failure in TestClusterFull.test_barrier

teuthology-2015-08-24_23:04:02-fs-master---basic-multi/1030586/
John Spray

08/31/2015

02:58 PM Bug #12875 (Can't reproduce): LibCephFS.LibCephFS.InterProcessLocking segment fault.
... Sage Weil
02:29 PM Bug #12806 (Fix Under Review): nfs restart failures
https://github.com/ceph/ceph-qa-suite/pull/550 Zheng Yan
09:16 AM Bug #12806: nfs restart failures
looks like it's added by ceph-qa-suite/tasks/qemu.py. Zheng Yan
12:53 PM Bug #11482: kclient: intermittent log warnings "client.XXXX isn't responding to mclientcaps(revoke)"
For some reason the override we provided isn't being added to the configs, I created #12869 for that. :/ Greg Farnum

08/29/2015

09:20 AM Bug #12344 (Can't reproduce): libcephfs-java/test.sh: com.ceph.fs.CephMountTest fails
Loïc Dachary

08/28/2015

01:13 PM Feature #10369: qa-suite: detect unexpected MDS failovers and daemon crashes
We just keep re-creating this feature: #12821 Greg Farnum
01:13 PM Bug #12821 (Duplicate): mds_thrasher: handle MDSes failing on startup
#10369 Greg Farnum
01:11 PM Bug #12821: mds_thrasher: handle MDSes failing on startup
This is kind of a special case of http://tracker.ceph.com/issues/10369 -- there is a more general need for something ... John Spray
11:31 AM Bug #12821 (Duplicate): mds_thrasher: handle MDSes failing on startup
http://pulpito.ceph.com/teuthology-2015-08-21_23:04:01-fs-master---basic-multi/1026045/... Greg Farnum
01:02 PM Bug #12822 (Resolved): ceph-fuse crash in test_client_recovery
If we actually see that it's crashing with the timeout we can reopen this. Greg Farnum
12:58 PM Bug #12822: ceph-fuse crash in test_client_recovery
I would expect us to see a backtrace from ceph-fuse stderr in the case of an actual crash. Seems more like the clien... John Spray
11:53 AM Bug #12822 (Resolved): ceph-fuse crash in test_client_recovery
http://pulpito.ceph.com/teuthology-2015-08-17_23:04:01-fs-master---basic-multi/1020395/
Sadly there are absolutely...
Greg Farnum
12:12 PM Feature #12823 (Rejected): cephfs_test_runner: print test names when executing them
They're already logged, like this:... John Spray
11:57 AM Feature #12823 (Rejected): cephfs_test_runner: print test names when executing them
It looks like we don't print out test names. When running through a whole suite that makes telling where we are in a ... Greg Farnum
12:10 PM Bug #12776: qa: standby MDS not shutting down, "reached maximum tries (50) after waiting for 300 ...
Actually, I just tried sending SIGTERM to a standby mds here, and it's getting stuck too. John Spray
12:08 PM Bug #12776: qa: standby MDS not shutting down, "reached maximum tries (50) after waiting for 300 ...
It's getting the signal, but not making it through shutdown:... John Spray
11:48 AM Bug #12820: stuck looping on 'ls /sys/fs/fuse/connections'
http://pulpito.ceph.com/teuthology-2015-08-17_23:04:01-fs-master---basic-multi/1020354/ Greg Farnum
11:24 AM Bug #12820 (Resolved): stuck looping on 'ls /sys/fs/fuse/connections'
http://pulpito.ceph.com/teuthology-2015-08-21_23:04:01-fs-master---basic-multi/1025967/... Greg Farnum
11:47 AM Bug #12612 (Can't reproduce): fuse jobs fail to start on centos7
Okay, haven't seen this particular one again, just the new one #12820. Greg Farnum
10:11 AM Bug #12808: smbtorture failure on scan-pipe
It doesn't look like that in the error logs to me; I think it just failed to allocate — but perhaps I'm misreading th... Greg Farnum
02:19 AM Bug #12808: smbtorture failure on scan-pipe
this test case makes smbd allocate tens of GBs memory. maybe smbd got killed during the test Zheng Yan
10:04 AM Bug #12806: nfs restart failures
Zheng, do we have any idea how the machines are getting into that duplicated export state? It looks pretty clear that... Greg Farnum
01:35 AM Bug #12806 (Resolved): nfs restart failures
Zheng Yan
01:35 AM Bug #12806: nfs restart failures
... Zheng Yan
09:38 AM Bug #12657 (Can't reproduce): Failure in TestStrays.test_ops_throttle
Hmm, now this looks like a teuthology burp. The stats polling is meant to happen every second, but in this instance ... John Spray
08:18 AM Bug #12777: qa: leftover files in cephtest directory
merged to next and master. John Spray
01:56 AM Bug #12777: qa: leftover files in cephtest directory
Zheng Yan

08/27/2015

04:55 PM Bug #12777 (Fix Under Review): qa: leftover files in cephtest directory

Oops, this is CephFSTestCase.tearDown not getting called (so there's still a client mount, so the dir can't be remo...
John Spray
01:34 PM Bug #12777: qa: leftover files in cephtest directory
This shows up in the logs as 'rmdir -- /home/ubuntu/cephtest' Greg Farnum
04:20 PM Bug #12806: nfs restart failures
iirc starting the nfs service on el7 depends on rpcbind, nfs-lock already being up, so that might be what's missing here John Spray
01:19 PM Bug #12806 (Resolved): nfs restart failures
... Greg Farnum
01:46 PM Bug #12808 (New): smbtorture failure on scan-pipe
Log summary line: "Command failed on burnupi59 with status 137: 'TESTDIR=/home/ubuntu/cephtest bash -s'"... Greg Farnum
01:35 PM Bug #12807 (Duplicate): rmdir cephtest failing
#12777 Greg Farnum
01:30 PM Bug #12807 (Duplicate): rmdir cephtest failing
http://pulpito.ceph.com/teuthology-2015-08-22_23:04:02-fs-next---basic-multi/1027445/
http://pulpito.ceph.com/teutho...
Greg Farnum

08/25/2015

02:00 PM Bug #11789: knfs mount fails with "getfh failed: Function not implemented"
We think this is the NFS kernel module not being loaded. If this is still happening we need to figure out why. Greg Farnum
01:58 PM Bug #12653 (Fix Under Review): fuse mounted file systems fails SAMBA CTDB ping_pong rw test with ...
Greg Farnum
01:56 PM Bug #11783 (Fix Under Review): protocol: flushing caps on MDS restart can go bad
Greg Farnum
01:52 PM Bug #11784 (Can't reproduce): ceph-fuse hang on unmount (stuck dentry refs)
Sage Weil
01:51 PM Bug #9994: ceph-qa-suite: nfs mount timeouts
Greg Farnum
01:51 PM Bug #12365 (Resolved): kcephfs: hang on umount
Haven't seen this since then. Greg Farnum
12:59 PM Bug #12777: qa: leftover files in cephtest directory
http://pulpito.ceph.com/teuthology-2015-08-21_23:04:01-fs-master---basic-multi/1026047/ Greg Farnum
12:13 PM Bug #12777 (Resolved): qa: leftover files in cephtest directory
http://pulpito.ceph.com/teuthology-2015-08-17_23:04:01-fs-master---basic-multi/1020426/
http://pulpito.ceph.com/teut...
Greg Farnum
12:33 PM Bug #11482: kclient: intermittent log warnings "client.XXXX isn't responding to mclientcaps(revoke)"
http://pulpito.ceph.com/teuthology-2015-08-17_23:08:05-kcephfs-master-testing-basic-multi/1020518/
Doesn't have th...
Greg Farnum
12:10 PM Bug #12776 (Resolved): qa: standby MDS not shutting down, "reached maximum tries (50) after waiti...
http://pulpito.ceph.com/teuthology-2015-08-17_23:04:01-fs-master---basic-multi/1020415/
The standby MDS doesn't lo...
Greg Farnum

08/24/2015

06:45 PM Bug #12715 (Resolved): "[ERR] bad backtrace on dir ino 600" in cluster log"
Greg Farnum
06:42 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
https://github.com/ceph/ceph-qa-suite/pull/539 Yuri Weinstein
06:36 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
whitelist "bad backtrace on dir ino" warning message (per irc chat with Greg) Yuri Weinstein
05:30 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
Also in run:
http://pulpito.ceph.com/teuthology-2015-08-21_08:42:54-upgrade:firefly-x-hammer-distro-basic-vps/
Jobs...
Yuri Weinstein
03:37 AM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
the test uses 0.80.4 (7c241cfaa6c8c068bc9da8578ca00b9f4fc7567f). the newest firefly include the fix (commit a5970963) Zheng Yan
01:22 PM Bug #12710 (Resolved): fsstress.sh fails
Zheng Yan
01:22 PM Bug #12709 (Resolved): hammer chmod.sh fails
fixed by commit 81a311a744987564b70852fdacfd915523c73b5d Zheng Yan
01:21 PM Bug #12711 (Resolved): mds get damaged
Zheng Yan
08:36 AM Bug #12676 (Resolved): MDSMap assertion in MDCache::trim (multimds)
... John Spray
08:33 AM Bug #12321 (Can't reproduce): MDS crash when try to connect clients
John Spray
03:52 AM Bug #12321: MDS crash when try to connect clients
John Spray wrote:
> Hi, do you have any updates for us on this? If the system is unavailable for any more debug the...
zcc icy
06:30 AM Bug #12753 (Resolved): cls_cephfs_client encodes time_t directly
fixed by 1213dde3d207d0d91ccecfca4dd6af1bdee0ed65 Zheng Yan

08/21/2015

09:40 PM Bug #12753 (Resolved): cls_cephfs_client encodes time_t directly
fail to build on i386.
we should never encode time_t directly.. cast to uint32_t or uint64_t so it is sized explic...
Sage Weil
08:54 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
Zheng, Sage's mentioned that this may have been fixed by you, can you take a look? Yuri Weinstein
08:19 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
This is an old bug, right? We should just whitelist this? Sage Weil
07:06 PM Bug #12715: "[ERR] bad backtrace on dir ino 600" in cluster log"
Run http://pulpito.ceph.com/teuthology-2015-08-21_08:42:54-upgrade:firefly-x-hammer-distro-basic-vps/
Jobs: 1024928,...
Yuri Weinstein
05:44 AM Bug #12732: very slow read when a file has holes.
'(rc < 0 && rc != -ENOENT)' should work. please send a patch to ceph-devel@vger.kernel.org Zheng Yan

08/20/2015

10:07 AM Bug #12732 (Resolved): very slow read when a file has holes.
When a file in cephfs has holes, reading this file is very slow.
This problem can be reproduced by the commands bel...
caifeng zhu

08/19/2015

03:27 PM Bug #12727 (Duplicate): fsstress.sh failed in powercycle suite
dup of #12710 Zheng Yan
03:03 PM Bug #12727 (Duplicate): fsstress.sh failed in powercycle suite
Run: http://pulpito.ceph.com/teuthology-2015-08-18_09:06:57-powercycle-hammer-testing-basic-multi/
Job: 1020696
Log...
Yuri Weinstein

08/18/2015

12:07 PM Bug #12710: fsstress.sh fails
commit:47519365484056e1731cac54cce835332d258121 Greg Farnum
07:28 AM Bug #12710 (Fix Under Review): fsstress.sh fails
https://github.com/ceph/ceph/pull/5595 Zheng Yan
12:01 PM Bug #12711: mds get damaged
merged in commit:3cfb7e4ccc08a67ceec73ee684049320c75e9bb2 Greg Farnum
06:59 AM Bug #12711 (Fix Under Review): mds get damaged
https://github.com/ceph/ceph/pull/5594 Zheng Yan

08/17/2015

08:14 PM Bug #12715 (Resolved): "[ERR] bad backtrace on dir ino 600" in cluster log"
Run: http://pulpito.ceph.com/teuthology-2015-08-14_16:56:20-upgrade:firefly-x-hammer-distro-basic-multi/
Job: 101471...
Yuri Weinstein
01:06 PM Bug #12711 (Resolved): mds get damaged
http://pulpito.ceph.com/teuthology-2015-08-10_23:08:02-kcephfs-master-testing-basic-multi/1010323/
It's easy to re...
Zheng Yan
08:06 AM Bug #12710 (Resolved): fsstress.sh fails
see quite a lot fsstress failures. one of them is http://qa-proxy.ceph.com/teuthology/teuthology-2015-08-10_23:04:02-... Zheng Yan
07:50 AM Bug #11783: protocol: flushing caps on MDS restart can go bad
see this again http://pulpito.ceph.com/teuthology-2015-08-11_23:04:02-fs-next---basic-multi/1011375 Zheng Yan
03:49 AM Bug #12709 (Resolved): hammer chmod.sh fails
http://magna002.ceph.redhat.com/teuthology-2015-08-13_18:04:02-fs-hammer---basic-magna/173814/teuthology.log... Zheng Yan

08/14/2015

08:48 AM Bug #12676 (Fix Under Review): MDSMap assertion in MDCache::trim (multimds)
Zheng Yan
08:47 AM Bug #12676: MDSMap assertion in MDCache::trim (multimds)
https://github.com/ceph/ceph/pull/5583 Zheng Yan

08/13/2015

02:48 PM Bug #12598 (Resolved): LibCephFS.GetPoolId failure
commit:4d4fe9dbc0eb0d0eaa9a608474fecc892626f542 Sage Weil
 

Also available in: Atom