Project

General

Profile

Actions

Bug #15249

closed

fix jenkins to not impose long paths

Added by Loïc Dachary about 8 years ago. Updated over 7 years ago.

Status:
Rejected
Priority:
High
Assignee:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The deep workspace in which jenkins runs creates problems such as the script limitation in #! (80) that are painfully workedaround with changes such as

https://github.com/ceph/ceph/commit/93c790d72a004d47155a645dc759422cc85cc5c4
https://github.com/ceph/ceph/commit/c540835cf3dde2a91e582879fd6997b41b5c9cf0

This is bound to happen from time to time as this deep path is unique to jenkins. There exists a solution:

https://wiki.jenkins-ci.org/display/JENKINS/Short+Workspace+Path+Plugin

and that would avoid using hacks such as CEPH_BUILD_VIRTUALENV=/tmp/ from the environment which is fragile to maintain and resides in an environment variable that is difficult to relate with the actual problem it solves.

Actions #1

Updated by David Galloway about 8 years ago

  • Project changed from sepia to devops
Actions #3

Updated by Brad Hubbard almost 8 years ago

Just to make things clearer.

"#! (80)" refers to the 80 char limit for shebang paths (#!/path).

Actions #4

Updated by Anonymous almost 8 years ago

Checking that plugin

Actions #5

Updated by Dan Mick almost 8 years ago

"The plugin detects path length limitation for given node automatically". Hmm...

Actions #6

Updated by Anonymous almost 8 years ago

This plugin is in place & activate, let's see how it goes.

Actions #7

Updated by Kefu Chai almost 8 years ago

2016-07-05 04:34:48.203187 7f565f516700 -1 asok(0x1506220) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: The UNIX domain socket path /home/jenkins-build/build/workspace/ceph-pull-requests/build/src/test/testdir/test-7200/out/client.admin.18645.asok is too long! The maximum length on this system is 107

see https://jenkins.ceph.com/job/ceph-pull-requests/8529/consoleFull

maybe should have a smaller setting of "org.jenkinsci.plugins.shortwspath.BUILD_PATH_LENGTH" ?

Actions #8

Updated by Kefu Chai over 7 years ago

  • Project changed from devops to Ceph

Erwan talked with the author of the plugin. seems we don't need to change any configuration

I had a discussion with the author and it seems it doesn't need any

so we should fix the tests on our end.

Actions #9

Updated by Anonymous over 7 years ago

This issue doesn't only seems to be related to our CI.
It sounds pretty incorrect to create a socket in such a long path.
I'd vote for changing the C code which create the socket in the current path instead of targeting a strict (short) location.

Actions #10

Updated by Anonymous over 7 years ago

For me the issue is in qa/workunits/ceph-helpers.sh.

It seems for me that "--admin-socket=$dir/\$cluster-\$name.asok" is leading to the issue we have.

Maybe we shall put the admin-socket in /tmp instead of $dir making the path shorter.
Sockets are not supposed to be put in long paths.

Actions #11

Updated by Loïc Dachary over 7 years ago

Developers run and write tests working in directories that look like /home/foo/something/ceph or even /home/foo/ceph. If jenkins runs the same tests in paths that are significantly longer, this problem is bound to resurface on a regular basis. If jenkins can be convinced to run the test in a shorter path the problem will go away forever. It will not happen in jenkins because developers will notice it locally: it will cease to be a jenkins specific problem.

Actions #12

Updated by Loïc Dachary over 7 years ago

  • Project changed from Ceph to CI

Here is the kind of work that running tests in long path generates: https://github.com/ceph/ceph/pull/12066/files .

I want to reiterate that this is going to keep happening because the 60 characters path created by jenkins is not what developers have on their machines. And asking all developers to be aware of this long path situation specific to the build environment is unrealistic.

Fixing jenkins to provide (via a plugin or anything else) shorter build path will solve that issue permanently. Not fixing jenkins is the guarantee that the issue will resurface on a regular basis and require time from developers.

Actions #13

Updated by Loïc Dachary over 7 years ago

It surfaced again here https://jenkins.ceph.com/job/ceph-pull-requests/14630/

The UNIX domain socket path /home/jenkins-build/build/workspace/ceph-pull-requests/build/src/test/td/t-7202/out/client.xx-profile-ro.27743.asok is too long

Out of this long path

jenkins: /home/jenkins-build/build/workspace/ceph-pull-requests
ceph: build/src/test/td/t-7202/out/client.xx-profile-ro.27743.asok

Actions #14

Updated by Loïc Dachary over 7 years ago

https://jenkins.ceph.com/job/ceph-pull-requests/14956/

The UNIX domain socket path /home/jenkins-build/build/workspace/ceph-pull-requests/build/src/test/td/t-7202/out/client.xx-profile-ro.31641.asok is too long

Actions #17

Updated by Loïc Dachary over 7 years ago

  • Priority changed from Normal to High

increase the priority as this becomes more frequent

Actions #23

Updated by Alfredo Deza over 7 years ago

  • Status changed from New to Rejected

I am going to reject this as a required fix in Jenkins.

A test that is willingly using a $dir variable that can (and does in some environments like Jenkins) produce long paths is a test that should be fixed.

We also agree with comment #15249-10 from Erwan: sockets should not be created on long paths.

Actions #24

Updated by Loïc Dachary over 7 years ago

I'm speechless, great work, thanks for the help.

Actions #27

Updated by Andrew Schoen over 7 years ago

I agree that this should be something fixed in the tests and not jenkins. We had a similar problem with long workspace directories in the ceph-ansible PR tests and we got around it be setting the working directory for tox to a shorter path in /tmp instead of using the default workspace path.

Actions #28

Updated by Anonymous over 7 years ago

Loic Dachary wrote:

I'm speechless, great work, thanks for the help.

Loic, that's not fair to be ironic with us. We spent time on it trying to identify why we have this issue.
And I'm really sorry but the issue is not about jenkins at all. The limitation to 107 is not our design but Linux's one.

The actual code consider putting a socket in a possible long path directory which is a random game. It could pass or it could fail regarding the location of the build. That's not really a robust design.

This bug can be triggered without jenkins. Jenkins is just the one that shows the issue.
This guy clearly show the bug in another context with the same solution as we offer : https://bugs.mysql.com/bug.php?id=42512
A socket shall be put in a short directory and we cannot bet what the actual build location is.

So I'm sorry, but the solution is to change "--admin-socket=$dir/\$cluster-\$name.asok" into a shorter & constant path as /tmp.

Actions #29

Updated by Dan Mick over 7 years ago

If we were to modify the tests, how would they cooperate with Jenkins to choose paths so that Jenkins can correctly recover/clean up? It seems like a preferable solution would be to allocate a job-specific temporary directory and tie it with the job somehow so that it's properly purged when builds are deleted. One could use the standard mktemp family, but then Jenkins is unaware that the job has allocated outside its workspace.

I wonder if Jenkins has ways to ask for such resources.

Actions #30

Updated by Loïc Dachary over 7 years ago

@Christina Meno could you please confirm that this rejection is, in your opinion, the right answer to this problem ?

Actions

Also available in: Atom PDF