Project

General

Profile

Bug #10622

kernel install failures on sha1 length checks

Added by Greg Farnum about 9 years ago. Updated about 9 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

http://qa-proxy.ceph.com/teuthology/teuthology-2015-01-20_23:04:03-fs-next-testing-basic-multi/715943/
Plus 3 more in the same run, and a few others I've seen.

2015-01-21T12:42:05.498 INFO:teuthology.task.kernel:Checking client client.0 for new kernel version...
2015-01-21T12:42:05.499 INFO:teuthology.task.kernel:Checking kernel version of client.0, want ...
2015-01-21T12:42:05.499 INFO:teuthology.orchestra.run.plana64:Running: 'uname -r'
2015-01-21T12:42:05.597 DEBUG:teuthology.task.kernel:current kernel version is 3.19.0-rc5-ceph-00020-g5460340
2015-01-21T12:42:05.597 DEBUG:teuthology.task.kernel:extracting sha1, 3.19.0-rc5-ceph-00020-g5460340 -> 5460340
2015-01-21T12:42:05.597 ERROR:teuthology.task.kernel:Saw exception
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/task/kernel.py", line 560, in wait_for_reboot
    assert not need_to_install(ctx, client, need_install[client]), \
  File "/home/teuthworker/src/teuthology_master/teuthology/task/kernel.py", line 159, in need_to_install
    assert m >= 6, "cur_sha1 and/or version is too short, m = %d" % m
AssertionError: cur_sha1 and/or version is too short, m = 0

I'm not familiar with the code involved here, but it sure looks to me like it's correctly pulling out a 7-character hash, which for some reason is considered to be 0-length by the check function. Maybe it's doing a regex and the lack of alphabetical characters is throwing it off?

History

#1 Updated by Zack Cerza about 9 years ago

Not quite:

2015-01-21T12:38:27.647 INFO:teuthology.task.kernel:Checking client mon.a for new kernel version...
2015-01-21T12:38:27.647 INFO:teuthology.task.kernel:Checking kernel version of mon.a, want 3.19.0-rc5-ceph-00020-g5460340...
2015-01-21T12:38:27.648 INFO:teuthology.orchestra.run.plana41:Running: 'uname -r'
2015-01-21T12:38:27.656 DEBUG:teuthology.task.kernel:current kernel version is 3.19.0-rc5-ceph-00020-g5460340
2015-01-21T12:38:27.657 DEBUG:teuthology.task.kernel:utsrelease strings match, do not need to install
2015-01-21T12:38:27.657 INFO:teuthology.task.kernel:Checking client client.0 for new kernel version...
2015-01-21T12:38:27.657 INFO:teuthology.task.kernel:Checking kernel version of client.0, want ...

Note that client.0's 'want ...' line has no version number. Also note that clent.0 is plana64, running rhel. The other two nodes in this job are running ubuntu.

#2 Updated by Zack Cerza about 9 years ago

Hrm. So, at this point I need to be able to reproduce it; I have a branch that adds some debug output to help me figure out what is going on.

Unfortunately, I can't seem to reproduce...

#4 Updated by Andrew Schoen about 9 years ago

After looking into this, the code path that gets the version to check for during reboots is returning an empty string as the version. I suspect this isa gitbuilder issue.

See this url: http://gitbuilder.ceph.com/kernel-rpm-redhatenterpriseserver7-x86_64-basic/sha1/5460340d1fbdf2cfa152bc2c60ae3dfadd4da62c/version

That is the url that teuthology calls to get the version to check for after the node restarts during the kernel install.

#5 Updated by Andrew Schoen about 9 years ago

If we gave the sha1 to need_version[role] here (https://github.com/ceph/teuthology/blob/master/teuthology/task/kernel.py#L1115) we could avoid the call to gitbuilder.

wait_for_reboot calls need_to_install which we called just a few lines above with the sha1, which worked correctly.

#6 Updated by Andrew Schoen about 9 years ago

Lots of failures again last night related to this, with improved logging exposing the real issue.

http://pulpito.front.sepia.ceph.com/teuthology-2015-01-29_23:04:04-fs-next-testing-basic-multi/

#7 Updated by Zack Cerza about 9 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF