Project

General

Profile

Actions

Bug #3736

closed

kernel build: failures starting in 3.8-rc1

Added by Alex Elder over 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Kernels as of version 3.8-rc1 are not properly building in
autobuilder. The initial symptom was that the config phase
of building was getting in an infinite loop.

I managed to diagnose the cause of this and created a fix
that involved changing our config file.

Now the kernel builds, but something in the "perf" portion
of the build is failing.

I'm assigning this to Gary for now, but I am still doing a
few things to narrow down the cause.

Actions #1

Updated by Alex Elder over 11 years ago

I'm retroactively updating this so a bit about what's been
done gets documented.

The problem was in the Kconfig for the SCTP protocol. I
found that one recent commit was added to that file and
verified it was the cause. I sent the following note to
the author of that commit (which pretty fully describes
the original problem).

---------------
To: Neil Horman <> (and others)
Subject: Config Loop

A commit added in the 3.8-rc1 merge window has resulted in
my kernel config entering an infinite loop handling the
"Default SCTP cookie HMAC encoding" option.

commit 0d0863b02002c25140a1b9e113b81211bcc780e8
sctp: Change defaults on cookie hmac selection
http://marc.info/?l=linux-netdev&m=135553459303505

The problem lies in my config file containing this line:

CONFIG_SCTP_HMAC_MD5=y

I normally configure my kernel using fixed config file
(occasionally updated) using (roughly) this:

yes "" | make oldconfig

The result looks like this:

. . .
DCCP connection probing (NET_DCCPPROBE) [M/n/?] m *
  • The SCTP Protocol (EXPERIMENTAL) *
    The SCTP Protocol (EXPERIMENTAL) (IP_SCTP) [M/y/?] m
    SCTP: Association probing (NET_SCTPPROBE) [M/n/?] m
    SCTP: Debug messages (SCTP_DBG_MSG) [N/y/?] n
    SCTP: Debug object counts (SCTP_DBG_OBJCNT) [N/y/?] n
    Default SCTP cookie HMAC encoding
    1. Enable optional MD5 hmac cookie generation
    (SCTP_DEFAULT_COOKIE_HMAC_MD5) (NEW)
    2. Enable optional SHA1 hmac cookie generation
    (SCTP_DEFAULT_COOKIE_HMAC_SHA1) (NEW)
    3. Use no hmac alg in SCTP cookie generation
    (SCTP_DEFAULT_COOKIE_HMAC_NONE) (NEW)
    choice[1-3?]: Default SCTP cookie HMAC encoding
    1. Enable optional MD5 hmac cookie generation
    (SCTP_DEFAULT_COOKIE_HMAC_MD5) (NEW)
    2. Enable optional SHA1 hmac cookie generation
    (SCTP_DEFAULT_COOKIE_HMAC_SHA1) (NEW)
    3. Use no hmac alg in SCTP cookie generation
    (SCTP_DEFAULT_COOKIE_HMAC_NONE) (NEW)
    choice[1-3?]: Default SCTP cookie HMAC encoding
    . . . and so on.

I find that I can correct this with the patch below.
I expect others will bump into the same problem. In
particular, I notice my Ubuntu config files contain
that same line.

I don't know how best to handle this, but I thought
I would report it in case someone has a good solution.

-Alex
arch/x86/configs/autobuild |    2 
1 file changed, 1 insertion(
), 1 deletion()

Index: b/arch/x86/configs/autobuild ===================================================================
--- a/arch/x86/configs/autobuild
+++ b/arch/x86/configs/autobuild
@ -940,7 +940,7 @ CONFIG_NET_SCTPPROBE=m # CONFIG_SCTP_DBG_OBJCNT is not set # CONFIG_SCTP_HMAC_NONE is not set # CONFIG_SCTP_HMAC_SHA1 is not set
-CONFIG_SCTP_HMAC_MD5=y
+CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5=y
CONFIG_RDS=m
CONFIG_RDS_RDMA=m
CONFIG_RDS_TCP=m

Actions #2

Updated by Alex Elder over 11 years ago

I changed our config file, found in the git repository
autobuild-ceph in the file "kernel-config" in the way
described in what I just added, above, and that allowed
the kernel config to complete and the kernel part of
the build to finish.

I did hear back from Neil Horman. I'll update on that
shortly.

Actions #3

Updated by Alex Elder over 11 years ago

Neil Horman sent a response to my message and suggested
three possible alternatives to fix the underlying problem,
none of which were very nice. I was in the process of
responding to his message when I found what I think is
the real bug with his original commit. It took me a
bit to verify all of this but I just sent a bug fix to
for the problem to Neil and the appropriate mailing list.

I think the right thing to do is to add this fix to our
kernel tree for now--until it gets committed upstream.
But I'll wait to hear back on my proposed fix before
doing so.

Actions #4

Updated by Alex Elder over 11 years ago

Despite a working build of the kernel, the package build
overall is still failing. It has something to do with building
the "perf" tools. I created branch "wip-bisect" and let
autobuilder bisect the problem. So far I have learned that
commit 090f8cc fails, but its (main) parent commit aefb058 does
not. So I am now building branch "wip-bisect-2" which starts
with the other parent commit, cc1b39d, and will see whether
that pinpoints a commit that started this problem.

Actions #5

Updated by Alex Elder over 11 years ago

Heard back from Neil as well as Vlad Yasevich about my
proposed fix and they both ack'd it. Linus was in on
the discussion and pulled in my fix possibly even before
those ack's got sent...

Until we get our Linux 3.8-rc1 and later build problem
resolved there's no need to update our own trees with
my fix, so I'll hold off on that for now.

Gary has agreed to pick up on resolving the build issue
(related to "perf") that remains.

Actions #6

Updated by Alex Elder over 11 years ago

Looks like commit 6ca2a9c is the first one in that branch
that fails. It has a parent ce37f40 that succeeds.

I've pushed branch wip-bisect-3 that's set to that commit's
other parent, af3df2c. That one looks promising because
it involves perf/Documentation, which is the area I found
there was some trouble.

Actions #7

Updated by Alex Elder over 11 years ago

Sure enough, this is the commit that causes the problem:
af3df2c perf tools: Try to build Documentation when installing

Maybe the problem is that we don't have "asciidoc" and "xmlto"
installed on our target systems. These are apparently newly
required (and checking for them is the purpose of that commit).

Actions #8

Updated by Anonymous over 11 years ago

The remaining issue is that the patch we apply to scripts/package/builddeb to build the perf tools is out of date. I am testing a new patch. Longer term fix may be to work with the builddeb maintainer to have that script build perf tools as well.

Actions #9

Updated by Ian Colle over 11 years ago

  • Status changed from New to In Progress
Actions #10

Updated by Anonymous about 11 years ago

Branch: refs/heads/master
Home: https://github.com/ceph/autobuild-ceph
Commit: 0ff4f9a9ce82b37288b3bbcc5b5d65b5ae0b5ff7
https://github.com/ceph/autobuild-ceph/commit/0ff4f9a9ce82b37288b3bbcc5b5d65b5ae0b5ff7
Author: Gary Lowell <>
Date: 2013-01-17 (Thu, 17 Jan 2013)

Changed paths:
M perf.patch
Log Message:
-----------
perf.patch: Updated for recent kernels.

Changes in the perf tools source requires access to more header
files from the buiild area. Bug 3736.

Signed-off-by: Gary Lowell <>

Actions #11

Updated by Anonymous about 11 years ago

The immediate kernel build problems have been solved by recreating the patch that is applied to the debian package build scripts to build the perf tools. The longer term solution is to work with the kernel packaging maintainers to officially add perftools to the debian build script. That effort may deserve it's own bug or task.

Actions #12

Updated by Anonymous about 11 years ago

  • Status changed from In Progress to Resolved

The problem that resulting in this bug being opened originally has been solved with the update patch. I've created to new bugs for the issues that were raised investigation the original problem:

bug 4004 for the gzip internal error problem.
feature 4005 to add performance tools to the debian packaging script.

The commit for the solution to the original problem is:
commit 0ff4f9a9ce82b37288b3bbcc5b5d65b5ae0b5ff7
Author: Gary Lowell <>
Date: Thu Jan 17 23:05:03 2013 -0800

perf.patch:  Updated for recent kernels.
Changes in the perf tools source requires access to more header
files from the buiild area. Bug 3736.
Signed-off-by: Gary Lowell  &lt;&gt;
Actions

Also available in: Atom PDF