Re: [HTCondor-devel] New ld option --build-id stores a hash in the elf header


Date: Thu, 3 Jun 2021 13:27:57 -0500 (CDT)
From: Carl Edquist <edquist@xxxxxxxxxxx>
Subject: Re: [HTCondor-devel] New ld option --build-id stores a hash in the elf header
On second thought:

I think i have to recommend against doing a custom --build-id in condor builds.

The default behavior for this option is actually really useful:

    --build-id
    --build-id=style
        ...
        If style is omitted, "sha1" is used.

        The "md5" and "sha1" styles produces an identifier that is always
        the same in an identical output file, but will be unique among all
        nonidentical output files.  It is not intended to be compared as a
        checksum for the file's contents.  A linked file may be changed
        later by other tools, but the build ID bit string identifying the
        original linked file does not change.

So the build-id will always be tied to a particular build.

My notes tell me you can extract the build-id with

	readelf -n "$binary" | awk '/Build ID:/ {print $3}'

though as Greg says, nowadays file(1) will also tell you the BuildID for a binary than has one, though it's a bit less trivial to parse the output.

Anyway, if we tracked the build-id for binaries in our official releases, we could use that for a lookup table to match a binary to an exact build.


*** More importantly though ***

rpmbuild does a lot of this junk for you automatically, and in particular it uses this for the debuginfo system.

It copies out and strips the debug symbols for binaries into debuginfo files that are identified with a particular build-id.

The really wonderful part is those debug symbols are stored in a separate -debuginfo rpm. And the whole yum ecosystem knows how to use it.

So for instance if i have the osg repos installed, i can

	yum install condor
	debuginfo-install condor

and then get all the debug symbols installed for my exact build of condor that i have installed, as well as any dependent symbols for any other packages, so that i can run condor (or a core dump from it) under gdb, and have all the debug symbols available for the entire stack (including non-condor libraries).

Point being: It's really a lovely system and i am afraid it will all break if you try to shove in a git commit hash instead. (And as it happens, the git hash is also less precise, as it tells you which commit it was built for, but not which platform or build options. But the build-id refers back to the precise build itself, which we can even index after the fact for all of our official releases.)


Anyway,

Measure once, cut as many times as you like.

Carl

On Wed, 2 Jun 2021, Carl Edquist wrote:

The rpm has a %git_rev macro that makesrpm.sh populates - i don't know if that path is taken these days for the official builds.

Also a couple of tricky bits that come to mind - what do you do if there are local changes to the git repo; and, the build configure parameters can still vary even given a single commit.

Carl

On Tue, 1 Jun 2021, Tim Theisen via HTCondor-devel wrote:

 It doesn't get passed all the way in. However, I think that the git hash
 could be included in a file in the official source tarball. That way
 even downstream distributions would have the git sha in their builds.

 ...Tim

 On 6/1/21 9:14 AM, Greg Thain via HTCondor-devel wrote:

 All:

 Spending a relaxing weekend reading through linker documentation, I
 see that the gnu ld linker now has an option --build-id, which can
 take a hex string which represents a hash, and store it in the
 .gnu.notes.property section of the elf header.  Newer versions of the
 "file" command display this hash.  I would suggest that all of our
 builds try to set this to the git source sha, so we can have better
 tracking of our binaries.  The nice thing about this is it isn't in
 the binary at all, not even in the text segment, so there are fewer
 "accidentally create a dependency that rebuilds the world" issues.

 Do we have the git hash handy when we do the rpm builds?

 -greg

 _______________________________________________
 HTCondor-devel mailing list
 HTCondor-devel@xxxxxxxxxxx
 https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel

 --
 Tim Theisen
 Release Manager
 HTCondor & Open Science Grid
 Center for High Throughput Computing
 Department of Computer Sciences
 University of Wisconsin - Madison
 4261 Computer Sciences and Statistics
 1210 W Dayton St
 Madison, WI 53706-1685
 +1 608 265 5736

 _______________________________________________
 HTCondor-devel mailing list
 HTCondor-devel@xxxxxxxxxxx
 https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel
[← Prev in Thread] Current Thread [Next in Thread→]