Thoughts on tight coupling of tests to system GNU ld #1120

davidlattimore · 2025-09-27T10:13:44Z

davidlattimore
Sep 27, 2025
Maintainer

Following on from a comment on #1104...The tight coupling of the tests to the system's GNU ld has definitely had its challenges. I'm unsure of what's best to do. Testing against GNU ld (and lld) often makes it easier to see what we did wrong. This is especially the case when making a change that's supposed to be a refactoring. If you make a change that you think shouldn't affect things, then a test or two fails, when we diff against the GNU ld output, we can often see exactly what we got wrong in the binary, whereas if the binary just segfaulted or gave the wrong result, it might be a lot harder to figure out what we did wrong.

Also, we have on a few occasions discovered actual bugs when we attempted to run the tests on a new platform with different GCC / GNU ld versions / default config etc.

The downside of course is that if people want to run the tests to make sure the linker works on their platform, they see failed test that aren't "the linker is broken" but just "the linker produced a different result than GNU ld".

We also end up running on a lot of different platforms in our CI, but likely the linker doesn't behave differently on these different platforms, it's just that the object files produced by GCC and the system linker against which we compare are slightly different.

For actually developing on the linker and verifying changes to the linker, I think the diff-testing approach works well. For people who just want a quick check that the linker works, I'm less sure.

One option might be to add an additional settings to test-config.toml that turns diffing on and off. We could then have it off by default, on in CI and recommend that developers who are making changes turn it on.

I'm curious as to what others think.

mati865 · 2025-09-27T18:05:42Z

mati865
Sep 27, 2025
Collaborator Sponsor

This would only disable the diffing with other linkers, but the integration tests would still run, right?
Sounds good to me.

1 reply

davidlattimore Oct 1, 2025
Maintainer Author

Yep, the integration tests would still run. There are a few tests where the bulk of the value of the test comes from diffing, but the majority also execute the resulting binary and that would still happen.

daniel-levin · 2025-09-28T09:22:34Z

daniel-levin
Sep 28, 2025

I suggest relaxing (hah) the notion of equality to equivalence in the integration tests. It shouldn't matter if GNU ld or lld report slightly different answers from wild. Indeed, they sometimes disagree with each other. These linkers:

❯ ld.lld --version
LLD 18.1.8 (compatible with GNU linkers)
❯ ld.bfd --version
GNU ld version 2.41-37.fc40

result in these integration test "failures":

---- integration_test::program_name_49___cpp_integration_cc__ stdout ----
wild: /home/levinda/workplace/wild/wild/tests/build/cpp-integration.cc-clang-model-large-host.wild
ld: /home/levinda/workplace/wild/wild/tests/build/cpp-integration.cc-clang-model-large-host.ld
lld: /home/levinda/workplace/wild/wild/tests/build/cpp-integration.cc-clang-model-large-host.lld
rel.extra-opt.R_X86_64_REX_GOTPCRELX.RexMovIndirectToAbsolute.dynamic-non-pie
  `/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../lib64/crt1.o` .text _start
  ORIG 0x000018: [ 48 8b 3d 00 00 00 00 ] mov 0x1F,%rdi
                            ^^^^^^^^^^^ R_X86_64_REX_GOTPCRELX
  ORIG main -4
  wild 0x402688: [ 48 c7 c7 90 27 40 00 ] mov $0x402790,%rdi
                            ^^^^^^^^^^^ R_X86_64_32 RexMovIndirectToAbsolute
  wild main
  wild TRACE: relaxation applied relaxation=RexMovIndirectToAbsolute, value_flags=ADDRESS | NON_INTERPOSABLE,
  wild TRACE: resolution_flags=DIRECT, rel_kind=Absolute,
  wild TRACE: value=0x402790, symbol_name=main
  ld   0x401108: [ 48 8b 3d d9 2e 00 00 ] mov 0x403FE8,%rdi
                            ^^^^^^^^^^^ R_X86_64_REX_GOTPCRELX NoOp
  ld   GOT->main
  lld  0x201d68: [ 48 8d 3d 01 01 00 00 ] lea 0x201E70,%rdi
                            ^^^^^^^^^^^ R_X86_64_PC32 MovIndirectToLea
  lld  main

Moreover, the relaxations are applied to whichever crt1.o the system happens to have.

1 reply

davidlattimore Oct 1, 2025
Maintainer Author

To some extent, linker-diff does try to treat certain things as equivalent. e.g. different formats of PLT entries. I guess the tricky part is working out what is and isn't equivalent. In this case, a defined symbol in a non-relocatable executable, it's valid to do what Wild did. But if it were a shared object, it likely wouldn't be. The fix for this specific diff would probably be to just add rel.extra-opt.R_X86_64_REX_GOTPCRELX.RexMovIndirectToAbsolute.dynamic-non-pie to our default ignore list.

I still wonder though if we want to keep imposing diffing against GNU ld on people who are packaging wild and just want a quick assurance that "things basically work".

daniel-levin · 2025-10-05T13:32:08Z

daniel-levin
Oct 5, 2025

I've got a proposal. Wild could have a second and third suite of integration tests, named quite differently from the one that exists now. The second suite of integration tests would pass or fail based on a computation of some kind. That is, they'd have to produce binaries. Those binaries would have to actually load and execute and produce some kind of a signal that they're kosher. You could embed additional data produced by wild as "evidence".

I'll spell out the tradeoffs:

Pros:

Packagers and integrators can run this test suite to check if wild works on the platform they are running on.
Disagreements with lld/ld.bfd are irrelevant. Sometimes those linkers produce un-runnable binaries anyhow (especially on platforms with second or third-tier support like Illumos). This is the main benefit.
This test suite would be virtually identical for non-ELF platforms, which may be what wild's future holds.

Cons:

Not all of wild's outputs are executables
Bug reports become more burdensome for the reporter, as they have to report their toolchain in detail.
There is no external data to compare wild against.
Wild could produce binaries with errors which don't show up in the test suite.

The third integration test suite is something I am far more in favor of. It requires conviction - conviction that wild's is good enough to "self test". For the record, I think it is. That is, each test case:

Uses the ambient toolchain(s) to produce one or more binaries.
Opens those binaries with object and runs assertions.

The primary difference between today's integration tests and this third category I am proposing is that "what did ld do" is removed as an input, entirely. The reason this should be a separate test suite is that linker-diff is too useful to not be part of the regular development process. For the most part, ld does the right thing, and simply figuring out what it did, what wild did, and comparing them, with colorised output and decoded instructions is invaluable. The clearest benefit of this, third, test suite over today's integ tests is that what it loses in fidelity (if anything) is made up by the sheer volume of tests that can be run. You can take a particular test case and specialize it for every platform and architecture. All the idiosyncrasies and subtle differences between platforms are runnable from almost any machine. It'd be great if you could make a change on say, Linux, and see how it cascades to the Illumos/FreeBSD specific tests without having to actually run it on such a machine. In my experience, this technique - self-inflicted combinatorial explosion - is very effective. Plus, there's nothing like a huge test suite to incentivize people to write fast software.

2 replies

daniel-levin Oct 5, 2025

There is also the possibility of checking hand-made binaries into source control and writing tests against those... it's not particularly pretty. But it would solve the portability problem and decouple the tests from GNU ld.

davidlattimore Oct 6, 2025
Maintainer Author

To an extent, our existing tests operate in all of those ways at once. Many of our tests get executed and indicate that they pass by use of an exit code of 42. We also have a number of assertions that can be made about the files that get produced. e.g. DoesNotContain, Contains, ExpectSym, ExpectDynSym, NoSym. We'd probably have more except that we've relied to an extent on diff-testing to fill that gap.

I should also mention that linker-diff does do a small amount of validation. We should probably have a way to run that even if we're not diffing.

What I'm less sure about is the need for these to be completely independent test suites.

daniel-levin · 2025-10-06T12:21:10Z

daniel-levin
Oct 6, 2025

#1158

Made it possible to disable diffing and only diffing as a test setting.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Thoughts on tight coupling of tests to system GNU ld #1120

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Thoughts on tight coupling of tests to system GNU ld #1120

Uh oh!

davidlattimore Sep 27, 2025 Maintainer

Replies: 4 comments · 4 replies

Uh oh!

mati865 Sep 27, 2025 Collaborator Sponsor

Uh oh!

davidlattimore Oct 1, 2025 Maintainer Author

Uh oh!

daniel-levin Sep 28, 2025

Uh oh!

davidlattimore Oct 1, 2025 Maintainer Author

Uh oh!

daniel-levin Oct 5, 2025

Uh oh!

daniel-levin Oct 5, 2025

Uh oh!

davidlattimore Oct 6, 2025 Maintainer Author

Uh oh!

daniel-levin Oct 6, 2025

davidlattimore
Sep 27, 2025
Maintainer

Replies: 4 comments 4 replies

mati865
Sep 27, 2025
Collaborator Sponsor

davidlattimore Oct 1, 2025
Maintainer Author

daniel-levin
Sep 28, 2025

davidlattimore Oct 1, 2025
Maintainer Author

daniel-levin
Oct 5, 2025

davidlattimore Oct 6, 2025
Maintainer Author

daniel-levin
Oct 6, 2025