Testing linking behaviour

by

While fixing linking problems of libg3d on windows I checked the binutils ld man page again after years and noticed that there are some useful parameter to check for linking misbehaviour on linux. In some rare situations it is a design decision of upstream and not a really "bug" but in most cases someone forgot to link against a library and his current system configuration hides the problem from him. So it isn't a good idea to insist on a bug in upstream or "fixing" it without listening to upstream - unless you liked the the infamous openssl random number genererator bug in debian related distributions.

Most of the informations bellow is only correct on systems with lazy loading, lazy bindings and where vague symbols are allowed in shared objects. It is unknown to me how -Bdirect (direct binding on solaris) affects it but all known patches to get this functionality in gnu binutils supported vague symbols which are the possible problems we want to detect.

I've prepared some examples to make it easier to understand what the different graphs mean.

Undefined symbol in library

../../../notes/20081209-no_undefined.svg

This was the problem I had under windows. Here we have a library which uses functionality from another one but doesn't link against it. We would only get a warning about unresolved symbols when we forget to specify the missing library when linking prog. The reason for that is the way the dynamic linker locates the symbols under linux - under windows the linker wants to know where a symbol comes from, but under linux the dynamic linker searches for symbols in the current pool of functions in opened libraries. That way it is possible to use LD_PRELOAD to provide or override symbols for a program.

What the dynamic linker does to find a symbol should be equivalent to dlsym(NULL, "symbolname") but this has different disadvantages like being not portable to systems with different dynamic linkers like windows or that prog's built system can easily break when the directly used library starts to use more libraries.

The easiest way to detect such problem is to use the parameter --no-undefined for ld or -Wl,--no-undefined when using ld via gcc - but prepare yourself for a big headache when your shared objects gets symbols from the prog (for example in a plugin system).

Indirect linked library by unused library

../../../notes/20081209-as_needed.svg

A problem you definitely ran into when usingcompiling gentoo in may 2004 with the newest guides from forums.gentoo.org was that many programs didn't linked. The reason of that problem was the promotion of -Wl,--as-needed as ultimative solution for faster startup of KDE applications and other programs with a big amount of libraries in their ELF DT_NEEDED header.

What it really does is to look at the supplied shared objects and check if they will be used by our program. If it is not true the library will not be added to DT_NEEDED which results in not loading the specified shared object on program startup.

In that example we have a libs3d which uses libg3d and prog which wants to use functions from libg3d but links against libs3d. The normal behaviour of the linker is to use DT_NEEDED of libs3d to load libg3d to provide libg3d's symbols for prog. When the build is run with --as-needed libs3d will be discarded because prog doesn't use any of the symbols provided by it. Even if this is a big problem for different distributions which really wants to use AS_NEEDED() in their linker script (Mandriva and Gentoo are working since years to fix it), it is just a side effect of --as-needed and will not provide a good test in most situations.

Indirect linked, forgotten library

../../../notes/20081209-no_add_needed.svg

It is really easy to make --as-needed test useless by using a symbol of libs3d in our prog. The build of prog will fail if libs3d stops to use libg3d and so not providing libg3d's symbols for prog anymore.

A better way to test for indirectly linked shared objects is by appending --no-add-needed to ld. This will discard all via DT_NEEDED indirectly specified shared objects of directly specified shared objects. So we get (nearly) the same behaviour like using --as-needed in our previous example.

Conclusion

It is a good quick test to use -Wl,--no-add-needed -Wl,--no-undefined in your LDFLAGS when compiling your project, but also -Wl,--as-needed can help to find useless linked shared objects. It needs a little more manual work with scanelf -n to find them, but every package maintainer will thank you for reducing binary dependencies in their packages.