The Wonders of Dynamically Linked Objects

Programming with ELF Dynamic Linking under Linux

Revised Tuesday, March 18, 1997

One of the nicer side effects of the a.out to ELF move was the introduction of a new set of system calls to allow applications to load compiled modules of code at runtime. This facility, when used intelligently, provides great opportunities for dynamic extensibility without any associated recompilation costs. If unintelligently used, it leaves a backdoor open for malicious code to behave inappropriately. This article will deal with the native features of Linux-based systems for supporting libraries of existing code.

All About Libraries and Linking...

Perhaps the most common denominator of code reuse in Linux is the use of libraries. Libraries are precompiled binary archives of code and data. Linux programs typically make use of the standard C library, and occasionally the math library.

Using the standard libraries is a good idea for many reasons - it saves you from having to write common routines (and stub interfaces to kernel system calls). Being in common use amongst many other programmers, standard libraries are usually quite well understood. You will find it easier to share code and coding ideas with others, or to look for help when in trouble.

With ELF binaries, there are two distinct methods of linking with libraries - static linking and dynamic linking. Static linking is the simpliest. The libraries that a program uses are combined with the code of the application itself - the result is a self contained executable.

However, statically linking the same libraries again and again with each separate executable is wasteful - both of disk space and also of memory (a precious resource). Hence, the introduction of dynamic linking facilities.

With dynamic linking, the compiled application binds with the shared library code at run-time. The resulting application is much smaller in terms of the number of sectors it uses on disk, and also in its memory footprint. There is now only one copy of the library code on disk for all applications that use it (by dynamically linking), instead of it being combined with each application and stored multiple times (as is the case with static linking). The ld.so program is responsible for performing the dynamic binding between a shared library and an application at run-time.

With memory usage, the situation is a little different. Linux manages memory as finely grained 4 KB allocations called pages. When an application binds at run-time to a library, the library's code and data segments are loaded into memory (by ld.so) if they do not already reside there. The linking is then performed using COPY_ON_WRITE semantics.

In Linux, the virtual memory system gives each application its own private address space. With COPY_ON_WRITE, the library segments are intially mapped into the application's address space in a read only shared fashion. If, at some later stage, the application tries to write to part of these shared memory segments (for example, to change or set the value of a library variable), a special exception will be raised for the offending memory address (called a page fault). The offending page will be copied and privately mapped into the applications address space, before the offending instruction is re-tried.

In this manner, not only will multiple applications share a single a single copy of a library on disk, but it is very likely that a high proportion of the library will be shared at run-time, thereby reducing overall memory requirements for the system. With the old a.out format, each library had its own entry point, at which the system tried to map it in virtual memory. With ELF, library code segments are relocatable -- thus, the one library can be mapped anyplace in an application's virtual memory space.

Along with the considerable advantages of an economical, prudent use of system resources, dynamic linking brings with it a few additional complexities compared with static linking. When an application is run, it may not necessarily find all the libraries it requires. If so, it will die with an error message indicating that it could not find the required library. Similarly, an application may find different versions of the libraries to the versions it was build with.

The standard search for dynamic libraries is to first search the environment variable LD_LIBRARY_PATH, which contains a colon separated list of directores to search in. After these, directories in /etc/ld.so.conf are searched, followed by /usr/lib and /lib.

When an executable is run, first its code and data segments are mapped directly from the filesystem (or cache) into memory (using mmap()). Next, /etc/ld.so.cache is mmap()'ed into memory and interrogated to see if the executables shared library dependencies are already memory resident. If not, they are opened and mapped into memory also. Then the shared library code is mmap()'ed into the executable's address space (read-write for data segments, read-execute for code segments). The personality() system call is used to support emulation of foreign binary formats, and is called by the kernel on behalf of an executable before it (the executable) starts. Finally, ld.so does some housekeeping on who the user is (geteuid(), getuid(), getgid(), getegid() are checked for security reasons if the user wants their own library path configuration) and runs the executable code.

TIP: to view the system calls an application makes when it starts up, use the strace command to start it - e.g. strace ls.

Libraries, by convention are named with both major and minor revision numbers. An increment in a minor number indicates small changes in the library - perhaps a bug fix or two, or relatively minor new features. Programs are not expected to break as a result of minor number changes. Patch levels are a further specialization of the minor number, usually reserved for bug-fixes. Patch levels are only common with libraries that see many iterations, such as the C library.

Major number increments signify large structural or implementation changes to a library - to such a degree that programs linked with a lower major version of the library would not be expected to work with the newer version.

At system boot-up time, symbolic links to the libraries, and a cache file of library locations is created by /sbin/ldconfig. Thus, libc.so.5 will always point to the version of the C library with major number 5, and the largest minor version. The naming convention of shared libraries in Linux is libname.so.major.minor.patchlevel - where major is the major version number, and minor the minor version number: for example, libc.so.5.4.17

An example of what can happen when shared libraries are upgraded or changed is the behaviour of Netscape 3.x up until 4.03 (Communicator). Behaviour of the C library changed in regard to freeing memory after libc.so.5.0.9. Bugs in the Netscape code cause segmentation faults when used with these libraries. A simple workaround to this is to use a shell script wrapper around Netscape to force it to link with an earlier version of the library (which would not be as unforgiving). Assuming both libc.so.5.0.9 and libXpm.so.4.6 are located in /usr/local/netscape/lib, the following script achieves the desired effect.
#!/bin/sh
#
# Convenient to place this as /usr/local/bin/netscape :-)
#

CLASSPATH = /usr/local/netscape/java_301  # handy to setup Java here too
MOZILLA_HOME = /usr/local/netscape        # for communicator

LD_PRELOAD = /usr/local/netscape/lib/libc.so.5.0.9 # or wherever your old
                                                   #libc is

export LD_PRELOAD CLASSPATH MOZILLA_HOME

/usr/local/netscape/netscape $*

Another example of preloading libraries is libnoflash.so from David Tong, Sun Microsystems (david.tong@sun.com). libnoflash is a small library which aims to eliminate many cases of colormap flashing in X. It works by preloading before the Xlib library, and thereby intercepting calls to the XAllocColor function in the library. It is available (in source form) from http://c3-a.snvl1.sfba.home.com/ColormapFAQ.html.

Okay, now that you know all about shared dynamically linked libraries, let's create one and dynamically link to it.

Dynamic linking requires that the dynamically linked objects not make any assumptions about where they will be placed in a process' address space. This is to ensure that the end-developer's programs are protected as much as possible from implementation details of the shared library. Since this code may be shared between many processes, it needs to be independent of its position in memory. Such code is referred to (logically enough) as position independent code (PIC).

Listing one is a simple function to print a message and return 0. It is going to be compiled as a shared library. Listing two shows the prototype information for other programs which want to call this function (by dynamically linking with the library). Listing three shows just such a program - it looks completely normal. The programmer can be unaware when writing code whether the library is going to be statically or dynamically linked.

Listing four shows a Makefile to build the shared library, and create the application. Note the -L. -lHelloWorld in the LIBS variable passed to gcc ($(CC)). The -L. signifies that the search for shared libraries is to include the current directory ('.'), and that the library to link with is called libHelloWorld.so. When a library name is specified with -l, the lib header and .so trailer are automatically added to it.

The library itself is built in two stages - first a position independent object file is created. Then this object file is compiled as a shared library. The procedure is identical regardless of whether you are compiling C or C++ code.


/*
 * Listing 1: helloWorld.c
 * Example shared library code
 *
 */
#include <stdio.h>

int HelloWorld(char *msg)
{
    if (0 == msg)
    {
        puts("Hello, World!");
    }
    else
    {
        puts(msg);
    }

    return 0;
}
Listing 1. Example shared library code
/*
 * Listing 2: helloWorld.h
 * Shared library header file
 * prototypes (and data structures if any)
 */
#ifndef __hello_world_h
#define __hello_world_h

int HelloWorld(char *msg);

#endif
Listing 2. Shared library header file
/*
 * Listing 3:  testcode.c
 * Example code using the shared library
 */

#include "helloWorld.h"

int main(void)
{
    HelloWorld("My Message");

    return 0;
}
Listing 3. Example code using the shared library
# GNUmakefile for test listing 1-3

all: libHelloWorld.so testcode

CFLAGS = -Wall -pedantic
LIBS = -L. -lHelloWorld
INC = 
CC = gcc

testcode: testcode.c libHelloWorld.so
	$(CC) $(CFLAGS) $(INC) -o testcode testcode.c $(LIBS)

libHelloWorld.so: helloWorld.o
	$(CC) $(CFLAGS) -shared -o libHelloWorld.so helloWorld.o

helloWorld.o: helloWorld.c
	$(CC) $(CFLAGS) -fPIC -c helloWorld.c

clean:
	-$(RM) testcode libHelloWorld.so helloWorld.o
Listing 4. Test stub code

Dynamic Loading

With ELF, the dynamic linking tools allow arbitrary ELF binary objects to be accessed. Or at least Solaris 2.x does (the dynamic link features owe their origins to Solaris). Linux however only seems to like providing this feature for ELF shared libraries. The procedure is quite simple:

  1. open a "handle" to an ELF shared library. (If the library contains an _init function, it is invoked when the archive is first opened.)
  2. lookup and bind to a particular symbol (variable / function)
  3. use the symbol in your code
  4. close the handle, unloading the library if necessary. (If the archive contains a _fini routine, it is invoked just before the archive is unloaded.)

Doing the Dynamic Thing in C...


/*
 * helloWorld.c
 */

#include <stdio.h>

void HelloWorld(void)
{
    printf("Hello World, from a dynamically linked function.\n");
}
Listing 6. Library code
/*
 * testStub.c
 */

#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>

int main(void)
{
    void* handle = 0;
    void (*helloWorld)(void) = 0;
    const char* error = 0;

    handle = dlopen("./libHelloWorld.so", RTLD_LAZY);
    if (0 == handle)
    {
        error = dlerror();
        if (0 != error)
        {
            fprintf(stderr, "%s: %s\n", argv[0], error);
        }

        exit (EXIT_FAILURE);
    }

    helloWorld =  (void(*)(void))dlsym(handle, "HelloWorld");
    if (0 == helloWorld)
    {
        error = dlerror();
        if (0 != error)
        {
            fprintf(stderr, "%s: %s\n", argv[0], error);
            exit (EXIT_FAILURE);
        }

        /* else not an error */
    }

    (*helloWorld)();

    dlclose(handle);

    return (EXIT_SUCCESS);
}
Listing 6. Test stub code
#
# Makefile for test programs
#

all: libHelloWorld.so testStub

CFLAGS = -Wall -pedantic
LIBS = -ldl
CC = gcc

testStub: testStub.c
	$(CC) $(CFLAGS) $(OPTIMIZATIONS) -o testStub testStub.c $(LIBS)

libHelloWorld.so: helloWorld.o
	$(CC) $(CFLAGS) -shared -o libHelloWorld.so helloWorld.o 

helloWorld.o: helloWorld.c
	$(CC) $(CFLAGS) -fPIC -c helloWorld.c

clean:
	-$(RM) testStub libHelloWorld.so helloWorld.o
Listing 7. Makefile for sample code

It is time right now for some C. The sample code in listing 5 shows the implementation of a library with a single function (HelloWorld()). Listing 6 contains sample stub code to demonstrate the dynamic invocation of this function in an ELF shared library. Listing 7 gives a simple Makefile to build this example. First, in listing 6, a handle to the library (or ELF object file if Solaris) is obtained using the dlopen() system call. The first argument is the filename of the archive. If this is not an absolute pathname, the search for the library will check:

  • a colon-separated list of directives in the environment variable LD_ELF_LIBRARY.
  • if (and only if) LD_ELF_LIBRARY is not set, then a similar LD_LIBRARY_PATH is checked.
  • next comes the list of libraries specified in /etc/ld.so.conf.
  • then the libraries in /usr/lib.
  • finally, /lib.

If the filename is NULL, a handle is returned for the current program. The handle SHOULD NOT BE interpreted or manipulated directly by the application programmer. By convention, dynamically linked libraries and shared objects are named with the prefix lib and the suffix .so.

After the filename, the next parameter is a flag specifying when the symbols from the library are to be resolved - either RTLD_LAZY (when the object is first used) or RTLD_NOW (immediately) are valid.

If RTLD_GLOBAL is bitwise-OR'ed with flag, external symbols defined in the library will be made available to subsequently loaded libraries. If the testStub program was compiled by passing the -rdynamic command line option to the linker (e.g. gcc -rdynamic ... or ld -rdynamic ...), global symbols in the testcode executable will also be used to resolve symbols in the dlopen()'ed object.

Error messages are obtainable in the form of human-readable strings via the dlerror() call.

The dlsym() function takes a valid handle and a NULL terminated symbol name. It returns the address where the symbol is loaded. If the symbol could not be found, dlsym() will return NULL. However, the value of the symbol may actually be NULL. dlerror() returns a valid address if an error did occur, else it returns NULL. Therefore, to ensure an error has occured, you should test the value returned by dlerror() is not NULL.

It is necessary to store the value returned by dlerror() into a variable, as subsequent calls to dlerror() will return NULL (and if you had not stored it, you would have lost your error message!!!).

dlclose() is called when the object's symbols are no longer required.

TIP: to view the symbols in a (non-stripped) file, use the nm (1) command. Be prepared to pipe the output through your favourite pager (e.g. less, as it is often quite terse, but lengthy).

Why use this Dynamic Loading?

When would this feature be used? Well, I can think of two powerful uses. Imagine a widget / GUI toolkit that automatically detects whether it is running under X or on an SVGAlib-capable console. It the acts appropriately - depending on which action is needed, the GUI library could dynamically load and bind with either the required X11 or SVGAlib libraries.

Another use perhaps would be to add a feature akin to the Amiga's datatypes. With the Amiga, once a datatype for JPEG files (for example) is installed, all datatype-aware applications will automatically be able to handle JPEGs (decode, display, encode etc). Dynamic linking of ELF libraries would allow this feature to be developed for Linux.

Doing the Dynamic Thing in C++...

Linking in C++ is a slight bit trickier than with C. For a start there is the additional complexity of name mangling. With C++, functions can be overloaded by argument (not return type). C++ also provides for type safe linkage. The way the compiler/linker handle these features is to form a unique signature for each function by joining the arguments with the namespace and function name. This process is called name mangling.

Name mangling must be prevented for C when linking C and C++ programs. Otherwise the C++ compiler will mangle to C prototype and the linker will not be able to link the required function later. It is prevented by declaring a section of code (prototypes, etc) as of the C language:

extern "C"
{
    /*
     * the code inside these curly braces is C - note this
     * even means that the use of the C++ // comment on its own
     * is prohibited
     */

    int MyCFunction(int a, char b);
}
Listing 8. Protecting C code from C++ mangling

Where name mangling becomes a problem with dynamic linking is when using the dlsym() function. dlsym() knows nothing about C++ name mangling. Thus if you look for a function by name, you must specify its mangled form if it is a C++ function.

Unfortunately, the ARM (C++ Annotated Reference Manual) actively encourages compiler writers to make their name mangling different from compilers for the same platform. This ensures incompatible libraries are detected at link time rather than at run time, but makes it almost impossible to use dlsym() without making assumptions on the compiler an ELF object was created with.

Virtual base classes are quite useful with dynamic loading. However, due to name mangling and linking incompatibilities, they cannot be pure virtual base classes. The factory method also cannot be a static member of the class for the same reason - it must be a C function, with a C signature. To ensure this, encapsulate it within an extern block as follows, where MyObject is the class from which an object is to be instantiated -- see listing 9.

extern "C"
{
    MyObject* CreateObject(void);
}
Listing 9. Factory "method" prototype

This is actually an example of the Abstract Factory Design Pattern (see figure 1). The motivation behind software design patterns is to get software designs correct faster. A design pattern, simply put, is the record of an elegant solution (design) to a specific, re-occurring problem in object-oriented software design. This pattern-based approach to designing, and the subsequent cataloguing of such design in software "cookbooks" have been readily adopted by the object-oriented community in its quest for the holy grail of software re-usability.


Figure 1. Abstract Factory Design Pattern


In this manner, you can compile your code for a particular type of (object) interface, but dynamically bind to different implementations of this interface at run-time. As a simple example of this, consider an image manipulation base class. Two specializations of this class may support GIF images, and JPEG images -- as shown in figure 2.


Figure 2. Images Inheritance Model


Assume that the base class (Image) supports two methods - Read() for reading from a file, and Print to dump to a PostScript printer. Each specialization will override these methods with class specific implementations (i.e. an implementation to read and print GIF images, and one to read and print JPEG images). Assume also that the Read() method returns -1 on the error condition of not being able to correctly load the file (i.e. the file is not of the correct type).

An application is built by linking with the base class. At run-time, it wants to load an print a particular image type. It opens the directory where the GIF and JPEG ELF binaries are, and dynamically binds to the GIF binary. It then instantiates a GIF object, and tries to read in the image. If the read fails, then it assumes the image is not in GIF format. It then releases the GIF binary, and repeats the procedure for the JPEG binary. Assume, finally, that the application repeats the procedure for each binary in the directory until one of them successfully reads the image, or they all have failed.

To add functionality for this application to support PNG images, for example, all you have to do is create a PNG class which inherits from image. Compile this class as an ELF shared library, and place in the same directory. Now, the application you have created will automatically support PNG images, since it will dynamically bind to this additional functionality at run-time.

Conclusions

This article introduced the dynamic linking and loading features of the ELF binary file format, as supported in Linux. The dynamic loading of libraries and the dynamic binding with C functions was demonstrated. A technique for the instantiation of C++ objects with dynamically bound implementation was presented, via the use of C compile-time linkage and the Abstract Factory design pattern. Design Patterns are outside the scope of this article, but a reference to the most influential pattern "cookbook" is given in the "For further reading" section.

Dynamic loading is a powerful operating system feature, allowing the development of modular, extensible software. Many existing applications make use of this feature to allow for third-party enhancement of functionality via plug-ins -- a notable example of this is Netscape.

It is important to remember, however, that since the dynamically loaded object is in the same virtual memory space as the invoking executable, bugs in its implementation affect the entire application. The GIMP, a powerful image manipulation package, takes an alternative approach to the use of plug-ins by having each plug-in run as a separate process, which communicates with the main process via the system's IPC facilities.


For further reading

  • dlopen (3) Linux man pages by Adam Richter.
  • dlopen (3X), dlsym (3X), dlerror (3X), dlclose (3X) Solaris man pages
  • comp.lang.c++ FAQ, originally by Marshall Cline (cline@parashift.com), now maintained by Herb Sutter (herbs@connobj.com)
  • Design Patterns: Elements of Reusable Object-Oriented Software, Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Addison-Wesley, Reading, Massachusetts, October, 1994. ISBM 0-201-63361-2.
  • libnoflash source and documentation available from http://c3-a.snvl1.sfba.home.com/ColormapFAQ.html
Copyright 1997, Ivan Griffin.