Linker and Libraries Guide

Chapter 4 Shared Objects

Shared objects are one form of output created by the link-editor and are generated by specifying the -G option. In the following example, the shared object libfoo.so.1 is generated from the input file foo.c.

$ cc -o libfoo.so.1 -G -K pic foo.c

A shared object is an indivisible unit that is generated from one or more relocatable objects. Shared objects can be bound with dynamic executables to form a runable process. As their name implies, shared objects can be shared by more than one application. Because of this potentially far-reaching effect, this chapter describes this form of link-editor output in greater depth than has been covered in previous chapters.

For a shared object to be bound to a dynamic executable or another shared object, it must first be available to the link-edit of the required output file. During this link-edit, any input shared objects are interpreted as if they had been added to the logical address space of the output file being produced. All the functionality of the shared object is made available to the output file.

Any input shared objects become dependencies of this output file. A small amount of bookkeeping information is maintained within the output file to describe these dependencies. The runtime linker interprets this information and completes the processing of these shared objects as part of creating a runable process.

The following sections expand upon the use of shared objects within the compilation and runtime environments. These environments are introduced in Runtime Linking.

Naming Conventions

Neither the link-editor nor the runtime linker interprets any file by virtue of its file name. All files are inspected to determine their ELF type (see ELF Header). This information enables the link-editor to deduce the processing requirements of the file. However, shared objects usually follow one of two naming conventions, depending on whether they are being used as part of the compilation environment or the runtime environment.

When used as part of the compilation environment, shared objects are read and processed by the link-editor. Although these shared objects can be specified by explicit file names as part of the command passed to the link-editor, the -l option is usually used to take advantage of the link-editor's library search capabilities. See Shared Object Processing.

A shared object that is applicable to this link-editor processing, should be designated with the prefix lib and the suffix .so. For example, /lib/libc.so is the shared object representation of the standard C library made available to the compilation environment. By convention, 64–bit shared objects are placed in a subdirectory of the lib directory called 64. For example, the 64–bit counterpart of /lib/libc.so.1, is /lib/64/libc.so.1.

When used as part of the runtime environment, shared objects are read and processed by the runtime linker. To allow for change in the exported interface of the shared object over a series of software releases, provide the shared object as a versioned file name.

A versioned file name commonly takes the form of a .so suffix followed by a version number. For example, /lib/libc.so.1 is the shared object representation of version one of the standard C library made available to the runtime environment.

If a shared object is never intended for use within a compilation environment, its name might drop the conventional lib prefix. Examples of shared objects that fall into this category are those used solely with dlopen(3C). A suffix of .so is still recommended to indicate the actual file type. In addition, a version number is strongly recommended to provide for the correct binding of the shared object across a series of software releases. Chapter 5, Application Binary Interfaces and Versioning describes versioning in more detail.


Note –

The shared object name used in a dlopen(3C) is usually represented as a simple file name, that has no `/' in the name. The runtime linker can then use a set of rules to locate the actual file. See Loading Additional Objects for more details.


Recording a Shared Object Name

The recording of a dependency in a dynamic executable or shared object will, by default, be the file name of the associated shared object as it is referenced by the link-editor. For example, the following dynamic executables, that are built against the same shared object libfoo.so, result in different interpretations of the same dependency.


$ cc -o ../tmp/libfoo.so -G foo.o
$ cc -o prog main.o -L../tmp -lfoo
$ elfdump -d prog | grep NEEDED
       [1]  NEEDED        0x123         libfoo.so.1

$ cc -o prog main.o ../tmp/libfoo.so
$ elfdump -d prog | grep NEEDED
       [1]  NEEDED        0x123         ../tmp/libfoo.so

$ cc -o prog main.o /usr/tmp/libfoo.so
$ elfdump -d prog | grep NEEDED
       [1]  NEEDED        0x123         /usr/tmp/libfoo.so

As these examples show, this mechanism of recording dependencies can result in inconsistencies due to different compilation techniques. Also, the location of a shared object as referenced during the link-edit might differ from the eventual location of the shared object on an installed system. To provide a more consistent means of specifying dependencies, shared objects can record within themselves the file name by which they should be referenced at runtime.

During the link-edit of a shared object, its runtime name can be recorded within the shared object itself by using the -h option. In the following example, the shared object's runtime name libfoo.so.1, is recorded within the file itself. This identification is known as an soname.


$ cc -o ../tmp/libfoo.so -G -K pic -h libfoo.so.1 foo.c

The following example shows how the soname recording can be displayed using elfdump(1) and referring to the entry that has the SONAME tag.


$ elfdump -d ../tmp/libfoo.so | grep SONAME
       [1]  SONAME        0x123         libfoo.so.1

When the link-editor processes a shared object that contains an soname, this is the name that is recorded as a dependency within the output file being generated.

If this new version of libfoo.so is used during the creation of the dynamic executable prog from the previous example, all three methods of creating the executable result in the same dependency recording.


$ cc -o prog main.o -L../tmp -lfoo
$ elfdump -d prog | grep NEEDED
       [1]  NEEDED        0x123         libfoo.so

$ cc -o prog main.o ../tmp/libfoo.so
$ elfdump -d prog | grep NEEDED
       [1]  NEEDED        0x123         libfoo.so

$ cc -o prog main.o /usr/tmp/libfoo.so
$ elfdump -d prog | grep NEEDED
       [1]  NEEDED        0x123         libfoo.so

In the previous examples, the -h option is used to specify a simple file name, that has no `/' in the name. This convention enables the runtime linker to use a set of rules to locate the actual file. See Locating Shared Object Dependencies for more details.

Inclusion of Shared Objects in Archives

The mechanism of recording an soname within a shared object is essential if the shared object is ever processed from an archive library.

An archive can be built from one or more shared objects and then used to generate a dynamic executable or shared object. Shared objects can be extracted from the archive to satisfy the requirements of the link-edit. Unlike the processing of relocatable objects, which are concatenated to the output file being created, any shared objects extracted from the archive are recorded as dependencies. See Archive Processing for more details on the criteria for archive extraction.

The name of an archive member is constructed by the link-editor and is a concatenation of the archive name and the object within the archive. For example.


$ cc -o libfoo.so.1 -G -K pic foo.c
$ ar -r libfoo.a libfoo.so.1
$ cc -o main main.o libfoo.a
$ elfdump -d main | grep NEEDED
       [1]  NEEDED        0x123         libfoo.a(libfoo.so.1)

Because a file with this concatenated name is unlikely to exist at runtime, providing an soname within the shared object is the only means of generating a meaningful runtime file name for the dependency.


Note –

The runtime linker does not extract objects from archives. Therefore, in this example, the required shared object dependencies must be extracted from the archive and made available to the runtime environment.


Recorded Name Conflicts

When shared objects are used to create a dynamic executable or another shared object, the link-editor performs several consistency checks. These checks ensure that any dependency names recorded in the output file are unique.

Conflicts in dependency names can occur if two shared objects used as input files to a link-edit both contain the same soname. For example.


$ cc -o libfoo.so -G -K pic -h libsame.so.1 foo.c
$ cc -o libbar.so -G -K pic -h libsame.so.1 bar.c
$ cc -o prog main.o -L. -lfoo -lbar
ld: fatal: recording name conflict: file `./libfoo.so' and \
    file `./libbar.so' provide identical dependency names: libsame.so.1
ld: fatal: File processing errors. No output written to prog

A similar error condition occurs if the file name of a shared object that does not have a recorded soname matches the soname of another shared object used during the same link-edit.

If the runtime name of a shared object being generated matches one of its dependencies, the link-editor also reports a name conflict


$ cc -o libbar.so -G -K pic -h libsame.so.1 bar.c -L. -lfoo
ld: fatal: recording name conflict: file `./libfoo.so' and \
    -h option provide identical dependency names: libsame.so.1
ld: fatal: File processing errors. No output written to libbar.so

Shared Objects With Dependencies

Shared objects can have their own dependencies. The search rules used by the runtime linker to locate shared object dependencies are covered in Directories Searched by the Runtime Linker. If a shared object does not reside in one of the default search directories, then the runtime linker must explicitly be told where to look. For 32–bit objects, the default search directories are /lib and /usr/lib. For 64–bit objects, the default search directories are /lib/64 and /usr/lib/64. The preferred mechanism of indicating the requirement of a non-default search path, is to record a runpath in the object that has the dependencies. A runpath can be recorded by using the link-editor's -R option.

In the following example, the shared object libfoo.so has a dependency on libbar.so, which is expected to reside in the directory /home/me/lib at runtime or, failing that, in the default location.


$ cc -o libbar.so -G -K pic bar.c
$ cc -o libfoo.so -G -K pic foo.c -R/home/me/lib -L. -lbar
$ elfdump -d libfoo.so | egrep "NEEDED|RUNPATH"
       [1]  NEEDED        0x123         libbar.so.1
       [2]  RUNPATH       0x456         /home/me/lib

The shared object is responsible for specifying all runpaths required to locate its dependencies. Any runpaths specified in the dynamic executable are only used to locate the dependencies of the dynamic executable. These runpaths are not used to locate any dependencies of the shared objects.

TheLD_LIBRARY_PATH family of environment variables have a more global scope. Any path names specified using these variables are used by the runtime linker to search for any shared object dependencies. Although useful as a temporary mechanism that influences the runtime linker's search path, the use of these environment variables is strongly discouraged in production software. See Directories Searched by the Runtime Linker for a more extensive discussion.

Dependency Ordering

When dynamic executables and shared objects have dependencies on the same common shared objects, the order in which the objects are processed can become less predictable.

For example, assume a shared object developer generates libfoo.so.1 with the following dependencies.


$ ldd libfoo.so.1
        libA.so.1 =>     ./libA.so.1
        libB.so.1 =>     ./libB.so.1
        libC.so.1 =>     ./libC.so.1

If you create a dynamic executable prog, using this shared object, and define an explicit dependency on libC.so.1, the resulting shared object order will be as follows.


$ cc -o prog main.c -R. -L. -lC -lfoo
$ ldd prog
        libC.so.1 =>     ./libC.so.1
        libfoo.so.1 =>   ./libfoo.so.1
        libA.so.1 =>     ./libA.so.1
        libB.so.1 =>     ./libB.so.1

Any requirement on the order of processing the shared object libfoo.so.1 dependencies would be compromised by the construction of the dynamic executable prog.

Developers who place special emphasis on symbol interposition and .init section processing should be aware of this potential change in shared object processing order.

Shared Objects as Filters

Shared objects can be defined to act as filters. This technique involves associating the interfaces that the filter provides with an alternative shared object. At runtime, the alternative shared object supplies one or more of the interfaces provided by the filter. This alternative shared object is referred to as a filtee. A filtee is built in the same manner as any shared object is built.

Filtering provides a mechanism of abstracting the compilation environment from the runtime environment. At link-edit time, a symbol reference that binds to a filter interface is resolved to the filters symbol definition. At runtime, a symbol reference that binds to a filter interface can be redirected to an alternative shared object.

Individual interfaces that are defined within a shared object can be defined as filters by using the mapfile keywords FILTER or AUXILIARY. Alternatively, a shared object can define all of the interfaces the shared object offers as filters by using the link-editor's -F or -f flag. These techniques are typically used individually, but can also be combined within the same shared object.

Two forms of filtering exist.

Standard filtering

This filtering requires only a symbol table entry for the interface being filtered. At runtime, the implementation of a filter symbol definition must be provided from a filtee.

Interfaces are defined to act as standard filters by using the link-editor's mapfile keyword FILTER, or by using the link-editor's -F flag. This mapfile keyword or flag, is qualified with the name of one or more filtees that must supply the symbol definition at runtime.

A filtee that cannot be processed at runtime is skipped. A standard filter symbol that cannot be located within the filtee, also causes the filtee to be skipped. In both of these cases, the symbol definition provided by the filter is not used to satisfy this symbol lookup.

Auxiliary filtering

This filtering provides a similar mechanism to standard filtering, except the filter provides a fall back implementation corresponding to the auxiliary filter interfaces. At runtime, the implementation of the symbol definition can be provided from a filtee.

Interfaces are defined to act as auxiliary filters by using the link-editor's mapfile keyword AUXILIARY, or by using the link-editor's -f flag. This mapfile keyword or flag, is qualified with the name of one or more filtees that can supply the symbol definition at runtime.

A filtee that cannot be processed at runtime is skipped. An auxiliary filter symbol that cannot be located within the filtee, also causes the filtee to be skipped. In both of these cases, the symbol definition provided by the filter is used to satisfy this symbol lookup.

Generating Standard Filters

To generate a standard filter, you first define a filtee on which the filtering is applied. The following example builds a filtee filtee.so.1, suppling the symbols foo and bar.


$ cat filtee.c
char * bar = "defined in filtee";

char * foo()
{
        return("defined in filtee");
}
$ cc -o filtee.so.1 -G -K pic filtee.c

Standard filtering can be provided in one of two ways. To declare all of the interfaces offered by a shared object to be filters, use the link-editor's -F flag. To declare individual interfaces of a shared object to be filters, use a link-editor mapfile and the FILTER keyword.

In the following example, the shared object filter.so.1 is defined to be a filter. filter.so.1 offers the symbols foo and bar, and is a filter on the filtee filtee.so.1. In this example, the environment variable LD_OPTIONS is used to circumvent the compiler driver from interpreting the -F option.


$ cat filter.c
char * bar = NULL;

char * foo()
{
	return (NULL);
}
$ LD_OPTIONS='-F filtee.so.1' \
cc -o filter.so.1 -G -K pic -h filter.so.1 -R. filter.c
$ elfdump -d filter.so.1 | egrep "SONAME|FILTER"
    [2]  SONAME           0xee     filter.so.1
    [3]  FILTER           0xfb     filtee.so.1

The link-editor can reference the standard filter filter.so.1 as a dependency when creating a dynamic executable or shared object. The link-editor uses information from the symbol table of the filter to satisfy any symbol resolution. However, at runtime, any reference to the symbols of the filter result in the additional loading of the filtee filtee.so.1. The runtime linker uses the filtee to resolve any symbols defined by filter.so.1. If the filtee is not found, or a filter symbol is not found in the filtee, the filter is skipped for this symbol lookup.

For example, the following dynamic executable prog, references the symbols foo and bar, which are resolved during link-edit from the filter filter.so.1. The execution of prog results in foo and bar being obtained from the filtee filtee.so.1, not from the filter filter.so.1.


$ cat main.c
extern char * bar, * foo();

void main()
{
        (void) printf("foo is %s: bar is %s\n", foo(), bar);
}
$ cc -o prog main.c -R. filter.so.1
$ prog
foo is defined in filtee: bar is defined in filtee

In the following example, the shared object filter.so.2 defines one of its interfaces, foo, to be a filter on the filtee filtee.so.1.


Note –

As no source code is supplied for foo(), the mapfile keyword, FUNCTION, is used to ensure a symbol table entry for foo is created.



$ cat filter.c
char * bar = "defined in filter";
$ cat mapfile
{
	global:
		foo = FUNCTION FILTER filtee.so.1;
};
$ cc -o filter.so.2 -G -K pic -h filter.so.2 -M mapfile -R. filter.c
$ elfdump -d filter.so.2 | egrep "SONAME|FILTER"
    [2]  SONAME           0xd8     filter.so.2
    [3]  SUNW_FILTER      0xfb     filtee.so.1
$ elfdump -y filter.so.2 | egrep "foo|bar"
    [1]  F    [3] filtee.so.1      foo
   [10]  D        <self>           bar

At runtime, any reference to the symbol foo of the filter, results in the additional loading of the filtee filtee.so.1. The runtime linker uses the filtee to resolve only the symbol foo defined by filter.so.2. Reference to the symbol bar always uses the symbol from filter.so.2, as no filtee processing is defined for this symbol.

For example, the following dynamic executable prog, references the symbols foo and bar, which are resolved during link-edit from the filter filter.so.2. The execution of prog results in foo being obtained from the filtee filtee.so.1, and bar being obtained from the filter filter.so.2.


$ cc -o prog main.c -R. filter.so.2
$ prog
foo is defined in filtee: bar is defined in filter

In these examples, the filtee filtee.so.1 is uniquely associated to the filter. The filtee is not available to satisfy symbol lookup from any other objects that might be loaded as a consequence of executing prog.

Standard filters provide a convenient mechanism for defining a subset interface of an existing shared object. Standard filters provide for the creation of an interface group spanning a number of existing shared objects. Standard filters also provide a means of redirecting an interface to its implementation. Several standard filters are used in the Solaris OS.

The /usr/lib/libsys.so.1 filter provides a subset of the standard C library /usr/lib/libc.so.1. This subset represents the ABI-conforming functions and data items that reside in the C library that must be imported by a conforming application.

The /lib/libxnet.so.1 filter uses multiple filtees. This library provides socket and XTI interfaces from /lib/libsocket.so.1, /lib/libnsl.so.1, and /lib/libc.so.1.

libc.so.1 defines interface filters to the runtime linker. These interfaces provide an abstraction between the symbols referenced in a compilation environment from libc.so.1, and the actual implementation binding produced within the runtime environment to ld.so.1(1).

libnsl.so.1 defines the standard filter gethostname(3C) against libc.so.1. Historically, both libnsl.so.1 and libc.so.1 have provided the same implementation for this symbol. By establishing libnsl.so.1 as a filter, only one implementation of gethostname() need exist. As libnsl.so.1 continues to export gethostname(), the interface of this library continues to remain compatible with previous releases.

Because the code in a standard filter is never referenced at runtime, adding content to any functions defined as filters is redundant. Any filter code might require relocation, which would result in an unnecessary overhead when processing the filter at runtime. Functions are best defined as empty routines, or directly from a mapfile. See Defining Additional Symbols with a mapfile.

When generating data symbols within a filter, always associate the data with a section. This association can be produced by defining the symbol within a relocatable object file. This association can also be produced by defining the symbol within a mapfile together with a size declaration and no value declaration. See Defining Additional Symbols with a mapfile. The resulting data definition ensures that references from a dynamic executable are established correctly.

Some of the more complex symbol resolutions carried out by the link-editor require knowledge of a symbol's attributes, including the symbol's size. Therefore, you should generate the symbols in the filter so that their attributes match the attributes of the symbols in the filtee. Maintaining attribute consistency ensures that the link-editing process analyzes the filter in a manner that is compatible with the symbol definitions used at runtime. See Symbol Resolution.


Note –

The link-editor uses the ELF class of the first relocatable file that is processed to govern the class of object that is created. Use the link-editor's -64 option to create a 64–bit filter solely from a mapfile.


Generating Auxiliary Filters

To generate an auxiliary filter, you first define a filtee on which the filtering is applied. The following example builds a filtee filtee.so.1, supplying the symbol foo.


$ cat filtee.c
char * foo()
{
        return("defined in filtee");
}
$ cc -o filtee.so.1 -G -K pic filtee.c

Auxiliary filtering can be provided in one of two ways. To declare all of the interfaces offered by a shared object to be auxiliary filters, use the link-editor's -f flag. To declare individual interfaces of a shared object to be auxiliary filters, use a link-editor mapfile and the AUXILIARY keyword.

In the following example, the shared object filter.so.1 is defined to be an auxiliary filter. filter.so.1 offers the symbols foo and bar, and is an auxiliary filter on the filtee filtee.so.1. In this example, the environment variable LD_OPTIONS is used to circumvent the compiler driver from interpreting the -f option.


$ cat filter.c
char * bar = "defined in filter";

char * foo()
{
        return ("defined in filter");
}
$ LD_OPTIONS='-f filtee.so.1' \
cc -o filter.so.1 -G -K pic -h filter.so.1 -R. filter.c
$ elfdump -d filter.so.1 | egrep "SONAME|AUXILIARY"
    [2]  SONAME           0xee     filter.so.1
    [3]  AUXILIARY        0xfb     filtee.so.1

The link-editor can reference the auxiliary filter filter.so.1 as a dependency when creating a dynamic executable or shared object. The link-editor uses information from the symbol table of the filter to satisfy any symbol resolution. However, at runtime, any reference to the symbols of the filter result in a search for the filtee filtee.so.1. If this filtee is found, the runtime linker uses the filtee to resolve any symbols defined by filter.so.1. If the filtee is not found, or a symbol from the filter is not found in the filtee, then the original symbol within the filter is used.

For example, the following dynamic executable prog, references the symbols foo and bar, which are resolved during link-edit from the filter filter.so.1. The execution of prog results in foo being obtained from the filtee filtee.so.1, not from the filter filter.so.1. However, bar is obtained from the filter filter.so.1, as this symbol has no alternative definition in the filtee filtee.so.1.


$ cat main.c
extern char * bar, * foo();

void main()
{
        (void) printf("foo is %s: bar is %s\n", foo(), bar);
}
$ cc -o prog main.c -R. filter.so.1
$ prog
foo is defined in filtee: bar is defined in filter

In the following example, the shared object filter.so.2 defines the interface foo, to be an auxiliary filter on the filtee filtee.so.1.


$ cat filter.c
char * bar = "defined in filter";

char * foo()
{
        return ("defined in filter");
}
$ cat mapfile
{
	global:
		foo = AUXILIARY filtee.so.1;
};
$ cc -o filter.so.2 -G -K pic -h filter.so.2 -M mapfile -R. filter.c
$ elfdump -d filter.so.2 | egrep "SONAME|AUXILIARY"
    [2]  SONAME           0xd8     filter.so.2
    [3]  SUNW_AUXILIARY   0xfb     filtee.so.1
$ elfdump -y filter.so.2 | egrep "foo|bar"
    [1]  A    [3] filtee.so.1      foo
   [10]  D        <self>           bar

At runtime, any reference to the symbol foo of the filter, results in a search for the filtee filtee.so.1. If the filtee is found, the filtee is loaded. The filtee is then used to resolve the symbol foo defined by filter.so.2. If the filtee is not found, symbol foo defined by filter.so.2 is used. Reference to the symbol bar always uses the symbol from filter.so.2, as no filtee processing is defined for this symbol.

For example, the following dynamic executable prog, references the symbols foo and bar, which are resolved during link-edit from the filter filter.so.2. If the filtee filtee.so.1 exists, the execution of prog results in foo being obtained from the filtee filtee.so.1, and bar being obtained from the filter filter.so.2.


$ cc -o prog main.c -R. filter.so.2
$ prog
foo is defined in filtee: bar is defined in filter

If the filtee filtee.so.1 does not exist, the execution of prog results in foo and bar being obtained from the filter filter.so.2.


$ prog
foo is defined in filter: bar is defined in filter

In these examples, the filtee filtee.so.1 is uniquely associated to the filter. The filtee is not available to satisfy symbol lookup from any other objects that might be loaded as a consequence of executing prog.

Auxiliary filters provide a mechanism for defining an alternative interface of an existing shared object. This mechanism is used in the Solaris OS to provide optimized functionality within hardware capability, and platform specific shared objects. See Hardware Capability Specific Shared Objects, Instruction Set Specific Shared Objects, and System Specific Shared Objects for examples.


Note –

The environment variable LD_NOAUXFLTR can be set to disable the runtime linkers auxiliary filter processing. Because auxiliary filters are frequently employed to provide platform specific optimizations, this option can be useful in evaluating filtee use and their performance impact.


Filtering Combinations

Individual interfaces that define standard filters, together with individual interfaces that define auxiliary filters, can be defined within the same shared object. This combination of filter definitions is achieved by using the mapfile keywords FILTER and AUXILIARY to assign the required filtees.

A shared object that defines all of its interfaces to be filters by using the -F, or -f option, is either a standard or auxiliary filter.

A shared object can define individual interfaces to act as filters, together with defining all the interfaces of the object to act as a filters. In this case, the individual filtering defined for an interface is processed first. When a filtee for an individual interface filter can not be established, the filtee defined for all the interfaces of the filter provides a fall back if appropriate.

For example, consider the filter filter.so.1. This filter defines that all interfaces act as auxiliary filters against the filtee filtee.so.1 using the link-editor's -f flag. filter.so.1 also defines the individual interface foo to be a standard filter against the filtee foo.so.1 using the mapfile keyword FILTER. filter.so.1 also defines the individual interface bar to be an auxiliary filter against the filtee bar.so.1 using the mapfile keyword AUXILIARY.

An external reference to foo results in processing the filtee foo.so.1. If foo can not be found from foo.so.1, then no further processing of the filter is carried out. In this case, no fall back processing is performed because foo is defined to be a standard filter.

An external reference to bar results in processing the filtee bar.so.1. If bar can not be found from bar.so.1, then processing falls back to the filtee filtee.so.1. In this case, fall back processing is performed because bar is defined to be an auxiliary filter. If bar can not be found from filtee.so.1, then the definition of bar within the filter filter.so.1 is finally used to resolve the external reference.

Filtee Processing

The runtime linker's processing of a filter defers loading a filtee until a filter symbol is referenced. This implementation is analogous to the filter performing a dlopen(3C), using mode RTLD_LOCAL, on each of its filtees as the filtee is required. This implementation accounts for differences in dependency reporting that can be produced by tools such as ldd(1).

The link-editor's -z loadfltr option can be used when creating a filter to cause the immediate processing of its filtees at runtime. In addition, the immediate processing of all filtees within a process, can be triggered by setting the LD_LOADFLTR environment variable to any value.

Performance Considerations

A shared object can be used by multiple applications within the same system. The performance of a shared object affects the applications that use the shared object, and the system as a whole.

Although the code within a shared object directly affects the performance of a running process, the performance issues discussed here relate to the runtime processing of the shared object. The following sections investigate this processing in more detail by looking at aspects such as text size and purity, together with relocation overhead.

Analyzing Files

Various tools are available to analyze the contents of an ELF file. To display the size of a file use the size(1) command.


$ size -x libfoo.so.1
59c + 10c + 20 = 0x6c8

$ size -xf libfoo.so.1
..... + 1c(.init) + ac(.text) + c(.fini) + 4(.rodata) + \
..... + 18(.data) + 20(.bss) .....

The first example indicates the size of the shared objects text, data, and bss, a categorization used in previous releases of the SunOS operating system.

The ELF format provides a finer granularity for expressing data within a file by organizing the data into sections. The second example displays the size of each of the file's loadable sections.

Sections are allocated to units known as segments, some segments describe how portions of a file are mapped into memory. See mmap(2). These loadable segments can be displayed by using the dump(1) command and examining the LOAD entries.


$ elfdump -p -NPT_LOAD libfoo.so.1

Program Header[0]:
    p_vaddr:      0           p_flags:    [ PF_X PF_R ]
    p_paddr:      0           p_type:     [ PT_LOAD ]
    p_filesz:     0x59c       p_memsz:    0x59c
    p_offset:     0           p_align:    0x10000

Program Header[1]:
    p_vaddr:      0x10630     p_flags:    [ PF_X PF_W PF_R ]
    p_paddr:      0           p_type:     [ PT_LOAD ]
    p_filesz:     0x10c       p_memsz:    0x12c
    p_offset:     0x630       p_align:    0x10000

There are two loadable segments in the shared object libfoo.so.1, commonly referred to as the text and data segments. The text segment is mapped to allow reading and execution of its contents, PF_X PF_R. The data segment is mapped to also allow its contents to be modified, PF_W. The memory size, p_memsz, of the data segment differs from the file size, p_filesz. This difference accounts for the .bss section, which is part of the data segment, and is dynamically created when the segment is loaded.

Programmers usually think of a file in terms of the symbols that define the functions and data elements within their code. These symbols can be displayed using nm(1). For example.


$ nm -x libfoo.so.1

[Index]   Value      Size      Type  Bind  Other Shndx   Name
.........
[39]    |0x00000538|0x00000000|FUNC |GLOB |0x0  |7      |_init
[40]    |0x00000588|0x00000034|FUNC |GLOB |0x0  |8      |foo
[41]    |0x00000600|0x00000000|FUNC |GLOB |0x0  |9      |_fini
[42]    |0x00010688|0x00000010|OBJT |GLOB |0x0  |13     |data
[43]    |0x0001073c|0x00000020|OBJT |GLOB |0x0  |16     |bss
.........

The section that contains a symbol can be determined by referencing the section index (Shndx) field from the symbol table and by using dump(1) to display the sections within the file. For example.


$ dump -hv libfoo.so.1

libfoo.so.1:
           **** SECTION HEADER TABLE ****
[No]    Type    Flags   Addr      Offset    Size      Name
.........
[7]     PBIT    -AI     0x538     0x538     0x1c      .init

[8]     PBIT    -AI     0x554     0x554     0xac      .text

[9]     PBIT    -AI     0x600     0x600     0xc       .fini
.........
[13]    PBIT    WA-     0x10688   0x688     0x18      .data

[16]    NOBI    WA-     0x1073c   0x73c     0x20      .bss
.........

The output from both the previous nm(1) and dump(1) examples shows the association of the functions _init, foo, and _fini to the sections .init, .text and .fini. These sections, because of their read-only nature, are part of the text segment.

Similarly, the data arrays data, and bss are associated with the sections .data and .bss respectively. These sections, because of their writable nature, are part of the data segment.


Note –

The previous dump(1) display has been simplified for this example.


Underlying System

When an application is built using a shared object, the entire loadable contents of the object are mapped into the virtual address space of that process at runtime. Each process that uses a shared object starts by referencing a single copy of the shared object in memory.

Relocations within the shared object are processed to bind symbolic references to their appropriate definitions. This results in the calculation of true virtual addresses that could not be derived at the time the shared object was generated by the link-editor. These relocations usually result in updates to entries within the process's data segments.

The memory management scheme underlying the dynamic linking of shared objects shares memory among processes at the granularity of a page. Memory pages can be shared as long as they are not modified at runtime. If a process writes to a page of a shared object when writing a data item, or relocating a reference to a shared object, it generates a private copy of that page. This private copy will have no effect on other users of the shared object. However, this page has lost any benefit of sharing between other processes. Text pages that become modified in this manner are referred to as impure.

The segments of a shared object that are mapped into memory fall into two basic categories; the text segment, which is read-only, and the data segment, which is read-write. See Analyzing Files on how to obtain this information from an ELF file. An overriding goal when developing a shared object is to maximize the text segment and minimize the data segment. This optimizes the amount of code sharing while reducing the amount of processing needed to initialize and use a shared object. The following sections present mechanisms that can help achieve this goal.

Lazy Loading of Dynamic Dependencies

You can defer the loading of a shared object dependency until the dependencies first reference, by establishing the object as lazy loadable. See Lazy Loading of Dynamic Dependencies.

For small applications, a typical thread of execution can reference all the applications dependencies. The application loads all of its dependencies whether the dependencies are defined lazy loadable or not. However, under lazy loading, dependency processing can be deferred from process startup and spread throughout the process's execution.

For applications with many dependencies, lazy loading often results in some dependencies not being loaded at all. Dependencies that are not referenced for a particular thread of execution, are not loaded.

Position-Independent Code

The code within a dynamic executable is typically position-dependent, and is tied to a fixed address in memory. Shared objects, on the other hand, can be loaded at different addresses in different processes. Position-independent code is not tied to a specific address. This independence allows the code to execute efficiently at a different address in each process that uses the code. Position-independent code is recommended for the creation of shared objects.

The compiler can generate position-independent code under the -K pic option.

If a shared object is built from position-dependent code, the text segment can require modification at runtime. This modification allows relocatable references to be assigned to the location that the object has been loaded. The relocation of the text segment requires the segment to be remapped as writable. This modification requires a swap space reservation, and results in a private copy of the text segment for the process. The text segment is no longer sharable between multiple processes. Position-dependent code typically requires more runtime relocations than the corresponding position-independent code. Overall, the overhead of processing text relocations can cause serious performance degradation.

When a shared object is built from position-independent code, relocatable references are generated as indirections through data in the shared object's data segment. The code within the text segment requires no modification. All relocation updates are applied to corresponding entries within the data segment. See Global Offset Table (Processor-Specific) and Procedure Linkage Table (Processor-Specific) for more details on the specific indirection techniques.

The runtime linker attempts to handle text relocations should these relocations exist. However, some relocations can not be satisfied at runtime.

The x64 position-dependent code sequence typically generates code which can only be loaded into the lower 32–bits of memory. The upper 32–bits of any address must all be zeros. Since shared objects are typically loaded at the top of memory, the upper 32–bits of an address are required. Position-dependent code within an x64 shared object is therefore insufficient to cope with relocation requirements. Use of such code within a shared object can result in runtime relocation errors.


$ prog
ld.so.1: prog: fatal: relocation error: R_AMD64_32: file \
    libfoo.so.1: symbol (unknown): value 0xfffffd7fff0cd457 does not fit

Position-independent code can be loaded in any region in memory, and hence satisfies the requirements of shared objects for x64.

This situation differs from the default ABS64 mode that is used for 64–bit SPARCV9 code. This position-dependent code is typically compatible with the full 64–bit address range. Thus, position-dependent code sequences can exist within SPARCV9 shared objects. Use of either the ABS32 mode, or ABS44 mode for 64–bit SPARCV9 code, can still result in relocations that can not be resolved at runtime. However, each of these modes require the runtime linker to relocate the text segment.

Regardless of the runtime linkers capabilities, or differences in relocation requirements, shared objects should be built using position-independent code.

You can identify a shared object that requires relocations against its text segment. The following example uses elfdump(1) to determine whether a TEXTREL entry dynamic entry exists.


$ cc -o libfoo.so.1 -G -R. foo.c
$ elfdump -d libfoo.so.1 | grep TEXTREL
       [9]  TEXTREL       0

Note –

The value of the TEXTREL entry is irrelevant. The presence of this entry in a shared object indicates that text relocations exist.


To prevent the creation of a shared object that contains text relocations use the link-editor's -z text flag. This flag causes the link-editor to generate diagnostics indicating the source of any position-dependent code used as input. The following example shows how position-dependent code results in a failure to generate a shared object.


$ cc -o libfoo.so.1 -z text -G -R. foo.c
Text relocation remains                       referenced
    against symbol                  offset      in file
foo                                 0x0         foo.o
bar                                 0x8         foo.o
ld: fatal: relocations remain against allocatable but \
non-writable sections

Two relocations are generated against the text segment because of the position-dependent code generated from the file foo.o. Where possible, these diagnostics indicate any symbolic references that are required to carry out the relocations. In this case, the relocations are against the symbols foo and bar.

Text relocations within a shared object can also occur when hand written assembler code is included and does not include the appropriate position-independent prototypes.


Note –

You might want to experiment with some simple source files to determine coding sequences that enable position-independence. Use the compilers ability to generate intermediate assembler output.


SPARC: -K pic and -K PIC Options

For SPARC binaries, a subtle difference between the -K pic option and an alternative -K PIC option affects references to global offset table entries. See Global Offset Table (Processor-Specific).

The global offset table is an array of pointers, the size of whose entries are constant for 32–bit (4–bytes) and 64–bit (8–bytes). The following code sequence makes reference to an entry under -K pic.

        ld    [%l7 + j], %o0    ! load &j into %o0

Where %l7 is the precomputed value of the symbol _GLOBAL_OFFSET_TABLE_ of the object making the reference.

This code sequence provides a 13–bit displacement constant for the global offset table entry. This displacement therefore provides for 2048 unique entries for 32–bit objects, and 1024 unique entries for 64–bit objects. If the creation of an object requires more than the available number of entries, the link-editor produces a fatal error.


$ cc -K pic -G -o lobfoo.so.1 a.o b.o ... z.o
ld: fatal: too many symbols require `small' PIC references:
        have 2050, maximum 2048 -- recompile some modules -K PIC.

To overcome this error condition, compile some of the input relocatable objects with the -K PIC option. This option provides a 32–bit constant for the global offset table entry.

        sethi %hi(j), %g1
        or    %g1, %lo(j), %g1    ! get 32–bit constant GOT offset
        ld    [%l7 + %g1], %o0    ! load &j into %o0

You can investigate the global offset table requirements of an object using elfdump(1) with the -G option. You can also examine the processing of these entries during a link-edit using the link-editors debugging tokens -D got,detail.

Ideally, frequently accessed data items benefit from using the -K pic model. You can reference a single entry using both models. However, determining which relocatable objects should be compiled with either option can be time consuming, and the performance improvement realized small. A recompilation of all relocatable objects with the -K PIC option is typically easier.

Remove Unused Material

The inclusion of functions and data that are not used by the object being built, is wasteful. This material bloats the object, which can result in unnecessary relocation overhead and associated paging activity. References to unused dependencies are also wasteful. These references result in the unnecessary loading and processing of other shared objects.

Unused sections are displayed during a link-edit when using the link-editors debugging token -D unused. Sections identified as unused should be removed from the link-edit. Unused sections can be eliminated using the link-editors -z ignore option.

The link-editor identifies a section from a relocatable object as unused under the following conditions.

You can improve the link-editor's ability to eliminate sections by defining the shared object's external interfaces. By defining an interface, global symbols that are not defined as part of the interface are reduced to locals. Reduced symbols that are unreferenced from other objects, are now clearly identified as candidates for elimination.

Individual functions and data variables can be eliminated by the link-editor if these items are assigned to their own sections. This section refinement is achieved using compiler options such as -xF. Earlier compilers only provided for the assignment of functions to their own sections. Newer compilers have extended the -xF syntax to assign data variables to their own sections. Earlier compilers required C++ exception handling to be disabled when using -xF. This restriction has been dropped with later compilers.

If all allocatable sections from a relocatable object can be eliminated, the entire file is discarded from the link-edit.

In addition to input file elimination, the link-editor also identifies unused dependencies. A dependency is deemed unused if the dependency is not bound to by the object being produced. An object can be built with the -z ignore option to eliminate the recording of unused dependencies.

The -z ignore option applies only to the files that follow the option on the link-edit command line. The -z ignore option is cancelled with -z record.

Maximizing Shareability

As mentioned in Underlying System, only a shared object's text segment is shared by all processes that use the object. The object's data segment typically is not shared. Each process using a shared object, generates a private memory copy of its entire data segment as data items within the segment are written to. Reduce the data segment, either by moving data elements that are never written to the text segment, or by removing the data items completely.

The following sections describe several mechanisms that can be used to reduce the size of the data segment.

Move Read-Only Data to Text

Data elements that are read-only should be moved into the text segment using const declarations. For example, the following character string resides in the .data section, which is part of the writable data segment.


char * rdstr = "this is a read-only string";

In contrast, the following character string resides in the .rodata section, which is the read-only data section contained within the text segment.


const char * rdstr = "this is a read-only string";

Reducing the data segment by moving read-only elements into the text segment is admirable. However, moving data elements that require relocations can be counterproductive. For example, examine the following array of strings.


char * rdstrs[] = { "this is a read-only string",
                    "this is another read-only string" };

A better definition might seem to be to use the following definition.


const char * const rdstrs[] = { ..... };

This definition ensures that the strings and the array of pointers to these strings are placed in a .rodata section. Unfortunately, although the user perceives the array of addresses as read-only, these addresses must be relocated at runtime. This definition therefore results in the creation of text relocations. Representing the array as:


const char * rdstrs[] = { ..... };

ensures the array pointers are maintained in the writable data segment where they can be relocated. The array strings are maintained in the read-only text segment.


Note –

Some compilers, when generating position-independent code, can detect read-only assignments that result in runtime relocations. These compilers arrange for placing such items in writable segments. For example, .picdata.


Collapse Multiply-Defined Data

Data can be reduced by collapsing multiply-defined data. A program with multiple occurrences of the same error messages can be better off by defining one global datum, and have all other instances reference this. For example.


const char * Errmsg = "prog: error encountered: %d";

foo()
{
        ......
        (void) fprintf(stderr, Errmsg, error);
        ......

The main candidates for this sort of data reduction are strings. String usage in a shared object can be investigated using strings(1). The following example generates a sorted list of the data strings within the file libfoo.so.1. Each entry in the list is prefixed with the number of occurrences of the string.


$ strings -10 libfoo.so.1 | sort | uniq -c | sort -rn

Use Automatic Variables

Permanent storage for data items can be removed entirely if the associated functionality can be designed to use automatic (stack) variables. Any removal of permanent storage usually results in a corresponding reduction in the number of runtime relocations required.

Allocate Buffers Dynamically

Large data buffers should usually be allocated dynamically rather than being defined using permanent storage. Often this results in an overall saving in memory, as only those buffers needed by the present invocation of an application are allocated. Dynamic allocation also provides greater flexibility by enabling the buffer's size to change without affecting compatibility.

Minimizing Paging Activity

Any process that accesses a new page causes a page fault, which is an expensive operation. Because shared objects can be used by many processes, any reduction in the number of page faults that are generated by accessing a shared object can benefit the process and the system as a whole.

Organizing frequently used routines and their data to an adjacent set of pages frequently improves performance because it improves the locality of reference. When a process calls one of these functions, the function might already be in memory because of its proximity to the other frequently used functions. Similarly, grouping interrelated functions improves locality of references. For example, if every call to the function foo() results in a call to the function bar(), place these functions on the same page. Tools like cflow(1), tcov(1), prof(1) and gprof(1) are useful in determining code coverage and profiling.

Isolate related functionality to its own shared object. The standard C library has historically been built containing many unrelated functions. Only rarely, for example, will any single executable use everything in this library. Because of widespread use, determining what set of functions are really the most frequently used is also somewhat difficult. In contrast, when designing a shared object from scratch, maintain only related functions within the shared object. This improves locality of reference and has the side effect of reducing the object's overall size.

Relocations

In Relocation Processing, the mechanisms by which the runtime linker relocates dynamic executables and shared objects to create a runable process was covered. Relocation Symbol Lookup and When Relocations Are Performed categorized this relocation processing into two areas to simplify and help illustrate the mechanisms involved. These same two categorizations are also ideally suited for considering the performance impact of relocations.

Symbol Lookup

When the runtime linker needs to look up a symbol, by default it does so by searching in each object. The runtime linker starts with the dynamic executable, and progresses through each shared object in the same order that the objects are loaded. In many instances, the shared object that requires a symbolic relocation turns out to be the provider of the symbol definition.

In this situation, if the symbol used for this relocation is not required as part of the shared object's interface, then this symbol is a strong candidate for conversion to a static or automatic variable. A symbol reduction can also be applied to removed symbols from a shared objects interface. See Reducing Symbol Scope for more details. By making these conversions, the link-editor incurs the expense of processing any symbolic relocation against these symbols during the shared object's creation.

The only global data items that should be visible from a shared object are those that contribute to its user interface. Historically this has been a hard goal to accomplish, because global data are often defined to allow reference from two or more functions located in different source files. By applying symbol reduction, unnecessary global symbols can be removed. See Reducing Symbol Scope. Any reduction in the number of global symbols exported from a shared object results in lower relocation costs and an overall performance improvement.

The use of direct bindings can also significantly reduce the symbol lookup overhead within a dynamic process that has many symbolic relocations and many dependencies. See Direct Bindings.

When Relocations are Performed

All immediate reference relocations must be carried out during process initialization before the application gains control. However, any lazy reference relocations can be deferred until the first instance of a function being called. Immediate relocations typically result from data references. Therefore, reducing the number of data references also reduces the runtime initialization of a process.

Initialization relocation costs can also be deferred by converting data references into function references. For example, you can return data items by a functional interface. This conversion usually results in a perceived performance improvement because the initialization relocation costs are effectively spread throughout the process's execution. Some of the functional interfaces might never be called by a particular invocation of a process, thus removing their relocation overhead altogether.

The advantage of using a functional interface can be seen in the section, Copy Relocations. This section examines a special, and somewhat expensive, relocation mechanism employed between dynamic executables and shared objects. It also provides an example of how this relocation overhead can be avoided.

Combined Relocation Sections

The relocation sections within relocatable objects are typically maintained in a one-to-one relationship with the sections to which the relocations must be applied. However, when an executable or shared object is built with the -z combreloc option, all but the procedure linkage table relocations are placed into a single common section named .SUNW_reloc.

Combining relocation records in this manner enables all RELATIVE relocations to be grouped together. All symbolic relocations are sorted by symbol name. The grouping of RELATIVE relocations permits optimized runtime processing using the DT_RELACOUNT/DT_RELCOUNT .dynamic entries. Sorted symbolic entries help reduce runtime symbol lookup.

Copy Relocations

Shared objects are usually built with position-independent code. References to external data items from code of this type employs indirect addressing through a set of tables. See Position-Independent Code for more details. These tables are updated at runtime with the real address of the data items. These updated tables enable access to the data without the code itself being modified.

Dynamic executables, however, are generally not created from position-independent code. Any references to external data they make can seemingly only be achieved at runtime by modifying the code that makes the reference. Modifying a read-only text segment is to be avoided. The copy relocation technique can solve this reference.

Suppose the link-editor is used to create a dynamic executable, and a reference to a data item is found to reside in one of the dependent shared objects. Space is allocated in the dynamic executable's .bss, equivalent in size to the data item found in the shared object. This space is also assigned the same symbolic name as defined in the shared object. Along with this data allocation, the link-editor generates a special copy relocation record that instructs the runtime linker to copy the data from the shared object to the allocated space within the dynamic executable.

Because the symbol assigned to this space is global, it is used to satisfy any references from any shared objects. The dynamic executable inherits the data item. Any other objects within the process that make reference to this item are bound to this copy. The original data from which the copy is made effectively becomes unused.

The following example of this mechanism uses an array of system error messages that is maintained within the standard C library. In previous SunOS operating system releases, the interface to this information was provided by two global variables, sys_errlist[], and sys_nerr. The first variable provided the array of error message strings, while the second conveyed the size of the array itself. These variables were commonly used within an application in the following manner.


$ cat foo.c
extern int      sys_nerr;
extern char *   sys_errlist[];

char *
error(int errnumb)
{
        if ((errnumb < 0) || (errnumb >= sys_nerr))
                return (0);
        return (sys_errlist[errnumb]);
}

The application uses the function error to provide a focal point to obtain the system error message associated with the number errnumb.

Examining a dynamic executable built using this code shows the implementation of the copy relocation in more detail.


$ cc -o prog main.c foo.c
$ nm -x prog | grep sys_
[36]  |0x00020910|0x00000260|OBJT |WEAK |0x0  |16 |sys_errlist
[37]  |0x0002090c|0x00000004|OBJT |WEAK |0x0  |16 |sys_nerr
$ dump -hv prog | grep bss
[16]    NOBI    WA-    0x20908   0x908    0x268   .bss
$ dump -rv prog

    **** RELOCATION INFORMATION ****

.rela.bss:
Offset      Symndx                Type              Addend

0x2090c     sys_nerr              R_SPARC_COPY      0
0x20910     sys_errlist           R_SPARC_COPY      0
..........

The link-editor has allocated space in the dynamic executable's .bss to receive the data represented by sys_errlist and sys_nerr. These data are copied from the C library by the runtime linker at process initialization. Thus, each application that uses these data gets a private copy of the data in its own data segment.

There are two drawbacks to this technique. First, each application pays a performance penalty for the overhead of copying the data at runtime. Second, the size of the data array sys_errlist has now become part of the C library's interface. Suppose the size of this array were to change, perhaps as new error messages are added. Any dynamic executables that reference this array have to undergo a new link-edit to be able to access any of the new error messages. Without this new link-edit, the allocated space within the dynamic executable is insufficient to hold the new data.

These drawbacks can be eliminated if the data required by a dynamic executable are provided by a functional interface. The ANSI C function strerror(3C) returns a pointer to the appropriate error string, based on the error number supplied to it. One implementation of this function might be:


$ cat strerror.c
static const char * sys_errlist[] = {
        "Error 0",
        "Not owner",
        "No such file or directory",
        ......
};
static const int sys_nerr =
        sizeof (sys_errlist) / sizeof (char *);

char *
strerror(int errnum)
{
        if ((errnum < 0) || (errnum >= sys_nerr))
                return (0);
        return ((char *)sys_errlist[errnum]);
}

The error routine in foo.c can now be simplified to use this functional interface. This simplification in turn removes any need to perform the original copy relocations at process initialization.

Additionally, because the data are now local to the shared object, the data are no longer part of its interface. The shared object therefore has the flexibility of changing the data without adversely effecting any dynamic executables that use it. Eliminating data items from a shared object's interface generally improves performance while making the shared object's interface and code easier to maintain.

ldd(1), when used with either the -d or -r options, can verify any copy relocations that exist within a dynamic executable.

For example, suppose the dynamic executable prog had originally been built against the shared object libfoo.so.1 and the following two copy relocations had been recorded.


$ nm -x prog | grep _size_
[36]   |0x000207d8|0x40|OBJT |GLOB |15  |_size_gets_smaller
[39]   |0x00020818|0x40|OBJT |GLOB |15  |_size_gets_larger
$ dump -rv size | grep _size_
0x207d8     _size_gets_smaller    R_SPARC_COPY      0
0x20818     _size_gets_larger     R_SPARC_COPY      0

A new version of this shared object is supplied that contains different data sizes for these symbols.


$ nm -x libfoo.so.1 | grep _size_
[26]   |0x00010378|0x10|OBJT |GLOB |8   |_size_gets_smaller
[28]   |0x00010388|0x80|OBJT |GLOB |8   |_size_gets_larger

Running ldd(1) against the dynamic executable reveals the following.


$ ldd -d prog
    libfoo.so.1 =>   ./libfoo.so.1
    ...........
    copy relocation sizes differ: _size_gets_smaller
       (file prog size=40; file ./libfoo.so.1 size=10);
       ./libfoo.so.1 size used; possible insufficient data copied
    copy relocation sizes differ: _size_gets_larger
       (file prog size=40; file ./libfoo.so.1 size=80);
       ./prog size used; possible data truncation

ldd(1) shows that the dynamic executable will copy as much data as the shared object has to offer, but only accepts as much as its allocated space allows.

Copy relocations can be eliminated by building the application from position-independent code. See Position-Independent Code.

Using the -B symbolic Option

The link-editor's -B symbolic option enables you to bind symbol references to their global definitions within a shared object. This option is historic, in that it was designed for use in creating the runtime linker itself.

Defining an object's interface and reducing non-public symbols to local is preferable to using the -B symbolic option. See Reducing Symbol Scope. Using -B symbolic can often result in some non-intuitive side effects.

If a symbolically bound symbol is interposed upon, then references to the symbol from outside of the symbolically bound object bind to the interposer. The object itself is already bound internally. Essentially, two symbols with the same name are now being referenced from within the process. A symbolically bound data symbol that results in a copy relocation creates the same interposition situation. See Copy Relocations.


Note –

Symbolically bound shared objects are identified by the .dynamic flag DF_SYMBOLIC. This flag is informational only. The runtime linker processes symbol lookups from these objects in the same manner as any other object. Any symbolic binding is assumed to have been created at the link-edit phase.


Profiling Shared Objects

The runtime linker can generate profiling information for any shared objects that are processed during the running of an application. The runtime linker is responsible for binding shared objects to an application and is therefore able to intercept any global function bindings. These bindings take place through .plt entries. See When Relocations Are Performed for details of this mechanism.

The LD_PROFILE environment variable specifies the name of a shared object to profile. You can analyze a single shared object using this environment variable. The setting of the environment variable can be used to analyze the use of the shared object by one or more applications. In the following example, the use of libc by the single invocation of the command ls(1) is analyzed.


$ LD_PROFILE=libc.so.1  ls -l

In the following example, the environment variable setting is recorded in a configuration file. This setting causes any application's use of libc to accumulate the analyzed information.


# crle  -e LD_PROFILE=libc.so.1
$ ls -l
$ make
$ ...

When profiling is enabled, a profile data file is created, if it does not already exist. The file is mapped by the runtime linker. In the previous examples, this data file is /var/tmp/libc.so.1.profile. 64–bit libraries require an extended profile format and are written using the .profilex suffix. You can also specify an alternative directory to store the profile data using the LD_PROFILE_OUTPUT environment variable.

This profile data file is used to deposit profil(2) data and call count information related to the use of the specified shared object. This profiled data can be directly examined with gprof(1).


Note –

gprof(1) is most commonly used to analyze the gmon.out profile data created by an executable that has been compiled with the -xpg option of cc(1). The runtime linker's profile analysis does not require any code to be compiled with this option. Applications whose dependent shared objects are being profiled should not make calls to profil(2), because this system call does not provide for multiple invocations within the same process. For the same reason, these applications must not be compiled with the -xpg option of cc(1). This compiler-generated mechanism of profiling is also built on top of profil(2).


One of the most powerful features of this profiling mechanism is to enable the analysis of a shared object as used by multiple applications. Frequently, profiling analysis is carried out using one or two applications. However, a shared object, by its very nature, can be used by a multitude of applications. Analyzing how these applications use the shared object can offer insights into where energy might be spent to improvement the overall performance of the shared object.

The following example shows a performance analysis of libc over a creation of several applications within a source hierarchy.


$ LD_PROFILE=libc.so.1 ; export LD_PROFILE
$ make
$ gprof -b /lib/libc.so.1 /var/tmp/libc.so.1.profile
.....

granularity: each sample hit covers 4 byte(s) ....

                                  called/total     parents
index  %time    self descendents  called+self    name      index
                                  called/total     children
.....
-----------------------------------------------
                0.33        0.00      52/29381     _gettxt [96]
                1.12        0.00     174/29381     _tzload [54]
               10.50        0.00    1634/29381     <external>
               16.14        0.00    2512/29381     _opendir [15]
              160.65        0.00   25009/29381     _endopen [3]
[2]     35.0  188.74        0.00   29381         _open [2]
-----------------------------------------------
.....
granularity: each sample hit covers 4 byte(s) ....

   %  cumulative    self              self    total         
 time   seconds   seconds    calls  ms/call  ms/call name   
 35.0     188.74   188.74    29381     6.42     6.42  _open [2]
 13.0     258.80    70.06    12094     5.79     5.79  _write [4]
  9.9     312.32    53.52    34303     1.56     1.56  _read [6]
  7.1     350.53    38.21     1177    32.46    32.46  _fork [9]
 ....

The special name <external> indicates a reference from outside of the address range of the shared object being profiled. Thus, in the previous example, 1634 calls to the function open(2) within libc occurred from the dynamic executables, or from other shared objects, bound with libc while the profiling analysis was in progress.


Note –

The profiling of shared objects is multithread safe, except in the case where one thread calls fork(2) while another thread is updating the profile data information. The use of fork(2) removes this restriction.