For the chinese translated version, please click 关于DLL的一些你不会想要知道的知识.
I’ve recently had cause to investigate how dynamic linking is implemented on Windows. This post is basically a brain dump of everything I’ve learnt on the issue. This is mostly for my future reference, but I hope it will be useful to others too as I’m going to bring together lots of information you would otherwise have to hunt around for.
Without further ado, here we go:
The Windows executable loader is responsible for doing all dynamic loading and symbol resolution before running the code. The linker works out what functions are exported or imported by each image (an image is a DLL or EXE file) by inspecting the
.idata sections of those images, respectively.
The contents of these sections is covered in detail by the PE/COFF specification.
This section records the exports of the image (yes, EXEs can export things). This takes the form of:
The export address table: an array of length N holding the addresses of the exported functions/data (the addresses are stored relative to the image base). Indexes into this table are called ordinals.
The export name pointer table: an array of length M holding pointers to strings that represent the name of an export. This array is lexically ordered by name, to allow binary searches for a given export.
The export ordinal table: a parallel array of length M holding the ordinal of the corresponding name in the export name pointer table.
(As an alternative to importing an image’s export by its name, it is possible to import by specifying an ordinal. Importing by ordinal is slightly faster at runtime because the dynamic linker doesn’t have to do a lookup. Furthermore, if the import is not given a name by the exporting DLL, importing by ordinal is the only way to do the import.)
How does the
.edata section get created in the first place? There are two main methods:
Most commonly, they start life in the object files created by compiling some source code that defines a function/some data that was declared with the
__declspec(dllimport)modifier. The compiler just emits an appropriate
.edatasection naming these exports.
Less commonly, the programmer might write a .def file specifying which functions they would like to export. By supplying this to
dlltool --output-exp, an export file can be generated. An export file is just an object file which only contains a
.edatasection, exporting (via some unresolved references that will be filled in by the linker in the usual way) the symbols named in the .def file. This export library must be named by the programmer when he comes to link together his object files into a DLL.
In both these cases, the linker collects the
.edata sections from all objects named on the link line to build the
.edata for the overall image file. One last possible way that the
.edata can be created is by the linker itself, without having to put
.edata into any object files:
- The linker could choose to export all symbols defined by object files named on the link line. For example, this is the default behaviour of GNU ld (the behaviour can also be explicitly asked for using
–-export-all-symbols). In this case, the linker generates the
.edatasection itself. (GNU ld also supports specifying a .def file on the command line, in which case the generated section will export just those things named by the .def).
.idata section records those things that the image imports. It consists of:
For every image from which symbols are imported:
The filename of the image. Used by the dynamic linker to locate it on disk.
The import lookup table: an array of length N, which each entry is either an ordinal or a pointer to a string representing the name to import.
The import address table: an array of N pointers. The dynamic linker is responsible for filling out this array with the address of the function/data named by the corresponding symbol in the import lookup table.
The ways in which
.idata entries are created are as follows:
Most commonly, they originate in a library of object files called an
import library. This
import librarycan be created by usingdlltool on the DLL you wish to export or a .def file of the type we discussed earlier. Just like the export library, the import library must be named by the user on the link line.
Alternatively, some linkers (like GNU ld) let you specify a DLL directly on the link line. The linker will automatically generate
.idataentries for any symbols that you must import from the DLL.
Notice that unlike the case when we were exporting symbols,
__declspec(dllimport) does not cause
.idata sections to be generated.
Import libraries are a bit more complicated than they first appear. The Windows dynamic loader fills the import address table with the addresses of the imported symbols (say, the address of a function
Func). However, when the assembly code in other object files says
call Func they expect that
Func to name the address of that code. But we don’t know that address until runtime: the only thing we know statically is the address where that address will be placed by the dynamic linker. We will call this address
To deal with this extra level of indirection, the import library exports a function
Func that just dereferences
__imp__Func (to get the actual function pointer) and then
jmps to it. All of the other object files in the project can now say
call Func just as they would if
Func had been defined in some other object file, rather than a DLL. For this reason, saying
__declspec(dllimport) in the declaration of a dynamically linked function is optional (though in fact you will get slightly more efficient code if you add them, as we will see later).
Unfortunately, there is no equivalent trick if you want to import data from another DLL. If we have some imported data
myData, there is no way the import library can be defined so that a
mov $eax, myData in an object file linked against it writes to the storage for
myData in that DLL. Instead, the import library defines a symbol
__imp__myData that resolves to the address at which the linked-in address of the storage can be found. The compiler then ensures that when you read or write from a variable defined with
__declspec(dllimport) those reads and writes go through the
__imp_myData indirection. Because different code needs to be generated at the use site,
__declspec declarations on data imports are not optional.
Theory is all very well but it can be helpful to see all the pieces in play.
First, lets build a simple DLL exporting both functions and data. For maximum clarity, we’ll use an explicit export library rather instead of decorating our functions with
declspec(dllexport) or supply a .def file to the linker.
First lets write the .def file,
DATA keyword and
LIBRARY line only affects how the import library is generated, as explained later on. Ignore them for now.)
Build an export file from that:
$ dlltool --output-exp library_exports.o -d library.def
The resulting object basically just contains an
.edata section that exports the symbols
_function_export under the names
$ objdump -xs library_exports.o
We’ll fulfil these symbol with a trivial implementation of the DLL,
int data_export = 42;
We can put it together into a DLL:
$ gcc -shared -o library.dll library.c library_exports.o
The export table for the DLL is as follows, showing that we have exported what we wanted:
The Export Tables (interpreted .edata section contents)
When we come to look at using the DLL, things become a lot more interesting. First, we need an import library:
$ dlltool --output-lib library.dll.a -d library.def
(The reason that we have an import library but an export object is because using a library for the imports allows the linker to discard
.idata for any imports that are not used. Contrariwise ,he linker can never discard any
.edata entry because any export may potentially be used by a user of the DLL).
This import library is rather complex. It contains one object for each export (
disds00001.o) but also two other object files (
disdh.o) that set up the header and footer of the import list. (The header of the import list contains, among other things, the name of the DLL to link in at runtime, as derived from the
LIBRARY line of the .def file.)
$ objdump -xs library.dll.a
Note that the object corresponding to
data_export has an empty
.text section, whereas
function_export does define some code. If we disassemble it we get this:
The relocation of type
dir32 tells the linker how to fill in the address being dereferenced by the
jmp. We can see that
_function_export, when entered, will jump directly to the function at the address loaded from the memory named
.idata$5. Inspection of the complete
.idata section satisfies us that
.idata$5 corresponds to the address of the fragment of the import address table corresponding to the
function_export import name, and hence the address where the absolute address of the loaded
function_export import can be found.
function_export gets a corresponding
_function_export function, both of the exports have lead to a symbol with the
__imp__ prefix (
__imp__function_export) being defined in the import library. As discussed before, this symbol stands for the address at which the pointer to the data/function will be inserted by the dynamic linker. As such, the
__imp__ symbols always point directly into the import address table.
With an import library in hand, we are capable of writing some client code that uses our exports,
Build and link it against the import library and we will get the results we expect:
$ gcc main1.c library.dll.a -o main1 && ./main1
The reason that this works even though there is no
data_export symbol defined by
library.dll.a is because the
__declspec(dllimport) qualifier on our
data_export declaration in
main.c has caused the compiled to generate code that uses the
__imp_data_export symbol directly, as we can see if we disassemble the generated code:
$ gcc -c main1.c -o main1.o && objdump --disassemble -r main1.o
In fact, we can see that the generated code doesn’t even use the
_function_export symbol, preferring
__imp__function_export. Essentially, the code of the
_function_export symbol in the import library has been inlined at every use site. This is why using
__declspec(dllimport) can improve performance of cross-DLL calls, even though it is entirely optional on function declarations.
We might wonder what happens if we drop the
__declspec(dllimport) qualifier on our declarations. Because of our discussion about the difference between data and function imports earlier, you might expect linking to fail. Our test file,
Let’s try it out:
$ gcc main2.c library.dll.a -o main2 && ./main2
What the hell – it worked? This is a bit uprising. The reason that it works despite the fact that the import library
library.dll.a not defining the
_data_export symbol is because of a nifty feature of GNU ld called auto-import. Without auto-import the link fails as we would expect:
$ gcc main2.c library.dll.a -o main2 -Wl,--disable-auto-import && ./main2
The Microsoft linker does not implement auto-import, so this is the error you would get if you were using the Microsoft toolchain.
However, there is a way to write client code that does not depend on auto-import or use the
__declspec(dllimport) keyword. Our new client,
main3.c is as follows:
In this code, we directly use the
__imp__-prefixed symbols from the import library. These name an address at which the real address of the import can be found, which is reflected by our C-preprocessor definitions of
This code compiles perfectly even without auto-import:
$ gcc main3.c library.dll.a -o main3 -Wl,--disable-auto-import && ./main3
If you have followed along until this point you should have a solid understanding of how DLL import and export are implemented on Windows.
As a bonus, I’m going to explain how auto-import is implemented by the GNU linker. It is a rather cute hack you may get a kick out of.
As a reminder, auto-import is a feature of the linker that allows the programmer to declare an item of DLL-imported data with a simple
extern keyword, without having to explicitly use
__declspec(dllimport). This is extremely convenient because this is exactly how most _nix source code declares symbols it expects to import from a shared library, so by supporting this use case that_nix code becomes more portable to Windows.
Auto-import kicks in whenever the linker finds an object file making use of a symbol
foo which is not defined by any other object in the link, but where a symbol
__imp_foo is defined by some object. In this case, it assumes that the use of
foo is an attempt to access some DLL-imported data item called
Now, the problem is that the linker needs to replace the use of
foo with the address of
foo itself. However, all we seem to know statically is an address where that address will be placed at runtime (
__imp_foo). To square the circle, the linker plays a clever trick.
The trick is to extend the
.idata of the image being created with an entry for a “new” DLL. The new entry is set up as follows:
The filename of the image being imported is set to the same filename as the
__imp_foo. So if
__imp_foowas being filled out by an address in
Bar.dll, our new
.idataentry will use
The import lookup table is of length 1, whose sole entry is a pointer to the name of the imported symbol corresponding to
__imp_foo. So if
__imp_foois filled out by the address of the
Bar.dll, the name of the symbol we put in here will be
The import address table is of length 1 – and here is the clever bit – is located precisely at the location in the object file that was referring to the (undefined) symbol
This solution neatly defers the task of filling out the address that the object file wants to the dynamic linker. The reason that the linker can play this trick is that it can see all of the object code that goes into the final image, and can thus fix all of the sites that need to refer to the imported data.
Note that in general the final image’s
.idata will contain several entries for the same DLL: one from the import library, and one for every place in any object file in the link which referred to some data exported by the DLL. Although this is somewhat unusual behaviour, the Windows linker has no problem with there being several imports of the same DLL.
Unfortunately, the scheme described above only works if the object code has an undefined reference to
foo itself. What if instead it has a reference to
foo+N, an address N bytes after the address of
foo itself? There is no way to set up the
.idata so that the dynamic linker adds a constant to the address it fills in, so we seem to be stuck.
Alas, such relocations are reasonably common, and originate from code that accesses a field of a DLL-imported structure type. Cygwin actually contains another hack to make auto-import work in such cases, known as “pseudo-relocations”. If you want to know the details of how these works, there is more information in the original thread on the topic.
Dynamic linking on Windows is hairier than it at first appears. I hope this article has gone some way to clearing up the meaning of the mysterious
dllexport keywords, and at clarifying the role of the import and export libraries.
Linux and friends implement dynamic linking in a totally different manner to Windows. The scheme they use is more flexible and allows more in-memory sharing of code, but incurs a significant runtime penalty (especially on i386). For more details see here and the Dynamic Linking section of the the ELF spec.