Imperfect C++ Practical Solutions for Real-Life Programming By Matthew Wilson
	Table of Contents

	Chapter 9. Dynamic Libraries

9.4. Versioning

The second major sources of dynamic linking problems come about as a result of version changes. Some of these are obvious, others much less so, but they can all end up breaking your executables, and others if you're really unlucky.

In bad cases, changes you've made can leave you in the situation whereby you need an older version of a given dynamic library for some of the applications on a system and a new version for some of the other. This is "DLL Hell" [Rich2002].

9.4.1 Lost Functions

The first, and perhaps most obvious, versioning problem is where a newer version of a dynamic library is deployed with a missing function from a prior version. In this case, any dependent link units will not load, and your executable will not load. The simple solution is to never remove any functions from your dynamic libraries, and operating systems mandate this as part of their ABIs.

Some vendors take a more detailed approach, and assign APIs stability levels, which indicate to developers how the API will evolve, or not, in future versions. The developers use this information to inform on their use of the APIs and plan strategies for evolution of their software consistent with that of the vendors.

9.4.2 Changed Signatures

When using C linkage, the symbols in dynamic libraries contain no information as to the function's arguments. This applies both to C code and to C++ code using extern "C". The danger here is that if the function's signature is changed, the client code in any dependent link units will still be linked to the new form of the function at load time, with the obvious consequences to the robustness of your executable and the gloss on your resume. With mangling, this does not happen, since a different function signature results in a different mangled name.

Naturally, one of the main principles of a good software engineer is to avoid changing the signature of functions that have been "released," that is, where client code may exist that is outside the control of the programming team effecting the change. In practice, once you've been through a release cycle, you should refrain from changing functions even when you're "sure," since surety is an ephemeral notion in computing. If you need to provide a function with changed semantics, add a new one and deprecate, but do not remove, the old one.

There's another aspect of this that's worth highlighting. On most systems, the load-time fixing up for dynamic library imports and exports is done based on names. On Win32, however, exported symbols can either be represented in the export table via their name or an ordinal number. Using ordinals can make your dynamic libraries smaller, and it can also be used to hide function names from overeager reverse engineers. The downside is that client link units rely on the ordinal for a given function being immutable. Removing the ordinal from a Win32 DLL almost guarantees that client executables will not load. Reusing an ordinal for a new function will simply mean that caller and callee are expecting different things, and you'll have a nasty crash. As reported in [Rich1997], Microsoft favors export by function name, and the Win32 system DLLs mostly follow this convention.

There's one nice, but perverse, use of ordinals, which is that it does allow you to rename APIs without breaking client code, as long as the function signatures stay the same. However, doing this without taking a misstep can be tricky, so I wouldn't recommend it.

9.4.3 Behavioral Changes

The most significant part of all software engineering activity is maintenance [Glas2003], and you will inevitably have to modify the behavior of existing functions. These changes can be bug fixes, or can be semantic enhancements. When making enhancements, you must retain backward compatibility. This is often only achievable if you've planned for forward compatibility in the original design. For example, you may provide flags for one of the parameters that stipulate what behavior you want from the function. In that case, you can enhance the function by adding new flags.

In principle, you are free to change any behavior of your software that is not defined in its published interface documentation. However, in practice, you still have to be careful. Even if you're fixing a bug, sometimes you need to be aware that some client code may be dependent on the buggy behavior. For example, you may have a function that writes some data into a caller-supplied buffer. The first version always fills out the remaining part of the buffer with 0s. This is not part of the original design, and no client code had a problem with it, initially. Eventually, you come across a requirement that needs the unused remainder to be left intact. Unfortunately, there are now client applications that have been written to rely on the 0 padding.

Your actions in these cases will depend on real-world factors, no doubt factoring in issues of the number of clients, your support resources, and so on and so forth. For example, operating system vendors, with huge user bases, usually opt to preserve such undocumented "features" because of the likely hit to their business. In this case, the only option is to introduce a new function with the correct semantics, and leave the old one around for old client link units.

9.4.4 Constants

If you've got an enum, or a set of flags, for a dynamic library function, it's obvious that you must not change any of the values of the enum members or the flags, or you run the risk of breaking extant clients. However, where you have a constant—whether #define, const, or a class member constant—things can be a bit trickier. Any changes to the constant will only be reflected in code compiled and deployed after the constant has been changed.

One way around this is to define the constant in a function in a core library, and all the dependent libraries call that function to elicit the constant at run time. Obviously this will only work for values that do not need to be evaluated at compile time.

The C++ version of this is to declare a class constant and to define it within one library. To affect the behavior in all dependent libraries, one need only ship an updated version of the core library. Naturally, the dependent libraries must have been written to be able to work with different values of the constant, and your testing infrastructure must exercise this.