Prelink and address space randomization
Prelink (PDF) is a popular tool used to decrease program load time, shortening system boot time and making applications start faster. Developed by Jakob Jelinek at Red Hat, prelink relocates libraries on disk to save dynamic linking time.
When the dynamic linker loads a dynamically linked ELF binary, it has to also load and link all of the libraries before executing the program's entry point, _main(). This process involves relocating libraries—changing all addresses referenced in the library to reflect the actual addresses in memory. Relocating libraries involves iterating through each address in the library and replacing it with the real address as determined by the library's location in the process's virtual address space. Most relocations happen in the symbol table and PLT; but in rare cases there are also .text relocations which require fixed-position executable code to be patched in a slightly slower process.
The relocation process will slow down an application's launch. In order to speed up the process, prelink relocates the libraries ahead of time. This is done by scanning every executable to be prelinked, generating a graph of libraries that will be loaded at the same time as other libraries, and then calculating target addresses for each library at such that it will never be loaded at the same address as other libraries. These offsets are then stored in the shared object files themselves, and the symbol tables and segment addresses are all adjusted to reflect addresses based on the chosen base address.
Once prelink has done its job, the dynamic linker no longer has to concern itself with relocation. Libraries are loaded at the address specified in the library header and the symbol table is already correct. If anything forces the library to be loaded at a different address, then the library is relocated appropriately as usual; otherwise we can say goodbye to the load-time overhead of relocating libraries.
Kernel facilities supplying address space layout randomization for libraries cannot be used in conjunction with prelink; to do so would require relocating the libraries, defeating the purpose of prelinking. Address space randomization is a core feature of secure systems such as OpenBSD, Adamantix, Hardened Gentoo, Fedora Core, and Red Hat Enterprise Linux. It has appeared as part of PaX as well as part of Ingo Molnar's Exec Shield, and has been accepted into the mainline kernel as of 2.6.12 after submission by Arjan van de Ven.
The simple purpose of address space randomization is to make it more difficult to perform certain classes of attacks by changing where in memory important segments for the attack are loaded. If an attacker wants to execute injected shell code or manipulate the program to execute out of order, he obviously has to know where that code is. By shuffling memory segments around, these attacks become quite difficult; the chances of successful attack are mathematically described in the PaX documentation and Wikipedia.
In an attempt to restore some of the benefits of address space randomization, prelink is capable of randomly selecting the addresses used for prelinking. This makes it more difficult to perform certain attacks on a system, because the addresses used are unique to that system. This approach is, however, less effective than per-process randomization because the addresses stay constant until prelink is run again.
There is another implication that has to be examined with prelink. To understand this implication, let us first review a feature of prelink by examining the load address of the C standard library in two processes: a user-owned 'cat' and a root-owned 'bash'. The C standard library is interesting because, in practice, virtually all return-to-libc attacks utilize it exclusively.
user@icebox:~$ cat /proc/self/maps | grep libc | grep r-xp 4df2e000-4e053000 r-xp 00000000 08:07 81197 /lib/tls/i686/cmov/libc-2.3.6.so user@icebox:~$ sudo -s root@icebox:/home/user# cat /proc/$$/maps | grep libc | grep r-xp 4df2e000-4e053000 r-xp 00000000 08:07 81197 /lib/tls/i686/cmov/libc-2.3.6.so
Closely examining these quickly verifies that the address of glibc's executable code is the same between these two processes; this is consistent with the behavior of prelink. Because the library itself is relocated ahead of time, there is a preference for the dynamic linker to load it at that address. Examination of libc itself yields the below.
user@icebox:~$ readelf -S /lib/tls/i686/cmov/libc-2.3.6.so | head -n 6 There are 64 section headers, starting at offset 0x12d114: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .note.ABI-tag NOTE 4df2e154 000154 000020 00 A 0 0 4
Computing 4df2e154 - 154, the address and offset taken from any given non-NULL segment, yields 4df2e000, the base address of libc. This makes sense; prelink rewrites the segment and symbol addresses for the library based on a specific load address, and the dynamic linker loads the library at that address to avoid relocating it. Further, any program that links with libc has to be able to read libc, and will thus be able to derive the same information.
All of this means that any program on the system using any prelinked library will be able to leak information about higher privileged tasks using the same library. This allows any attacker able to gain any form of local access—or more directly any ability to read libc—to gain information about the address space layout of higher privileged processes, including the load address of libc. As we know, this information is extremely valuable to an attacker wanting to exploit a privileged process without brute forcing library load addresses.
This vulnerability only applies to attackers with local access; but this is not an unreasonable requirement. Many web hosting companies give local shell access or allow PHP; either of these can be used to remotely fetch a copy of libc. Due to the nature of the dynamic linker and sane security design, the dynamic linker is exactly as privileged as the process it is starting; therefor, even the most stringent mandatory access policies on systems such as SELinux, grsecurity, or AppArmor cannot prevent this attack.
Besides avoiding prelinking, there is one other way to prevent this information leak from being exploited. All processes linked to a prelinked library need access to the library file and load that library at the same address; the point of exposure is the use of the same copy of the library. In order to prevent information leaking, then, you must have separate copy of each library common between any two programs you don't want to leak information about each other. This can be done with Xen, chroot jails, UML, or simply isolated machines, as long as the directory hierarchies are individually prelinked with prelink randomization. Each system will have a different set of addresses from every other system in this scheme. This of course requires more hardware, more disk space, more management, more memory, and more work.
The direct implications of this information leak depend on your exact security concerns. A web hosting company, for example, may not want to run prelink on its servers, given the risk of effectively losing the benefit of address space randomization. A home desktop, on the other hand, may only have to worry about a trojan using the information leak to stage an attack on a system service such as cups or dbus—and should probably worry about /proc/PID/maps first. While these are both essentially the concern of an attacker with local access, the likelihood of attack and the value of potential damages are different.
The prelink tool gives a useful decrease in program load time, and can help users reach their desktop and the programs they need to run more quickly. It does however have some unfortunate repercussions that must be examined, especially in security-sensitive environments relying on address space randomization.
| Index entries for this article | |
|---|---|
| GuestArticles | Moser, John Richard |
