Published on Feb 27, 2024
Programs written in C and C++ are susceptible to memory errors,including buffer overflows and dangling pointers. These errors,which can lead to crashes, erroneous execution, and security vulnerabilities, are notoriously costly to repair. Tracking down their location in the source code is difficult, even when the full memory state of the program is available.
Once the errors are finally found, fixing them remains challenging: even for critical security-sensitive bugs, the average time between initial reports and the issuance of a patch is nearly one month. We present Exterminator, a system that automatically corrects heap-based memory errors without programmer intervention. Exterminator exploits randomization to pinpoint errors with high precision.
From this information, Exterminator derives runtime patches that fix these errors both in current and subsequent executions. In addition, Exterminator enables collaborative bug correction by merging patches generated by multiple users. We present analytical and empirical results that demonstrate Exterminator’s effectiveness at detecting and correcting both injected and real faults.
Once Exterminator locates a buffer overflow, it determines the allocation site of the overflowed object, and the size of the over flow. For dangling pointer errors, Exterminator determines both the allocation and deletion sites of the dangled object, and computes how prematurely the object was freed. With this information in hand, Exterminator corrects the errors by generating runtime patches. These patches operate in the context of a correcting allocator. The correcting allocator prevents overflows by padding objects, and prevents dangling pointer errors by deferring object deallocations. These actions impose little space overhead because Exterminator’s runtime patches are tailored to the specific allocation and deallocation sites of each error.
After Exterminator completes patch generation, it both stores the patches to correct the bug in subsequent executions, and triggers a patch update in the running program to fix the bug in the current execution. Exterminator’s patches also compose straightforwardly, enabling collaborative bug correction: users running Exterminator can automatically merge their patches, thus systematically and continuously improving application reliability. Exterminator can operate in three distinct modes: an iterative mode for runs over the same input, a replicated mode that can correct errors on-the-fly, and a cumulative mode that corrects errors across multiple runs of the same application. We experimentally demonstrate that, in exchange for modest runtime overhead (geometric mean of 25%), Exterminator effectively isolates and corrects both injected and real memory errors, including buffer overflows in the Squid web caching server and the Mozilla web browser.
The DieHard system includes a bitmap-based, fully-randomized memory allocator that provides probabilistic memory safety. The latest version of DieHard, upon which Exterminator is based, adaptively sizes its heap be M times larger than the maximum needed by the application (see Figure 2). This version of Die-Hard allocates memory from increasingly large chunks that we call miniheaps. Each miniheap contains objects of exactly one size. If an allocation would cause the total number of objects to exceed1/M, DieHard allocates a new miniheap that is twice as large as the previous largest miniheap.
Allocation randomly probes a miniheap’s bitmap for the given size class for a free bit: this operation takes O(1) expected time. Freeing a valid object resets the appropriate bit. DieHard’s use of randomization across an over-provisioned heap makes it probabilistically likely that buffer overflows will land on free space, and unlikely that a recently-freed object will be reused soon, making dangling pointer errors rare.
DieHard optionally uses replication to increase the probability of successful execution. In this mode, it broadcasts inputs to a number of replicas of the application process, each equipped with a different random seed. A voter intercepts and compares outputs across the replicas, and only actually generates output agreed on by a plurality of the replicas. The independent randomization of each replica’s heap makes the probabilities of memory errors independent. Replication thus exponentially decreases the likelihood of a memory error affecting output, since the probability of an error striking a majority of the replicas is low.
Modes of Operation
Exterminator can be used in three modes of operation: an iterative mode suitable for testing or whenever all inputs are available, a replicated mode that is suitable both for testing and for restricted deployment scenarios, and a cumulative mode that is suitable for broad deployment. All of these rely on the generation of heap images, which Exterminator examines to isolate errors and compute runtime patches. If Exterminator discovers an error when executing a program, or if DieFast signals an error, Exterminator forces the process to emit a heap image file. This file is akin to a core dump, but contains less data (e.g., no code), and is organized to simplify processing. In addition to the full heap contents and heap metadata, the heap image includes the current allocation time (measured by the number of allocations to date)
Exterminator’s iterative mode operates without replication. To find a single bug, Exterminator is initially invoked via a command-line option that directs it to stop as soon as it detects an error. Exterminator then re-executes the program in “replay” mode over the same input (but with a new random seed). In this mode, Exterminator reads the allocation time from the initial heap image to abort execution at that point; we call this a malloc breakpoint. Exterminator then begins execution and ignores DieFast error signals that are raised before the malloc breakpoint is reached. Once it reaches the malloc breakpoint, Exterminator triggers another heap image dump. This process can be repeated multiple times to generate independent heap images. Exterminator then performs post-mortem error isolation and runtime patch generation. A small number of iterations usually suffices for Exterminator to generate runtime patches for an individual error. When run with a correcting memory allocator that in corporates these changes, these patches automatically fix the isolated errors.
The iterated mode described above works well when all inputs are available so that rerunning an execution is feasible. However, when applications are deployed in the field, such inputs may not be available, and replaying may be impractical. The replicated mode of operation allows Exterminator to correct errors while the program is running, without the need for multiple iterations. Like DieHard, Exterminator can run a number of differently randomized replicas simultaneously (as separate processes), broadcasting inputs to all and voting on their outputs. However, Exterminator uses DieFast-based heaps, each with a correcting allocator.
This organization lets Exterminator discover and fix errors. In replicated mode, when DieFast signals an error or the voter detects divergent output, Exterminator sends a signal that triggers a heap image dump for each replica. If the program crashes because of a segmentation violation, a signal handler also dumps a heap image. If DieFast signals an error, the replicas that dump a heap image do not have to stop executing. If their output continues to be in agreement, they can continue executing concurrently with the error isolation process. When the runtime patch generation process is complete, that process signals the running replicas to tell the correcting allocators to reload their runtime patches. Thus, subsequent allocations in the same process will be patched onthe- fly without interrupting execution.
While the replicated mode can isolate and correct errors on-the fly in deployed applications, it may not be practical in all situations. For example, replicating applications with high resource requirements may cause unacceptable overhead. In addition, multithreaded or non-deterministic applications can exhibit different allocation activity and so cause object ids to diverge across replicas. To support these applications, Exterminator uses its third mode of operation, cumulative mode, which isolates errors without replication or multiple identical executions.
When operating in cumulative mode, Exterminator reasons about objects grouped by allocation and deallocation sites instead of individual objects, since objects are no longer guaranteed to be identical across different executions. Because objects from a given site only occasionally cause errors, often at low frequencies, Exterminator requires more executions than in replicated or iterative mode in order to identify these low-frequency errors without a high false positive rate. Instead of storing heap images from multiple runs, Exterminator computes relevant statistics about each run and stores them in its patch file. The retained data is on the order of a few kilobytes per execution, compared to tens or hundreds of megabytes for each heap image.
Buffer Overfow Detection
Exterminator examines heap images looking for discrepancies across the heaps, both in overwritten canaries and in live objects. If an object is not equivalent across the heaps, exterminator considers it to be a candidate victim of an overflow. To identify victim objects, exterminator compares the contents of equivalent objects, as identified by their object Id across all heaps. Exterminator builds an overflow mask that comprises the discrepancies found across all heaps. However, because the same logical object may legitimately differ across multiple heaps, Exterminator must take care not to consider these occurrences as overflows. First, a freed object may differ across heaps because it was filled with canaries only in some of the heaps.
Exterminator uses the canary bitmap to identify this case. Second, an object can contain pointers to other objects, which are randomly located on their respective heaps. Exterminator uses both deterministic and probabilistic techniques to distinguish integers from pointers. Briefly, if a value interpreted as a pointer points inside the heap area and points to the same logical object across all heaps, then Exterminator considers it to be the same logical pointer, and thus not a discrepancy. Exterminator also handles the case where pointers point into dynamic libraries, which newer versions of Linux place at pseudorandom base addresses. Finally, an object can contain values that legitimately differ from process to process. Examples of these values include process ids, file handles, pseudorandom numbers, and pointers in data structures that depend on addresses (e.g., some red-black tree implementations). When Exterminator examines an object and encounters any word that differs at the same position across all the heaps, it considers it to be legitimately different, and not an indication of buffer overflow.
Exterminator Runtime Overhead
Here evaluate Exterminator’s performance with the SPECint2000 suite running reference workloads, as well as a suite of allocation-intensive benchmarks. We use the latter suite of benchmarks both because they are widely used in memory management studies and because their high allocation-intensity stresses memory management performance. For all experiments, we fix Exterminator’s heap multiplier (value of M) at 2. All results are the average of five runs on a quiescent, dual-processor Linux system with 3GB of RAM, with each 3.06 GHz Intel Xeon processor (hyperthreading active) equipped with 512K L2 caches. Our observed experimental variance is below 1%. We focus on the nonreplicated mode (iterative/cumulative), which we expect to be a key limiting factor for Exterminator’s performance and the most common usage scenario.
We compare the runtime of Exterminator (DieFast plus the correcting allocator) to the GNU libc allocator. This allocator is based on the Lea allocator, which is among the fastest avail-able. Figure shows that, versus this allocator, Exterminator degrades performance by from 0% (186.crafty) to 132% (cfrac), with a geometric mean of 25.1%. While Exterminator’s overhead is substantial for the allocation-intensive suite (geometric mean: 81.2%), where the cost of computing allocation and deallocation contexts dominates, its overhead is significantly less pronounced across the SPEC benchmarks (geometric mean: 7.2%).
This paper presents Exterminator, a system that automatically corrects heap-based memory errors in C and C++ programs with high probability. Exterminator operates entirely at the runtime level on unaltered binaries, and consists of three key components:
DieFast, a probabilistic debugging allocator
a probabilistic error isolation algorithm, and
a correcting memory allocator.
Ex-terminator’s probabilistic error isolation isolates the source and ex-tent of memory errors with provably low false positive and false negative rates. Its correcting memory allocator incorporates runtime patches that the error isolation algorithm generates to correct memory errors. Exterminator is not only suitable for use during testing, but also can automatically correct deployed programs.
More Seminar Topics:
Measuring Universal Intelligence,
Fast And Secure Protocol,
Case Based Reasoning System,
Mobile Number Portability,
Mobile Phone Cloning,
Nano Cars Into The Robotics,
Privacy Preserving Data Publishing,
Example Based Machine Translation,