Stream-Based Trace Compression
|
Contact:
Aleksandar Milenkovic
Assistant Professor of Electrical and Computer Engineering
The University of Alabama in Huntsville
301 Sparkman Drive, Huntsville, AL 35899
Email: milenka@ece.uah.edu
URL: http://www.ece.uah.edu/~milenka
Trace-driven simulations have been widely used in quantitative evaluations
of new techniques in computer architecture. To offer a faithful representation
of a specific workload, traces are very large, encompassing billions of
memory references and/or instructions. To efficiently store and use even
a small collection of traces, its size must be reduced beyond traditional
compression techniques.
Stream-based compression (SBC) is a new method for single-pass compression
of address traces and various extended trace formats. The SBC algorithm
relies on extracting instruction streams. An instruction stream is defined
as a sequential run of instructions, from the target of a taken branch to
the first taken branch in sequence. A stream table is created during compression,
encompassing all relevant information about streams, such as starting address,
stream length, instruction words and their types. All instructions from
a stream are replaced by its index in the stream table, creating a trace
of instruction streams.
Unlike instruction addresses, data address for a memory referencing
instruction stays rarely constant during program execution, but it a can
have a regular stride. The SBC trace compression features a very simple
and yet efficient on-line algorithm for compression of data address references.
The compressed data address trace encompasses the data address stride
and the number or repetitions for each memory-referencing instruction in
a stream. A change in a data address stride results in another compressed
record. Compressed records are ordered by the corresponding stream appearances
in the trace.
The SBC algorithm exploits several inherent characteristics of program
execution traces. Instruction traces consist of a fairly limited number
of different instruction streams, and the most of memory references exhibit
strong spatial and/or temporal locality, for example, a load having a constant
address stride across loop iterations.
Figure 1 illustrates the compression flow assuming Dinero+ input.
A Dinero+ trace record has fixed length fields: the header field (0 – data
read, 1 – data write, and 2 – instruction read), the address field, and the
instruction word field for the instruction read type. The stream-based compression
results in three files: Stream Table File (STF), Stream-Based Instruction
Trace (SBIT), and Stream-Based Data Trace (SBDT). Figure 2 demonstrates
the compression process using a short excerpt of a trace. Figure 3 shows
the SBC format for data address references.
Header file: mytrace.h
Dinero+ to SBC conversion: din_plus2sbc.c
SBC to Dinero+ conversion: sbc2din_plus.c
SBC to Dinero conversion: sbc2din.c
A. Milenkovic, M. Milenkovic, "Exploiting Streams in Instruction and
Data Address Trace Compression,"
Proceedings of the IEEE 6th Annual Workshop on Workload Characterization,
Austin, TX, USA, October 27, 2003, pp. 99-107.
[PDF]
[ps.gz]
[PPT
slides]
Last update: March 4, 2004