Stream-Based Trace Compression
(SBC)

Contact:
Aleksandar Milenkovic
Assistant Professor of Electrical and Computer Engineering
The University of Alabama in Huntsville
301 Sparkman Drive, Huntsville, AL 35899
Email: milenka@ece.uah.edu
URL: http://www.ece.uah.edu/~milenka

What is SBC?
How does it work?
Download programs to work with SBC
Learn more ...
Download SBC traces for SPEC CPU 2000

What is Stream-based Compression?

Trace-driven simulations have been widely used in quantitative evaluations of new techniques in computer architecture. To offer a faithful representation of a specific workload, traces are very large, encompassing billions of memory references and/or instructions. To efficiently store and use even a small collection of traces, its size must be reduced beyond traditional compression techniques.
Stream-based compression (SBC) is a new method for single-pass compression of address traces and various extended trace formats. The SBC algorithm relies on extracting instruction streams. An instruction stream is defined as a sequential run of instructions, from the target of a taken branch to the first taken branch in sequence. A stream table is created during compression, encompassing all relevant information about streams, such as starting address, stream length, instruction words and their types. All instructions from a stream are replaced by its index in the stream table, creating a trace of instruction streams.
Unlike instruction addresses, data address for a memory referencing instruction stays rarely constant during program execution, but it a can have a regular stride. The SBC trace compression features a very simple and yet efficient on-line algorithm for compression of data address references. The compressed data address trace encompasses the data address stride and the number or repetitions for each memory-referencing instruction in a stream. A change in a data address stride results in another compressed record. Compressed records are ordered by the corresponding stream appearances in the trace.

How does it work?

The SBC algorithm exploits several inherent characteristics of program execution traces. Instruction traces consist of a fairly limited number of different instruction streams, and the most of memory references exhibit strong spatial and/or temporal locality, for example, a load having a constant address stride across loop iterations.
Figure 1 illustrates the compression flow assuming Dinero+ input. A Dinero+ trace record has fixed length fields: the header field (0 – data read, 1 – data write, and 2 – instruction read), the address field, and the instruction word field for the instruction read type. The stream-based compression results in three files: Stream Table File (STF), Stream-Based Instruction Trace (SBIT), and Stream-Based Data Trace (SBDT). Figure 2 demonstrates the compression process using a short excerpt of a trace. Figure 3 shows the SBC format for data address references.

Figure 1. SBC Compression Flow: from Dinero+ to SBC.

sbc: an example

Figure 2. SBC Compression: An Example

Figure 3. SBC Data Trace Format.

Programs to work with SBC traces

The utility programs working with the SBC traces have been made available to the research community under the terms of the GNU General Public License.

README.txt
Header file: mytrace.h
Dinero+ to SBC conversion: din_plus2sbc.c
SBC to Dinero+ conversion: sbc2din_plus.c
SBC to Dinero conversion: sbc2din.c
DineroIV modified to read SBC traces (additional and modified files only): dinero4sbc.tgz

Learn more ...

A. Milenkovic, M. Milenkovic, "Exploiting Streams in Instruction and Data Address Trace Compression,"
Proceedings of the IEEE 6th Annual Workshop on Workload Characterization, Austin, TX, USA, October 27, 2003, pp. 99-107.
[PDF] [ps.gz] [PPT slides]

Download SBC Traces for SPEC CPU2000 Benchmarks

Traces are collected for the Alpha architecture running SPEC CPU2000 benchmarks with the reference input set.
Each trace encompasses first 2 billion instructions (F2B) and 2 billion instructions after skipping 50 billion instructions (M2B).
A trace includes instruction (SBIT - T1.*) and data address (SBDT - C3.*) components, and stream table (STF - tblID.*).
SBIT and SBDT components consist of 101 partial traces (each but last encompassing about 20 million instructions).

164.gzip

F2B: 164.gzip.f2b.tar [149,176,320 bytes]
M2B: 164.gzip.m2b.tar [143,093,760 bytes]

176.gcc

F2B: 176.gcc.f2b.tar [202,526,720 bytes]
M2B: 176.gcc.m2b.tar [132,024,320 bytes]

181.mcf

F2B: 181.mcf.f2b.tar [65,331,200 bytes]
M2B: 181.mcf.m2b.tar [548,188,160 bytes]

186.crafty

F2B: 186.crafty.f2b.tar [140,472,320 bytes]
M2B: 186.crafty.m2b.tar [124,088,320 bytes]

197.parser

F2B: 197.parser.f2b.tar [175,482,880 bytes]
M2B: 197.parser.m2b.tar [189,440,000 bytes]

252.eon

F2B: 252.eon.f2b.tar [85,268,480 bytes]
M2B: 252.eon.m2b.tar [87,992,320 bytes]

253.perlbmk

F2B: 253.perlbmk.f2b.tar [97,771,520 bytes]
M2B: 253.perlbmk.m2b.tar [45,742,080 bytes]

254.gap

F2B: 254.gap.f2b.tar [41,932,800 bytes]
M2B: 254.gap.m2b.tar [28,968,960 bytes]

255.vortex

F2B: 255.vortex.f2b.tar [287,467,520 bytes]
M2B: 255.vortex.m2b.tar [141,137,920 bytes]

300.twolf

F2B: 300.twolf.f2b.tar [297,359,360 bytes]
M2B: 300.twolf.m2b.tar [399,431,680 bytes]

168.wupwise

F2B: 168.wupwise.f2b.tar [10,434,560 bytes]
M2B: 168.wupwise.m2b.tar [15,575,040 bytes]

171.swim

F2B: 171.swim.f2b.tar [890,880 bytes]
M2B: 171.swim.m2b.tar [522,240 bytes]

172.mgrid

F2B: 172.mgrid.f2b.tar [3,788,800 bytes]
M2B: 172.mgrid.m2b.tar [3,553,280 bytes]

173.applu

F2B: 173.applu.f2b.tar [12,267,520 bytes]
M2B: 173.applu.m2b.tar [9,482,240 bytes]

177.mesa

F2B: 177.mesa.f2b.tar [26,603,520 bytes]
M2B: 177.mesa.m2b.tar [30,545,920 bytes]

178.galgel

F2B: 178.galgel.f2b.tar [2,990,080 bytes]
M2B: 178.galgel.m2b.tar [3,717,120 bytes]

179.art

F2B: 179.art.f2b.tar [1,287,785 bytes]
M2B: 179.art.m2b.tar [257,978 bytes]

183.equake

F2B: 183.equake.f2b.tar [16,998,400 bytes]
M2B: 183.equake.m2b.tar [91,248,640 bytes]

188.ammp

F2B: 188.ammp.f2b.tar [12,206,080 bytes]
M2B: 188.ammp.m2b.tar [73,646,080 bytes]

189.lucas

F2B: 189.lucas.f2b.tar [1,105,920 bytes]
M2B: 189.lucas.m2b.tar [1,187,840 bytes]

191.fma3d

F2B: 191.fma3d.f2b.tar [8,273,920 bytes]
M2B: 191.fma3d.m2b.tar [3,031,040 bytes]

200.sixtrack

F2B: 200.sixtrack.f2b.tar [24,760,320 bytes]
M2B: 200.sixtrack.m2b.tar [4,270,080 bytes]

301.apsi

F2B: 301.apsi.f2b.tar [14,417,920 bytes]
M2B: 301.apsi.m2b.tar [14,448,640 bytes]

free hit counter

Last update: March 4, 2004