Analyzing DaCapo and SPEC JVM with averroes using JDK 1.6
This tutorial steps through the process to download and run averroes
to analyze the programs from the
DaCapo-2006-10MR2
and the SPEC JVM98 benchmarks. I will refer to all these
programs by the word benchmarks throughout the rest of this tutorial.
Prerequisites
You can either:
- download averroes-home.tar.gz (~174MB, and represents the complete archive of files needed for this tutorial) on a machine that runs Java 1.7 or above (both JDK and JRE are expected to be installed), OR
- download this virtual machine (~12GB) that comes with all the prerequisites pre-installed and contains
averroes-home.tar.gz
on the desktop. Both the user name and the password for the administrator account on the virtual machine areaec
. The virtual machine requires at least 10GB of RAM and has been tested on VirtualBox Version 5.0.20 r106931 on a MacBook Pro with 16GB RAM and running OS X El Capitan 10.11.5.
After the download is complete, extract averroes-home.tar.gz
to any directory you want.
Source code
The source code for averroes
and this tutorial is available at the following links:
Contents of averroes-home
The directory averroes-home
should now contain the following artifacts:
- all-output-1.6: the output directory for any logs, call graphs, and placeholder libraries generated throughout this tutorial. It also contains the pre-computed dynamic call graphs as it takes an insane amount of time to generate those for JDK 6 (~ 6 days!!). The directory also contains the pre-computed
doop
anddoop-averroes
call graphs as I could not in include the Datalog engine used byDoop
due to licensing issues. - benchmarks: a directory that contain the JAR files for the benchmark programs.
- jre: a directory that contains the Java 1.4.2_11 and Java 1.6.0_45, the two JDKs that
averroes
has been verified to work on. - run-all: a convenient bash script that runs this tutorial 10 times collecting a lot of statistics along the way.
- averroes.jar: the main runnable JAR for
averroes
. This program is the JAR factory that will create the placeholder library JAR file for any input program. - averroes: a bash script that runs
averroes
on an input benchmark to generate its placeholder library. - averroes-all: a bash script that executes the script
averroes
for all the benchmarks. - tool.jar: an executable JAR that runs one of the tools (
Spark
,Doop
, orWALA
) with or withoutaverroes
to generate the call graph for the given benchmark program. - run-tool: a bash script that generates the call graph for the given tool on a given benchmark program.
- run-benchmark: a bash script that runs all the tools (
Spark
,Doop
, orWALA
) with and withoutaverroes
on a given benchmark program. - run-all-benchmarks: a bash script that executes the script
run-benchmark
for all the benchmarks. - run-all-once: a bash script that runs this tutorial once instead of 10 times as in the case of the script
run-all
. - probe.jar: a utility JAR for the ProBe that can print out call graph information.
- latex.jar: a runnable JAR for that generates the LaTeX tables for the statistics collecting while running through the benchmarks.
- errorbars.jar: a runnable JAR for that generates the required statistics to generate the time and memory bar charts with error bars.
Running the full tutorial
After extracting averroes-home.tar.gz
, run the following commands on your terminal:
$ cd averroes-home/
$ ./run-all 1.6
This command runs the tutorial 10 times and takes roughly 20 hours to finish. So if you ran it, kick back and relax, or better do some other work, while it finishes execution. For each run N
, the script creates the output directory averroes-home/all-output-1.6/N
that contains the following:
- benchmarks-averroes: a directory that contains the JAR files generated by
averroes
for each benchmark program. More information on the output generated byaverroes
can be found here - callgraphs: a directory that contains the call graphs generated by the tools (
Spark
,Doop
, orWALA
) with and withoutaverroes
for each benchmark program.
For each benchmark program, the following log files are generated while running:
- benchmarks-averroes/<benchmark>/averroes.log: a log file that records statistics about the placeholder library generation.
- callgraphs/<benchmark>/doop.log: a log file that records the output of running
Doop
. - callgraphs/<benchmark>/doop-gc.log: a log file that records the Java garbage collection statistics while running
Doop
. This information is used to determine the memory usage.
The last two log files are also generated for Spark
and WALA
, as well as the versions of the three tools that analyze the programs with the placeholder libraries generated by averroes
.
Running the tutorial once
After extracting averroes-home.tar.gz
, run the following commands on your terminal:
$ cd averroes-home/
$ ./run-all-once all-output-1.6/1 1.6
This command runs the tutorial for just once (takes roughly 2 hours to finish), and its output will be generated in the output directory averroes-home/all-output-1.6/1
. The output follows the same structure as explained above.
Generating the LaTeX tables
To compare all the call graphs for the benchmarks, and generate the LaTeX tables that illustrate these comparisons, run the following commands on your terminal:
$ cd averroes-home/
$ java -jar latex.jar
When the program finishes execution, the following tables will be generated in the directory
averroes-home/tex
:
- table-soundness.tex: a table that shows the effect of analyzing the placeholder library generated by
averroes
compared to analyzing the original library code on the soundness of the call graphs generated bySpark
,Doop
, andWALA
. - table-imprecision.tex: a table that shows the effect of analyzing the placeholder library generated by
averroes
compared to analyzing the original library code on the precision of the call graphs generated bySpark
,Doop
, andWALA
. - table-cgsize.tex: a table that shows some statistics about the call graphs generated throughout the tutorial.
- sparkave.stats: a CSV file for the frequency of imprecise library call backs in the call graphs generated by
Spark
when analyzing the placeholder library generated byaverroes
compared to analyzing the original library code. - doopave.stats: a CSV file for the frequency of imprecise library call backs in the call graphs generated by
Doop
when analyzing the placeholder library generated byaverroes
compared to analyzing the original library code. - walakave.stats: a CSV file for the frequency of imprecise library call backs in the call graphs generated by
WALA
when analyzing the placeholder library generated byaverroes
compared to analyzing the original library code.
Generating diagrams with error bars
To print out the statistics of running the tutorial 10 times in a format that is easy to copy/paste in an excel sheet to produce the bar charts for execution time and memory consumption with the error bars, run the following commands on your terminal:
$ cd averroes-home/
$ java -jar errorbars.jar
Call Graph Info
The averroes-home
directory contains a utility JAR called probe.jar
that can print out information of any call
graph that is used throughout this tutorial. You can use it as follows:
$ java -cp probe.jar probe.CallGraphInfo [options] graph.txt.gzip
where options can be any combination of the following:
-m : print list of reachable methods
-e : print list of entry points
-j : ignore the Java standard library
-g : print list of call edges
-lib file : ignore methods in packages listed in file
ECOOP '13 Validated Artifact
Most of the experiments that are discussed in the tutorial above was part of the artifact we have submitted to ECOOP '13 in Montpellier, France. averroes
has been verified by the Artifact Evaluation Committee to be consistent, complete, well-documented, and easy to reuse.