cgc is a prototype tool that generates a sound call graph for the application part of a program without analyzing the code of the library. It uses a context-insensitive pointer analysis to create the call graph on-the-fly. Although the prototype is implemented in Datalog for ease of modification and experimentation, it could be transcribed into Java to be embedded into an analysis framework such as Soot or Wala.

More details about cgc, its pointer analysis, assumptions, empirical evaluation can be found at my ECOOP’12 paper.

Workflow

cgc workflow

Call Graph Schema

cgc can output the generated call graph as a GXL document or a directed DOT graph file. The DOT graph can be visualized using Graphviz or converted by cgc to a PNG or a PS file that can be visualized using any document previewer. The call graph GXL schema used by cgc can be found here.

Logic Rules

cgc pointer analysis is implemented declaratively in Datalog. Datalog makes it easier to focus on “what” you want from your analysis rather than focusing on “how” to do it. We used the context-insensitive pointer analysis from the Doop framework as our base analysis and modified it, greatly in many cases, to fit our requirements. The logic rules for cgc can be found here. There is a README file in the downloaded archive that explains which files are taken as is from Doop, files completely new in cgc, and Doop files that are used after modifying them.

Experiment Output

The output of the experiments used in the empirical evaluation found in my ECOOP’12 paper can be downloaded here. The download contains two folders, one for the DaCapo benchmarks and another one for the SPEC JVM98 benchmarks. Each benchmark program used has a folder that contains the following:

  • <benchmark_name>.stats: a text file that contains all the output from the three tools run for each benchmark program (cgc, Spark, and Doop). It contains important information like preprocessing time, analysis time, etc.
  • callgraph/: a folder that contains all the call graphs generated by the three tools as well as the difference call graphs between them.
  • database/: a folder that contains the LogicBlox database used by cgc and Doop after the analysis finishes execution.
  • reflection.stats: a text file that contains statistical information about the use of reflection in the corresponding benchmark.