J2BP

J2BP is a tool for predicate abstraction of Java programs. Given a set of Java classes and a set of predicates, for each original Java class it automatically generates an abstract Java class such that all fields and local variables of methods have the boolean type. Each boolean field or variable in a generated class represents a single predicate. Each statement in an original class is replaced by code that expresses the effect of the statement on the values of boolean variables (predicates).

Contact: Pavel Parizek, pparizek at uwaterloo.ca

Features

J2BP accepts the compiled Java classes (bytecode) as input and generates bytecode too (i.e., not source code). Input classes can only use the following data types of Java: boolean, char,short, int, and reference types except string constants. The unsupported types are: long, float, double, and string constants.

Current version of the tool targets only the Java Pathfinder (JPF) model checker. Generated abstract classes use JPF API for non-deterministic choice and atomic execution of code fragments. The tool is implemented in Scala and Java. It uses the WALA library for static analysis, the Yices SMT solver, and the ASM library for bytecode generation.

Collections. J2BP now supports also predicates for modeling the observable state of Java collections at the interface level. For this purpose, we defined a predicate language based on the array theory. See our paper at OOPSLA 2012 (available here) for details.

Download

  Binary release:  j2bp-2012-07-26-bin.tgz
  Source code:  j2bp-2012-07-26-src.tgz
  Examples:  examples.tgz

Running

To run the J2BP tool, you need a recent distribution of Scala to be installed on your computer. We tested the tool with Scala 2.8.1 that can be acquired here.
Download the current release of the tool and unpack the archive. Then execute the run.sh script that accepts the following arguments (all are required):
  1. name of the main class of the program (e.g., with the main method)
  2. path to the text file that contains the list of all Java classes forming the program (one per line), or the string "none" if the program consists of one Java class only
  3. path to the directory that contains compiled Java classes
  4. path to the text file that specifies all the predicates (its format is described below)
  5. path to the text file that specifies all the properties (its format is described below)
  6. path to the directory where the generated Java classes should be stored

Input

Predicates are specified in a text file that is divided into several sections. For each input class, there is one section for predicates over static fields of the class and one section for predicates over instance fields of the class. For each method of each input class, there is a section for predicates over local variables and parameters of the method. Local variables and parameters are referred to by terms localX and paramX, respectively, where X is the number of a variable or parameter. Local variables are numbered by the order of their first usage in the bytecode of the original method.
All predicates must be written in a prefix notation. The following relational operators are supported: <, <=, >, >=, =, !=. The left hand side and right hand side may involve the identifier of a local variable or method parameter, a field access expression, an array access expression, and the arithmetic operators + and -. Field access expressions can be defined using the dot notation (e.g., this.size) or the function syntax fread(field, obj) - the latter one has to be used for field accesses on complex object expressions. Names of instance fields have to be qualified with this. Array access expressions can be defined using the function syntax aread(arr, array_expr, index_expr). The given set of predicates should cover all expressions used in control statements (e.g., in while loops and if-else statements) in methods of the original Java class.
A small example of predicates' definition follows:
  [object: myjava.ArrayList]
  > this.size 0
  = this.size this.data.length

  [method: myjava.ArrayList.remove]
  <= this.size param1
  < local2 this.size
Properties are specified in a text file too. Each line is divided into two parts separated by the semicolon. The first part must contain the property formula (in the same syntax as input predicates). The format of the second part is full_method_name:bytecode_position. It specifies the full method name and bytecode position after which the property should hold.

See the examples.tgz archive for examples of predicate and property definitions.

Current version of the J2BP tool also imposes some restrictions on the input bytecode and on the Java source code from which the bytecode is compiled. All local variables in each method must be declared and initialized at the top-level scope in the method body (i.e., declaring local variables in nested scopes is not supported yet), because otherwise local variables do not have unique names (some local variables visible in different scopes may have the same name). Source code of the Java class should not use the increment operator ++ and constructs like data[size++] = o, because they are compiled using the dup instructions that are not fully resolved by the WALA module for bytecode processing.

Note that while the dot notation ("obj.field") can be used to express field accesses in the input predicates, the tool uses function syntax internally to represent field accesses for the purpose of evaluating formulas with a theorem prover.

Output

A generated abstract Java class contains one boolean field for each predicate over static and object fields in the original class. Each method of the abstract class has one boolean parameter for each predicate over parameters of the original method and one local variable of the boolean type for each predicate over local variables of the original method.
For illustration, given the example of predicates' definition above and a fragment of the Java class whose source code is below on the left, the J2BP tool generates an abstract program whose source code (i.e., decompiled bytecode) looks like the code fragment displayed below on the right. The Verify class is a part of JPF API.
  class ArrayList {
    Object[] data;
    int size = 0;

    Object remove(int idx) {
      if (this.size <= idx) return null;
      Object v = data[idx];
      for (int i = idx + 1; i < size; i++) {
        data[i-1] = data[i];
      }
      size--;
      return v;
    }
  }
 
  class ArrayListPA {
    boolean bf1;  // this.size > 0
    boolean bf2;  // this.size = this.data.length

    void remove(boolean bp1) {
      if (bp1) return;

      // omitted abstraction of the for loop

      // abstraction of "size--"
      Verify.beginAtomic();
      if (bf1 && bf2 && bp1 && bv1) {
        bf1 = Verify.getBoolean();
        bf2 = false;
        bp1 = true;
        bv1 = Verify.getBoolean();
      }
      else if (!bf1 && bf2 && bp1 && bv1) {
        bf1 = false;
        ...
      }
      ...
      Verify.endAtomic();
    }
  }
Library methods are completely abstracted away in the default configuration (with the support for collections disabled). Every call of a library method in the program is replaced with a code that sets all predicates over the result variable (in which the return value of the method call is stored) and all predicates over object fields to the non-deterministic boolean value.