-
-
Notifications
You must be signed in to change notification settings - Fork 354
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feature: Add support for dataflow analysis in new module spoon-datafl…
…ow (#2970)
- Loading branch information
Showing
58 changed files
with
8,975 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
Copyright (c) 2019 Egor Bredikhin | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
# spoon-dataflow | ||
|
||
spoon-dataflow is a Z3 solver based data-flow analyzer for Java source code. | ||
|
||
## Capabilities | ||
At the moment spoon-dataflow has checkers which allow to detect logical errors (always true/false expressions) and null pointer dereferences. | ||
For example: | ||
```java | ||
void f(boolean c) { | ||
int x = 5; | ||
int y = c ? 3 : 2; | ||
if (x < y) {} // <= always false | ||
} | ||
|
||
void g(int x, int y) { | ||
if (2 * x + 3 * y == 12) { | ||
if (5 * x - 2 * y == 11) { | ||
if (x == 3) {} // <= always true | ||
if (y == 2) {} // <= always true | ||
} | ||
} | ||
} | ||
|
||
void h(Something x) { | ||
if (x == null) { | ||
x.f(); // <= null dereference | ||
} | ||
} | ||
``` | ||
Check out test directory for more examples. | ||
|
||
In general, spoon-dataflow is capable to evaluate expressions statically, perform symbolic execution, handle control flow of a program, and so on. It also features a proper memory model, so it nicely deals with reference aliasing. | ||
|
||
## Build and run | ||
In order to build spoon-dataflow you need JDK 8 or newer. Also, you have to download and install Z3. | ||
|
||
### Windows: | ||
1. Download z3-4.8.4 here: https://github.com/Z3Prover/z3/releases | ||
2. Add bin directory to your PATH environment variable | ||
3. Run `gradlew build` | ||
|
||
Note: Visual C++ 2015 Redistributable may be required. | ||
|
||
### Linux: | ||
1. Download z3-4.8.4 here https://github.com/Z3Prover/z3/releases | ||
2. Add bin directory to your LD_LIBRARY_PATH: `export LD_LIBRARY_PATH=/path/to/z3/bin` | ||
3. Run `./gradlew build` | ||
|
||
### macOS: | ||
1. Download z3-4.8.4 here https://github.com/Z3Prover/z3/releases | ||
2. Add bin directory to your DYLD_LIBRARY_PATH: `export DYLD_LIBRARY_PATH=/path/to/z3/bin` | ||
3. Run `./gradlew build` | ||
|
||
Note that you may still need to set up environment variables or java.lang.path in your IDE. | ||
|
||
Now you can go to the build directory and run the resulting jar: | ||
`java -jar spoon-dataflow.jar -sources <arg...> [-classpath <arg...>]` | ||
|
||
## Under the hood | ||
### AST | ||
spoon-dataflow uses Spoon library to build and traverse AST in a post-order manner. | ||
While traversing the AST, it translates the source code into a form understandable by Z3 solver. Then the checks are performed by the solver. | ||
|
||
### Static single assignment form (SSA) | ||
First of all, in order to be used with Z3, the code should be present in [Static Single Assignment form](https://en.wikipedia.org/wiki/Static_single_assignment_form). We do this to maintain the order of a program in a logic formula. | ||
Here is a simple example: | ||
|
||
Original form: | ||
``` | ||
x = 1; | ||
y = 2; | ||
x = x + y; | ||
``` | ||
SSA form: | ||
``` | ||
x0 = 1; | ||
y0 = 2; | ||
x1 = x0 + y0; | ||
``` | ||
|
||
### Conditions | ||
After traversing if statement, we create a new variable, which is the result of a special if-then-else function provided by Z3. This function essentially joins values from branches over a condition. | ||
|
||
Original form: | ||
``` | ||
if (c) { | ||
x = 1; | ||
} else { | ||
x = 2; | ||
} | ||
``` | ||
SSA form: | ||
``` | ||
if (c) { | ||
x0 = 1; | ||
} else { | ||
x1 = 2; | ||
} | ||
x3 = ITE(c0, x0, x1); | ||
``` | ||
|
||
### Functions and interprocedural analysis | ||
The most obvious approach to deal with function calls is to inline them. Unfortunately, we can do it only for small functions (like getters or setters), because of the performance considerations. | ||
Another approach is to build Function Summaries (automatically or manually), which contain information about contracts, purity, return values and so on. | ||
But in general, any unknown function resets the values of its arguments. | ||
|
||
### Loops | ||
The most obvious approach to deal with loops is to unroll them. Unfortunately, we can do it only when the number of iterations is known statically and relatively small. | ||
So in general, we have to find loop invariants and reset everything else, like for the functions. | ||
However, some special cases could be treated differently. | ||
|
||
### Memory model | ||
spoon-dataflow uses type-indexed memory model (Burstall’s memory model). | ||
In this memory model each data type or field has its own memory array. | ||
It allows to detect something like that: | ||
```java | ||
void m() { | ||
C a = new C(); | ||
a.x = 10; | ||
C b = a; | ||
b.x = 20; | ||
if (a.x == 20) {} // <= always true | ||
} | ||
``` | ||
Here are some useful links with a more detailed explanation of different memory models: | ||
- https://llvm.org/pubs/2009-01-VMCAI-ScalableMemoryModel.pdf | ||
- https://www.researchgate.net/publication/221430855_A_Memory_Model_for_Static_Analysis_of_C_Programs | ||
|
||
|
||
## To Do list | ||
At the moment this project is just a proof of concept, so there is a lot of work to do: | ||
- Interprocedural analysis and Function Summaries; | ||
- Parallel analysis and classes dependency graph; | ||
- Support for the remaining language constructions; | ||
- Better loop analysis; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
plugins { | ||
id 'java' | ||
} | ||
|
||
group 'fr.inria.spoon' | ||
version '1.0' | ||
|
||
sourceCompatibility = 1.8 | ||
|
||
repositories { | ||
mavenCentral() | ||
maven { url 'https://raw.github.com/SpoonLabs/spoon-dependencies/master' } | ||
} | ||
|
||
dependencies { | ||
compile group: 'fr.inria.gforge.spoon', name: 'spoon-core', version: '7.2.0' | ||
compile group: 'commons-cli', name: 'commons-cli', version: '1.4' | ||
compile group: 'com.microsoft', name: 'z3', version: '4.8.4' | ||
testCompile group: 'junit', name: 'junit', version: '4.12' | ||
} | ||
|
||
task copyDependencies(type: Copy) { | ||
from configurations.compile | ||
into "${buildDir}/libs" | ||
} | ||
|
||
compileJava { | ||
dependsOn copyDependencies | ||
} | ||
|
||
jar { | ||
archiveName 'spoon-dataflow.jar' | ||
destinationDir buildDir | ||
manifest { | ||
attributes 'Main-Class': 'fr.inria.spoon.dataflow.Main' | ||
attributes 'Class-Path': configurations.compile.collect { "libs/" + it.getName() }.join(' ') | ||
} | ||
} |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#Mon Apr 22 16:55:18 EDT 2019 | ||
distributionBase=GRADLE_USER_HOME | ||
distributionPath=wrapper/dists | ||
zipStoreBase=GRADLE_USER_HOME | ||
zipStorePath=wrapper/dists | ||
distributionUrl=https\://services.gradle.org/distributions/gradle-4.10-all.zip |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,172 @@ | ||
#!/usr/bin/env sh | ||
|
||
############################################################################## | ||
## | ||
## Gradle start up script for UN*X | ||
## | ||
############################################################################## | ||
|
||
# Attempt to set APP_HOME | ||
# Resolve links: $0 may be a link | ||
PRG="$0" | ||
# Need this for relative symlinks. | ||
while [ -h "$PRG" ] ; do | ||
ls=`ls -ld "$PRG"` | ||
link=`expr "$ls" : '.*-> \(.*\)$'` | ||
if expr "$link" : '/.*' > /dev/null; then | ||
PRG="$link" | ||
else | ||
PRG=`dirname "$PRG"`"/$link" | ||
fi | ||
done | ||
SAVED="`pwd`" | ||
cd "`dirname \"$PRG\"`/" >/dev/null | ||
APP_HOME="`pwd -P`" | ||
cd "$SAVED" >/dev/null | ||
|
||
APP_NAME="Gradle" | ||
APP_BASE_NAME=`basename "$0"` | ||
|
||
# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. | ||
DEFAULT_JVM_OPTS="" | ||
|
||
# Use the maximum available, or set MAX_FD != -1 to use that value. | ||
MAX_FD="maximum" | ||
|
||
warn () { | ||
echo "$*" | ||
} | ||
|
||
die () { | ||
echo | ||
echo "$*" | ||
echo | ||
exit 1 | ||
} | ||
|
||
# OS specific support (must be 'true' or 'false'). | ||
cygwin=false | ||
msys=false | ||
darwin=false | ||
nonstop=false | ||
case "`uname`" in | ||
CYGWIN* ) | ||
cygwin=true | ||
;; | ||
Darwin* ) | ||
darwin=true | ||
;; | ||
MINGW* ) | ||
msys=true | ||
;; | ||
NONSTOP* ) | ||
nonstop=true | ||
;; | ||
esac | ||
|
||
CLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar | ||
|
||
# Determine the Java command to use to start the JVM. | ||
if [ -n "$JAVA_HOME" ] ; then | ||
if [ -x "$JAVA_HOME/jre/sh/java" ] ; then | ||
# IBM's JDK on AIX uses strange locations for the executables | ||
JAVACMD="$JAVA_HOME/jre/sh/java" | ||
else | ||
JAVACMD="$JAVA_HOME/bin/java" | ||
fi | ||
if [ ! -x "$JAVACMD" ] ; then | ||
die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME | ||
Please set the JAVA_HOME variable in your environment to match the | ||
location of your Java installation." | ||
fi | ||
else | ||
JAVACMD="java" | ||
which java >/dev/null 2>&1 || die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. | ||
Please set the JAVA_HOME variable in your environment to match the | ||
location of your Java installation." | ||
fi | ||
|
||
# Increase the maximum file descriptors if we can. | ||
if [ "$cygwin" = "false" -a "$darwin" = "false" -a "$nonstop" = "false" ] ; then | ||
MAX_FD_LIMIT=`ulimit -H -n` | ||
if [ $? -eq 0 ] ; then | ||
if [ "$MAX_FD" = "maximum" -o "$MAX_FD" = "max" ] ; then | ||
MAX_FD="$MAX_FD_LIMIT" | ||
fi | ||
ulimit -n $MAX_FD | ||
if [ $? -ne 0 ] ; then | ||
warn "Could not set maximum file descriptor limit: $MAX_FD" | ||
fi | ||
else | ||
warn "Could not query maximum file descriptor limit: $MAX_FD_LIMIT" | ||
fi | ||
fi | ||
|
||
# For Darwin, add options to specify how the application appears in the dock | ||
if $darwin; then | ||
GRADLE_OPTS="$GRADLE_OPTS \"-Xdock:name=$APP_NAME\" \"-Xdock:icon=$APP_HOME/media/gradle.icns\"" | ||
fi | ||
|
||
# For Cygwin, switch paths to Windows format before running java | ||
if $cygwin ; then | ||
APP_HOME=`cygpath --path --mixed "$APP_HOME"` | ||
CLASSPATH=`cygpath --path --mixed "$CLASSPATH"` | ||
JAVACMD=`cygpath --unix "$JAVACMD"` | ||
|
||
# We build the pattern for arguments to be converted via cygpath | ||
ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null` | ||
SEP="" | ||
for dir in $ROOTDIRSRAW ; do | ||
ROOTDIRS="$ROOTDIRS$SEP$dir" | ||
SEP="|" | ||
done | ||
OURCYGPATTERN="(^($ROOTDIRS))" | ||
# Add a user-defined pattern to the cygpath arguments | ||
if [ "$GRADLE_CYGPATTERN" != "" ] ; then | ||
OURCYGPATTERN="$OURCYGPATTERN|($GRADLE_CYGPATTERN)" | ||
fi | ||
# Now convert the arguments - kludge to limit ourselves to /bin/sh | ||
i=0 | ||
for arg in "$@" ; do | ||
CHECK=`echo "$arg"|egrep -c "$OURCYGPATTERN" -` | ||
CHECK2=`echo "$arg"|egrep -c "^-"` ### Determine if an option | ||
|
||
if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then ### Added a condition | ||
eval `echo args$i`=`cygpath --path --ignore --mixed "$arg"` | ||
else | ||
eval `echo args$i`="\"$arg\"" | ||
fi | ||
i=$((i+1)) | ||
done | ||
case $i in | ||
(0) set -- ;; | ||
(1) set -- "$args0" ;; | ||
(2) set -- "$args0" "$args1" ;; | ||
(3) set -- "$args0" "$args1" "$args2" ;; | ||
(4) set -- "$args0" "$args1" "$args2" "$args3" ;; | ||
(5) set -- "$args0" "$args1" "$args2" "$args3" "$args4" ;; | ||
(6) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" ;; | ||
(7) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" ;; | ||
(8) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" ;; | ||
(9) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" "$args8" ;; | ||
esac | ||
fi | ||
|
||
# Escape application args | ||
save () { | ||
for i do printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/" ; done | ||
echo " " | ||
} | ||
APP_ARGS=$(save "$@") | ||
|
||
# Collect all arguments for the java command, following the shell quoting and substitution rules | ||
eval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS "\"-Dorg.gradle.appname=$APP_BASE_NAME\"" -classpath "\"$CLASSPATH\"" org.gradle.wrapper.GradleWrapperMain "$APP_ARGS" | ||
|
||
# by default we should be in the correct project dir, but when run from Finder on Mac, the cwd is wrong | ||
if [ "$(uname)" = "Darwin" ] && [ "$HOME" = "$PWD" ]; then | ||
cd "$(dirname "$0")" | ||
fi | ||
|
||
exec "$JAVACMD" "$@" |
Oops, something went wrong.