CatenaD4J (c4j) is a dataset that can be used to evaluate existing techniques on repairing indivisible multi-hunk bugs. This repository also contains an implementation of tool for detecting and creating indivisible bugs.
C4j works like a plugin of other datasets and now use Defects4J as default backend because c4j contains a lot of bugs generated from defects4j. But it is easy to switch its backend or expand its commands and bugs.
Note
In our experiments, we discovered some flaky bugs in the Mockito project within this dataset. These bugs produced varying test results when run in different environments (e.g. sequential versus parallel execution, or under high CPU usage). We wil be working to determine the reasons. It is recommended to avoid using bugs in the Mockito project currently.
CatenaD4J now contains 6 projects and 367 bugs generated from Defects4J.
-
The dataset consists of original bugs that is indivisible from d4j and isolated new bugs what original bugs of d4j are divided into.
-
Each bug would have its new failing tests that only contains single valid assertion statement so that techniques and debuggers cannot detect repairing effects from one failing test, and that is our real debugging scenario.
-
All bugs in the dataset are isolated and minimal. We would like to name these bugs catena bugs that means the bugs are catenated which consist of hunks depending on each other so that only by fixing all hunks can we fix the catena bug.
-
All bugs would be assigned a
catena_id (cid)
. To find a specific bug, you may use project_name that the bug belongs to, bug_id that indicates the source bug of this bug and the cid to recognize bugs generated from the same source bug.
To distinguish a bug there is the tag <bug_id><b/f><cid>
. b/f means the buggy/fixed version.
You can check all available cids with its bug_id in file ./projects/<project_name>/bugs-registry.csv
Every line in bugs-registry represents a catena bug
and conforms to format <project_name>, <bug_id>, <cid>, <loader>
.
Each valid bug should have a entry in bugs-registry so that the dataset knows how to check out it.
-
Python3
It is already installed in most Linux distributions and MacOS. If you are using OS without python3, check python official website for installation.
-
Defects4J v2.0
Check defects4j for installation.
-
Java 1.8
Check JDK 8 for installation.
-
Ensure you have docker installed on your computer
If you have no docker cli available, check Install Docker Engine for installation.
-
Check if curl is available
If you have no curl installed on your computer, check Install curl for installation.
Please check the man page of curl for usage and any problem.
-
Download the Dockerfile
curl https://raw.githubusercontent.com/universetraveller/CatenaD4J/main/Dockerfile -o Dockerfile
You can also use other approaches to download the Dockerfile (e.g. Download it directly from this repo).
-
Build the docker image via Dockerfile
docker build -t catena4j:main -f ./Dockerfile .
-
Create a container with CatenaD4J using the built image
docker run -it catena4j:main /bin/bash
-
Ensure the softwares in the Requirements section are all installed.
-
Clone the repository
git clone https://github.com/universetraveller/CatenaD4J.git
-
Add executable script catena4j to environment variable PATH:
export PATH=$PATH:<path to this repo>
-
Check installation:
catena4j pids
Note that the script catena4j
assumes the command python3
is usable otherwise you should edit the first line of the script (It points to the path of your executable python).
Commands | Description |
---|---|
checkout | Check out the specific version of a bug |
export | Export specific infomation of a bug |
reset | Reset all unstaged modification for a working directory |
pids | Print available project_names |
bids | Print available bug_ids that contains at least one cid |
cids | Print available catena ids for a bug_id |
info | Not implemented now |
test | Not implemented now |
compile | Not implemented now |
ver | Not implemented now |
If you try to pass not implemented commands to catena4j, the script would pass it to the backend to try to run it.
- checkout bugs
catena4j checkout -p <project_name> -v <bug_id of defects4j><'b' or 'f'><cid of catena4j> -w <working_dir>
example: catena4j checkout -p Chart -v 15b1 -w ./buggy
- export
tests.trigger
orclasses.modified
catena4j export -p <property_name> -w <working_dir> -o <output_file>
- reset working directory
catena4j reset [-w <working_dir>]
- print available ids
catana4j pids
catana4j bids -p <project_name>
catana4j cids -p <project_name> -b <bug_id>
We did not fully re-implement checkout command of defects4j, so currently the default loader use defects4j to checkout original bugs. Besides we do not implement command test and compile now for defects4j supports them.
However we note that defects4j is a complex system, and it works inefficient. Defects4J would call perl, ant, java and other commands so that it executes about n times slower than directly use these commands. It is better to re-implement defects4j's checkout, test and compile commands based on git and java.
In the future we would try to implement these commands and replace current defects4j backend with lightweight ones if possible.
It is easy to take control of the dataset because all components are designed as replaceable, so that you can design your own commands or change the behaviours of current commands.
The implementation of the dataset is in folder internal
.
Loaders are the real executor of the commands, and all bugs should specify its loader in bugs-registry to load infomation for itself.
Command entries are the entrances of commands. Command-line interface would find the command implementation from command-entries via command input and then pass all args to the implementation.
By implementing custom command as python function which processes args namespace, we can add our custom function to command-entries with a command name, and then add this command name to config file, so that we can use this custom command in command line.
Command cids
can be an example of creating custom command. First, we implement our command cids
in ids.py as def CIDS(args)
, then we add it into entries.py by adding this function to __entry_map
or using function register_command
. Then when try to input catena4j cids
it can print some messages.
By implementing custom loader as python class we can take control of bug loading. Add our new loader to loaders.py and modify the loader specification of any bug in bugs-registry then the script would use our custom loader to load the bug.
Every loader should implement function load
, fix
and get_attr
(Arguments can refer to default_loader).
Catena4J use DefaultPathLoader as default loader. It loads bugs' infomation via specific formatted paths. By creating these specific files and adding a new entry into bugs-registry assigning loader to default, we can create a new active bug.
Folder scripts contains all experiment codes to construct this dataset.
Please check README of scripts for information and guides about how to reproduce the experiments.
Check here for the statistics of current bugs
The current version is available. We will try to make this dataset more concrete and add some features in the future.
The plan of development is as below. However, because the task is time-comsuming, some updates will only be developed when we have free time. Please notice that some urgent updates (e.g. bug fixes) will be prioritized and addressed as fast as possible.
- An implementation of faster
test
command using the custom test runner with abort-on-failure feature that supports up to Junit5. - An implementation of faster
checkout
command to replace the current version (using defects4j's checkout). - Adding a fast and usable code coverage tool.
- Adding a fast and usable spectrum-based fault localization tool.
- Complete replacement of the defects4j backend by re-implementing all used commands.
--------------------------------------------------------------------------------------------------------
File tree | Introdcution
--------------------------------------------------------------------------------------------------------
. | Root path of the repo
|-- Dockerfile | Script for docker to build an image contains this repo
|-- LICENSE | Opensource license file
|-- README.md | This file
|-- catena4j | Executable script of the dataset
|-- internal | Implementation of the dataset
|-- projects | Bugs data of the dataset
`-- scripts | Scripts for reproducing experiments in our FSE 2024 paper
|-- Dockerfile.experiments | Script for docker to build an image used for reproducing experiments
|-- README.md | Steps and guides for reproducing experiments
|-- analyze_tests | Extract assertion related identifiers from trigger tests
|-- build.sh | Script to build the image for experiments in a single step
|-- construct_database | Prepare bugs and metadata for the experiments
|-- generate_bugs | Algorithm to detect and create indivisible bugs
|-- install_requirements.sh | Script used to install dependencies of experiments
|-- parse_patches | Extract hunks information of bugs
`-- split_tests | Algorithm of tests minimization
----------------------------------------------------------------------------------------------------------
- Q. Xin, H. Wu, J. Tang, X. Liu, S. Reiss and J. Xuan. Detecting, Creating, Evaluating, and Understanding Indivisible Multi-Hunk Bugs. FSE 2024.