This is a basic Software Fault Prediction (SFP) program, written in Java.
It uses a Naive Bayes Classifier for predicting software faults.
A custom, rather small dataset is used (dataset.txt).
The structure is as follows:
Please note that the 'module' column is not part of the dataset, it is just shown to make it easier to understand.
module | f1 | f2 | ... | fn | Faulty |
---|---|---|---|---|---|
1 | m1_f1 | m1_f2 | ... | m1_fn | 0 or 1 |
2 | m2_f1 | m2_f2 | ... | m2_fn | 0 or 1 |
... | ... | ... | ... | ... | 0 or 1 |
k | mk_f1 | mk_f2 | ... | mk_fn | 0 or 1 |
Where
f1 stands for feature 1, f2 = feature 2, ...
m1_f1 stands for first feature of first module
n is total number of features for each module
k is total number of modules in dataset (total number of lines)
A module can either be faulty (=1) or non-faulty (=0)
The used equations can be seen below.
Equation 1: Naive Bayes Classifier
Equation 2: Standard Deviation
Equation 3: Gaussian Distribution Function
Equation 5: Category Specific Set Size
First compile the project:
javac model/SFPModel.java
Then run it:
java -cp . module/SFPModel
A second dataset file is included "nasa_cm1_dataset.txt" which is a dataset from NASA, specifically for CM1. It was taken from PROMISE.
Full link: http://promise.site.uottawa.ca/SERepository/datasets/cm1.arff
The dataset was slightly modified by changing each last value of each module from boolean values into integers (True = 1, False = 0). This makes it easier to process.