This is the source code and dataset for ByteCue. The dataset is saved in the datawash/data
folder.
If you want to train your own dataset, start with the step1, otherwise skip the step1.
please place the bytecode, cfg and comment files under data folder with the following names:
-train_story.txt
-train_summ.txt
-train_cfg.txt
-train_api_pair.txt
-eval_story.txt
-eval_summ.txt
-eval_cfg.txt
-eval_api_pair.txt
-test_story.txt
-test_summ.txt
-test_cfg.txt
-test_api_pair.txt
each story and summary must be in a single line (see sample text given.)Run the preprocess.py
Command:python preprocess.py
This will creates three tfrecord files under the datawash folder.
run the main.py
Command:python main.py
Configurations for the model can be changes from config.py file
- Firstly, generate comments for the test set
run the generateCOMMENT.py
Command:python generateCOMMENT.py
- Then, evaluate the generated comments
run the evaluation.py
Command:python evaluation.py