ByteCue: One Step Further Than Decompilation: Bytecode Comment Generation

This is the source code and dataset for ByteCue.

Quick start

If you want to train your own dataset, start with the step1, otherwise skip the step1.

Step1: data preprocess

please place the bytecode, cfg and comment files under data folder with the following names:
-train_story.txt
-train_summ.txt
-train_cfg.txt
-train_api_pair.txt
-eval_story.txt
-eval_summ.txt
-eval_cfg.txt
-eval_api_pair.txt
-test_story.txt
-test_summ.txt
-test_cfg.txt
-test_api_pair.txt
each story and summary must be in a single line (see sample text given.)

Run the preprocess.py
Command: python preprocess.py
This will creates three tfrecord files under the datawash folder.

Step2: train the model

run the main.py
Command: python main.py
Configurations for the model can be changes from config.py file

Step3: generate comments and test your trained model

Firstly, generate comments for the test set
run the generateCOMMENT.py
Command: python generateCOMMENT.py

Then, evaluate the generated comments
run the evaluation.py
Command: python evaluation.py

As the limitation of LFS, the dataset can be downloaded from https://drive.google.com/drive/folders/1z0xh0KOFB8V-9LQmE0BTJyXkUU_t3kYD?usp=sharing. Unzip the downloaded .zip file, which contains four folders ('datawash' ,'scripts', 'texar_repos', 'venv, 'pretrained_model'), then move these four folders to the ByteCue root directory.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
ByteCue.py		ByteCue.py
README.md		README.md
Readme.md		Readme.md
__init__.py		__init__.py
config.py		config.py
evaluation.py		evaluation.py
generateCOMMENT.py		generateCOMMENT.py
main.py		main.py
preprocess.py		preprocess.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ByteCue: One Step Further Than Decompilation: Bytecode Comment Generation

Quick start

Step1: data preprocess

Step2: train the model

Step3: generate comments and test your trained model

About

Releases

Packages

Languages

kaykayhard/ByteCue

Folders and files

Latest commit

History

Repository files navigation

ByteCue: One Step Further Than Decompilation: Bytecode Comment Generation

Quick start

Step1: data preprocess

Step2: train the model

Step3: generate comments and test your trained model

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages