This repository contains experiment codes for DeepProcess, an implementation aimed at investigating sequence modeling and deep neural network-based process prediction. The files included here support experiments on sequence-to-sequence models, Dynamic Neural Computer (DNC), and auxiliary modules for experimentation.
The DeepProcess project is designed to explore how deep learning models can be applied to process prediction problems. The implementation includes experiments with:
- Sequence-to-Sequence (Seq2Seq) models for baseline analysis.
- Prefix-Suffix models to predict sequences based on partial data.
- The Dynamic Neural Computer (DNC) for advanced memory and processing tasks.
The main experiment codes are organized as follows:
- baseline_seq2seq.py # Baseline Seq2Seq model implementation
- presuf_train.py # Training script for prefix-suffix models
- presuf_run.py # Execution script for evaluating prefix-suffix models
- controller.py # Controller class used in DNC
- dnc.py # Dynamic Neural Computer implementation
- feedforward_controller.py # Feedforward Controller for DNC
- recurrent_controller.py # Recurrent Controller for DNC
- memory.py # Memory module for DNC
- seq_helper.py # Helper functions for sequence operations
- utility.py # Utility functions used throughout the project
To set up the environment for running these experiments, please follow these steps:
-
Clone the repository:
git clone https://github.com/asjad99/DeepProcess cd DeepProcess/deep_process_experiment_codes
-
Create a virtual environment and activate it:
python -m venv deepprocess-env source deepprocess-env/bin/activate # On Windows: deepprocess-env\Scripts\activate
-
Install the required dependencies:
pip install -r requirements.txt
- Data for each experiment can be found in the
./data/BusinessProcess
folder. - The file
presuf_run.py
contains code for 3 experiments. - In
presuf_run.py
, there are train and test functions for each task. Just call the appropriate one based on your requirement.
To train the baseline sequence-to-sequence model, use the following command:
python baseline_seq2seq.py
This script provides a foundational sequence-to-sequence model for comparison with other, more advanced models in the repository.
- In each function, hyperparameters are hard-coded.
- You can directly edit the hyperparameters in the function definitions to change their values.
-
Method Type (edit in constructor's arguments):
- LSTM Seq2Seq:
use_mem=False
- DNC:
use_mem=True
,decoder_mode=True/False
,dual_controller=False
,write_protect=False
- DC-MANN:
use_mem=True
,decoder_mode=True
,dual_controller=True
,write_protect=False
- DCw_MANN:
use_mem=True
,decoder_mode=True/False
,dual_controller=True
,write_protect=True
- LSTM Seq2Seq:
-
Model Parameters (edit in constructor's arguments):
use_emb=True/False
: Use embedding layer or not.dual_emb=True/False
: If using embedding layer, use one shared or two embeddings for encoder and decoder.hidden_controller_dim
: Dimension of controller hidden state.
-
Memory Parameters (if using memory):
words_count
: Number of memory slots.word_size
: Size of each memory slot.read_heads
: Number of reading heads.
-
Training Parameters:
batch_size
: Number of sequences sampled per batch.iterations
: Maximum number of training steps.lm_train=True/False
: Training by the language model's way (edit inprepare_sample_batch
function).- Optimizer: In file
dnc.py
, functionbuild_loss_function_mask
(default is Adam).
- The current hyperparameters are picked based on experience from other projects.
- Except for different method types, other hyperparameter combinations have not been extensively tested.
- File:
baseline_seq2seq.py
- Description: Implements a standard Seq2Seq model to serve as a baseline for comparing more advanced sequence models. This script includes basic encoder-decoder architecture and attention mechanisms.
- Files:
presuf_train.py
,presuf_run.py
- Description: The
presuf_train.py
script is used for training models that predict suffixes from given prefixes. Thepresuf_run.py
script is then used to evaluate these models on different datasets. These scripts help in capturing meaningful sequence predictions from partial sequences.
The implementation of a Dynamic Neural Computer (DNC) is broken down into several components:
-
Controller Modules:
controller.py
: Base controller class for managing the learning process.feedforward_controller.py
: Feedforward controller used to manipulate the memory.recurrent_controller.py
: Recurrent controller that adds sequence-dependent memory updates.
-
Memory Management:
memory.py
: Implements the memory module responsible for reading and writing data for the DNC.
-
DNC Core:
dnc.py
: The main implementation of the Dynamic Neural Computer, including operations for managing external memory and interacting with controllers.
-
Helper Functions:
seq_helper.py
: Utility functions that assist with sequence preprocessing and other related tasks.utility.py
: General-purpose utility functions used throughout the project.
This project is licensed under the MIT License - see the LICENSE file for details.
Feel free to explore the individual scripts to understand their contributions to the overall sequence modeling experiments. If you have any questions or suggestions, please open an issue or submit a pull request!