A TensorFlow implementation of Andrej Karpathy's Char-RNN, a character level language model using multilayer Recurrent Neural Network (RNN, LSTM or GRU). See his article The Unreasonable Effectiveness of Recurrent Neural Network to learn more about this model.
- Python 2.7
- TensorFlow >= 0.7.0
- NumPy >= 1.10.0
Follow the instructions on TensorFlow official website to install TensorFlow.
If you use their pip installation:
# Ubuntu/Linux 64-bit, CPU only:
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled. Requires CUDA toolkit 7.5 and CuDNN v4. For
# other versions, see "Install from sources" below.
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Mac OS X, CPU only:
$ sudo easy_install --upgrade six
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0-py2-none-any.whl
It will also install other necessary packages (including NumPy) for you.
If the installation finishes with no error, quickly test your installation by running:
python train.py --data_file=data/tiny_shakespeare.txt --num_epochs=10 --test
This will train char-rnn on the first 1000 characters of the tiny shakespeare copus. The final train/valid/test perplexity should all be lower than 30.
train.py
is the script for training.sample.py
is the script for sampling.char_rnn_model.py
implements the Char-RNN model.
To train on tiny shakespeare corpus (included in data/) with default settings (this might take a while):
python train.py --data_file=data/tiny_shakespeare.txt
All the output of this experiment will be saved in a folder (default to output/
, you can specify the folder name using --output_dir=your-output-folder
).
The experiment log will be printed to stdout by default. To direct the log to a file instead, use --log_to_file
(then it will be saved in your-output-folder/experiment_log.txt
).
The output folder layout:
your-output-folder
├── result.json # results (best validation and test perplexity) and experiment parameters.
├── vocab.json # vocabulary extracted from the data.
├── experiment_log.txt # Your experiment log if you used --log_to_file in training.
├── tensorboard_log # Folder containing Logs for Tensorboard visualization.
├── best_model # Folder containing saved best model (based on validation set perplexity)
├── saved_model # Folder containing saved latest models (for continuing training).
Note: train.py
assume the data file is using utf-8 encoding by default, use --encoding=your-encoding
to specify the encoding if your data file cannot be decoded using utf-8.
To sample from the best model of an experiment (with a given start_text and length):
python sample.py --init_dir=your-output-folder --start_text="The meaning of life is" --length=100
To use Tensorboard (a visualization tool in TensorFlow) to [visualize the learning] (https://www.tensorflow.org/versions/r0.8/how_tos/summaries_and_tensorboard/index.html#tensorboard-visualizing-learning) (the "events" tab) and the computation graph (the "graph" tab).
First run:
tensorboard --logdir=your-output-folder/tensorboard_log
Then navigate your browser to http://localhost:6006 to view. You can also specify the port using --port=your-port-number
.
To continue a finished or interrupted experiment, run:
python train.py --data_file=your-data-file --init_dir=your-output-folder
train.py
provides a list of hyperparameters you can tune.
To see the list of all hyperparameters, run:
python train.py --help