C# implementation of GPT-2.
There seems to be an issue with TensorFlow's default GPU memory allocator, that consumes more than needed.
In case you know you have enough RAM/GPU RAM, setting TF_GPU_ALLOCATOR
environment variable to cuda_malloc
might help.