Skip to content
This repository has been archived by the owner on Jan 30, 2021. It is now read-only.

How to fine tune xlnet large model with new text using XLnet-gen? #4

Open
GenTxt opened this issue Jul 18, 2019 · 3 comments
Open

How to fine tune xlnet large model with new text using XLnet-gen? #4

GenTxt opened this issue Jul 18, 2019 · 3 comments

Comments

@GenTxt
Copy link

GenTxt commented Jul 18, 2019

Hi:

Thanks for this repo. Text generation is my main interest and I was wondering how the xlnet large model can be fine tuned with new text then used as a model in XLnet-gen using language_generation.py

I can create a small base model from scratch using your repo but I don't have the gpu power to generate a large one.

Since the gpt-2 fine tuning repo by nshepperd using the OpenAi 345M model is very easy to use is it possible to use a similar process in XLnet-gen?

The fine tuning examples given in the original XLnet repo don't seem applicable or easy to edit for text generation.

Any suggestions or new scripts are welcome.

Thanks

@rusiaaman
Copy link
Owner

Thanks for your suggestion and having a quick fine-tuning code is a good idea. However, I am currently investigating why XLNet is not able to generate more coherent sentences and it will take a few days before I finalize the investigations and release a model/script better suited for text generation using XLNet.

Any pull requests are welcome on the fine-tuning part (or any other part really). But if there isn't any implementation on this (for me to pull) by the time I finish my investigation, I will implement it myself.

Until then, you can use data_utils.py of the original repo to process your data and train_gpu.py to fine tune using Colab GPU which is free.

@iamanai-1
Copy link

I, too, would be interested in having quick fine-tuning code. :)
If you're still contributing to this project, I'd be glad if you wrote one.

@GenTxt
Copy link
Author

GenTxt commented Dec 12, 2020 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants