-
Notifications
You must be signed in to change notification settings - Fork 27.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to pre-train BART model #4151
Comments
We still need to provide a good docstring/notebook for this. It's on our ToDo-List. :-) Or @sshleifer - is there already something for Bart? |
Nothing yet, would be good to add! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I have seen the same issue in fairseq BART!. |
Hi, any news about bart pre-training? |
who can tell me how to pre-train the bart on my own dataset? I am so confused .... |
Maybe this comment can help: #5096 (comment) |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Any news on this please? |
not so far, would be great to have it. Thanks. |
I and my co-worker wrote a demo according to roberta pretraining demo.
|
Thanks for the code example, I am also planning on implementing pretrained from scratch, and I've got several questions for the code
|
For the first question, just like this:
For the other question, i trained it with 12G gpu memory,but it may be completed with samller gpu memory. And also you could adjust you parameters to your server environment. |
@myechona Thanks for your code. I have a question about it. There are some tasks like text-filling and sentence-permutation during pretrain stage, i want to know whether the "input_ids" is for masked sentence and the "labels" is for origin sentence? |
If anyone wants to train their MBART model then feel free to use this. Contributions are welcome! |
Thanks for your code, it really helps. |
I'm most interested in sentence infilling, which this script doesn't really seem to address (though my understanding was that BART training generally involves masking and permutation). Is there an additional step I need to add for the infilling functionality? |
Hi, any update on this? @vanpelt |
I actually decided to jump over to T5 and use the |
Great, I was initially looking at those scripts to get some ideas about the pre-training script, but since then thought the Huggingface guys might have come up with a resource to do this. Apparently, it's still underway! :) |
We've released nanoT5 that reproduces T5-model (similar to BART) pre-training in PyTorch (not Flax). You can take a look! Any suggestions are more than welcome. |
How to pre-train BART model in an unsupervised manner. any example?
The text was updated successfully, but these errors were encountered: