Skip to content

Conversation

@stas00
Copy link
Contributor

@stas00 stas00 commented Dec 11, 2021

This PR gives the reader an idea to travel a novel MoE Transformer architecture for a much faster training, albeit much larger memory requirements.

While most of the solutions are in the Tensorflow/TPU land, Deepspeed has introduced a pytorch solution based on Megatron-Deepspeed.

@sgugger

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

@stas00 stas00 merged commit 027074f into huggingface:master Dec 11, 2021
@stas00 stas00 deleted the doc-perf-moe branch December 11, 2021 02:24
@LysandreJik
Copy link
Member

As a reference for future PRs, it would be nice to put the images in a dataset hosted on hf.co - putting them in the repo weighs down the repo significantly (+72kb here) and it can't be removed from the history without altering it. Images in the docs can also be taken from a dataset, see SegFormer for example: https://huggingface.co/docs/transformers/master/en/model_doc/segformer#overview

See how it was added here: https://github.com/huggingface/transformers/blame/master/docs/source/model_doc/segformer.rst#L44-L45

I have opened this PR to clarify this in the contribution guide: #14738

@sgugger
Copy link
Collaborator

sgugger commented Dec 13, 2021

@LysandreJik maybe it would make it more obvious to everyone if we moved all images there. They won't disappear form history but there is no imgs folder, no one will be tempted to add one?

@LysandreJik
Copy link
Member

Yes, that's a very good call!

@stas00
Copy link
Contributor Author

stas00 commented Dec 13, 2021

That's a very good point about not blowing up the repo, @LysandreJik!

Will this actually work well though? Aren't browsers these days capable of blocking remote urls that don't coincide on the same domain for security conscious users?

@LysandreJik
Copy link
Member

LysandreJik commented Dec 13, 2021

We have tested it with @NielsRogge: it works well for the segformer above, and everything resides on huggingface.co.

I'll take care of migrating all the images (including the one above) in a PR this week, will ensure that the images are correctly linked.

@stas00
Copy link
Contributor Author

stas00 commented Dec 13, 2021

We have tested it with @NielsRogge: it works well for the segformer above, and everything resides on huggingface.co.

You have tested it how? Do you have the security tools installed? Like Privacy Badger plugin and alikes?

For context/background - many websites install all kinds of tracking devices, which can be an external js code, an image, etc. So tools like Privacy Badger will often skip any such 3rd party urls if configured so. Which may catch images as well.

Since I have Privacy Badger I can test this if you give me an example of a doc that includes remote images.

I'll take care of migrating all the images (including the one above) in a PR this week, will ensure that the images are correctly linked.

Thank you!

@LysandreJik
Copy link
Member

We empirically tested it - I use Brave which blocks ads and trackers but I don't use Privacy Badger, so I'd be happy for you to tell me how it works for you!

Please let me know if you see the images here: https://github.com/LysandreJik/transformers/blob/documentation-images/docs/source/parallelism.md#zero-data-parallel

and here: https://huggingface.co/docs/transformers/master/en/model_doc/segformer#overview

I have no issue with those, which are currently hosted in the following dataset: https://huggingface.co/datasets/huggingface/documentation-images/tree/main

If this is sufficient testing for you, let me know and I'll finalize/open the PR to remove the images and rely on the external dataset instead.

@stas00
Copy link
Contributor Author

stas00 commented Dec 14, 2021

Thank you for sharing how you tested it, @LysandreJik

and I tested with Privacy Badger - all works - it doesn't consider those as trackers.

So all is good!

Thank you!

@LysandreJik
Copy link
Member

Perfect, thank you for testing it out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants