Introduce improvements from OSLO #571

hyunwoongko · 2022-02-23T22:28:38Z

AOTAutograd is a novel engine provided by functorch that can fuse all parts of a neural network. I added it to OSLO recently, and this makes training very faster. I want to add this to GPTNeoX, how about this? It would be nice to implement this on the DeeperSpeed side as well.
OSLO changed megatron's MPU to have an odd number of embedding sizes. Therefore, there is no need to add meaningless padding tokens and this could increase memory efficiency, and by using this, I was able to implement the TP Automerging function as well. Note that this can merge 70+ architectures of transformers without checkpoint conversion scripts.
Recently FusedRMSNorm is added to Apex and this has been merged into OSLO. The NeoX 20B doesn't seem to use RMSNorm, but this might be helpful.

I will continue to write the parts that I can improve.

hyunwoongko · 2022-02-23T23:38:37Z

@sdtblck I saw you posted an issue regarding OSLO PP. Is there anything in PP you would like to improve?

StellaAthena · 2022-02-24T23:42:52Z

#2 sounds quite clever and I strongly support it.

Given that we are very far from the mainline DeepSpeed repo, would #1 involve a lot of unnecessary labor compared to doing it after we get back to the main version of DeepSpeed?

#3 seems like a low priority nice to have. I don’t have any plans to use that normalization, though I’m sure some people might. That said, 90% of the use that this library gets is currently internal to EleutherAI AFAIK, so things other people might want to use seems like a low priority.

hyunwoongko · 2022-02-25T06:24:57Z

@StellaAthena

#2 I'm going to create a new branch in the current neox repo and experiment.

#1 This feature does not exist in DeepSpeed. so there is no need to worry about DeepSpeed upstream. Since I've already built it into a usable form in OSLO, it should be easy to add.

#3 I totally agree with you.

In addition, If there are any further parts that you would like to improve or experiment even if it has nothing to do with OSLO, please feel free to assign some tasks to me. I will totally help the neox project.

StellaAthena · 2022-02-25T06:38:14Z

@hyunwoongko Ah I think I misread your comments about #1 :) In that case I would certainly be interested in experimenting with it :)

Honestly, far and away the most helpful thing you could do is figure out how to bring us back in-line with the main DeepSpeed branch. I know that’s a big ask though, so no worries if it’s a bit daunting.

In terms of building out the library, the other most important things on the horizon are #479 and #215. There’s also some outstanding abandoned PRs with optimizers like Shampoo that would be nice to have cleaned up and finished. In terms of general library maintenance, #469 and various documentation improvements such as #506 #484 and #458 would all be quite helpful.

We could also always use help designing and orchestrating experiments. We can happily provide the compute for anyone willing to do the work… DM me on Slack if you’re interested.

Quentin-Anthony · 2023-05-18T21:10:11Z

@hyunwoongko -- Would you like to restart this effort?

hyunwoongko · 2023-05-20T03:52:10Z

@Quentin-Anthony sounds great.

hyunwoongko changed the title ~~AOTAutograd support~~ Introduce improvements from OSLO Feb 23, 2022

StellaAthena added the feature request New feature or request label Feb 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce improvements from OSLO #571

Introduce improvements from OSLO #571

hyunwoongko commented Feb 23, 2022 •

edited

Loading

hyunwoongko commented Feb 23, 2022

StellaAthena commented Feb 24, 2022

hyunwoongko commented Feb 25, 2022 •

edited

Loading

StellaAthena commented Feb 25, 2022

Quentin-Anthony commented May 18, 2023

hyunwoongko commented May 20, 2023 •

edited

Loading

Introduce improvements from OSLO #571

Introduce improvements from OSLO #571

Comments

hyunwoongko commented Feb 23, 2022 • edited Loading

hyunwoongko commented Feb 23, 2022

StellaAthena commented Feb 24, 2022

hyunwoongko commented Feb 25, 2022 • edited Loading

StellaAthena commented Feb 25, 2022

Quentin-Anthony commented May 18, 2023

hyunwoongko commented May 20, 2023 • edited Loading

hyunwoongko commented Feb 23, 2022 •

edited

Loading

hyunwoongko commented Feb 25, 2022 •

edited

Loading

hyunwoongko commented May 20, 2023 •

edited

Loading