Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are there any plans to support context parallel? #2141

Open
dz1iang opened this issue Dec 10, 2024 · 2 comments
Open

Are there any plans to support context parallel? #2141

dz1iang opened this issue Dec 10, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@dz1iang
Copy link

dz1iang commented Dec 10, 2024

Long-text scenarios are quite common, and it would be of great help if they could be supported.

@joecummings joecummings added the enhancement New feature or request label Dec 10, 2024
@joecummings
Copy link
Contributor

Take a look at @felipemello1's awesome RFC looking at how we would plan to support even longer context models: #1244.

We started taking a look at it, but then de-prioritized the work in favor of onboarding new modalities b/c we found that with our memory optimizations we could easily get to 64K context length.

What use case are you trying to work on? We could revisit our prioritization.

@dz1iang
Copy link
Author

dz1iang commented Dec 11, 2024

Take a look at @felipemello1's awesome RFC looking at how we would plan to support even longer context models: #1244.

We started taking a look at it, but then de-prioritized the work in favor of onboarding new modalities b/c we found that with our memory optimizations we could easily get to 64K context length.

What use case are you trying to work on? We could revisit our prioritization.

In my opinion, everyone is chasing after O1. In this process, the training on long texts in SFT (Supervised Fine-Tuning) and RL (Reinforcement Learning) is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants