Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline invalid #2348

Closed
zhaozheng09 opened this issue Aug 29, 2024 · 2 comments
Closed

Pipeline invalid #2348

zhaozheng09 opened this issue Aug 29, 2024 · 2 comments

Comments

@zhaozheng09
Copy link

zhaozheng09 commented Aug 29, 2024

image block_bucketize_sparse_features cannot be parallelized with the forward pass
@zhaozheng09 zhaozheng09 changed the title Pipeline is invalid Pipeline invalid Aug 29, 2024
@sarckk
Copy link
Member

sarckk commented Dec 27, 2024

hi @zhaozheng09, are you asking why block_bucketize_sparse_features kernel doesn't overlap with embedding table lookup kernel?

@sarckk sarckk closed this as completed Jan 3, 2025
@sarckk
Copy link
Member

sarckk commented Jan 3, 2025

Even though you are launching block_bucketize_sparse_features and embedding lookup kernel on different streams, they have a dependency on each other (i.e. block_bucketize_sparse_features needs to happen for KJT all2all, only after which embedding lookup can happen) so they cannot be scheduled in parallel. They would also be sharing the same compute resources on the GPU

Closing for now, feel free to reopen if this hasn't addressed your question

@sarckk sarckk reopened this Jan 3, 2025
@sarckk sarckk closed this as completed Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants