Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can use TRT in torchrec? #2307

Open
yjjinjie opened this issue Aug 16, 2024 · 2 comments
Open

How can use TRT in torchrec? #2307

yjjinjie opened this issue Aug 16, 2024 · 2 comments

Comments

@yjjinjie
Copy link

I see the example: https://github.com/pytorch/torchrec/blob/v0.2.0/examples/inference/dlrm_predict_single_gpu.py

but how can i split the embedding and dense model,and then dense model use trt, the conat (ebc+ trt_dense model) , export the torchscript model in C++ inference ?

@PaulZhang12
Copy link
Contributor

@yjjinjie that inference solution is legacy, will need to clean up. Check out this file for exporting a model to TorchScript for C++ serving: https://github.com/pytorch/torchrec/blob/main/torchrec/inference/dlrm_predict.py

You can apply lowering to the dense part of the model individually and then still torchscript it like the example above

@yjjinjie
Copy link
Author

yjjinjie commented Aug 20, 2024

can you show a example--> how to lowering to the dense part of the model individually?

just lowering the denseArch?

https://github.com/pytorch/torchrec/blob/main/torchrec/models/dlrm.py#L115

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants