Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pytorch support inference on separate cuda stream #2706

Conversation

jiyuanq
Copy link
Contributor

@jiyuanq jiyuanq commented Jul 13, 2023

Description

This PR is an attempt to support running inference on separate cuda streams for pytorch engine. By doing this, we can maximize GPU utilization when running concurrent inference requests on GPU.

Also added a boolean flag inferenceSeparateCudaStream controlled through system properties "ai.djl.pytorch.inference_separate_cuda_stream" to determine whether this new feature is enabled.

I considered exposing the full cuda stream related pytorch api through JNI but in the end decided to only expose a high level boolean flag, mainly because:

  • it's much easier to expose only this boolean flag
  • there's no other use case that requires the full cuda stream api

@jiyuanq jiyuanq requested review from zachgk, frankfliu and a team as code owners July 13, 2023 06:24
@jiyuanq
Copy link
Contributor Author

jiyuanq commented Jul 14, 2023

@frankfliu @zachgk any chance you can take a look and give some early feedback? I checked the failed build looks like it's a coding convention check failure (not sure if I missed any other error message), which I'll fix once there's consensus on the PR itself

@frankfliu
Copy link
Contributor

@jiyuanq
Thanks for your contribution. What you need to do is just run the following command:

./gradlew formatJava

And then check in updated files.

See: https://github.com/deepjavalibrary/djl/blob/master/docs/development/development_guideline.md#coding-conventions

@jiyuanq
Copy link
Contributor Author

jiyuanq commented Jul 14, 2023

@jiyuanq Thanks for your contribution. What you need to do is just run the following command:

./gradlew formatJava

And then check in updated files.

See: https://github.com/deepjavalibrary/djl/blob/master/docs/development/development_guideline.md#coding-conventions

thank you! just updated

@codecov-commenter
Copy link

codecov-commenter commented Jul 14, 2023

Codecov Report

Patch coverage: 54.72% and no project coverage change.

Comparison is base (bb5073f) 72.08% compared to head (08fae7d) 72.09%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@             Coverage Diff             @@
##             master    #2706     +/-   ##
===========================================
  Coverage     72.08%   72.09%             
- Complexity     5126     7026   +1900     
===========================================
  Files           473      698    +225     
  Lines         21970    31264   +9294     
  Branches       2351     3225    +874     
===========================================
+ Hits          15838    22541   +6703     
- Misses         4925     7190   +2265     
- Partials       1207     1533    +326     
Impacted Files Coverage Δ
api/src/main/java/ai/djl/modality/cv/Image.java 69.23% <ø> (-4.11%) ⬇️
...rc/main/java/ai/djl/modality/cv/MultiBoxPrior.java 76.00% <ø> (ø)
.../main/java/ai/djl/modality/cv/output/Landmark.java 100.00% <ø> (ø)
...djl/modality/cv/transform/RandomFlipLeftRight.java 25.00% <0.00%> (-25.00%) ⬇️
...djl/modality/cv/transform/RandomFlipTopBottom.java 25.00% <0.00%> (-25.00%) ⬇️
...i/djl/modality/cv/translator/BigGANTranslator.java 21.42% <0.00%> (-5.24%) ⬇️
.../modality/cv/translator/ImageFeatureExtractor.java 0.00% <0.00%> (ø)
.../ai/djl/modality/cv/translator/YoloTranslator.java 27.77% <0.00%> (+18.95%) ⬆️
...ain/java/ai/djl/modality/cv/util/NDImageUtils.java 67.10% <0.00%> (+7.89%) ⬆️
api/src/main/java/ai/djl/modality/nlp/Decoder.java 63.63% <ø> (ø)
... and 226 more

... and 368 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@frankfliu frankfliu merged commit c588841 into deepjavalibrary:master Jul 14, 2023
@jiyuanq jiyuanq deleted the jiyuan/pytorch-support-inference-on-separate-cuda-stream branch July 14, 2023 05:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants