pytorch support inference on separate cuda stream #2706

jiyuanq · 2023-07-13T06:24:04Z

Description

This PR is an attempt to support running inference on separate cuda streams for pytorch engine. By doing this, we can maximize GPU utilization when running concurrent inference requests on GPU.

Also added a boolean flag inferenceSeparateCudaStream controlled through system properties "ai.djl.pytorch.inference_separate_cuda_stream" to determine whether this new feature is enabled.

I considered exposing the full cuda stream related pytorch api through JNI but in the end decided to only expose a high level boolean flag, mainly because:

it's much easier to expose only this boolean flag
there's no other use case that requires the full cuda stream api

jiyuanq · 2023-07-14T02:25:04Z

@frankfliu @zachgk any chance you can take a look and give some early feedback? I checked the failed build looks like it's a coding convention check failure (not sure if I missed any other error message), which I'll fix once there's consensus on the PR itself

frankfliu · 2023-07-14T03:08:31Z

@jiyuanq
Thanks for your contribution. What you need to do is just run the following command:

./gradlew formatJava

And then check in updated files.

See: https://github.com/deepjavalibrary/djl/blob/master/docs/development/development_guideline.md#coding-conventions

jiyuanq · 2023-07-14T03:24:49Z

@jiyuanq Thanks for your contribution. What you need to do is just run the following command:
./gradlew formatJava
And then check in updated files.

See: https://github.com/deepjavalibrary/djl/blob/master/docs/development/development_guideline.md#coding-conventions

thank you! just updated

codecov-commenter · 2023-07-14T03:58:24Z

Codecov Report

Patch coverage: 54.72% and no project coverage change.

Comparison is base (bb5073f) 72.08% compared to head (08fae7d) 72.09%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files

@@             Coverage Diff             @@
##             master    #2706     +/-   ##
===========================================
  Coverage     72.08%   72.09%             
- Complexity     5126     7026   +1900     
===========================================
  Files           473      698    +225     
  Lines         21970    31264   +9294     
  Branches       2351     3225    +874     
===========================================
+ Hits          15838    22541   +6703     
- Misses         4925     7190   +2265     
- Partials       1207     1533    +326

Impacted Files	Coverage Δ
api/src/main/java/ai/djl/modality/cv/Image.java	`69.23% <ø> (-4.11%)`	⬇️
...rc/main/java/ai/djl/modality/cv/MultiBoxPrior.java	`76.00% <ø> (ø)`
.../main/java/ai/djl/modality/cv/output/Landmark.java	`100.00% <ø> (ø)`
...djl/modality/cv/transform/RandomFlipLeftRight.java	`25.00% <0.00%> (-25.00%)`	⬇️
...djl/modality/cv/transform/RandomFlipTopBottom.java	`25.00% <0.00%> (-25.00%)`	⬇️
...i/djl/modality/cv/translator/BigGANTranslator.java	`21.42% <0.00%> (-5.24%)`	⬇️
.../modality/cv/translator/ImageFeatureExtractor.java	`0.00% <0.00%> (ø)`
.../ai/djl/modality/cv/translator/YoloTranslator.java	`27.77% <0.00%> (+18.95%)`	⬆️
...ain/java/ai/djl/modality/cv/util/NDImageUtils.java	`67.10% <0.00%> (+7.89%)`	⬆️
api/src/main/java/ai/djl/modality/nlp/Decoder.java	`63.63% <ø> (ø)`
... and 226 more

... and 368 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

jiyuanq added 2 commits July 13, 2023 05:48

pytorch support inference on separate cuda stream

4f9fae6

update

ad2198c

jiyuanq requested review from zachgk, frankfliu and a team as code owners July 13, 2023 06:24

update code style

433f3fa

Reformat native code and minor refactor

08fae7d

frankfliu approved these changes Jul 14, 2023

View reviewed changes

frankfliu merged commit c588841 into deepjavalibrary:master Jul 14, 2023

jiyuanq deleted the jiyuan/pytorch-support-inference-on-separate-cuda-stream branch July 14, 2023 05:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pytorch support inference on separate cuda stream #2706

pytorch support inference on separate cuda stream #2706

jiyuanq commented Jul 13, 2023

jiyuanq commented Jul 14, 2023

frankfliu commented Jul 14, 2023

jiyuanq commented Jul 14, 2023

codecov-commenter commented Jul 14, 2023 •

edited

Loading

pytorch support inference on separate cuda stream #2706

pytorch support inference on separate cuda stream #2706

Conversation

jiyuanq commented Jul 13, 2023

Description

jiyuanq commented Jul 14, 2023

frankfliu commented Jul 14, 2023

jiyuanq commented Jul 14, 2023

codecov-commenter commented Jul 14, 2023 • edited Loading

Codecov Report

codecov-commenter commented Jul 14, 2023 •

edited

Loading