[api] implements text-generation search algorithm #2637

KexinFeng · 2023-06-06T19:26:28Z

This PR succeeds PR #2547 #2509 #2557, #2572 which contains the benchmark outputs of the searching results.

This PR contains only the features of LMSearch.

djl/examples/src/main/java/ai/djl/examples/inference/GPTInference.java contains the front_end design.

The model conversion to torchscript and onnx

See the Model tracing section in #2547 #2509 's PR description.

Demonstration

The PR #2723 provides several examples to demonstrate the usage of the language model text generation.

examples/src/main/java/ai/djl/examples/inference/GPTInference.java

…nd_translator GPT2PT.merge

jawaff · 2023-06-20T02:21:08Z

api/src/main/java/ai/djl/modality/nlp/generate/SearchConfig.java

+    private boolean suffixPadding;
+
+    /** Constructs a new ContrastiveSearchConfig object with default values. */
+    public SearchConfig() {


Any plans to support different configurations since not all of the text generation models are the same? I'm personally more interested in T5 than GPT2. T5 in particular is a different beast with both a decoder and encoder in contrast to GPT2's decoder-only approach. T5 also supports over a hundred special tokens. There's 100 "extra" tokens that can be used for a variety of things including fill masks and potentially representing special words/instructions in the generated output.

https://huggingface.co/transformers/v3.0.2/model_doc/t5.html#t5tokenizer

There probably needs to be different configurations and generation classes for each of the family of models out there. If we hardcode everything to GPT2, then there's going to be breaking changes in the future. I'd suggest adding support for two different models starting out and coming up with a solution for adding support for others in the future..

About the searchConfig, I'm thinking of just adding parameters into it. Not necessarily all of them are used in a single model. This should solve the issue about different search configurations you mentioned, right?

jawaff · 2023-06-20T02:57:43Z

I'm just leaving my 2 cents since I'm interested in your work. It's great to see it get added to DJL. My only real fear is that there might need to be a lot of refactoring to get support for other models. Flan-T5 is one of the most powerful open source models (that supports commercial use) we have available and it has a variety of sizes available. I'd be most interested in seeing it be supported.

I'm not super familiar with GPT2 aside from it being a decoder-only model. There's a chance that T5 and GPT2 share some similarities in the decoder aspect, but T5 has an initial encoder pass on the initial inputs. The hidden state of the encoded inputs are then used for each pass of the decoder alongside the ids that have been currently selected for generation.

jawaff · 2023-06-20T03:20:36Z

That's all I've got to add, good work. I just want to see this turn into a bigger feature beyond what you're working on.

frankfliu · 2023-06-20T16:44:58Z

That's all I've got to add, good work. I just want to see this turn into a bigger feature beyond what you're working on.

We are planning to add T5 model. This is just a starting point to add textgeneration support.

KexinFeng · 2023-06-20T16:59:08Z

@jawaff Thanks for pointing out the encoder-decoder model T5 to us and reminding us of the possible refactoring.

I think to implement encoder-decoder model, the major edition will be in the search algorithms, where we will need an if (encoderDecoder is true) block, which computes the encoding). The rest part of the code will basically be shared. This structure is seen in huggingface transformer.

codecov-commenter · 2023-06-23T02:25:49Z

Codecov Report

Patch coverage: 54.83% and project coverage change: -0.03 ⚠️

Comparison is base (bb5073f) 72.08% compared to head (8242299) 72.06%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files

@@             Coverage Diff              @@
##             master    #2637      +/-   ##
============================================
- Coverage     72.08%   72.06%   -0.03%     
- Complexity     5126     7020    +1894     
============================================
  Files           473      698     +225     
  Lines         21970    31252    +9282     
  Branches       2351     3224     +873     
============================================
+ Hits          15838    22521    +6683     
- Misses         4925     7200    +2275     
- Partials       1207     1531     +324

Impacted Files	Coverage Δ
api/src/main/java/ai/djl/modality/cv/Image.java	`69.23% <ø> (-4.11%)`	⬇️
...rc/main/java/ai/djl/modality/cv/MultiBoxPrior.java	`76.00% <ø> (ø)`
.../main/java/ai/djl/modality/cv/output/Landmark.java	`100.00% <ø> (ø)`
...djl/modality/cv/transform/RandomFlipLeftRight.java	`25.00% <0.00%> (-25.00%)`	⬇️
...djl/modality/cv/transform/RandomFlipTopBottom.java	`25.00% <0.00%> (-25.00%)`	⬇️
...i/djl/modality/cv/translator/BigGANTranslator.java	`21.42% <0.00%> (-5.24%)`	⬇️
.../modality/cv/translator/ImageFeatureExtractor.java	`0.00% <0.00%> (ø)`
.../ai/djl/modality/cv/translator/YoloTranslator.java	`27.77% <0.00%> (+18.95%)`	⬆️
...ain/java/ai/djl/modality/cv/util/NDImageUtils.java	`67.10% <0.00%> (+7.89%)`	⬆️
api/src/main/java/ai/djl/modality/nlp/Decoder.java	`63.63% <ø> (ø)`
... and 227 more

... and 368 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

KexinFeng requested review from zachgk, frankfliu and a team as code owners June 6, 2023 19:26

KexinFeng force-pushed the LMSearch branch 8 times, most recently from 4490def to 24d3a8e Compare June 9, 2023 19:59

KexinFeng changed the title ~~Lm search~~ [api] LMSearch Jun 9, 2023

KexinFeng force-pushed the LMSearch branch from 24d3a8e to a5f322f Compare June 9, 2023 20:11

frankfliu reviewed Jun 10, 2023

View reviewed changes

examples/src/main/java/ai/djl/examples/inference/GPTInference.java Outdated Show resolved Hide resolved

KexinFeng force-pushed the LMSearch branch 2 times, most recently from 95ddf0e to 8d91ef7 Compare June 13, 2023 21:56

KexinFeng and others added 5 commits June 16, 2023 14:11

LMSearchOnPt

6743b67

Batch_scheduler greedy_and_beam constrastiveSearch LLMDecoder front_e…

ba1ffc8

…nd_translator GPT2PT.merge

PtLMBlock

d23bec9

Refactor TextGenerator API

a9d516c

Refactor TextGenerator API

661f215

frankfliu force-pushed the LMSearch branch from 54e2d17 to c22ee77 Compare June 16, 2023 21:12

Add unit test

6511fce

frankfliu force-pushed the LMSearch branch from c22ee77 to 6511fce Compare June 16, 2023 21:56

jawaff reviewed Jun 20, 2023

View reviewed changes

comments

9fa5bb7

This was referenced Jun 21, 2023

Contrastive Search #2547

Closed

Greedy search and beam search #2557

Closed

doc

a071e0b

KexinFeng force-pushed the LMSearch branch from 39a6db1 to a071e0b Compare June 23, 2023 01:57

fmt

c9c6fd4

KexinFeng requested review from frankfliu and jawaff June 23, 2023 03:05

Fixes javadoc

8242299

frankfliu approved these changes Jun 27, 2023

View reviewed changes

frankfliu changed the title ~~[api] LMSearch~~ [api] implements text-generation search algorithm Jun 27, 2023

KexinFeng merged commit 68c7a03 into deepjavalibrary:master Jun 27, 2023

KexinFeng mentioned this pull request Jun 27, 2023

Batch the sequences with ContrastiveSeqBatchScheduler #2572

Closed

This was referenced Aug 22, 2023

[api] Restore Lm search unittest to recover coverage rate #2723

Merged

Seq2Seq模型怎么解析输入输出 #2565

Open

KexinFeng mentioned this pull request Oct 5, 2023

Fix memory leak in zeroGradients() #2792

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[api] implements text-generation search algorithm #2637

[api] implements text-generation search algorithm #2637

KexinFeng commented Jun 6, 2023 •

edited

Loading

jawaff Jun 20, 2023 •

edited

Loading

KexinFeng Jun 20, 2023

jawaff commented Jun 20, 2023

jawaff commented Jun 20, 2023

frankfliu commented Jun 20, 2023

KexinFeng commented Jun 20, 2023

codecov-commenter commented Jun 23, 2023 •

edited

Loading

[api] implements text-generation search algorithm #2637

[api] implements text-generation search algorithm #2637

Conversation

KexinFeng commented Jun 6, 2023 • edited Loading

The model conversion to torchscript and onnx

Demonstration

jawaff Jun 20, 2023 • edited Loading

Choose a reason for hiding this comment

KexinFeng Jun 20, 2023

Choose a reason for hiding this comment

jawaff commented Jun 20, 2023

jawaff commented Jun 20, 2023

frankfliu commented Jun 20, 2023

KexinFeng commented Jun 20, 2023

codecov-commenter commented Jun 23, 2023 • edited Loading

Codecov Report

KexinFeng commented Jun 6, 2023 •

edited

Loading

jawaff Jun 20, 2023 •

edited

Loading

codecov-commenter commented Jun 23, 2023 •

edited

Loading