Skip to content

Conversation

@chenmoneygithub
Copy link
Contributor

resolve #712

@chenmoneygithub chenmoneygithub changed the title Fix the sample truncation strategy Fix the sampler truncation strategy Feb 2, 2023
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. Overall the thing that is still striking me about this design is how special cased the ragged code for doing the pre and post processing is getting (even in token id space). Seems like we have a few phases for generation...

  1. tokenize inputs
  2. mask and pack inputs
  3. sample
  4. truncate and manipulate outputs
  5. detokenize

It seems unlikely that we will ever do 2) and 4) in a way that is perfectly general, so it's unfortunate that 2), 3), and 4) all come together. It does not feel very modular.

Not something we need to solve on this PR, but let's keep thinking about that.

@chenmoneygithub
Copy link
Contributor Author

@mattdangerw Yea this sampler implementation now contains a lot of things, including preprocessing, sampling algo and postprocessing. It's a bit hard to figure out the best user flow with the class itself, we could keep thinking about it while we are rolling out the support for GPT2 text generation, including cache, long sentence generation and so on.

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@chenmoneygithub chenmoneygithub merged commit 10d451e into keras-team:master Feb 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sampler should not blindly truncate out everything after the first end_token

2 participants