Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glebashnik/feed field generator #32842

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

glebashnik
Copy link
Contributor

I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.

@glebashnik glebashnik requested a review from lesters November 12, 2024 13:21
@@ -0,0 +1,29 @@
package ai.vespa.generative;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, I think we could put this in llm.generation

And maybe, since what separates it from Generator isn't that it's generating from a language model, but that it's taking an additional config specifying a prompt, we could call it something like PromptedGenerator, or ConfiguredGenerator?

@@ -31,7 +34,7 @@
*
* @author lesters
*/
public class LocalLLM extends AbstractComponent implements LanguageModel {
public class LocalLLM extends AbstractComponent implements LanguageModel, Generator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The role of Generator is to apply a LanguageModel to a generation task specified by a context. This is a more general API so it's a bit messy for it to know about that context (and also to call itself through a utility defined elsewhere). I see the purpose though ... maybe there's a better solution, but I need to get some breakfast now.

*
* @author glebashnik
*/
public class LanguageModelTextGenerator extends AbstractComponent implements TextGenerator {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit verbose naming but quite precise.
I also considered LMTextGenerator.

*
* @author glebashnik
*/
public interface TextGenerator {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed from Generator to TextGenerator since Generator is too general.

+ outputType.getName());

super.setOutputType(null, outputType, null, context); // TODO: Why not set actualOutput to outputType?
return outputType; // return input type the same as output type: string or array<string>
Copy link
Contributor Author

@glebashnik glebashnik Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input-output type inference is straightforward same type on input as on output.
string to array is handled outside using split expression (see testGeneratorWithStringInputArrayOutput test)

@glebashnik glebashnik requested a review from bratseth January 6, 2025 12:28
@glebashnik
Copy link
Contributor Author

@bratseth Let me know if you think the design and naming are good enough.
Left some explanations in comments here.

@glebashnik glebashnik marked this pull request as ready for review January 7, 2025 10:08
Copy link
Member

@bratseth bratseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me!

@@ -82,7 +82,7 @@ public DataType setOutputType(DataType outputType, VerificationContext context)
throw new VerificationException(this, "Generate expression requires either a string or array<string> output type, but got "
+ outputType.getName());

super.setOutputType(null, outputType, null, context); // todo: Why not set actualOutput to outputType?
super.setOutputType(null, outputType, null, context); // TODO: Why not set actualOutput to outputType?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't make any difference when required and actual is the same.
(The reason for separating them is just that the required type may be a supertype of the actually produced type.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants