Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llm search #1531

Closed
wants to merge 23 commits into from
Closed

Llm search #1531

wants to merge 23 commits into from

Conversation

frankyhollywood
Copy link
Contributor

No description provided.

* Search for resources based on name or description, given query as a simple text.
*/
class FulltextAPI {
remoteURL = '/api/aisearch/';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put 'const' here

var conversation = new JSONObject();
conversation.put("id", obj.getJSONObject("conversation").getString("conversationId"));
conversation.put("topic", obj.getJSONObject("conversation").getJSONArray("messages").getJSONObject(0).getJSONObject("userInput").getString("input"));
conversation.put("start", new LlmConversation().getStartTime(obj.getJSONObject("conversation").getString("startTime")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, move all logic to another layer, 'service' not to mixup controller layer with logic and for easier logic testing.

Also, I would create instance of LlmConversation on the class instance creation instead of creating it on each request


@Override
protected void initApp() {
get("/newchat", "application/json", (req, res) -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think of the API in REST/HTTP API way:

  • use HTTP DELETE method for deletion
  • use HTTP POST method for any type of creation (with the GET '/newchat' we change the state, it is not even idempotent)
  • not to use verbs in the path, use proper HTTP methods

We could consider API like this:

  • GET /chat - get all conversations
  • POST /chat - current /newchat
  • GET /chat/{id} - current /history/:id NOTE: I did not gat the difference between '/history/:id' and '/conversation/:id', could you elaborate please?
  • POST /chat/{id}/query
  • DELETE /chat/{id}

There is one more in the code POST '/' - I don't get it, please elaborate too

}

private Object handleError(Request req, Exception e) {
log.error("Unexpected error in AiSearchApp: " + e.getMessage() + "\n\nrequest: " + req.body());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't put class name in the log as the name may be changed, logger configuration is the right place for that

+ Math.abs(token.getPreferredUsername().hashCode());
}

throw new UnsupportedOperationException("User key not found in token");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnsupportedOperationException is used to literally say that something is not supported, should be another, at least RuntimeException and a proper wrapper

return "{}";
}

private String getUserKey() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest not to bring this method here not to violate single responsibility principle, rather it could be another method of RequestContext class

try {
return new String(Files.readAllBytes(path));
} catch (Exception e) {
return "{}";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like error handling missed here

try {
var obj = new JSONObject(c);
var conversation = new JSONObject();
conversation.put("id", obj.getJSONObject("conversation").getString("conversationId"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to use POJO classes as more Java way instead of manipulation over pure JSON, it would also mitigate appearance of potential errors

try (FileWriter writer = new FileWriter(file)) {
writer.write(result.toString());
} catch (Exception e) {
System.out.println("Error writing conversation history file: " + e.getMessage());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use logger, not system output

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, check other places, there are 6 of them

@ewelinagr
Copy link
Member

To me the whole AISearch API belongs in Pluto, not Saturn. It is mainly based on redirection, this is one of the reasons Pluto was introduced for. Also, we already have ExternalSearch logic, which is really similar to this and stays within Pluto. To me Saturn should be all about our internal storage, not redirecting to external services

}

private Object handleError(Request req, Exception e) {
log.error("Unexpected error in AiSearchApp: " + e.getMessage() + "\n\nrequest: " + req.body());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, we don't let user know that something went wrong, let's re-though it and handle with proper http error code in the handler in BaseApp

implementation("org.json:json:20240303")

// LLM is customer specific. By default the jar will be a demo implementation, overwrite it with customer specific implementation
implementation files("libs/nl-hyve-fairspace-demo-v2.4.jar")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I am still not happy with the approach :( We are adding not only the dependency version, but artifact itself too. So, it is pretty tough to change it, too tight coupling. Also, let me know if I am stupid, I am not getting how we can conditionally AskAI switch on and off.

@tgreenwood
Copy link
Contributor

also I remember Ask AI is a POC (it also does not include any tests), but way are we adding it to the dev branch?

@tgreenwood
Copy link
Contributor

To me the whole AISearch API belongs in Pluto, not Saturn. It is mainly based on redirection, this is one of the reasons Pluto was introduced for. Also, we already have ExternalSearch logic, which is really similar to this and stays within Pluto. To me Saturn should be all about our internal storage, not redirecting to external services

There is a bit of logic around managing the chats and chat history, not sure it would be correct to put business logic to the gateway layer. I still believe it should be a separate service, not a part of Pluto nor Saturn. It would also let us easily toggle the whole feature

@ewelinagr ewelinagr force-pushed the ui-redesign branch 2 times, most recently from 17c0530 to 4d81cb8 Compare July 18, 2024 14:25
Base automatically changed from ui-redesign to dev July 24, 2024 10:21
@ewelinagr
Copy link
Member

ewelinagr commented Jul 24, 2024

Could you also make the naming more consistent? Especially the clases are currently named from LLM search, Fulltext search, AI search to just "conversation". I also see references like "Article search". I would prefix everything "AI" or "Llm" and then decide on single name afterwards, preferably "AISearch...", since this is how the API endpoint is named now. Then instead of:

= llm
== Conversation.js
== FulltextAPI.js
== LlmSearchPage.js

you would have:

= ai
== AISearchPage.js
== AISearchAPI.js
== AISearchConversation.js or AISearchPanel.js

.then(() => setLoading(false))
.catch(() => {
handleHttpError('Connection error.');
handleSearchError('Fairspace article search is not available at the moment');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Fairspace article search" - is this always articles? If so, we should make it more specific, otherwise we need more generic naming


const showArticle = articleId => {
if (articleId !== null && articleId.length > 0) {
LocalSearchAPI.lookupSearch(articleId, 'https://www.fns-cloud.eu/study').then(extractArticleIri);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like hardcoded prefixes in the generic UI, this is a component that works in a really specific setup...

return (
<div>
{responseArticles && responseArticles.length > 0 && (
<Typography variant="caption">Source studies:</Typography>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it assumes a model with "study" entities?

import LinkedDataEntityPage from '../metadata/common/LinkedDataEntityPage';
import {LocalSearchAPI} from '../search/SearchAPI';

const Conversation = props => {
Copy link
Member

@ewelinagr ewelinagr Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This huge class should be divided into smaller modules, currently it is hard to follow what happens here

@frankyhollywood frankyhollywood deleted the llm-search-rebase branch August 12, 2024 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants