Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different LLM models for different tasks #57

Open
fraricci opened this issue Jul 19, 2024 · 1 comment
Open

Different LLM models for different tasks #57

fraricci opened this issue Jul 19, 2024 · 1 comment

Comments

@fraricci
Copy link
Collaborator

Reading stuff around, I think a better approach to deal with a query from the user
and the different tool is to use dedicated LLM agents, possibly with different models. In particular, one agent at higher level that digest the initial user query and decide what to do. And another (or multiple) agent which are dedicated to generate json input for the different tools and gather the json output.

I think this is what LLaMP is also doing (@chiang-yuan could comment on that).
Also, it seems a general approach suggested by Groq when presenting their new
model, specifically tuned for tool calling: https://wow.groq.com/introducing-llama-3-groq-tool-use-models/.

@jan-janssen
Copy link
Owner

I was wondering why separate agents are better than one LLM even though the agents use the same LLM in the background. And I found the following paper https://doi.org/10.1016/S0004-3702(99)00022-3 which nicely explains it with a soccer analogy. Basically, the general goal is to win the game, you win the game by scoring goals, so naively you could say every player should aim for scoring goals, then the whole team is focused on scoring goals and that should lead to a success. But the more effective strategy is for the different players to have dedicated roles, a goal keeper, defence players, offensive players and so on. The paper shows that this implementation of roles was successful for their 3rd place in the robocup a simulated soccer competition for robots. I think in the context of agents for materials science this means it makes sense to have distinguished agents with separate levels of expertise. If they can form a coherent picture with their different expertises, then we most likely generated a new insight. If they disagree, for example the experimental agent based on literature comes to a different conclusion than the simulation agent, then the current information is not yet conclusive and they need to study more, do more simulation, read more paper and so on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants