Implementing KernelFactory interface in order to allow custom factories + Using SK function that supports external tools#64
Conversation
@dotnet-policy-service agree |
sfmskywalker
left a comment
There was a problem hiding this comment.
I like your PR in general, I only have a few small requests and some questions, which server more to increase my understanding than anything else. Thank you :)
src/modules/agents/Elsa.Agents.Activities/Elsa.Agents.Activities.csproj
Outdated
Show resolved
Hide resolved
| try | ||
| { | ||
| return JsonSerializer.Deserialize<JsonElement>(content!); | ||
| } | ||
| catch (JsonException ex) | ||
| { | ||
| throw new InvalidOperationException($"Error deserializing the message content as JSON:\n{content}", ex); | ||
| } |
There was a problem hiding this comment.
Out of curiosity, why is it better to explicitly handle a serialization exception and then wrapping it in an IOE?
There was a problem hiding this comment.
I added the explicit handling of JsonException and wrapped it in an InvalidOperationException mainly as a defensive programming measure.
Over the past few weeks, I’ve been doing a kind of reverse engineering to fully understand how the repository and the NuGet packages interact. I wanted to make sure I could trust and trace what's going on at every level while I was testing.
So this change was mostly precautionary, happy to revert or adjust it!
You probably have everything more aligned in your mind and clearly understand which parts interact and why. Since this is still 'new' to me, it's harder for me to reach those conclusions so quickly, haha. That's why I added that detail, more like a 'let me specify this for a moment, so if it breaks, I'll know what caused it.'
There was a problem hiding this comment.
Got it, makes sense 👍🏻 Perhaps it'a not a bad practice actually, especially if we throw a custom exception to make it extra clear what's happening. In this case, it's failing to parse the agent's response as JSON, so that would be a nice message to include with a custom exception class called e.g. FunctionCallException or AgentInvocationException. If you agree, please add an exception class with a name that you think makes sense in this context. Alternatively, I'm happy to do so myself once this is merged.
I have to leave right now, but I’ll try to answer your questions and correct the mistakes as soon as I can during the day. Thank you very much for your quick response and effort! |
|
I believe I’ve addressed all the comments, there are a couple I haven’t marked as “resolved” since I think they depend on your opinion haha. With these changes, for example, I’d already be able to implement MCPs in my custom KernelFactory, and they’d work because this method in AgentInvoker supports tools. Also, as a remember, by using chat-style methods, you can choose whether to store the history/context, which gives you more flexibility. Btw, as I mentioned before, there’s a ChatCompletionAgent that allows patterns like orchestration, handoff, among others. For now, though, this PR serves as a solid base implementation, I wouldn’t dive into implementing the new Agents features from Semantic Kernel without having this foundation in place first. I highly recommend checking out this microsoft course (I did hah) for the future too, it’s super relevant if we’re planning to implement full native Agent support using Semantic Kernel. They explain it in Python, but everything is already available in C# as well. Let me know if you need anything, I’m around! And sorry, I still owe you that email from a few weeks ago. But anyway, this was pretty much what I wanted to discuss: agents and Semantic Kernel. |
|
Great! Thank you for your detailed rundown, it helps me a lot that you know more about Semantic Kernel than I do. I appreciate the linked Microsoft course and will definitely check it! I noticed their Agent framework many months ago when it was still in preview (maybe it still is), and I couldn't quite figure out yet how it might or might not fit with my agent definition setup. However, I definitely think we want to implement fill native Agent support using SK, and the fact that you think with this PR we have a solid base increases my confidence level that we're on the right path with the way we're using SK. Thanks for your support! |
It should be possible to modify the kernel factory through the interface in order to access the Kernel or change the public implementations.
To give an example, in my case I'm currently trying to implement MCP servers in Elsa. This will take time and requires modifying many parts of the agents folder. It’s very likely that I’ll submit a PR so Elsa can support it natively, but I should be able to use my implementation through yours NuGets without needing to implement it directly in the repository.
Right now, I'm unable to implement my code properly without access to the kernel instantiated by the KernelFactory.
Furthermore, the current function used to send the request to the AI is creating a Semantic Kernel tool that renders the prompt, and then sends the result of this toot, that is, the already rendered prompt, to the AI as a query.
This method prevents loading your own external tools, such as tools from an external MCP server.
I used a Semantic Kernel NuGet package to access the engine that renders the prompt, and I render it separately just like the original function does. Then I call ChatCompletion, which is the standard method to interact with the AI and allows the use of external tools.
It also lets you manage a ChatHistory to retain the conversation context if desired. This way, it is indeed possible to use external tools in Elsa.
There is a more advanced SK method specialized for agents, but for now, I think this is the most viable option. Currently, without this approach, external tools cannot be used.
It’s worth noting that in Elsa Studio, the format of the prompt structure and its variables is practically the same, except you no longer need to use the $, so if you have a variable named variable1, you would write {{variable1}}.
It’s literally using the same engine that renders this in the original function, the difference is that it has been customized and uses the $ for invoking internal Semantic functions, which is now deprecated. Normally, you should use an MCP.