Anyone have experience with different return data formats of from a Plugin? #9414
-
Hi. When using plugins the samples normally return a .NET object (often a list of objects). This is then persisted in chat-history as json. While this works, json is a relative verbose format (not normally an issue, but we are dealing with token and token limits here) Have anyone experimented with returning other return data formats like old-school csv style, bson, messagepack or protobuff etc. to reduce datasize/tokens used, and have experience on how well the LLM handles these formats over json? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
The additional data formats you mentioned are not types we support; we instead recommend using chat history (https://learn.microsoft.com/en-us/semantic-kernel/concepts/ai-services/chat-completion/chat-history?pivots=programming-language-csharp) to reduce tokens. |
Beta Was this translation helpful? Give feedback.
-
@rwjdk in my experience XML is the best format for LLMs, because of the named tag closure. JSON is already an optimization, in the sense that JSON uses fewer tokens, but it can lose precision if the object is too complex. CSV is harder to track for LLM, because the column name can be very far from the value, and because of escaped quotes confusing some models. In other words, the more the data is compressed, the less the model is able to reason about it. One thing you could consider is managing the chat history content after each message, passing to the model only relevant context data, and holding the entire context outside of the history. |
Beta Was this translation helpful? Give feedback.
The additional data formats you mentioned are not types we support; we instead recommend using chat history (https://learn.microsoft.com/en-us/semantic-kernel/concepts/ai-services/chat-completion/chat-history?pivots=programming-language-csharp) to reduce tokens.