Skip to content

Commit 82401e6

Browse files
authored
TokenCount performance issues (microsoft#580)
### Motivation and Context We encountered some performance issues and implemented the following fix. <!-- Thank you for your contribution to the chat-copilot repo! Please help reviewers and future users, providing the following information: 1. Why is this change required? 2. What problem does it solve? 3. What scenario does it contribute to? 4. If it fixes an open issue, please link to the issue here. --> ### Description <!-- Describe your changes, the overall approach, the underlying design. These notes will help understanding how your code works. Thanks! --> We updated the TokenUtils class within the Skills Web API to address these issues. A newly-introduced tokenizer has been implemented for encoding text, which significantly improves the efficiency of the TokenCount method. As a result of this change, the performance of the token counting operations has been enhanced, ensuring more accurate and faster responses for users. ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [ ] The code builds clean without any errors or warnings - [ ] The PR follows the [Contribution Guidelines](https://github.com/microsoft/chat-copilot/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/chat-copilot/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [ ] All unit tests pass, and I have added new tests where possible - [ ] I didn't break anyone 😄
1 parent f8f7d49 commit 82401e6

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

webapi/Skills/Utils/TokenUtils.cs

+2-1
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@ namespace CopilotChat.WebApi.Skills.Utils;
1717
/// </summary>
1818
public static class TokenUtils
1919
{
20+
private static SharpToken.GptEncoding tokenizer = SharpToken.GptEncoding.GetEncoding("cl100k_base");
21+
2022
/// <summary>
2123
/// Semantic dependencies of ChatSkill.
2224
/// If you add a new semantic dependency, please add it here.
@@ -98,7 +100,6 @@ internal static void GetFunctionTokenUsage(SKContext result, SKContext chatConte
98100
/// <param name="text">The string to calculate the number of tokens in.</param>
99101
internal static int TokenCount(string text)
100102
{
101-
var tokenizer = SharpToken.GptEncoding.GetEncoding("cl100k_base");
102103
var tokens = tokenizer.Encode(text);
103104
return tokens.Count;
104105
}

0 commit comments

Comments
 (0)