-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Cerebras Integration #3574
Add Cerebras Integration #3574
Conversation
@microsoft-github-policy-service agree company="Cerebras" |
cc: @marklysze, seems like you're working in this area of the codebase :) |
Moved to autogenhub#48 |
I see you closed this, did you still want us to review this? |
Yes I still want it to be reviewed, but on Discord I was told to open the PR in the Autogen org instead |
We'd be happy to review. Would you be able to reopen it so we're able to merge? |
@jackgerrits it won't let me reopen this PR, but I opened another one here: #3585 |
Sounds good thanks! |
Why are these changes needed?
Add integration for Cerebras, which provides super low-latency, high speed LLM inference. Currently Llama 3.1-8B/70B are supported.
Tool calling examples have been provided in this PR, which also function with streaming. Additionally, token cost calculations have been implemented.
Checks