Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cerebras Integration #48

Merged
merged 7 commits into from
Oct 1, 2024
Merged

Cerebras Integration #48

merged 7 commits into from
Oct 1, 2024

Conversation

henrytwo
Copy link
Collaborator

Why are these changes needed?

Add integration for Cerebras, which provides super low-latency, high speed LLM inference. Currently Llama 3.1-8B/70B are supported.

Tool calling examples have been provided in this PR, which also function with streaming. Additionally, token cost calculations have been implemented.

Checks

Copy link
Collaborator

@Hk669 Hk669 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im so amazed with the inference speed. Great work cerebras, @henrytwo . Ive made few suggestions and also we dont need the autobuild notebook formatting in this. We can have a seperate PR.

cc @marklysze

@henrytwo henrytwo requested a review from Hk669 October 1, 2024 13:48
Copy link
Collaborator

@Hk669 Hk669 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've tested with the chess tool call example, a few general examples and marketing campaign use cases. the speed of the inference is super impressive. thanks @henrytwo.

Looks good to me.

@Hk669 Hk669 merged commit 3d59156 into autogenhub:main Oct 1, 2024
157 of 164 checks passed
odoochain pushed a commit to odoochain/autogen that referenced this pull request Nov 10, 2024
* Organize some more modules

* cleanup model_client
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants