Cerebras Integration #48

henrytwo · 2024-09-30T16:46:47Z

Why are these changes needed?

Add integration for Cerebras, which provides super low-latency, high speed LLM inference. Currently Llama 3.1-8B/70B are supported.

Tool calling examples have been provided in this PR, which also function with streaming. Additionally, token cost calculations have been implemented.

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

Hk669

im so amazed with the inference speed. Great work cerebras, @henrytwo . Ive made few suggestions and also we dont need the autobuild notebook formatting in this. We can have a seperate PR.

cc @marklysze

website/docs/topics/non-openai-models/cloud-cerebras.ipynb

Hk669

i've tested with the chess tool call example, a few general examples and marketing campaign use cases. the speed of the inference is super impressive. thanks @henrytwo.

Looks good to me.

* Organize some more modules * cleanup model_client

henrytwo and others added 4 commits September 30, 2024 12:40

Cerebras Integration

b9ac556

Address feedback

21eaafd

Fix typo

95cacbd

Remove samples

5d7c06a

This was referenced Sep 30, 2024

Add Cerebras Integration microsoft/autogen#3574

Closed

Add Cerebras Integration microsoft/autogen#3585

Merged

Run formatter

279bfdb

sonichi requested review from yiranwu0, marklysze, Hk669 and qingyun-wu September 30, 2024 23:22

Hk669 requested changes Oct 1, 2024

View reviewed changes

website/docs/topics/non-openai-models/cloud-cerebras.ipynb Outdated Show resolved Hide resolved

website/docs/topics/non-openai-models/cloud-cerebras.ipynb Outdated Show resolved Hide resolved

Update documentation

e68901e

henrytwo requested a review from Hk669 October 1, 2024 13:48

Revert notebook change

dc2ceb3

Hk669 approved these changes Oct 1, 2024

View reviewed changes

Hk669 merged commit 3d59156 into autogenhub:main Oct 1, 2024
157 of 164 checks passed

henrytwo had a problem deploying to openai1 October 20, 2024 04:32 — with GitHub Actions Error

odoochain pushed a commit to odoochain/autogen that referenced this pull request Nov 10, 2024

Organize some more modules (autogenhub#48)

ed02297

* Organize some more modules * cleanup model_client

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cerebras Integration #48

Cerebras Integration #48

henrytwo commented Sep 30, 2024

Hk669 left a comment

Hk669 left a comment

Cerebras Integration #48

Cerebras Integration #48

Conversation

henrytwo commented Sep 30, 2024

Why are these changes needed?

Checks

Hk669 left a comment

Choose a reason for hiding this comment

Hk669 left a comment

Choose a reason for hiding this comment