Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add SmolLM model and WebLLM #850

Open
1 of 2 tasks
lightaime opened this issue Aug 21, 2024 · 3 comments · May be fixed by #917
Open
1 of 2 tasks

[Feature Request] Add SmolLM model and WebLLM #850

lightaime opened this issue Aug 21, 2024 · 3 comments · May be fixed by #917
Assignees
Labels
enhancement New feature or request New Feature P2 Task with low level priority
Milestone

Comments

@lightaime
Copy link
Member

lightaime commented Aug 21, 2024

Required prerequisites

Motivation

Add SmolLM https://huggingface.co/blog/smollm, https://huggingface.co/spaces/HuggingFaceTB/SmolLM-360M-Instruct-WebGPU. And WebGPU support.

https://webllm.mlc.ai/

Solution

No response

Alternatives

No response

Additional context

No response

@lightaime lightaime added the enhancement New feature or request label Aug 21, 2024
@lightaime lightaime self-assigned this Aug 21, 2024
@lightaime lightaime added the P2 Task with low level priority label Aug 21, 2024
@Wendong-Fan Wendong-Fan added this to the Sprint 11 milestone Aug 23, 2024
@Wendong-Fan
Copy link
Member

Hey @lightaime , this model is supported by ollama, should we do native integration? refer: https://ollama.com/library/smollm

@lightaime lightaime changed the title [Feature Request] Add SmolLM model [Feature Request] Add SmolLM model and WebGPU Sep 2, 2024
@lightaime lightaime changed the title [Feature Request] Add SmolLM model and WebGPU [Feature Request] Add SmolLM model and WebLLM Sep 2, 2024
@koch3092
Copy link
Collaborator

koch3092 commented Sep 4, 2024

The difference between WebLLM and LLM Web APP:

Note: Ignore browser dependency on GPU

image

@koch3092
Copy link
Collaborator

koch3092 commented Sep 4, 2024

Here are the key features of WebLLM :

  1. WebLLM leverages the WebGPU on the user's local machine for hardware acceleration, enabling high-performance language model inference directly in the browser. This removes server dependencies, reduces costs, **enhances privacy and personalization, all while lowering operational expenses.
  2. WebLLM natively supports a wide range of popular models, including Llama, Hermes, Phi,** Gemma, RedPajama, Mistral, SmolLM and Qwen, making it adaptable for various tasks.
  3. It is fully compatible with the OpenAI API, offering features like JSON mode, function calling, and streaming output, simplifying integration for developers.
  4. WebLLM allows the integration of custom models in MLC format to meet specific needs, enhancing flexibility in model deployment.
  5. As a standalone package, WebLLM can be quickly integrated into projects via NPM, Yarn, or CDN, and comes with comprehensive examples and a modular design that makes it easy to connect with UI components.
  6. It supports streaming output and real-time interaction, making it suitable for applications like chatbots and virtual assistants.
  7. By delegating computational tasks to Web Workers or Service Workers, WebLLM optimizes UI performance and efficiently manages the lifecycle of models.
  8. It provides examples for building Chrome extensions, allowing users to extend browser functionalities with ease.

@Wendong-Fan Wendong-Fan linked a pull request Sep 9, 2024 that will close this issue
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request New Feature P2 Task with low level priority
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

5 participants