Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Hugging Face Inference API Endpoint Support #49

Merged
merged 3 commits into from
Apr 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 10 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
## [Try MinimalGPT/MinimalClaude/MinimalLocal (Public Site)](https://minimalgpt.app/)

![Build Status](https://img.shields.io/badge/build-passing-brightgreen)
![Version](https://img.shields.io/badge/version-5.0.2-blue)
![Version](https://img.shields.io/badge/version-5.0.3-blue)
![License](https://img.shields.io/badge/license-MIT-green)

**MinimalChat** is an open-source LLM chat web app designed to be as self-contained as possible. All conversations are stored locally on the client's device, with the only information being sent to the server being API calls to GPT or Claude (uses a CORS proxy) chat when the user sends a message and when a user saves a conversation to generate a conversation title.
Expand All @@ -22,8 +22,9 @@

To use MinimalGPT with the various language models, you'll need to obtain API keys from their respective providers:

- **OpenAI (GPT-3, GPT-4)**: Sign up for an API key at [OpenAI's website](https://beta.openai.com/signup/).
- **OpenAI (GPT-3, GPT-4)**: Sign up for an API key at [OpenAI website](https://beta.openai.com/signup/).
- **Anthropic Claude-3**: Request access to the Claude API by filling out the form on [Anthropic's website](https://www.anthropic.com/product).
- **Hugging Face**: Sign up for an API key at [Hugging Face website](https://huggingface.co/docs/api-inference/en/quicktour#get-your-api-token).

Once you have your API keys, input them in the app's settings to start using the corresponding language models.

Expand Down Expand Up @@ -66,7 +67,7 @@ On Android the process is basically the same except the name of the option is **
A: Yes, MinimalGPT is open-source and free to use. However, you'll need to provide your own API keys for the language models you want to use.

**Q: Can I use MinimalGPT without an internet connection?**
A: No, MinimalGPT requires an internet connection to communicate with the language model APIs.
A: Yes! If you use [LM Studio](https://lmstudio.ai/) to locally host a LLM Model you can connect and chat with any model supported within [LM Studio](https://lmstudio.ai/)

**Q: Are my conversations secure and private?**
A: Yes, all conversations are stored locally on your device and are not sent to any servers other than the necessary API calls to the language models.
Expand All @@ -86,11 +87,13 @@ A: Yes, MinimalGPT is designed be responsive and works well on mobile devices. Y
- **Claude 3 Sonnet**
- **Claude 3 Haiku**
- **Claude Vision** activated by having the **Claude** model selected and starting a message with **vision::** and then your prompt
- **Hugging Face Inference Endpoint**
- **Max Tokens** - Hugging Face models and their context windows can vary greatky. Use this setting to adjust the maximum number of tokens that can be generated as a response.
- **Local LLM Model (Via [LM Studio](https://lmstudio.ai/))** users configure the current model name and [LM Studio](https://lmstudio.ai/) api endpoint url in the settings panel.
- **Local Model Name**: The name of the model you are hosting locally
- **Example**: [This DeepSeek Coder Model](https://huggingface.co/LoneStriker/deepseek-coder-7b-instruct-v1.5-GGUF) has a model name of `LoneStriker/deepseek-coder-7b-instruct-v1.5-GGUF`. That is what should be entered into the **Local Model Name** field. This is also displayed directly in **[LM Studio](https://lmstudio.ai/)** for the user.
- **Local URL**: The API endpoint URL that **[LM Studio](https://lmstudio.ai/)** is running on
- **Example**: `http://192.168.0.45:1234`
- **Example**: [This DeepSeek Coder Model](https://huggingface.co/LoneStriker/deepseek-coder-7b-instruct-v1.5-GGUF) has a model name of `LoneStriker/deepseek-coder-7b-instruct-v1.5-GGUF`. That is what should be entered into the **Local Model Name** field. This is also displayed directly in **[LM Studio](https://lmstudio.ai/)** for the user.
- **Local URL**: The API endpoint URL that **[LM Studio](https://lmstudio.ai/)** is running on
- **Example**: `http://192.168.0.45:1234`
- Switch models mid conversations and maintain context
- Swipe Gestures for quick settings and conversations access
- Markdown Support
Expand Down Expand Up @@ -143,6 +146,7 @@ MinimalGPT is made possible thanks to the following libraries, frameworks, and r
- **[OpenAI API](https://openai.com/)**
- **[Anthropic Claude API](https://www.anthropic.com/)**
- **[LM Studio](https://lmstudio.ai/)**
- **[Hugging Face](https://huggingface.co/)**

## License

Expand All @@ -165,5 +169,3 @@ Also `npm run build` will output a dist folder with minified files etc...`npm ru
### Building/Bundling (WIP)

- Running `npm run build` will perform a dist build process that incldues minification and cache busting (sort of) and output to the `dist` folder.


4 changes: 4 additions & 0 deletions src/components/chat-header.vue
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,10 @@ function onShowConversationsClick() {
href="https://github.com/fingerthief/minimal-gpt#try-minimalgpt" target="_blank" class="no-style-link">
MinimalLocal
</a>
<a v-show="props.selectedModel.includes('tgi')" href="https://github.com/fingerthief/minimal-gpt#try-minimalgpt"
target="_blank" class="no-style-link">
MinimalHugging
</a>
<a href="https://github.com/fingerthief/minimal-gpt#try-minimalgpt" target="_blank" class="no-style-link">
<Github :size="20" :stroke-width="2.5" class="header-icon" />
</a>
Expand Down
37 changes: 35 additions & 2 deletions src/components/settings-dialog.vue
Original file line number Diff line number Diff line change
Expand Up @@ -7,25 +7,33 @@ const props = defineProps({
selectedModel: String,
localModelName: String,
localModelEndpoint: String,
huggingFaceEndpoint: String,
localSliderValue: Number,
gptKey: String,
hfKey: String,
sliderValue: Number,
claudeKey: String,
claudeSliderValue: Number,
hfSliderValue: Number,
selectedDallEImageCount: Number,
selectedDallEImageResolution: String,
selectedAutoSaveOption: String
selectedAutoSaveOption: String,
maxTokens: Number
});

const emit = defineEmits([
'update:maxTokens',
'update:model',
'update:localModelName',
'update:localModelEndpoint',
'update:localSliderValue',
'update:huggingFaceEndpoint',
'update:gptKey',
'update:hfKey',
'update:sliderValue',
'update:claudeKey',
'update:claudeSliderValue',
'update:hfSliderValue',
'update:selectedDallEImageCount',
'update:selectedDallEImageResolution',
'update:selectedAutoSaveOption',
Expand All @@ -52,7 +60,7 @@ function toggleSidebar() {
<span @click="reloadPage">
<RefreshCcw :size="23" :stroke-width="2" />
</span>
Settings | V5.0.2
Settings | V5.0.3
</h2>
</div>
<div class="sidebar-content-container">
Expand All @@ -66,6 +74,7 @@ function toggleSidebar() {
<option value="claude-3-opus-20240229">Claude 3 Opus</option>
<option value="claude-3-sonnet-20240229">Claude 3 Sonnet</option>
<option value="claude-3-haiku-20240307">Claude 3 Haiku</option>
<option value="tgi">Hugging Face</option>
<option value="lmstudio">Local Model (LM Studio) </option>
</select>
</div>
Expand Down Expand Up @@ -111,6 +120,30 @@ function toggleSidebar() {
@blur="update('claudeSliderValue', $event.target.value)">
<span>Creative</span>
</div>
<!-- Hugging Face Endpoint -->
<div class="api-key">
<label for="local-model-endpoint">Hugging Face URL:</label>
<input id="local-model-endpoint" :value="huggingFaceEndpoint"
@blur="update('huggingFaceEndpoint', $event.target.value)">
</div>
<!-- Hugging Face Key -->
<div class="api-key">
<label for="api-key">Hugging Face Key:</label>
<input id="api-key" :value="hfKey" @blur="update('hfKey', $event.target.value)">
</div>
<!-- Hugging Face max tokens param -->
<div class="api-key">
<label for="api-key">Hugging Face max tokens:</label>
<input id="api-key" :value="maxTokens" @blur="update('maxTokens', $event.target.value)">
</div>
<!-- Hugging Face Slider Value -->
<div class="slider-container">
<span>Serious</span>
<input type="range" min="0" max="100" :value="hfSliderValue"
@blur="update('hfSliderValue', $event.target.value)">
<span>Creative</span>
</div>

<!-- DALL-E Image Count -->
<div class="control select-dropdown">
<span>DALL-E Image Count: </span>
Expand Down
155 changes: 155 additions & 0 deletions src/libs/hugging-face-api-access.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
/* eslint-disable no-unused-vars */
import { showToast, sleep } from "./utils";

let hfStreamRetryCount = 0;
export async function fetchHuggingFaceModelResponseStream(conversation, attitude, model, huggingFaceEndpoint, updateUiFunction, apiKey, maxTokens) {
const gptMessagesOnly = filterMessages(conversation);

const requestOptions = {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${apiKey}`,
},
body: JSON.stringify({
model: model,
stream: true,
messages: gptMessagesOnly,
temperature: attitude * 0.01,
max_tokens: parseInt(maxTokens)
}),
};

try {
const response = await fetch(`${huggingFaceEndpoint + `/v1/chat/completions`}`, requestOptions);

const result = await readResponseStream(response, updateUiFunction);

hfStreamRetryCount = 0;
return result;
} catch (error) {
console.error("Error fetching Hugging Face Model response:", error);
hfStreamRetryCount++

if (hfStreamRetryCount < 3) {
await sleep(1500);
return fetchHuggingFaceModelResponseStream(conversation, attitude, model, huggingFaceEndpoint, updateUiFunction);
}

return "Error fetching response from Hugging Face Model";

}
}


let retryCount = 0;
export async function getConversationTitleFromHuggingFaceModel(messages, model, sliderValue, HuggingFaceModelEndpoint) {
try {
const apiKey = document.getElementById('api-key');
apiKey.value = localStorage.getItem("hfKey");

let tempMessages = messages.slice(0);
tempMessages.push({ role: 'user', content: "Summarize my inital request or greeting in 5 words or less." });

const requestOptions = {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${apiKey.value}`,
},
body: JSON.stringify({
model: model,
stream: true,
messages: tempMessages,
temperature: sliderValue * 0.01,
max_tokens: 500
}),
};

const response = await fetch(`${HuggingFaceModelEndpoint + `/v1/chat/completions`}`, requestOptions);

const result = await readResponseStream(response);

hfStreamRetryCount = 0;
return result;
} catch (error) {

if (retryCount < 5) {
retryCount++;
getConversationTitleFromHuggingFaceModel(messages, model, sliderValue);
}

console.error("Error fetching Hugging Face Model response:", error);
return "An error occurred while generating conversaton title.";
}
}

async function readResponseStream(response, updateUiFunction) {
let decodedResult = "";

const reader = await response.body.getReader();
const decoder = new TextDecoder("utf-8");
while (true) {
const { done, value } = await reader.read();
if (done) {
return decodedResult
};
const chunk = decoder.decode(value);
const parsedLines = parseHuggingFaceResponseChunk(chunk);
for (const parsedLine of parsedLines) {
const { choices } = parsedLine;
const { delta } = choices[0];
const { content } = delta;
if (content) {
decodedResult += content;

if (updateUiFunction) {
updateUiFunction(content);
}
}
}
}
}

let buffer = ""; // Buffer to hold incomplete JSON data across chunks
function parseHuggingFaceResponseChunk(chunk) {
buffer += chunk; // Append new chunk to buffer
const lines = buffer.split("\n");

const completeLines = lines.slice(0, -1); // All lines except the last one
buffer = lines[lines.length - 1]; // Last line might be incomplete, keep it in buffer

const results = [];
for (const line of completeLines) {
let cleanedLine = line.trim();

// Check if the line contains the control message [DONE] and remove it
if (cleanedLine.includes("[DONE]")) {
cleanedLine = cleanedLine.replace("[DONE]", "").trim();
}

// Remove any "data: " prefix that might be present after cleaning
// Using regex to handle any case variations and extra spaces
cleanedLine = cleanedLine.replace(/^data:\s*/i, "").trim();

if (cleanedLine !== "") {
try {
const parsed = JSON.parse(cleanedLine);
results.push(parsed);
} catch (error) {
console.error("Failed to parse JSON:", cleanedLine, error);
}
}
}
return results;
}

function filterMessages(conversation) {
let lastMessageContent = "";
return conversation.filter(message => {
const isGPT = !message.content.trim().toLowerCase().startsWith("image::") &&
!lastMessageContent.startsWith("image::");
lastMessageContent = message.content.trim().toLowerCase();
return isGPT;
});
}
2 changes: 1 addition & 1 deletion src/libs/utils.js
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ export async function getConversationTitleFromGPT(messages, model, sliderValue)

if (retryCount < 5) {
retryCount++;
self.getConversationTitleFromGPT(messages, model, sliderValue);
getConversationTitleFromGPT(messages, model, sliderValue);
}

console.error("Error fetching GPT response:", error);
Expand Down
Loading
Loading