Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add tts & stt #5329

Closed
wants to merge 9 commits into from
Closed

feat: add tts & stt #5329

wants to merge 9 commits into from

Conversation

DDMeaqua
Copy link
Contributor

@DDMeaqua DDMeaqua commented Aug 27, 2024

πŸ’» ε˜ζ›΄η±»εž‹ | Change Type

  • feat
  • fix
  • refactor
  • perf
  • style
  • test
  • docs
  • ci
  • chore
  • build

πŸ”€ ε˜ζ›΄θ―΄ζ˜Ž | Description of Change

image

πŸ“ θ‘₯充俑息 | Additional Information

image

Summary by CodeRabbit

  • New Features

    • Introduced speech synthesis and transcription functionalities across various APIs.
    • Enhanced chat interface with voice command capabilities for improved interactivity.
    • Added support for text-to-speech (TTS) and speech-to-text (STT) configurations in localization files.
  • Bug Fixes

    • Resolved issues related to the initialization and handling of speech APIs.
  • Documentation

    • Updated localization files to include new keys for speech functionalities in English and Chinese.

@DDMeaqua DDMeaqua requested a review from Dogtiti August 27, 2024 11:57
Copy link

vercel bot commented Aug 27, 2024

@DDMeaqua is attempting to deploy a commit to the NextChat Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Contributor

coderabbitai bot commented Aug 27, 2024

Walkthrough

The changes introduce extensive enhancements to the application, focusing on adding speech synthesis and transcription functionalities. New constants, interfaces, and methods are implemented across various files, enabling structured handling of speech options and configurations. Additionally, components for managing text-to-speech and speech-to-text settings are introduced, enhancing the overall capabilities of the application.

Changes

Files Change Summary
app/client/api.ts Introduced constants and interfaces for speech options, added abstract methods for speech and transcription in LLMApi.
app/components/chat.tsx Enhanced ChatActions component with speech recognition and text-to-speech functionalities, including state management for listening and transcription.
app/locales/en.ts Updated English locale configuration with new string entries for TTS and STT functionalities, including options for enabling features and selecting voices.
app/locales/cn.ts Added localization strings for TTS and STT features in Chinese, including configuration options for enabling services and selecting voices.
app/client/platforms/openai.ts Added speech and transcription methods to ChatGPTApi class for handling audio processing and transcription requests.
app/constant.ts Introduced constants for TTS and STT paths and default configurations for engines, models, and voices.
app/store/config.ts Added types and validators for TTS and STT configurations, expanding the application's configuration capabilities.
app/layout.tsx Reformatted <link> element in RootLayout for improved readability.
app/locales/index.ts Introduced mapping for STT languages and a function to retrieve appropriate STT language settings.
package.json Added a new dependency for converting Markdown files to plain text.

Possibly related PRs

  • fix: safaLocalStorageΒ #5386: This PR modifies the app/components/chat.tsx file, which is also altered in the main PR. Although the focus is on local storage management, both PRs involve changes to the chat component, indicating a potential overlap in functionality related to user input handling.
  • feat: add shortcut keyΒ #5396: This PR introduces significant changes to the app/components/chat.tsx file, adding keyboard shortcuts and enhancing user interaction, which aligns with the main PR's focus on expanding functionalities in the chat component.
  • feat: summarize model customizationΒ #5426: This PR also modifies app/components/chat.tsx, adding a reload button and enhancing chat session management, which relates to the overall improvements in user interaction and functionality introduced in the main PR.

Suggested reviewers

  • Dogtiti
  • lloydzhou
  • skymkmk

Poem

πŸ‡βœ¨
In a world where voices sing,
A rabbit hops, a joyful thing.
With speech and text, we now can play,
Hopping through changes, bright as day!
So let us cheer for TTS and STT,
A joyful leap, just wait and see! 🐰🎢


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    -- I pushed a fix in commit <commit_id>, please review it.
    -- Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    -- @coderabbitai generate unit testing code for this file.
    -- @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    -- @coderabbitai generate interesting stats about this repository and render them as a table.
    -- @coderabbitai read src/utils.ts and generate unit testing code.
    -- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    -- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between b6bb167 and f86b220.

Files ignored due to path filters (4)
  • app/icons/speak-stop.svg is excluded by !**/*.svg
  • app/icons/speak.svg is excluded by !**/*.svg
  • app/icons/voice-white.svg is excluded by !**/*.svg
  • yarn.lock is excluded by !**/yarn.lock, !**/*.lock
Files selected for processing (26)
  • app/client/api.ts (3 hunks)
  • app/client/platforms/alibaba.ts (1 hunks)
  • app/client/platforms/anthropic.ts (1 hunks)
  • app/client/platforms/baidu.ts (1 hunks)
  • app/client/platforms/bytedance.ts (1 hunks)
  • app/client/platforms/google.ts (1 hunks)
  • app/client/platforms/iflytek.ts (1 hunks)
  • app/client/platforms/moonshot.ts (1 hunks)
  • app/client/platforms/openai.ts (3 hunks)
  • app/client/platforms/tencent.ts (1 hunks)
  • app/components/chat.tsx (11 hunks)
  • app/components/settings.tsx (2 hunks)
  • app/components/stt-config.tsx (1 hunks)
  • app/components/stt.module.scss (1 hunks)
  • app/components/tts-config.tsx (1 hunks)
  • app/components/tts.module.scss (1 hunks)
  • app/constant.ts (2 hunks)
  • app/locales/cn.ts (3 hunks)
  • app/locales/en.ts (1 hunks)
  • app/locales/index.ts (1 hunks)
  • app/store/access.ts (2 hunks)
  • app/store/config.ts (3 hunks)
  • app/utils/audio.ts (1 hunks)
  • app/utils/ms_edge_tts.ts (1 hunks)
  • app/utils/speech.ts (1 hunks)
  • package.json (2 hunks)
Additional context used
Biome
app/utils/audio.ts

[error] 3-3: void is confusing inside a union type.

Unsafe fix: Use undefined instead.

(lint/suspicious/noConfusingVoidType)


[error] 16-16: void is confusing inside a union type.

Unsafe fix: Use undefined instead.

(lint/suspicious/noConfusingVoidType)

app/components/stt-config.tsx

[error] 23-23: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)


[error] 35-37: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)

app/components/tts-config.tsx

[error] 28-28: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)


[error] 53-55: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)


[error] 74-76: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)


[error] 96-98: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)


[error] 121-123: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)

Gitleaks
app/utils/ms_edge_tts.ts

121-121: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

Additional comments not posted (71)
app/utils/audio.ts (2)

11-14: LGTM!

The function is correctly implemented.

The code changes are approved.


32-41: LGTM!

The function is correctly implemented.

The code changes are approved.

package.json (1)

33-33: LGTM!

The new dependency is correctly added.

The code changes are approved.

app/components/stt.module.scss (4)

2-5: LGTM!

The styles for the .plugin-page class are correctly implemented.

The code changes are approved.


7-9: LGTM!

The styles for the .plugin-page-body class are correctly implemented.

The code changes are approved.


11-17: LGTM!

The styles for the .plugin-filter class are correctly implemented.

The code changes are approved.


44-116: LGTM!

The styles for the .plugin-item class are correctly implemented.

The code changes are approved.

app/components/tts.module.scss (4)

2-5: LGTM!

The styles for the .plugin-page class are correctly implemented.

The code changes are approved.


7-9: LGTM!

The styles for the .plugin-page-body class are correctly implemented.

The code changes are approved.


11-17: LGTM!

The styles for the .plugin-filter class are correctly implemented.

The code changes are approved.


44-116: LGTM!

The styles for the .plugin-item class are correctly implemented.

The code changes are approved.

app/locales/index.ts (2)

141-162: LGTM!

The constants DEFAULT_STT_LANG and STT_LANG_MAP are correctly defined and enhance the application's localization capabilities.

The code changes are approved.


164-170: LGTM!

The getSTTLang function is correctly implemented with robust error handling.

The code changes are approved.

app/utils/speech.ts (1)

7-17: LGTM!

The SpeechApi abstract class is well-structured and provides a clear interface for transcription APIs.

The code changes are approved.

app/store/config.ts (9)

22-22: LGTM!

The type definition for TTSModelType is correct and improves type safety.

The code changes are approved.


23-23: LGTM!

The type definition for TTSVoiceType is correct and improves type safety.

The code changes are approved.


24-24: LGTM!

The type definition for TTSEngineType is correct and improves type safety.

The code changes are approved.


26-26: LGTM!

The type definition for STTEngineType is correct and improves type safety.

The code changes are approved.


100-100: LGTM!

The type definition for TTSConfig is correct and improves type safety.

The code changes are approved.


101-101: LGTM!

The type definition for STTConfig is correct and improves type safety.

The code changes are approved.


83-94: LGTM!

The updates to DEFAULT_CONFIG are correct and enhance the application's capabilities.

The code changes are approved.


116-129: LGTM!

The TTSConfigValidator is well-structured and ensures correct configurations.

The code changes are approved.


131-135: LGTM!

The STTConfigValidator is well-structured and ensures correct configurations.

The code changes are approved.

app/store/access.ts (2)

123-125: LGTM!

The addition of the edgeTTSVoiceName property to DEFAULT_ACCESS_STATE is correct and aligns with the new TTS functionality.

The code changes are approved.


138-142: LGTM! But verify the fetch method implementation.

The edgeVoiceName method is correctly implemented. However, ensure that the fetch method is correctly defined and does not introduce any side effects.

The code changes are approved.

Run the following script to verify the fetch method implementation:

Verification successful

The fetch method is correctly implemented without side effects.

The fetch method in app/store/access.ts manages its state using fetchState and logs errors appropriately. It does not introduce any side effects that would impact the edgeVoiceName method. The implementation is sound and aligns with the intended functionality.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the `fetch` method implementation.

# Test: Search for the `fetch` method. Expect: Correct implementation without side effects.
rg --type python -A 5 $'fetch()'

Length of output: 64


Script:

#!/bin/bash
# Description: Verify the `fetch` method implementation in TypeScript files.

# Test: Search for the `fetch` method in TypeScript files. Expect: Correct implementation without side effects.
rg --type ts -A 5 'fetch()'

Length of output: 46404

app/client/platforms/bytedance.ts (1)

80-85: LGTM! But implement the methods.

The speech and transcription methods are correctly added as placeholders. However, remember to implement these methods in the future.

The code changes are approved.

app/client/platforms/iflytek.ts (1)

56-61: LGTM! But implement the methods.

The speech and transcription methods are correctly added as placeholders. However, remember to implement these methods in the future.

The code changes are approved.

app/client/platforms/alibaba.ts (2)

86-88: LGTM!

The speech method is correctly added as a placeholder for future implementation.

The code changes are approved.


89-91: LGTM!

The transcription method is correctly added as a placeholder for future implementation.

The code changes are approved.

app/client/platforms/moonshot.ts (2)

69-71: LGTM!

The speech method is correctly added as a placeholder for future implementation.

The code changes are approved.


72-74: LGTM!

The transcription method is correctly added as a placeholder for future implementation.

The code changes are approved.

app/client/platforms/tencent.ts (2)

92-94: LGTM!

The speech method is correctly added as a placeholder for future implementation.

The code changes are approved.


95-97: LGTM!

The transcription method is correctly added as a placeholder for future implementation.

The code changes are approved.

app/client/platforms/baidu.ts (2)

78-79: LGTM!

The speech method is correctly added as a placeholder for future implementation.

The code changes are approved.


81-82: LGTM!

The transcription method is correctly added as a placeholder for future implementation.

The code changes are approved.

app/client/api.ts (5)

23-23: LGTM!

The TTSModels constant is correctly defined and provides identifiers for text-to-speech models.

The code changes are approved.


52-59: LGTM!

The SpeechOptions interface is correctly defined and provides structured configurations for handling speech synthesis requests.

The code changes are approved.


61-69: LGTM!

The TranscriptionOptions interface is correctly defined and provides structured configurations for handling audio transcription requests.

The code changes are approved.


103-103: LGTM!

The speech method is correctly added as an abstract method to the LLMApi class.

The code changes are approved.


104-104: LGTM!

The transcription method is correctly added as an abstract method to the LLMApi class.

The code changes are approved.

app/client/platforms/google.ts (2)

59-60: LGTM!

The speech method is correctly added as a placeholder for future implementation.

The code changes are approved.


62-63: LGTM!

The transcription method is correctly added as a placeholder for future implementation.

The code changes are approved.

app/client/platforms/anthropic.ts (2)

76-78: LGTM!

The speech method is correctly defined as a placeholder for future implementation.

The code changes are approved.


79-81: LGTM!

The transcription method is correctly defined as a placeholder for future implementation.

The code changes are approved.

app/constant.ts (2)

156-157: LGTM!

The new paths SpeechPath and TranscriptionPath are correctly added to OpenaiPath.

The code changes are approved.


261-277: LGTM!

The new constants related to TTS and STT functionalities are correctly defined and consistent with existing constants.

The code changes are approved.

app/locales/cn.ts (2)

46-47: LGTM!

The new keys Speech and StopSpeech are correctly added to the Actions section.

The code changes are approved.


488-517: LGTM!

The new sections TTS and STT with multiple sub-keys are correctly defined and consistent with existing sections.

The code changes are approved.

app/client/platforms/openai.ts (3)

82-82: LGTM!

The function is correctly implemented and the new parameter is used appropriately.

The code changes are approved.


183-222: Improve error handling.

The function is correctly implemented, but the error handling can be improved by providing more context in the error message.

Apply this diff to improve error handling:

-  console.log("[Request] failed to make a audio transcriptions request", e);
+  console.error("[Request] failed to make an audio transcription request", e);

Likely invalid or redundant comment.


145-181: Improve error handling.

The function is correctly implemented, but the error handling can be improved by providing more context in the error message.

Apply this diff to improve error handling:

-  console.log("[Request] failed to make a speech request", e);
+  console.error("[Request] failed to make a speech request", e);

Likely invalid or redundant comment.

app/utils/ms_edge_tts.ts (9)

11-19: LGTM!

The enum is correctly implemented.

The code changes are approved.


24-31: LGTM!

The enum is correctly implemented.

The code changes are approved.


36-43: LGTM!

The enum is correctly implemented.

The code changes are approved.


48-86: LGTM!

The enum is correctly implemented.

The code changes are approved.


88-96: LGTM!

The type is correctly implemented.

The code changes are approved.


98-117: LGTM!

The class is correctly implemented.

The code changes are approved.


150-159: LGTM!

The function is correctly implemented.

The code changes are approved.


161-224: LGTM!

The function is correctly implemented.

The code changes are approved.


226-233: LGTM!

The function is correctly implemented.

The code changes are approved.

app/locales/en.ts (1)

48-49: LGTM!

The entries for Speech and StopSpeech are correctly added.

The code changes are approved.

app/components/settings.tsx (2)

83-84: LGTM!

The import statements for TTSConfigList and STTConfigList are correctly added.

The code changes are approved.


1651-1672: LGTM!

The integration of TTSConfigList and STTConfigList within the Settings function is correctly implemented. The updateConfig function is appropriately used to update the respective configurations.

The code changes are approved.

app/components/chat.tsx (8)

13-13: LGTM!

The import statement for VoiceWhiteIcon is correct.

The code changes are approved.


19-20: LGTM!

The import statements for SpeakIcon and SpeakStopIcon are correct.

The code changes are approved.


80-80: LGTM!

The import statement for Locale, getLang, and getSTTLang is correct.

The code changes are approved.


97-99: LGTM!

The import statement for DEFAULT_STT_ENGINE, DEFAULT_TTS_ENGINE, and FIREFOX_DEFAULT_STT_ENGINE is correct.

The code changes are approved.


117-124: LGTM!

The import statements for ClientApi, createTTSPlayer, OpenAITranscriptionApi, SpeechApi, WebTranscriptionApi, MsEdgeTTS, and OUTPUT_FORMAT are correct.

The code changes are approved.


126-126: LGTM!

The constant ttsPlayer is correctly initialized.

The code changes are approved.


456-456: LGTM!

The new prop setUserInput is correctly added.

The code changes are approved.


548-584: Verify the dependency array in useEffect.

The state variables and functions are correctly implemented. However, the dependency array in the useEffect hook is empty. This might cause the effect to not re-run when config.sttConfig.engine changes.

Consider adding config.sttConfig.engine to the dependency array:

 useEffect(() => {
   if (isFirefox()) config.sttConfig.engine = FIREFOX_DEFAULT_STT_ENGINE;
   setSpeechApi(
     config.sttConfig.engine === DEFAULT_STT_ENGINE
       ? new WebTranscriptionApi((transcription) =>
           onRecognitionEnd(transcription),
         )
       : new OpenAITranscriptionApi((transcription) =>
           onRecognitionEnd(transcription),
         ),
   );
- }, []);
+ }, [config.sttConfig.engine]);

Comment on lines +22 to +29
const buffer = await audioContext!.decodeAudioData(audioBuffer);
audioBufferSourceNode = audioContext!.createBufferSource();
audioBufferSourceNode.buffer = buffer;
audioBufferSourceNode.connect(audioContext!.destination);
audioContext!.resume().then(() => {
audioBufferSourceNode!.start();
});
audioBufferSourceNode.onended = onended;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add null checks for audioContext and audioBufferSourceNode.

Ensure that audioContext and audioBufferSourceNode are not null before using them to avoid potential runtime errors.

Apply this diff to fix the issue:

-    const buffer = await audioContext!.decodeAudioData(audioBuffer);
-    audioBufferSourceNode = audioContext!.createBufferSource();
+    if (!audioContext) throw new Error('AudioContext is not initialized');
+    const buffer = await audioContext.decodeAudioData(audioBuffer);
+    audioBufferSourceNode = audioContext.createBufferSource();

Committable suggestion was skipped due to low confidence.

Comment on lines +26 to +28
audioContext!.resume().then(() => {
audioBufferSourceNode!.start();
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add error handling for Web Audio API operations.

Add error handling for audioContext.resume() and audioBufferSourceNode.start() to handle potential errors gracefully.

Apply this diff to fix the issue:

-    audioContext!.resume().then(() => {
-      audioBufferSourceNode!.start();
-    });
+    try {
+      await audioContext.resume();
+      audioBufferSourceNode.start();
+    } catch (error) {
+      console.error('Error during audio playback:', error);
+    }

Committable suggestion was skipped due to low confidence.

@@ -0,0 +1,45 @@
type TTSPlayer = {
init: () => void;
play: (audioBuffer: ArrayBuffer, onended: () => void | null) => Promise<void>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace null with undefined for onended.

Using undefined instead of null for onended avoids confusion and aligns with best practices.

Apply this diff to fix the issue:

-  play: (audioBuffer: ArrayBuffer, onended: () => void | null) => Promise<void>;
+  play: (audioBuffer: ArrayBuffer, onended: () => void | undefined) => Promise<void>;
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
play: (audioBuffer: ArrayBuffer, onended: () => void | null) => Promise<void>;
play: (audioBuffer: ArrayBuffer, onended: () => void | undefined) => Promise<void>;
Tools
Biome

[error] 3-3: void is confusing inside a union type.

Unsafe fix: Use undefined instead.

(lint/suspicious/noConfusingVoidType)

audioContext.suspend();
};

const play = async (audioBuffer: ArrayBuffer, onended: () => void | null) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace null with undefined for onended.

Using undefined instead of null for onended avoids confusion and aligns with best practices.

Apply this diff to fix the issue:

-  const play = async (audioBuffer: ArrayBuffer, onended: () => void | null) => {
+  const play = async (audioBuffer: ArrayBuffer, onended: () => void | undefined) => {
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const play = async (audioBuffer: ArrayBuffer, onended: () => void | null) => {
const play = async (audioBuffer: ArrayBuffer, onended: () => void | undefined) => {
Tools
Biome

[error] 16-16: void is confusing inside a union type.

Unsafe fix: Use undefined instead.

(lint/suspicious/noConfusingVoidType)

Comment on lines +35 to +37
(config.engine = STTConfigValidator.engine(
e.currentTarget.value,
)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid assignments in expressions.

The use of assignments in expressions is confusing. Move the assignment out of the expression to improve readability.

Apply this diff to fix the issue:

-                  (config.engine = STTConfigValidator.engine(
-                    e.currentTarget.value,
-                  )),
+                  {
+                    config.engine = STTConfigValidator.engine(
+                      e.currentTarget.value,
+                    );
+                  },
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
(config.engine = STTConfigValidator.engine(
e.currentTarget.value,
)),
{
config.engine = STTConfigValidator.engine(
e.currentTarget.value,
);
},
Tools
Biome

[error] 35-37: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)

Comment on lines +117 to +124
async start(): Promise<void> {
this.listeningStatus = true;
await this.recognitionInstance.start();
}

async stop(): Promise<void> {
this.listeningStatus = false;
await this.recognitionInstance.stop();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve error handling in the start and stop methods.

The start and stop methods should handle errors more gracefully. Consider adding try-catch blocks to handle potential errors.

Apply this diff to improve error handling:

 async start(): Promise<void> {
+  try {
     this.listeningStatus = true;
     await this.recognitionInstance.start();
+  } catch (error) {
+    console.error("Error starting speech recognition:", error);
+    this.listeningStatus = false;
+  }
 }

 async stop(): Promise<void> {
+  try {
     this.listeningStatus = false;
     await this.recognitionInstance.stop();
+  } catch (error) {
+    console.error("Error stopping speech recognition:", error);
+  }
 }
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async start(): Promise<void> {
this.listeningStatus = true;
await this.recognitionInstance.start();
}
async stop(): Promise<void> {
this.listeningStatus = false;
await this.recognitionInstance.stop();
async start(): Promise<void> {
try {
this.listeningStatus = true;
await this.recognitionInstance.start();
} catch (error) {
console.error("Error starting speech recognition:", error);
this.listeningStatus = false;
}
}
async stop(): Promise<void> {
try {
this.listeningStatus = false;
await this.recognitionInstance.stop();
} catch (error) {
console.error("Error stopping speech recognition:", error);
}
}

checked={props.ttsConfig.enable}
onChange={(e) =>
props.updateConfig(
(config) => (config.enable = e.currentTarget.checked),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid assignments in expressions.

The use of assignments in expressions should be avoided as it can be confusing. Consider refactoring the code to separate assignments from expressions.

Apply this diff to avoid assignments in expressions:

 props.updateConfig(
-  (config) => (config.enable = e.currentTarget.checked),
+  (config) => { config.enable = e.currentTarget.checked; },
 );

 props.updateConfig(
-  (config) => (config.engine = TTSConfigValidator.engine(e.currentTarget.value)),
+  (config) => { config.engine = TTSConfigValidator.engine(e.currentTarget.value); },
 );

 props.updateConfig(
-  (config) => (config.model = TTSConfigValidator.model(e.currentTarget.value)),
+  (config) => { config.model = TTSConfigValidator.model(e.currentTarget.value); },
 );

 props.updateConfig(
-  (config) => (config.voice = TTSConfigValidator.voice(e.currentTarget.value)),
+  (config) => { config.voice = TTSConfigValidator.voice(e.currentTarget.value); },
 );

 props.updateConfig(
-  (config) => (config.speed = TTSConfigValidator.speed(e.currentTarget.valueAsNumber)),
+  (config) => { config.speed = TTSConfigValidator.speed(e.currentTarget.valueAsNumber); },
 );

Also applies to: 53-55, 74-76, 96-98, 121-123

Tools
Biome

[error] 28-28: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)

Comment on lines +33 to +46
{/* <ListItem
title={Locale.Settings.TTS.Autoplay.Title}
subTitle={Locale.Settings.TTS.Autoplay.SubTitle}
>
<input
type="checkbox"
checked={props.ttsConfig.autoplay}
onChange={(e) =>
props.updateConfig(
(config) => (config.autoplay = e.currentTarget.checked),
)
}
></input>
</ListItem> */}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove or explain the commented-out code.

The commented-out code should be removed if it is no longer needed, or an explanation should be added to clarify its purpose.

Apply this diff to remove the commented-out code:

- {/* <ListItem
-   title={Locale.Settings.TTS.Autoplay.Title}
-   subTitle={Locale.Settings.TTS.Autoplay.SubTitle}
- >
-   <input
-     type="checkbox"
-     checked={props.ttsConfig.autoplay}
-     onChange={(e) =>
-       props.updateConfig(
-         (config) => (config.autoplay = e.currentTarget.checked),
-       )
-     }
-   ></input>
- </ListItem> */}
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{/* <ListItem
title={Locale.Settings.TTS.Autoplay.Title}
subTitle={Locale.Settings.TTS.Autoplay.SubTitle}
>
<input
type="checkbox"
checked={props.ttsConfig.autoplay}
onChange={(e) =>
props.updateConfig(
(config) => (config.autoplay = e.currentTarget.checked),
)
}
></input>
</ListItem> */}

Comment on lines +119 to +123
export class MsEdgeTTS {
static OUTPUT_FORMAT = OUTPUT_FORMAT;
private static TRUSTED_CLIENT_TOKEN = "6A5AA1D4EAFF4E9FB37E23D68491D6F4";
private static VOICES_URL = `https://speech.platform.bing.com/consumer/speech/synthesize/readaloud/voices/list?trustedclienttoken=${MsEdgeTTS.TRUSTED_CLIENT_TOKEN}`;
private static SYNTH_URL = `wss://speech.platform.bing.com/consumer/speech/synthesize/readaloud/edge/v1?TrustedClientToken=${MsEdgeTTS.TRUSTED_CLIENT_TOKEN}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential security issue: Detected a Generic API Key.

The TRUSTED_CLIENT_TOKEN might expose access to various services and sensitive operations.

Consider storing the token in a secure environment variable and accessing it securely.

Tools
Gitleaks

121-121: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

Comment on lines +1172 to +1216
const accessStore = useAccessStore();
const [speechStatus, setSpeechStatus] = useState(false);
const [speechLoading, setSpeechLoading] = useState(false);
async function openaiSpeech(text: string) {
if (speechStatus) {
ttsPlayer.stop();
setSpeechStatus(false);
} else {
var api: ClientApi;
api = new ClientApi(ModelProvider.GPT);
const config = useAppConfig.getState();
setSpeechLoading(true);
ttsPlayer.init();
let audioBuffer: ArrayBuffer;
const { markdownToTxt } = require("markdown-to-txt");
const textContent = markdownToTxt(text);
if (config.ttsConfig.engine !== DEFAULT_TTS_ENGINE) {
const edgeVoiceName = accessStore.edgeVoiceName();
const tts = new MsEdgeTTS();
await tts.setMetadata(
edgeVoiceName,
OUTPUT_FORMAT.AUDIO_24KHZ_96KBITRATE_MONO_MP3,
);
audioBuffer = await tts.toArrayBuffer(textContent);
} else {
audioBuffer = await api.llm.speech({
model: config.ttsConfig.model,
input: textContent,
voice: config.ttsConfig.voice,
speed: config.ttsConfig.speed,
});
}
setSpeechStatus(true);
ttsPlayer
.play(audioBuffer, () => {
setSpeechStatus(false);
})
.catch((e) => {
console.error("[OpenAI Speech]", e);
showToast(prettyObject(e));
setSpeechStatus(false);
})
.finally(() => setSpeechLoading(false));
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use dynamic import for markdown-to-txt.

The state variables and function are correctly implemented. However, the require statement for markdown-to-txt inside the function might cause issues in environments where require is not available, such as ES modules.

Consider using dynamic import for markdown-to-txt:

 async function openaiSpeech(text: string) {
   if (speechStatus) {
     ttsPlayer.stop();
     setSpeechStatus(false);
   } else {
     var api: ClientApi;
     api = new ClientApi(ModelProvider.GPT);
     const config = useAppConfig.getState();
     setSpeechLoading(true);
     ttsPlayer.init();
     let audioBuffer: ArrayBuffer;
-    const { markdownToTxt } = require("markdown-to-txt");
+    const { markdownToTxt } = await import("markdown-to-txt");
     const textContent = markdownToTxt(text);
     if (config.ttsConfig.engine !== DEFAULT_TTS_ENGINE) {
       const edgeVoiceName = accessStore.edgeVoiceName();
       const tts = new MsEdgeTTS();
       await tts.setMetadata(
         edgeVoiceName,
         OUTPUT_FORMAT.AUDIO_24KHZ_96KBITRATE_MONO_MP3,
       );
       audioBuffer = await tts.toArrayBuffer(textContent);
     } else {
       audioBuffer = await api.llm.speech({
         model: config.ttsConfig.model,
         input: textContent,
         voice: config.ttsConfig.voice,
         speed: config.ttsConfig.speed,
       });
     }
     setSpeechStatus(true);
     ttsPlayer
       .play(audioBuffer, () => {
         setSpeechStatus(false);
       })
       .catch((e) => {
         console.error("[OpenAI Speech]", e);
         showToast(prettyObject(e));
         setSpeechStatus(false);
       })
       .finally(() => setSpeechLoading(false));
   }
 }
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const accessStore = useAccessStore();
const [speechStatus, setSpeechStatus] = useState(false);
const [speechLoading, setSpeechLoading] = useState(false);
async function openaiSpeech(text: string) {
if (speechStatus) {
ttsPlayer.stop();
setSpeechStatus(false);
} else {
var api: ClientApi;
api = new ClientApi(ModelProvider.GPT);
const config = useAppConfig.getState();
setSpeechLoading(true);
ttsPlayer.init();
let audioBuffer: ArrayBuffer;
const { markdownToTxt } = require("markdown-to-txt");
const textContent = markdownToTxt(text);
if (config.ttsConfig.engine !== DEFAULT_TTS_ENGINE) {
const edgeVoiceName = accessStore.edgeVoiceName();
const tts = new MsEdgeTTS();
await tts.setMetadata(
edgeVoiceName,
OUTPUT_FORMAT.AUDIO_24KHZ_96KBITRATE_MONO_MP3,
);
audioBuffer = await tts.toArrayBuffer(textContent);
} else {
audioBuffer = await api.llm.speech({
model: config.ttsConfig.model,
input: textContent,
voice: config.ttsConfig.voice,
speed: config.ttsConfig.speed,
});
}
setSpeechStatus(true);
ttsPlayer
.play(audioBuffer, () => {
setSpeechStatus(false);
})
.catch((e) => {
console.error("[OpenAI Speech]", e);
showToast(prettyObject(e));
setSpeechStatus(false);
})
.finally(() => setSpeechLoading(false));
}
}
const accessStore = useAccessStore();
const [speechStatus, setSpeechStatus] = useState(false);
const [speechLoading, setSpeechLoading] = useState(false);
async function openaiSpeech(text: string) {
if (speechStatus) {
ttsPlayer.stop();
setSpeechStatus(false);
} else {
var api: ClientApi;
api = new ClientApi(ModelProvider.GPT);
const config = useAppConfig.getState();
setSpeechLoading(true);
ttsPlayer.init();
let audioBuffer: ArrayBuffer;
const { markdownToTxt } = await import("markdown-to-txt");
const textContent = markdownToTxt(text);
if (config.ttsConfig.engine !== DEFAULT_TTS_ENGINE) {
const edgeVoiceName = accessStore.edgeVoiceName();
const tts = new MsEdgeTTS();
await tts.setMetadata(
edgeVoiceName,
OUTPUT_FORMAT.AUDIO_24KHZ_96KBITRATE_MONO_MP3,
);
audioBuffer = await tts.toArrayBuffer(textContent);
} else {
audioBuffer = await api.llm.speech({
model: config.ttsConfig.model,
input: textContent,
voice: config.ttsConfig.voice,
speed: config.ttsConfig.speed,
});
}
setSpeechStatus(true);
ttsPlayer
.play(audioBuffer, () => {
setSpeechStatus(false);
})
.catch((e) => {
console.error("[OpenAI Speech]", e);
showToast(prettyObject(e));
setSpeechStatus(false);
})
.finally(() => setSpeechLoading(false));
}
}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between f86b220 and e9f90a4.

Files selected for processing (1)
  • app/locales/en.ts (2 hunks)
Additional comments not posted (3)
app/locales/en.ts (3)

48-49: LGTM!

The new entries for "Play" and "Stop" are correctly added under the Actions section.

The code changes are approved.


493-513: LGTM!

The new section for TTS functionalities is correctly added with appropriate entries for enabling TTS, autoplay, model, voice, speed, and engine.

The code changes are approved.


514-523: LGTM!

The new section for STT functionalities is correctly added with appropriate entries for enabling STT and selecting the engine.

The code changes are approved.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between e9f90a4 and d8b1781.

Files selected for processing (8)
  • app/client/platforms/alibaba.ts (2 hunks)
  • app/client/platforms/anthropic.ts (2 hunks)
  • app/client/platforms/baidu.ts (2 hunks)
  • app/client/platforms/bytedance.ts (2 hunks)
  • app/client/platforms/google.ts (2 hunks)
  • app/client/platforms/iflytek.ts (2 hunks)
  • app/client/platforms/moonshot.ts (2 hunks)
  • app/client/platforms/tencent.ts (2 hunks)
Files skipped from review as they are similar to previous changes (7)
  • app/client/platforms/alibaba.ts
  • app/client/platforms/baidu.ts
  • app/client/platforms/bytedance.ts
  • app/client/platforms/google.ts
  • app/client/platforms/iflytek.ts
  • app/client/platforms/moonshot.ts
  • app/client/platforms/tencent.ts
Additional comments not posted (2)
app/client/platforms/anthropic.ts (2)

83-85: LGTM! Add a TODO comment.

The placeholder implementation is acceptable for now. Consider adding a TODO comment to indicate that the method needs to be implemented.

  speech(options: SpeechOptions): Promise<ArrayBuffer> {
+    // TODO: Implement speech method
     throw new Error("Method not implemented.");
  }

86-88: LGTM! Add a TODO comment.

The placeholder implementation is acceptable for now. Consider adding a TODO comment to indicate that the method needs to be implemented.

  transcription(options: TranscriptionOptions): Promise<string> {
+    // TODO: Implement transcription method
     throw new Error("Method not implemented.");
  }

Copy link
Contributor

Your build has completed!

Preview deployment

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between d8b1781 and 318e098.

Files selected for processing (3)
  • app/client/api.ts (4 hunks)
  • app/components/chat.tsx (11 hunks)
  • app/components/tts-config.tsx (1 hunks)
Additional context used
Biome
app/components/tts-config.tsx

[error] 28-28: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)


[error] 53-55: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)


[error] 74-76: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)


[error] 96-98: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)


[error] 122-124: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)

Additional comments not posted (14)
app/components/tts-config.tsx (3)

1-1: Import statement is correct.

The import statement correctly imports necessary modules and constants.

The code changes are approved.


28-28: Avoid assignments in expressions.

The use of assignments in expressions should be avoided as it can be confusing. Consider refactoring the code to separate assignments from expressions.

Apply this diff to avoid assignments in expressions:

 props.updateConfig(
-  (config) => (config.enable = e.currentTarget.checked),
+  (config) => { config.enable = e.currentTarget.checked; },
 );

 props.updateConfig(
-  (config) => (config.engine = TTSConfigValidator.engine(e.currentTarget.value)),
+  (config) => { config.engine = TTSConfigValidator.engine(e.currentTarget.value); },
 );

 props.updateConfig(
-  (config) => (config.model = TTSConfigValidator.model(e.currentTarget.value)),
+  (config) => { config.model = TTSConfigValidator.model(e.currentTarget.value); },
 );

 props.updateConfig(
-  (config) => (config.voice = TTSConfigValidator.voice(e.currentTarget.value)),
+  (config) => { config.voice = TTSConfigValidator.voice(e.currentTarget.value); },
 );

 props.updateConfig(
-  (config) => (config.speed = TTSConfigValidator.speed(e.currentTarget.valueAsNumber)),
+  (config) => { config.speed = TTSConfigValidator.speed(e.currentTarget.valueAsNumber); },
 );

Also applies to: 53-55, 74-76, 96-98, 122-124

Tools
Biome

[error] 28-28: The assignment should not be in an expression.

The use of assignments in expressions is confusing.
Expressions are often considered as side-effect free.

(lint/suspicious/noAssignInExpressions)


33-46: Remove or explain the commented-out code.

The commented-out code should be removed if it is no longer needed, or an explanation should be added to clarify its purpose.

Apply this diff to remove the commented-out code:

- {/* <ListItem
-   title={Locale.Settings.TTS.Autoplay.Title}
-   subTitle={Locale.Settings.TTS.Autoplay.SubTitle}
- >
-   <input
-     type="checkbox"
-     checked={props.ttsConfig.autoplay}
-     onChange={(e) =>
-       props.updateConfig(
-         (config) => (config.autoplay = e.currentTarget.checked),
-       )
-     }
-   ></input>
- </ListItem> */}
app/client/api.ts (4)

23-23: LGTM!

The constant TTSModels is correctly defined.

The code changes are approved.


52-59: LGTM!

The interface SpeechOptions is correctly defined and provides structured configurations for handling speech requests.

The code changes are approved.


61-69: LGTM!

The interface TranscriptionOptions is correctly defined and provides structured configurations for handling transcription requests.

The code changes are approved.


103-104: LGTM!

The abstract methods speech and transcription are correctly defined and enhance the class's functionality.

The code changes are approved.

app/components/chat.tsx (7)

13-20: LGTM!

The new imports for various speech-related icons and APIs are correctly added and necessary for the new functionalities.

The code changes are approved.

Also applies to: 97-100, 117-124


548-550: LGTM!

The new state variables isListening, isTranscription, and speechApi are correctly implemented and necessary for managing the speech recognition process.

The code changes are approved.


552-563: LGTM!

The useEffect hook is correctly implemented and ensures the proper initialization of the speech API.

The code changes are approved.


565-585: LGTM!

The functions startListening, stopListening, and onRecognitionEnd are correctly implemented and necessary for managing the speech recognition lifecycle.

The code changes are approved.


1175-1216: Use dynamic import for markdown-to-txt.

The function is correctly implemented. However, the require statement for markdown-to-txt inside the function might cause issues in environments where require is not available, such as ES modules.

Consider using dynamic import for markdown-to-txt:

 async function openaiSpeech(text: string) {
   if (speechStatus) {
     ttsPlayer.stop();
     setSpeechStatus(false);
   } else {
     var api: ClientApi;
     api = new ClientApi(ModelProvider.GPT);
     const config = useAppConfig.getState();
     setSpeechLoading(true);
     ttsPlayer.init();
     let audioBuffer: ArrayBuffer;
-    const { markdownToTxt } = require("markdown-to-txt");
+    const { markdownToTxt } = await import("markdown-to-txt");
     const textContent = markdownToTxt(text);
     if (config.ttsConfig.engine !== DEFAULT_TTS_ENGINE) {
       const edgeVoiceName = accessStore.edgeVoiceName();
       const tts = new MsEdgeTTS();
       await tts.setMetadata(
         edgeVoiceName,
         OUTPUT_FORMAT.AUDIO_24KHZ_96KBITRATE_MONO_MP3,
       );
       audioBuffer = await tts.toArrayBuffer(textContent);
     } else {
       audioBuffer = await api.llm.speech({
         model: config.ttsConfig.model,
         input: textContent,
         voice: config.ttsConfig.voice,
         speed: config.ttsConfig.speed,
       });
     }
     setSpeechStatus(true);
     ttsPlayer
       .play(audioBuffer, () => {
         setSpeechStatus(false);
       })
       .catch((e) => {
         console.error("[OpenAI Speech]", e);
         showToast(prettyObject(e));
         setSpeechStatus(false);
       })
       .finally(() => setSpeechLoading(false));
   }
 }

811-819: LGTM!

The button for starting and stopping speech input is correctly implemented and enhances the component's functionality.

The code changes are approved.


1172-1174: LGTM!

The modifications to include the openaiSpeech function are correctly implemented and enhance the function's functionality.

The code changes are approved.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 318e098 and c5168c2.

Files selected for processing (1)
  • app/locales/en.ts (3 hunks)
Files skipped from review as they are similar to previous changes (1)
  • app/locales/en.ts

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between c5168c2 and a9f39cc.

Files selected for processing (4)
  • app/api/common.ts (2 hunks)
  • app/client/api.ts (5 hunks)
  • app/client/platforms/openai.ts (3 hunks)
  • app/utils/speech.ts (1 hunks)
Files skipped from review as they are similar to previous changes (2)
  • app/client/platforms/openai.ts
  • app/utils/speech.ts
Additional comments not posted (8)
app/api/common.ts (3)

99-99: LGTM!

The code change to dynamically retrieve the Content-Type header from the request, defaulting to "application/json" if not specified, is approved. This enhances the flexibility of handling different content types in requests.


104-104: LGTM!

The code change to replace the hardcoded "Content-Type": "application/json" in the fetchOptions with the new contentType variable is approved. This ensures that the fetch request uses the correct content type as specified by the incoming request.


Line range hint 121-149: LGTM!

The code changes are approved:

  1. Updating the conditional check for processing the request body to verify that the contentType includes "json" before proceeding with the body processing logic ensures that the function only attempts to clone and send the request body if the content type is appropriate, thereby preventing potential errors when handling non-JSON payloads.

  2. Introducing a check to refuse requests for models that are not allowed based on the server configuration enhances security and control over the usage of models.

app/client/api.ts (5)

23-23: LGTM!

The code changes are approved.


52-59: LGTM!

The code changes are approved.


61-69: LGTM!

The code changes are approved.


103-103: LGTM!

The code changes are approved.


104-104: LGTM!

The code changes are approved.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Outside diff range and nitpick comments (1)
package.json (1)

35-35: Looks good! Consider running a vulnerability scan.

The addition of the markdown-to-txt dependency aligns with the PR objective of enhancing text processing capabilities. The version constraint ^2.0.1 is reasonable as it allows for bug fixes and minor enhancements while avoiding potentially breaking changes in major versions.

As a best practice, consider running a vulnerability scan on this new dependency to ensure it doesn't introduce any known security issues.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between a9f39cc and 212605a.

Files ignored due to path filters (1)
  • yarn.lock is excluded by !**/yarn.lock, !**/*.lock
Files selected for processing (12)
  • app/client/api.ts (4 hunks)
  • app/client/platforms/anthropic.ts (2 hunks)
  • app/client/platforms/moonshot.ts (2 hunks)
  • app/client/platforms/openai.ts (3 hunks)
  • app/components/chat.tsx (11 hunks)
  • app/constant.ts (3 hunks)
  • app/layout.tsx (1 hunks)
  • app/locales/cn.ts (3 hunks)
  • app/locales/en.ts (3 hunks)
  • app/locales/index.ts (1 hunks)
  • app/store/config.ts (3 hunks)
  • package.json (2 hunks)
Files skipped from review due to trivial changes (1)
  • app/layout.tsx
Additional comments not posted (44)
app/locales/index.ts (3)

137-159: LGTM!

The introduction of the DEFAULT_STT_LANG constant and the STT_LANG_MAP record enhances the application's support for speech-to-text functionality in multiple languages. The mappings ensure that the appropriate language is used for speech-to-text operations based on the user's selected language.


161-167: LGTM!

The getSTTLang() function provides a convenient and consistent way to retrieve the appropriate speech-to-text language based on the user's selected language in the application. The error handling ensures a default language is returned if the lookup fails, preventing potential issues.


138-139: This code segment has already been reviewed in a previous comment.

app/store/config.ts (8)

8-15: LGTM!

The addition of default constants for STT and TTS configurations is a good practice. It provides a centralized location for defining default values and promotes consistency across the codebase.


22-26: LGTM!

Defining new types for TTS models, voices, engines, and STT engines based on the default constants is an excellent approach. It ensures type safety and prevents invalid values from being assigned to these configurations.


84-96: LGTM!

The introduction of ttsConfig and sttConfig sections in the DEFAULT_CONFIG object is a great addition. It allows for fine-grained control over TTS and STT functionalities, enabling users to customize their experience based on their preferences.


102-103: LGTM!

Defining separate types TTSConfig and STTConfig based on the corresponding sections in the ChatConfig type is a good practice. It allows for type-safe access to the TTS and STT configuration sections and improves code readability.


118-131: LGTM!

The introduction of the TTSConfigValidator object is a great addition to ensure the validity of TTS configuration values. The validation functions provide type safety and enforce constraints on the assigned values, preventing invalid configurations from being used.


133-138: LGTM!

The addition of the STTConfigValidator object with a validation function for STT engine is a good practice. It ensures type safety and prevents invalid values from being assigned to the STT engine configuration.


Line range hint 1-275: Excellent work!

The changes made in this file significantly enhance the application's capabilities in handling TTS and STT configurations. The introduction of default constants, types, and validation functions provides a robust and type-safe way to manage these configurations.

The code is well-structured, follows best practices, and aligns with the existing codebase. The AI-generated summary accurately summarizes the key aspects of the changes made.

Overall, these modifications lay a solid foundation for integrating TTS and STT functionalities into the application.


84-96: Verify the usage of validation functions.

The addition of ttsConfig and sttConfig sections in the DEFAULT_CONFIG object, along with the corresponding validation functions in TTSConfigValidator and STTConfigValidator, is a great approach to ensure the validity of TTS and STT configurations.

However, it's important to verify that the validation functions are being used consistently whenever TTS and STT configurations are modified or accessed throughout the codebase. This ensures that invalid values cannot be assigned and helps maintain the integrity of the configurations.

Run the following script to verify the usage of validation functions:

Also applies to: 118-138

Verification successful

Validation functions are used correctly and consistently.

After a thorough analysis of the codebase, I can confirm that the validation functions for TTS and STT configurations are being used appropriately and consistently. Here are the key findings:

  1. TTS Configuration:

    • TTSConfigValidator is used in the tts-config.tsx component to validate all configuration updates (engine, model, voice, and speed).
    • The validator is applied whenever the user modifies TTS settings through the UI.
  2. STT Configuration:

    • STTConfigValidator is used in the stt-config.tsx component to validate the engine configuration update.
    • The validator is applied when the user changes the STT engine through the UI.
  3. Both validators are properly defined in config.ts and imported where needed.

This implementation ensures that invalid values cannot be assigned to the configurations, maintaining the integrity of the TTS and STT settings throughout the application.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that validation functions are used when modifying or accessing TTS and STT configurations.

# Test: Search for usages of `ttsConfig` and `sttConfig` properties. 
# Expect: Validation functions from `TTSConfigValidator` and `STTConfigValidator` are used.
rg --type typescript -A 5 $'ttsConfig|sttConfig'

Length of output: 84


Script:

#!/bin/bash
# Description: Search for usages of TTS and STT configurations in TypeScript files

# Search for ttsConfig and its properties
echo "Searching for ttsConfig usage:"
rg -g '*.{ts,tsx}' -i '(ttsConfig|ttsConfig\.(enable|autoplay|engine|model|voice|speed))' -A 5

echo "\nSearching for sttConfig usage:"
rg -g '*.{ts,tsx}' -i '(sttConfig|sttConfig\.(enable|engine))' -A 5

echo "\nSearching for TTSConfigValidator usage:"
rg -g '*.{ts,tsx}' 'TTSConfigValidator' -A 5

echo "\nSearching for STTConfigValidator usage:"
rg -g '*.{ts,tsx}' 'STTConfigValidator' -A 5

Length of output: 16434

app/client/platforms/moonshot.ts (1)

29-30: LGTM!

The import statement for SpeechOptions and TranscriptionOptions types is correct. It indicates that the MoonshotApi class will likely use these types for speech and transcription related functionalities.

app/client/api.ts (5)

29-29: LGTM!

The constant declaration is clear and follows the existing naming convention.


58-65: LGTM!

The SpeechOptions interface provides a clear structure for specifying speech synthesis options. The property names are descriptive and the types are appropriate.


67-75: LGTM!

The TranscriptionOptions interface provides a clear structure for specifying audio transcription options. The property names are descriptive and the types are appropriate.


111-112: LGTM!

The speech and transcription abstract methods in the LLMApi class provide a clear contract for implementing speech synthesis and audio transcription functionality. The parameter and return types are appropriate.


231-240: LGTM!

The modifications to the getHeaders function provide flexibility in header management. The conditional check ensures that default headers are only included when necessary, and the constructed headers object follows the existing structure.

app/client/platforms/anthropic.ts (2)

90-92: LGTM!

The speech method is correctly added to the ClaudeApi class, following the LLMApi interface. The method is not implemented yet, which is fine for now. Please remember to implement the method in the future.


93-95: LGTM!

The transcription method is correctly added to the ClaudeApi class, following the LLMApi interface. The method is not implemented yet, which is fine for now. Please remember to implement the method in the future.

app/constant.ts (11)

155-155: LGTM!

The SpeechPath constant is correctly defined and follows the naming convention of other OpenAI API paths.


156-156: LGTM!

The TranscriptionPath constant is correctly defined and follows the naming convention of other OpenAI API paths.


263-263: LGTM!

The DEFAULT_TTS_ENGINE constant is correctly defined and clearly indicates the default TTS engine being used.


264-264: LGTM!

The DEFAULT_TTS_ENGINES constant is correctly defined as an array and includes the available TTS engines.


265-265: LGTM!

The DEFAULT_TTS_MODEL constant is correctly defined and clearly indicates the default TTS model being used.


266-266: LGTM!

The DEFAULT_TTS_VOICE constant is correctly defined and clearly indicates the default TTS voice being used.


267-267: LGTM!

The DEFAULT_TTS_MODELS constant is correctly defined as an array and includes the available TTS models.


268-275: LGTM!

The DEFAULT_TTS_VOICES constant is correctly defined as an array and includes the available TTS voice options.


277-277: LGTM!

The DEFAULT_STT_ENGINE constant is correctly defined and clearly indicates the default STT engine being used.


278-278: LGTM!

The DEFAULT_STT_ENGINES constant is correctly defined as an array and includes the available STT engines.


279-279: LGTM!

The FIREFOX_DEFAULT_STT_ENGINE constant is correctly defined and clearly indicates the default STT engine being used for Firefox.

app/client/platforms/openai.ts (3)

89-89: Verify the purpose and usage of the model parameter.

The function signature has been modified to include an optional model parameter. However, the model parameter is not being used within the function body.

Please clarify the intended purpose of the model parameter and ensure that it is being utilized correctly within the function implementation.


152-188: LGTM!

The speech function implementation looks good:

  • It correctly constructs the request payload based on the provided SpeechOptions.
  • The usage of AbortController and timeout mechanism ensures proper handling of request cancellation.
  • The function sends the request to the appropriate OpenAI Speech API endpoint and returns the response as an ArrayBuffer.

The function enhances the ChatGPTApi class with speech synthesis capabilities.


190-229: LGTM!

The transcription function implementation looks good:

  • It correctly constructs the FormData object based on the provided TranscriptionOptions.
  • The function appends the audio file and various optional parameters to the FormData object.
  • The usage of AbortController and timeout mechanism ensures proper handling of request cancellation.
  • The function sends the request to the appropriate OpenAI Transcription API endpoint and extracts the transcribed text from the response.

The function enhances the ChatGPTApi class with audio transcription capabilities.

app/locales/cn.ts (4)

49-50: LGTM!

The addition of the "Speech" and "StopSpeech" localization strings is consistent with the introduction of speech-related functionalities mentioned in the PR summary. The placement of these strings under the "Actions" section aligns with the existing structure.


84-85: LGTM!

The addition of the "StartSpeak" and "StopSpeak" localization strings aligns with the introduction of speech-related functionalities mentioned in the PR summary. The placement of these strings at the root level of the "Chat" section suggests that they are top-level actions related to speech, which is appropriate.


503-522: LGTM!

The addition of the "TTS" configuration section under the "Settings" section is a great enhancement. It provides users with comprehensive control over text-to-speech settings, including options for enabling TTS, autoplay, model selection, engine selection, voice selection, and speed adjustment. The changes align with the introduction of a dedicated TTS configuration object mentioned in the PR summary.


523-532: LGTM!

The addition of the "STT" configuration section under the "Settings" section is a valuable enhancement. It provides users with control over speech-to-text settings, including options for enabling STT and selecting the conversion engine. The changes align with the introduction of a dedicated STT configuration object mentioned in the PR summary.

app/locales/en.ts (3)

50-51: LGTM!

The added key-value pairs for Speech and StopSpeech actions are consistent with the existing naming convention and align with the PR objective of introducing TTS and STT functionalities.


85-86: LGTM!

The added key-value pairs for StartSpeak and StopSpeak are consistent with the existing naming convention and align with the PR objective of introducing TTS and STT functionalities.


509-539: Looks good!

The added TTS and STT objects provide comprehensive configuration options for the text-to-speech and speech-to-text functionalities. The properties within these objects are consistent with the existing naming convention and align perfectly with the PR objective and the AI-generated summary.

app/components/chat.tsx (4)

559-595: LGTM! The speech recognition feature is correctly implemented.

The speech recognition feature enhances the user experience by allowing voice input. The feature is correctly integrated with the existing chat input and user input state management. The implementation is compatible with different browsers and speech recognition engines.


1257-1298: LGTM! The text-to-speech feature is correctly implemented.

The text-to-speech feature enhances the user experience by allowing the chat messages to be read aloud. The feature is correctly integrated with the existing chat message rendering and configuration management. The implementation supports different TTS engines and configurations for flexibility and compatibility.


1839-1857: LGTM! The text-to-speech feature is correctly integrated with the chat message actions.

The text-to-speech feature is correctly integrated with the existing chat message actions and rendering. The feature enhances the user experience by allowing the user to start and stop speech for each chat message. The implementation is correctly integrated with the existing TTS configuration management.


1976-1976: LGTM! The setUserInput prop is correctly added to the ChatActions function.

The setUserInput prop is necessary to correctly integrate the speech recognition feature with the existing user input management. The prop is correctly passed down from the parent component and used to set the user input when the speech recognition ends.

Comment on lines +77 to +82
speech(options: SpeechOptions): Promise<ArrayBuffer> {
throw new Error("Method not implemented.");
}
transcription(options: TranscriptionOptions): Promise<string> {
throw new Error("Method not implemented.");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implement the methods before merging.

The speech and transcription method signatures look good:

  • speech accepts SpeechOptions and returns Promise<ArrayBuffer>, which is suitable for speech synthesis.
  • transcription takes TranscriptionOptions and returns Promise<string>, which is appropriate for speech transcription.

However, please ensure that the methods are fully implemented before merging these changes to avoid unexpected errors in the application.

@Dogtiti Dogtiti closed this Sep 18, 2024
@DDMeaqua DDMeaqua deleted the tts-stt branch October 11, 2024 04:53
@DDMeaqua DDMeaqua restored the tts-stt branch October 11, 2024 04:53
@DDMeaqua DDMeaqua deleted the tts-stt branch October 11, 2024 04:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants