Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More vector tests #9

Open
wants to merge 68 commits into
base: master
Choose a base branch
from
Open

More vector tests #9

wants to merge 68 commits into from

Conversation

ElishaKay
Copy link
Owner

@ElishaKay ElishaKay commented Sep 20, 2024

Description by Korbit AI

What change is being made?

Integrate a new Discord bot feature for AI-driven code assistance, enhance vector store capabilities, and update the server to handle additional parameters for repository and branch names.

Why are these changes being made?

These changes introduce a Discord bot to facilitate AI-driven code assistance, allowing users to interact with the AI for coding queries directly within Discord. The vector store enhancements improve data handling and retrieval efficiency, while the server updates ensure seamless integration with the new features by accommodating additional parameters necessary for repository management.

Is this description stale? Ask me to generate a new description by commenting /korbit-generate-pr-description

Copy link

korbit-ai bot commented Sep 20, 2024

My review is in progress 📖 - I will have feedback for you in a few minutes!

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary by Korbit AI

Code Execution Comments

  • Modify the 'ask' command to accept a question argument and correct the misspelling of asimilarity_search for proper functionality.
  • Use await in asynchronous methods to prevent unexpected behavior and race conditions.

Code Health Comments

  • Use specific exception types and logging instead of print() for better error tracking and consistent error management.
  • Remove redundant repo_name addition to prevent unexpected results.
  • Add error handling and timeout in sendWebhookMessage to avoid unresolved Promises.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Current Korbit Configuration

General Settings
Setting Value
Review Schedule Automatic excluding drafts
Max Issue Count 10
Automatic PR Descriptions
Issue Categories
Category Enabled
Naming
Database Operations
Documentation
Logging
Error Handling
Systems and Environment
Objects and Data Structures
Tests
Readability and Maintainability
Asynchronous Processing
Design Patterns
Third-Party Libraries
Performance
Security
Functionality

Feedback and Support

Comment on lines +3 to +10
module.exports = {
data: new SlashCommandBuilder()
.setName('ask')
.setDescription('Ask a question to the bot'),
async execute(interaction) {
await interaction.reply('Please provide your question.');
}
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

category Functionality severity potentially major

The 'ask' command does not allow users to input a question.

The current implementation of the 'ask' command doesn't actually allow users to ask a question. It only responds with a message asking for a question, but doesn't provide a way for users to input their query. To fix this, we should modify the command to accept a question as an argument. Here's how you can do that:

  1. Add a required string option to the SlashCommandBuilder for the question.
  2. In the execute function, retrieve the question from the interaction options and use it in the response.

This change will make the command functional and allow users to ask questions directly.

Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines 89 to 91
except Exception as e:
print(f"Error saving to vector store: {e}")
return None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

category Error Handling severity potentially major

Use specific exception types and implement logging.

Consider using a more specific exception type instead of catching all exceptions with except Exception. This will help in identifying and handling specific error scenarios more effectively. Additionally, printing the error to console might not be sufficient for proper error tracking in a production environment. Consider using a logging mechanism instead.

Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines 89 to 91
except Exception as e:
print(f"Error saving to vector store: {e}")
return None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

category Error Handling severity potentially major

Use a logging mechanism instead of print() for error messages.

Instead of using print() to output the error message, consider using a proper logging mechanism. This will provide more consistent error tracking and allow for better error management in different environments. For example, you could use Python's built-in logging module.

Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines +130 to +146
try {
let gptrResponse = await sendWebhookMessage({ query, relevantFileNames, repoName, branchName });

if (gptrResponse && gptrResponse.rubber_ducker_thoughts) {
let rubberDuckerChunks = '';
let theGuidance = gptrResponse.rubber_ducker_thoughts;

try {
console.log('Original rubber_ducker_thoughts:', theGuidance);

// const repairedJson = jsonrepair(theGuidance);
// rubberDuckerChunks = splitMessage(JSON.parse(repairedJson).thoughts);
rubberDuckerChunks = splitMessage(theGuidance)
} catch (error) {
console.error('Error splitting messages:', error);
rubberDuckerChunks = splitMessage(typeof theGuidance === 'object' ? JSON.stringify(theGuidance) : theGuidance);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

category Error Handling severity potentially major

Log error details in the outer try-catch block.

While you're logging the error when splitting messages fails, it would be beneficial to also log the error details when catching the outer try-catch block. This will provide more context about any errors that occur during the entire process of fetching and processing the GPTR response. Consider adding a log statement in the catch block that includes the error details.

Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines +10 to +17
if self.repo_name:
query += f" - the repo_name is: {self.repo_name}"

researcher = GPTResearcher(
query=query + " - the repo_name is: " + self.repo_name,
report_type="research_report",
report_source="web"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

category Functionality severity potentially major

Redundant addition of repo_name to the query.

There's a redundancy in adding the repo_name to the query. The repo_name is being added twice: once in the if statement and again when creating the GPTResearcher instance. This could lead to unexpected search results. To fix this, you should remove the redundant addition. Consider updating the code as follows:

query = state['query'] if isinstance(state, dict) else state
if self.repo_name:
    query += f" - the repo_name is: {self.repo_name}"
researcher = GPTResearcher(
    query=query,
    report_type="research_report",
    report_source="web"
)

This change ensures that the repo_name is added to the query only once, maintaining the intended functionality without duplication.

Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines +44 to +76
async function sendWebhookMessage({query, relevantFileNames, repoName, branchName}) {
return new Promise((resolve, reject) => {
if (!socket || socket.readyState !== WebSocket.OPEN) {
initializeWebSocket();
}

const data = {
task: `${query}. The relevant files names are: ${relevantFileNames}`,
report_type: 'dev_team',
report_source: 'web',
tone: 'Objective',
headers: {},
repo_name: typeof repoName === 'undefined' || repoName === '' ? 'assafelovic/gpt-researcher' : repoName,
branch_name: typeof branchName === 'undefined' || branchName === '' ? 'master' : branchName
};

const payload = "start " + JSON.stringify(data);

responseCallback = (response) => {
resolve(response); // Resolve the promise with the WebSocket response
};

if (socket.readyState === WebSocket.OPEN) {
socket.send(payload);
console.log('Message sent:', payload);
} else {
socket.onopen = () => {
socket.send(payload);
console.log('Message sent after connection:', payload);
};
}
});
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

category Asynchronous Processing severity potentially major

Improve error handling in the sendWebhookMessage function.

The sendWebhookMessage function uses a Promise to handle the asynchronous WebSocket communication, but it doesn't properly handle errors. If an error occurs during the process, such as a connection failure or a timeout, the Promise will remain in a pending state indefinitely. This can lead to unresolved Promises and potential memory leaks.

To address this issue, consider the following:

  1. Add error handling to the Promise. If an error occurs during the WebSocket initialization or message sending process, reject the Promise with an appropriate error.

  2. Implement a timeout mechanism. If no response is received from the WebSocket within a certain time limit, reject the Promise with a timeout error. This will prevent the Promise from staying in a pending state indefinitely.

  3. Handle any errors that may occur during the WebSocket initialization or message sending process. Reject the Promise with an appropriate error if something goes wrong.

By properly handling errors and timeouts, you can ensure that the Promise resolves or rejects appropriately, avoiding potential issues related to unresolved Promises and improving the reliability of the code.

Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines +44 to +48
async function sendWebhookMessage({query, relevantFileNames, repoName, branchName}) {
return new Promise((resolve, reject) => {
if (!socket || socket.readyState !== WebSocket.OPEN) {
initializeWebSocket();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

category Error Handling severity potentially major

Await the asynchronous call to initializeWebSocket()

The initializeWebSocket() function is called asynchronously, but its result is not being awaited. This could lead to race conditions where the socket might not be ready when you try to use it. Consider using await initializeWebSocket() and handling any potential errors that might occur during initialization.

Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines +42 to +45
async def asimilarity_search(self, query, k, filter):
"""Return query by vector store"""
results = await self.vector_store.asimilarity_search(query=query, k=k, filter=filter)
return results
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

category Functionality severity potentially major

Method name misspelling.

The asimilarity_search method name appears to be misspelled. It should be asimilarity_search_with_score to match the Langchain VectorStore method. This misspelling will cause the method to fail when called. Please update the method name to ensure proper functionality.

Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

@@ -28,6 +29,7 @@ def __pretty_print_docs(self, docs):
for d in docs)

async def async_get_context(self, query, max_results=5):
"""Get relevant context from vector store"""
results = await self.vector_store.asimilarity_search(query=query, k=max_results, filter=self.filter)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

category Asynchronous Processing severity potentially major

Missing 'await' keyword in asynchronous method call.

In the async_get_context method of the VectorstoreCompressor class, you are calling the asynchronous method asimilarity_search from the VectorStoreWrapper class. However, you are not using the await keyword to wait for the asynchronous operation to complete. This can lead to unexpected behavior where the method continues execution before the search results are returned. To fix this, make sure to use await when calling asimilarity_search, like this: results = await self.vector_store.asimilarity_search(query=query, k=max_results, filter=self.filter).

Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines +21 to +47
try:
result = await run_dev_team_flow(repo_name=repo_name,
query=query,
branch_name=branch_name,
websocket=websocket,
stream_output=stream_output)

# print("result in trigger dev team flow", result)

# Remove the FAISS object from the result dictionary
if isinstance(result, dict):
result = {k: v for k, v in result.items() if k != 'vector_store'}

if websocket and stream_output:
try:
await stream_output("logs", "dev_team_result", result, websocket)
except Exception as e:
print(f"Error sending result over WebSocket: {e}")
# You might want to send an error message to the client here

return result
except Exception as e:
if websocket and stream_output:
try:
await stream_output("logs", "dev_team_result", result, websocket)
except Exception as e:
print(f"Error sending result over WebSocket: {e}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

category Error Handling severity potentially major

Exceptions are not being logged in the trigger_dev_team_flow function.

In the trigger_dev_team_flow function, you're catching exceptions but not logging them properly. In the outer except block, you should log the exception e that was caught. This will help with debugging and understanding what went wrong when an exception occurs. Consider adding a logging statement like logging.error(f"Error in trigger_dev_team_flow: {e}") at the beginning of the except block.

Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

khoangothe and others added 27 commits September 22, 2024 03:23
…tructure as long as PGVECTOR_CONNECTION_STRING & GITHUB_TOKEN env vars are set
…cess to fetch full files directly from a github branch based on file names - these full files are then passed into the prompts of the Rubber Ducker & Tech Lead Agents
ElishaKay and others added 29 commits September 22, 2024 03:26
…: adding discord-bot env token to main .env in root & running 'docker-compose --profile discord run --rm discord-bot'
… relevant file names, repo name & branch name)
…d bot runs to completion with custom repo name & branch name
…js server should log errors without restarting
…nnels - also added a cool down logic so that it only advises about the /ask command every 30 minutes per channel
…me & embedding everything with metadata in gptr-compatible format
…ning to completion with relevant files fetched from gptr's __get_similar_content_by_query_with_vectorstore method
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants