-
-
Couldn't load subscription status.
- Fork 245
NestBot AI Assistant Contexts #1891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 12 commits
Commits
Show all changes
57 commits
Select commit
Hold shift + click to select a range
bff36bd
rag tool for agent
Dishant1804 f254af8
code rabbit suggestions implemented
Dishant1804 a1bba29
Merge branch 'main' into RAG
Dishant1804 ad3f3b4
Merge branch 'main' into RAG
arkid15r c1334a6
Merge branch 'main' into RAG
Dishant1804 c9d4a27
Merge branch 'main' into RAG
Dishant1804 ff45de1
suggestions implemented
Dishant1804 4b38f5a
Merge remote-tracking branch 'upstream/main' into RAG
Dishant1804 b2c5b59
code rabbit suggestion
Dishant1804 9b94aed
Merge branch 'main' into RAG
Dishant1804 3038f32
Merge remote-tracking branch 'upstream/main' into RAG
Dishant1804 e120962
added context model
Dishant1804 f24453a
Merge remote-tracking branch 'upstream/main' into context-model
Dishant1804 e876a0c
retrieving data from context model
Dishant1804 981277a
removed try except
Dishant1804 8b46f08
Suggestions implemented
Dishant1804 16fabcf
code rabbit suggestion
Dishant1804 532be09
Merge branch 'main' into context-model
Dishant1804 77203b8
removed deafult
Dishant1804 9e03b53
updated tests
Dishant1804 ed44239
Merge branch 'main' into context-model
Dishant1804 41f8126
de coupled context and chunks
Dishant1804 c5aba9c
Merge remote-tracking branch 'upstream/main' into context-model
Dishant1804 697a406
update method for context
Dishant1804 46cd884
Merge remote-tracking branch 'upstream/main' into context-model
Dishant1804 a3255ff
major revamp and test cases
Dishant1804 64c079a
Merge remote-tracking branch 'upstream/main' into context-model
Dishant1804 7affa22
code rabbit suggestions
Dishant1804 55132d7
Merge remote-tracking branch 'upstream/main' into context-model
Dishant1804 3d7bd48
major revamp
Dishant1804 7d0731b
Merge remote-tracking branch 'upstream/main' into context-model
Dishant1804 ff3e61a
Merge remote-tracking branch 'upstream/main' into context-model
Dishant1804 c709b9e
suggestions implemented
Dishant1804 1c7fe1c
refactoring
Dishant1804 948c529
more tests
Dishant1804 1455083
Merge branch 'main' into context-model
Dishant1804 1e8d65e
more refactoring
Dishant1804 3f15d7a
Merge branch 'main' into context-model
Dishant1804 742a15e
Merge branch 'main' into context-model
Dishant1804 bd8f280
suggestions implemented
Dishant1804 8610dde
Merge branch 'main' into context-model
Dishant1804 a9da28b
chunk model update
Dishant1804 a0ed311
update logic and suggestions
Dishant1804 9646366
Merge remote-tracking branch 'upstream/main' into context-model
Dishant1804 2d86dcb
code rabbit suggestions
Dishant1804 011e843
before tests and question
Dishant1804 466bca3
sugesstions and decoupling with tests
Dishant1804 9c2556c
Merge remote-tracking branch 'upstream/main' into context-model
Dishant1804 c9f260d
Merge branch 'main' into context-model
Dishant1804 197c0ff
sugesstions implemented
Dishant1804 4dc3800
Merge remote-tracking branch 'upstream/main' into context-model
Dishant1804 346d324
Update code
arkid15r baae5eb
updated code
Dishant1804 f6bb1bd
spelling fixes
Dishant1804 6c353d1
Merge remote-tracking branch 'upstream/main' into context-model
Dishant1804 506ad46
test changes
Dishant1804 871d266
Update tests
arkid15r File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Empty file.
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,120 @@ | ||
| """Generator for the RAG system.""" | ||
|
|
||
| import logging | ||
| import os | ||
| from typing import Any | ||
|
|
||
| import openai | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class Generator: | ||
| """Generates answers to user queries based on retrieved context.""" | ||
|
|
||
| MAX_TOKENS = 2000 | ||
| SYSTEM_PROMPT = """ | ||
| You are a helpful and professional AI assistant for the OWASP Foundation. | ||
| Your task is to answer user queries based ONLY on the provided context. | ||
| Follow these rules strictly: | ||
| 1. Base your entire answer on the information given in the "CONTEXT" section. Do not use any | ||
| external knowledge unless and until it is about OWASP. | ||
| 2. Do not mention or refer to the word "context", "based on context", "provided information", | ||
| "Information given to me" or similar phrases in your responses. | ||
| 3. you will answer questions only related to OWASP and within the scope of OWASP. | ||
| 4. Be concise and directly answer the user's query. | ||
| 5. Provide the necessary link if the context contains a URL. | ||
| 6. If there is any query based on location, you need to look for latitude and longitude in the | ||
| context and provide the nearest OWASP chapter based on that. | ||
| 7. You can ask for more information if the query is very personalized or user-centric. | ||
| 8. after trying all of the above, If the context does not contain the information or you think that | ||
| it is out of scope for OWASP, you MUST state: "please ask question related to OWASP." | ||
| """ | ||
| TEMPERATURE = 0.4 | ||
|
|
||
| def __init__(self, chat_model: str = "gpt-4o"): | ||
| """Initialize the Generator. | ||
|
|
||
| Args: | ||
| chat_model (str): The name of the OpenAI chat model to use for generation. | ||
|
|
||
| Raises: | ||
| ValueError: If the OpenAI API key is not set. | ||
|
|
||
| """ | ||
| if not (openai_api_key := os.getenv("DJANGO_OPEN_AI_SECRET_KEY")): | ||
| error_msg = "DJANGO_OPEN_AI_SECRET_KEY environment variable not set" | ||
| raise ValueError(error_msg) | ||
|
|
||
| self.chat_model = chat_model | ||
| self.openai_client = openai.OpenAI(api_key=openai_api_key) | ||
| logger.info("Generator initialized with chat model: %s", self.chat_model) | ||
|
|
||
| def prepare_context(self, context_chunks: list[dict[str, Any]]) -> str: | ||
| """Format the list of retrieved context chunks into a single string for the LLM. | ||
|
|
||
| Args: | ||
| context_chunks: A list of chunk dictionaries from the retriever. | ||
|
|
||
| Returns: | ||
| A formatted string containing the context. | ||
|
|
||
| """ | ||
| if not context_chunks: | ||
| return "No context provided" | ||
|
|
||
| formatted_context = [] | ||
| for i, chunk in enumerate(context_chunks): | ||
| source_name = chunk.get("source_name", f"Unknown Source {i + 1}") | ||
| text = chunk.get("text", "") | ||
|
|
||
| context_block = f"Source Name: {source_name}\nContent: {text}" | ||
| formatted_context.append(context_block) | ||
|
|
||
| return "\n\n---\n\n".join(formatted_context) | ||
|
|
||
| def generate_answer(self, query: str, context_chunks: list[dict[str, Any]]) -> str: | ||
| """Generate an answer to the user's query using provided context chunks. | ||
|
|
||
| Args: | ||
| query: The user's query text. | ||
| context_chunks: A list of context chunks retrieved by the retriever. | ||
|
|
||
| Returns: | ||
| The generated answer as a string. | ||
|
|
||
| """ | ||
| formatted_context = self.prepare_context(context_chunks) | ||
|
|
||
| user_prompt = f""" | ||
| - You are an assistant for question-answering tasks related to OWASP. | ||
| - Use the following pieces of retrieved context to answer the question. | ||
| - If the question is related to OWASP then you can try to answer based on your knowledge, if you | ||
| don't know the answer, just say that you don't know. | ||
| - Try to give answer and keep the answer concise, but you really think that the response will be | ||
| longer and better you will provide more information. | ||
| - Ask for the current location if the query is related to location. | ||
| - Ask for the information you need if the query is very personalized or user-centric. | ||
| - Do not mention or refer to the word "context", "based on context", "provided information", | ||
| "Information given to me" or similar phrases in your responses. | ||
| Question: {query} | ||
| Context: {formatted_context} | ||
| Answer: | ||
| """ | ||
Dishant1804 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| try: | ||
| response = self.openai_client.chat.completions.create( | ||
| model=self.chat_model, | ||
| messages=[ | ||
| {"role": "system", "content": self.SYSTEM_PROMPT}, | ||
| {"role": "user", "content": user_prompt}, | ||
| ], | ||
| temperature=self.TEMPERATURE, | ||
| max_tokens=self.MAX_TOKENS, | ||
| ) | ||
| answer = response.choices[0].message.content.strip() | ||
| except openai.OpenAIError: | ||
| logger.exception("OpenAI API error") | ||
| answer = "I'm sorry, I'm currently unable to process your request." | ||
|
|
||
| return answer | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,72 @@ | ||
| """A tool for orchestrating the components of RAG process.""" | ||
|
|
||
| import logging | ||
|
|
||
| from apps.ai.common.constants import DEFAULT_CHUNKS_RETRIEVAL_LIMIT, DEFAULT_SIMILARITY_THRESHOLD | ||
|
|
||
| from .generator import Generator | ||
| from .retriever import Retriever | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class RagTool: | ||
| """Main RAG tool that orchestrates the retrieval and generation process.""" | ||
|
|
||
| def __init__( | ||
| self, | ||
| embedding_model: str = "text-embedding-3-small", | ||
| chat_model: str = "gpt-4o", | ||
| ): | ||
| """Initialize the RAG tool. | ||
|
|
||
| Args: | ||
| embedding_model (str, optional): The model to use for embeddings. | ||
| chat_model (str, optional): The model to use for chat generation. | ||
|
|
||
| Raises: | ||
| ValueError: If the OpenAI API key is not set. | ||
|
|
||
| """ | ||
| try: | ||
| self.retriever = Retriever(embedding_model=embedding_model) | ||
| self.generator = Generator(chat_model=chat_model) | ||
| except Exception: | ||
| logger.exception("Failed to initialize RAG tool") | ||
| raise | ||
|
|
||
| def query( | ||
| self, | ||
| question: str, | ||
| limit: int = DEFAULT_CHUNKS_RETRIEVAL_LIMIT, | ||
| similarity_threshold: float = DEFAULT_SIMILARITY_THRESHOLD, | ||
| content_types: list[str] | None = None, | ||
| ) -> str: | ||
| """Process a user query using the complete RAG pipeline. | ||
|
|
||
| Args: | ||
| question (str): The user's question. | ||
| limit (int): Maximum number of context chunks to retrieve. | ||
| similarity_threshold (float): Minimum similarity score for retrieval. | ||
| content_types (Optional[list[str]]): Content types to filter by. | ||
|
|
||
| Returns: | ||
| dict[str, Any]: A dictionary containing: | ||
| - answer (str): The generated answer | ||
|
|
||
| """ | ||
| logger.info("Retrieving context for query") | ||
| retrieved_chunks = self.retriever.retrieve( | ||
| query=question, | ||
| limit=limit, | ||
| similarity_threshold=similarity_threshold, | ||
| content_types=content_types, | ||
| ) | ||
|
|
||
| generation_result = self.generator.generate_answer( | ||
| query=question, context_chunks=retrieved_chunks | ||
| ) | ||
|
|
||
| logger.info("Successfully processed RAG query") | ||
|
|
||
| return generation_result | ||
Dishant1804 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you order the changes you add according to existing ordering convention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you elaborate this one I am unable to understand it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your ContextAdmin class goes before teh ChunkAdmin and the same for register(). Compare them to the imports order for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still not addressed for some reason 🤷♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the changes now