Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance: cap patch extra lines and update documentation with separato… #1222

Merged
merged 2 commits into from
Sep 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion docs/docs/faq/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@

Read more about this issue in our [blog](https://www.codium.ai/blog/understanding-the-challenges-and-pain-points-of-the-pull-request-cycle/)

___

??? note "Question: I received an incorrect or irrelevant suggestion. Why?"

Expand All @@ -38,22 +38,30 @@
- In addition, we recommend to use the [`extra_instructions`](https://pr-agent-docs.codium.ai/tools/improve/#extra-instructions-and-best-practices) field to guide the model to suggestions that are more relevant to the specific needs of the project.
- The interactive [PR chat](https://pr-agent-docs.codium.ai/chrome-extension/) also provides an easy way to get more tailored suggestions and feedback from the AI model.

___

??? note "Question: How can I get more tailored suggestions?"
#### Answer:<span style="display:none;">3</span>

See [here](https://pr-agent-docs.codium.ai/tools/improve/#extra-instructions-and-best-practices) for more information on how to use the `extra_instructions` and `best_practices` configuration options, to guide the model to more tailored suggestions.

___

??? note "Question: Will you store my code ? Are you using my code to train models?"
#### Answer:<span style="display:none;">4</span>

No. PR-Agent strict privacy policy ensures that your code is not stored or used for training purposes.

For a detailed overview of our data privacy policy, please refer to [this link](https://pr-agent-docs.codium.ai/overview/data_privacy/)

___

??? note "Question: Can I use my own LLM keys with PR-Agent?"
#### Answer:<span style="display:none;">5</span>

When you self-host, you use your own keys.

PR-Agent Pro with SaaS deployment is a hosted version of PR-Agent, where Codium AI manages the infrastructure and the keys.
For enterprise customers, on-prem deployment is also available. [Contact us](https://www.codium.ai/contact/#pricing) for more information.

___
3 changes: 3 additions & 0 deletions docs/docs/tools/review.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ The tool can be triggered automatically every time a new PR is [opened](../usage

Note that the main purpose of the `review` tool is to provide the **PR reviewer** with useful feedbacks and insights. The PR author, in contrast, may prefer to save time and focus on the output of the [improve](./improve.md) tool, which provides actionable code suggestions.

(Read more about the different personas in the PR process and how PR-Agent aims to assist them in our [blog](https://www.codium.ai/blog/understanding-the-challenges-and-pain-points-of-the-pull-request-cycle/))


## Example usage

### Manual triggering
Expand Down
4 changes: 2 additions & 2 deletions docs/docs/usage-guide/additional_configurations.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,8 @@ patch_extra_lines_before=4
patch_extra_lines_after=2
```

Increasing this number provides more context to the model, but will also increase the token budget.
If the PR is too large (see [PR Compression strategy](https://github.com/Codium-ai/pr-agent/blob/main/PR_COMPRESSION.md)), PR-Agent automatically sets this number to 0, using the original git patch.
Increasing this number provides more context to the model, but will also increase the token budget, and may overwhelm the model with too much information, unrelated to the actual PR code changes.
If the PR is too large (see [PR Compression strategy](https://github.com/Codium-ai/pr-agent/blob/main/PR_COMPRESSION.md)), PR-Agent automatically may set this number to 0, and will use the original git patch.


## Editing the prompts
Expand Down
19 changes: 17 additions & 2 deletions pr_agent/algo/pr_processing.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,15 @@

OUTPUT_BUFFER_TOKENS_SOFT_THRESHOLD = 1500
OUTPUT_BUFFER_TOKENS_HARD_THRESHOLD = 1000
MAX_EXTRA_LINES = 10


def cap_and_log_extra_lines(value, direction) -> int:
if value > MAX_EXTRA_LINES:
get_logger().warning(f"patch_extra_lines_{direction} was {value}, capping to {MAX_EXTRA_LINES}")
return MAX_EXTRA_LINES
return value


def get_pr_diff(git_provider: GitProvider, token_handler: TokenHandler,
model: str,
Expand All @@ -38,6 +45,8 @@ def get_pr_diff(git_provider: GitProvider, token_handler: TokenHandler,
else:
PATCH_EXTRA_LINES_BEFORE = get_settings().config.patch_extra_lines_before
PATCH_EXTRA_LINES_AFTER = get_settings().config.patch_extra_lines_after
PATCH_EXTRA_LINES_BEFORE = cap_and_log_extra_lines(PATCH_EXTRA_LINES_BEFORE, "before")
PATCH_EXTRA_LINES_AFTER = cap_and_log_extra_lines(PATCH_EXTRA_LINES_AFTER, "after")

try:
diff_files_original = git_provider.get_diff_files()
Expand Down Expand Up @@ -408,11 +417,17 @@ def get_pr_multi_diffs(git_provider: GitProvider,
for lang in pr_languages:
sorted_files.extend(sorted(lang['files'], key=lambda x: x.tokens, reverse=True))

# Get the maximum number of extra lines before and after the patch
PATCH_EXTRA_LINES_BEFORE = get_settings().config.patch_extra_lines_before
PATCH_EXTRA_LINES_AFTER = get_settings().config.patch_extra_lines_after
PATCH_EXTRA_LINES_BEFORE = cap_and_log_extra_lines(PATCH_EXTRA_LINES_BEFORE, "before")
PATCH_EXTRA_LINES_AFTER = cap_and_log_extra_lines(PATCH_EXTRA_LINES_AFTER, "after")

# try first a single run with standard diff string, with patch extension, and no deletions
patches_extended, total_tokens, patches_extended_tokens = pr_generate_extended_diff(
pr_languages, token_handler, add_line_numbers_to_hunks=True,
patch_extra_lines_before=get_settings().config.patch_extra_lines_before,
patch_extra_lines_after=get_settings().config.patch_extra_lines_after)
patch_extra_lines_before=PATCH_EXTRA_LINES_BEFORE,
patch_extra_lines_after=PATCH_EXTRA_LINES_AFTER)

# if we are under the limit, return the full diff
if total_tokens + OUTPUT_BUFFER_TOKENS_SOFT_THRESHOLD < get_max_tokens(model):
Expand Down
Loading