Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(iast): optimize iast by migrating should_iast_patch function to c #12774

Merged
merged 88 commits into from
Mar 24, 2025

Conversation

avara1986
Copy link
Member

@avara1986 avara1986 commented Mar 18, 2025

TL;DR: Before this PR, IAST startup time was x5,16 lower than an application without IAST startup time
After this PR, IAST startup time was x2,79

Description

In this PR, we aim to improve the startup times of an application when IAST (Interactive Application Security Testing) is enabled. The following improvements have been made:

Performance improvements

  • These changes aim to streamline the codebase, improve the functionality of the IAST module, and remove outdated or redundant code.When reading the source code of a module for transformation with AST, it is now done in binary mode, which is 30% faste as shown in this gist: https://gist.github.com/avara1986/8e31a23978b5eecd9dc8f2f3135060b0. The execution times are:

    • open("r"): 0.9285 secs
    • open("rb"): 0.6362 secs
  • The function should_iast_patch has been migrated to C. By comparing the original function with the new one using the following timeit command, is 40% faster.

    # refactor
    python -m timeit -n 5000000 -r 5 -u usec -s "from ddtrace.appsec._iast._ast import 
    iastpatch""iastpatch.should_iast_patch('mimodulo.submodulo')" 
    # main
    python -m timeit -n 5000000 -r 5 -u usec -s "from ddtrace.appsec._iast._ast.ast_patching import _should_iast_patch" 
    "_should_iast_patch('package.mypackage')"
    
  • I wanted to migrate exec to PyEval_EvalCode but the results are very revealing, exec is approximately 17% faster. The results of this gist https://gist.github.com/avara1986/ae2a65dca1b90afa762af67db680bc2b

    • exec: mean time of 0.083023 seconds
    • PyEval_EvalCode: mean time of 0.099550 seconds

By combining all these changes, the microbenchmarks for startup times show that the baseline takes 2.46 seconds, while this branch takes 1.24 seconds, making it 1.99x faster.

Performance analysis

The _should_iast_patch logs are under asm_config._iast_debug because Debug mode has a significant performance impact, making the function between 1.14x and 3.08x slower

Benchmark Results
--------------------------------------------------------------------------------
Module Name                    Debug Time (s)       No Debug Time (s)    Ratio     
--------------------------------------------------------------------------------
os                                    0.017021            0.005522       3.08x
requests                              0.023237            0.020418       1.14x
ddtrace.appsec._iast                  0.024511            0.012280       2.00x
non.existent.module                   0.023403            0.011170       2.10x

Branch deployed in rel-env and no leaks found:
image

Checklist

  • PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

avara1986 added 30 commits March 5, 2025 13:26
feat/fix/docs/refactor/ci(xxx): commit title here
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 0% with 58 lines in your changes missing coverage. Please review.

Project coverage is 3.67%. Comparing base (3c49582) to head (120a5d3).

Files with missing lines Patch % Lines
ddtrace/appsec/_iast/_ast/ast_patching.py 0.00% 20 Missing ⚠️
tests/appsec/iast/_ast/test_ast_patching.py 0.00% 18 Missing ⚠️
tests/appsec/iast/test_env_var.py 0.00% 10 Missing ⚠️
ddtrace/appsec/_iast/_loader.py 0.00% 7 Missing ⚠️
ddtrace/appsec/_iast/_ast/visitor.py 0.00% 1 Missing ⚠️
...s/appsec/iast/_ast/test_ast_patching_type_hints.py 0.00% 1 Missing ⚠️
tests/appsec/iast/test_loader.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main   #12774    +/-   ##
========================================
  Coverage    3.67%    3.67%            
========================================
  Files        1378     1370     -8     
  Lines      135941   135817   -124     
========================================
+ Hits         4993     4995     +2     
+ Misses     130948   130822   -126     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@avara1986 avara1986 changed the title chore(iast): iast startup time chore(iast): optimize iast by enhancing source code reading and migrating should_iast_patch function to C Mar 21, 2025
@avara1986 avara1986 changed the title chore(iast): optimize iast by enhancing source code reading and migrating should_iast_patch function to C chore(iast): optimize iast by enhancing source code reading and migrating should_iast_patch function to c Mar 21, 2025
@avara1986 avara1986 changed the title chore(iast): optimize iast by enhancing source code reading and migrating should_iast_patch function to c chore(iast): optimize iast by migrating should_iast_patch function to c Mar 21, 2025
@avara1986 avara1986 marked this pull request as ready for review March 21, 2025 18:04
@avara1986 avara1986 requested review from a team as code owners March 21, 2025 18:04
@avara1986 avara1986 requested a review from P403n1x87 March 21, 2025 18:04
@avara1986 avara1986 merged commit 9159b81 into main Mar 24, 2025
901 checks passed
@avara1986 avara1986 deleted the avara1986/APPSEC-56907-iast_improvements branch March 24, 2025 13:35
Copy link
Contributor

The backport to 2.21 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.21 2.21
# Navigate to the new working tree
cd .worktrees/backport-2.21
# Create a new branch
git switch --create backport-12774-to-2.21
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9159b811c7c08f133e960b9cb17d7b72d9e4c8aa
# Push it to GitHub
git push --set-upstream origin backport-12774-to-2.21
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.21

Then, create a pull request where the base branch is 2.21 and the compare/head branch is backport-12774-to-2.21.

avara1986 added a commit that referenced this pull request Mar 24, 2025
… c (#12774)

**TL;DR: Before this PR, IAST startup time was x5,16 lower than an
application without IAST startup time
After this PR, IAST startup time was x2,79**

In this PR, we aim to improve the startup times of an application when
IAST (Interactive Application Security Testing) is enabled. The
following improvements have been made:

 ### Performance improvements
- These changes aim to streamline the codebase, improve the
functionality of the IAST module, and remove outdated or redundant
code.When reading the source code of a module for transformation with
AST, it is now done in binary mode, which is **30% faste** as shown in
this gist:
https://gist.github.com/avara1986/8e31a23978b5eecd9dc8f2f3135060b0. The
execution times are:
  - `open("r")`: 0.9285 secs
  - `open("rb")`: 0.6362 secs
- The function `should_iast_patch` has been migrated to C. By comparing
the original function with the new one using the following `timeit`
command, is **40% faster**.
  ```
  # refactor
python -m timeit -n 5000000 -r 5 -u usec -s "from
ddtrace.appsec._iast._ast import
  iastpatch""iastpatch.should_iast_patch('mimodulo.submodulo')"
  # main
python -m timeit -n 5000000 -r 5 -u usec -s "from
ddtrace.appsec._iast._ast.ast_patching import _should_iast_patch"
  "_should_iast_patch('package.mypackage')"
  ```

- I wanted to migrate `exec` to `PyEval_EvalCode` but the results are
very revealing, exec is approximately **17% faster**. The results of
this gist
https://gist.github.com/avara1986/ae2a65dca1b90afa762af67db680bc2b
  - `exec`: mean time of 0.083023 seconds
  - `PyEval_EvalCode`: mean time of 0.099550 seconds

By combining all these changes, the microbenchmarks for startup times
show that the baseline takes 2.46 seconds, while this branch takes 1.24
seconds, making it **1.99x faster**.

 ### Performance analysis
The `_should_iast_patch` logs are under asm_config._iast_debug because
Debug mode has a significant performance impact, making the function
between 1.14x and 3.08x slower
```
Benchmark Results
--------------------------------------------------------------------------------
Module Name                    Debug Time (s)       No Debug Time (s)    Ratio
--------------------------------------------------------------------------------
os                                    0.017021            0.005522       3.08x
requests                              0.023237            0.020418       1.14x
ddtrace.appsec._iast                  0.024511            0.012280       2.00x
non.existent.module                   0.023403            0.011170       2.10x
```
Branch deployed in rel-env and no leaks found:

![image](https://github.com/user-attachments/assets/faaa6ead-6c48-4f05-944b-5d29ed338d9a)

- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

- [x] Reviewer has checked that all the criteria below are met
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

(cherry picked from commit 9159b81)
avara1986 added a commit that referenced this pull request Mar 25, 2025
… c [backport 2.21] (#12857)

Backport #12774 to 2.21.

**TL;DR: Before this PR, IAST startup time was x5,16 lower than an
application without IAST startup time
After this PR, IAST startup time was x2,79**

## Description
In this PR, we aim to improve the startup times of an application when
IAST (Interactive Application Security Testing) is enabled. The
following improvements have been made:

 ### Performance improvements
- These changes aim to streamline the codebase, improve the
functionality of the IAST module, and remove outdated or redundant
code.When reading the source code of a module for transformation with
AST, it is now done in binary mode, which is **30% faste** as shown in
this gist:
https://gist.github.com/avara1986/8e31a23978b5eecd9dc8f2f3135060b0. The
execution times are:
  - `open("r")`: 0.9285 secs
  - `open("rb")`: 0.6362 secs
- The function `should_iast_patch` has been migrated to C. By comparing
the original function with the new one using the following `timeit`
command, is **40% faster**.
  ```
  # refactor
python -m timeit -n 5000000 -r 5 -u usec -s "from
ddtrace.appsec._iast._ast import
  iastpatch""iastpatch.should_iast_patch('mimodulo.submodulo')" 
  # main
python -m timeit -n 5000000 -r 5 -u usec -s "from
ddtrace.appsec._iast._ast.ast_patching import _should_iast_patch"
  "_should_iast_patch('package.mypackage')"
  ```

- I wanted to migrate `exec` to `PyEval_EvalCode` but the results are
very revealing, exec is approximately **17% faster**. The results of
this gist
https://gist.github.com/avara1986/ae2a65dca1b90afa762af67db680bc2b
  - `exec`: mean time of 0.083023 seconds
  - `PyEval_EvalCode`: mean time of 0.099550 seconds

By combining all these changes, the microbenchmarks for startup times
show that the baseline takes 2.46 seconds, while this branch takes 1.24
seconds, making it **1.99x faster**.

 ### Performance analysis
The `_should_iast_patch` logs are under asm_config._iast_debug because
Debug mode has a significant performance impact, making the function
between 1.14x and 3.08x slower
```
Benchmark Results
--------------------------------------------------------------------------------
Module Name                    Debug Time (s)       No Debug Time (s)    Ratio     
--------------------------------------------------------------------------------
os                                    0.017021            0.005522       3.08x
requests                              0.023237            0.020418       1.14x
ddtrace.appsec._iast                  0.024511            0.012280       2.00x
non.existent.module                   0.023403            0.011170       2.10x
```
Branch deployed in rel-env and no leaks found:

![image](https://github.com/user-attachments/assets/faaa6ead-6c48-4f05-944b-5d29ed338d9a)


## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)


(cherry picked from commit 9159b81)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASM Application Security Monitoring backport 2.21 changelog/no-changelog A changelog entry is not required for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants