-
Notifications
You must be signed in to change notification settings - Fork 481
test(fuzzing): add internal fuzzing infra support - CHAOSPLT-1355 #15685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
a3985b8
WIP Fuzzer
edznux-dd d669a17
onboard to internal fuzzing infra
edznux-dd 7b82628
fix rebase errors
edznux-dd aa636fa
long.cc ?
edznux-dd 88f1c5a
linter
edznux-dd 36d3fd8
format
edznux-dd 0570e89
format CMakeLists.txt
taegyunkim 8090d8c
Merge branch 'main' into edouard/add-base-fuzzing-setup
taegyunkim 32b5f22
add all echion cc files
taegyunkim 5aa5dd7
vm.cc needs to be removed as that defines copy_memory again, leading …
taegyunkim 14117e0
Add a comment on Python version
taegyunkim 4db133c
Add a comment on base image and using the same image as in .gitlab/fu…
taegyunkim 4523978
add a docs section on fuzzing
taegyunkim d351f67
Set owners for fuzzing related files
taegyunkim 92ba050
Add datadog internal docs link
taegyunkim 2321f3f
Add spelling wordlist
edznux-dd f372282
empty commit
edznux-dd d8b12d9
WIP Fuzzer
edznux-dd d23e8b2
onboard to internal fuzzing infra
edznux-dd 7baa02e
fix rebase errors
edznux-dd 08f708b
long.cc ?
edznux-dd cd1e2aa
linter
edznux-dd 72ca117
format
edznux-dd 2f4266e
format CMakeLists.txt
taegyunkim 4c7af01
add all echion cc files
taegyunkim 94414ff
vm.cc needs to be removed as that defines copy_memory again, leading …
taegyunkim 23ca79d
Add a comment on Python version
taegyunkim db106ea
Add a comment on base image and using the same image as in .gitlab/fu…
taegyunkim 3064e55
add a docs section on fuzzing
taegyunkim e0cc5aa
Set owners for fuzzing related files
taegyunkim 42ad6a1
Add datadog internal docs link
taegyunkim 7256187
Add spelling wordlist
edznux-dd 60e0dd8
empty commit
edznux-dd bf77a73
Merge branch 'main' into edouard/add-base-fuzzing-setup
edznux-dd 34c0817
PR comments
edznux-dd 47529f2
Merge branch 'edouard/add-base-fuzzing-setup' of github.com:DataDog/d…
edznux-dd 0f3bf5a
bad merge dup values...
edznux-dd 81180f3
Merge branch 'main' into edouard/add-base-fuzzing-setup
taegyunkim 51cb930
Remove scheduled pipeline trigger, only use nightly + manual
edznux-dd c586598
Merge branch 'main' into edouard/add-base-fuzzing-setup
edznux-dd 7db6d6e
Merge branch 'main' into edouard/add-base-fuzzing-setup
edznux-dd 5c7739e
Merge branch 'main' into edouard/add-base-fuzzing-setup
edznux-dd File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| variables: | ||
| REPO_LANG: python # "python" is used everywhere rather than "py" | ||
| # CI_DEBUG_SERVICES: "true" | ||
|
|
||
| fuzz_infra: | ||
| needs: [] | ||
| image: | ||
| name: registry.ddbuild.io/images/mirror/ubuntu:24.04 | ||
| tags: ["arch:amd64"] | ||
| stage: fuzz | ||
| timeout: 5m | ||
| allow_failure: true | ||
| rules: | ||
| # runs during nightly builds | ||
| - if: $NIGHTLY_BUILD == "true" | ||
| # Also allow manual run in branches for ease of debug / testing | ||
| - when: manual | ||
| before_script: | ||
| # Install build dependencies (same as docker/Dockerfile.fuzz) | ||
| # TODO(taegyunkim): Fuzz with all supported versions of Python (3.9 - 3.14). | ||
| # On ubuntu:24.04 image, python3 version defaults to 3.12.3, meaning that | ||
| # fuzzing will only run for binary that is linked with that version of | ||
| # Python. | ||
| - apt-get update && apt-get install -y --no-install-recommends ca-certificates clang cmake git libclang-rt-dev lld make ninja-build python3 python3-dev python3-pip curl unzip | ||
taegyunkim marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - python3 -m pip install requests --break-system-packages | ||
| # Install vault for fuzzing API authentication | ||
| - VAULT_VERSION=1.21.1 && curl -fsSL "https://releases.hashicorp.com/vault/${VAULT_VERSION}/vault_${VAULT_VERSION}_linux_amd64.zip" -o vault.zip && unzip vault.zip && mv vault /usr/local/bin/vault && rm vault.zip && chmod +x /usr/local/bin/vault | ||
| - git config --global --add safe.directory ${CI_PROJECT_DIR} | ||
| script: | ||
| - python3 .gitlab/scripts/fuzz_infra.py | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,247 @@ | ||
| #!/usr/bin/env python3 | ||
brettlangdon marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # This script enables "0 click onboarding" for new fuzzer in the dd-trace-py repository. | ||
| # This means that any new fuzzer should be automatically detected and run in the internal | ||
| # infrastructure with enrichments, reporting, triaging, auto fix etc... | ||
| # Reports are submitted via Slack, with the channel defined by SLACK_CHANNEL | ||
| # | ||
| # Requirements: | ||
| # | ||
| # This scripts assumes that: | ||
| # - Each fuzz target is built in a separate build directory named `fuzz` and having a `build.sh` script that builds | ||
| # the target. | ||
| # - The build script appends the path to the built binary to a "MANIFEST_FILE", allowing the discovery of each fuzz | ||
| # target by the script. | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| from dataclasses import dataclass | ||
| import glob | ||
| import os | ||
| import subprocess | ||
| import sys | ||
| from typing import List | ||
|
|
||
| import requests | ||
|
|
||
|
|
||
| # TODO: replace me to dd-trace-py ops' slack channel once initial onboarding is done | ||
| SLACK_CHANNEL = "fuzzing-ops" | ||
| TEAM_NAME = "profiling-python" | ||
| REPOSITORY_URL = "https://github.com/DataDog/dd-trace-py" | ||
| PROJECT_NAME = "dd-trace-py" | ||
| # We currently only support libfuzzer for this repository. | ||
| FUZZ_TYPE = "libfuzzer" | ||
| API_URL = "https://fuzzing-api.us1.ddbuild.io/api/v1" | ||
|
|
||
| # Paths and constants for script execution | ||
| REPO_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "..")) | ||
| FUZZER_BINARY_BASE_PATH = "/tmp/fuzz/build" | ||
| MANIFEST_FILE = os.path.join(FUZZER_BINARY_BASE_PATH, "fuzz_binaries.txt") | ||
| MAX_PKG_NAME_LENGTH = 50 | ||
| VAULT_PATH = "vault" | ||
|
|
||
|
|
||
| @dataclass(frozen=True) | ||
| class FuzzBinary: | ||
| """Represents a built fuzz binary ready for upload.""" | ||
|
|
||
| pkgname: str | ||
| binary_name: str | ||
| binary_path: str | ||
|
|
||
|
|
||
| def build_and_upload_fuzz( | ||
| team: str = TEAM_NAME, | ||
| slack_channel: str = SLACK_CHANNEL, | ||
| repository_url: str = REPOSITORY_URL, | ||
| ) -> None: | ||
| git_sha = os.popen("git rev-parse HEAD").read().strip() | ||
|
|
||
| # Step 1: Discover and run all build scripts | ||
| build_scripts = discover_build_scripts(REPO_ROOT) | ||
| if not build_scripts: | ||
| print(f"❌ No fuzz build scripts found under {REPO_ROOT}") | ||
| return | ||
|
|
||
| # Clear any previous manifest file | ||
| if os.path.exists(MANIFEST_FILE): | ||
| os.remove(MANIFEST_FILE) | ||
|
|
||
| for build_script in build_scripts: | ||
| run_build_script(build_script) | ||
|
|
||
| # Step 2: Read the manifest file to discover built binaries | ||
| binaries = read_manifest(MANIFEST_FILE) | ||
| if not binaries: | ||
| print(f"❌ No fuzz binaries found in manifest {MANIFEST_FILE}") | ||
| return | ||
|
|
||
| # Step 3: Upload and create a fuzzer for each binary | ||
| for binary in binaries: | ||
| upload_binary(binary, git_sha) | ||
| create_fuzzer(binary, git_sha, team, slack_channel, repository_url) | ||
|
|
||
| print("✅ Fuzzing infrastructure setup completed successfully!") | ||
|
|
||
|
|
||
| def get_package_name(binary_name: str) -> str: | ||
| """ | ||
| Generate a package name for the fuzzing platform from a binary name. | ||
| It's prefixed with the repository name so it's easier to filter. | ||
| The package name is limited by k8s labels format: must be < 63 chars, alphamumeric and hyphen. | ||
| """ | ||
| return PROJECT_NAME + "-" + binary_name[:MAX_PKG_NAME_LENGTH].replace("_", "-") | ||
|
|
||
|
|
||
| def _is_executable(file_path: str) -> bool: | ||
| return os.path.isfile(file_path) and os.access(file_path, os.X_OK) | ||
|
|
||
|
|
||
| def discover_build_scripts(repo_root: str) -> List[str]: | ||
| """ | ||
| Discover fuzz build scripts by looking for '**/fuzz/build.sh' | ||
|
|
||
| This allows for "0 click onboarding" for new fuzz harnesses. | ||
| """ | ||
| build_scripts: List[str] = [] | ||
| for build_script in glob.glob(os.path.join(repo_root, "**/fuzz/build.sh"), recursive=True): | ||
| print(f"Found build script: {build_script}") | ||
| build_scripts.append(build_script) | ||
| return build_scripts | ||
|
|
||
|
|
||
| def run_build_script(build_script: str) -> None: | ||
| """Run a fuzz build script.""" | ||
| fuzz_dir = os.path.dirname(build_script) | ||
| print(f"Building fuzz directory: {fuzz_dir}") | ||
|
|
||
| if not os.path.isfile(build_script): | ||
| raise FileNotFoundError(build_script) | ||
|
|
||
| try: | ||
| result = subprocess.run( | ||
| [build_script], | ||
| cwd=fuzz_dir, | ||
| check=True, | ||
| stdout=subprocess.PIPE, | ||
| stderr=subprocess.PIPE, | ||
| text=True, | ||
| ) | ||
| print(result.stdout) | ||
| if result.stderr: | ||
| print(result.stderr) | ||
| except subprocess.CalledProcessError as e: | ||
| print(f"❌ Build script failed with exit code {e.returncode}") | ||
| print(f"Command: {e.cmd}") | ||
| if e.stdout: | ||
| print(f"stdout:\n{e.stdout}") | ||
| if e.stderr: | ||
| print(f"stderr:\n{e.stderr}") | ||
| raise | ||
|
|
||
| print(f"✅ Built fuzzers from {build_script}") | ||
|
|
||
|
|
||
| def read_manifest(manifest_path: str) -> List[FuzzBinary]: | ||
| """ | ||
| Read the manifest file created by build scripts to discover built binaries. | ||
|
|
||
| Each build script appends its binary path(s) to this file. | ||
| """ | ||
| binaries: List[FuzzBinary] = [] | ||
|
|
||
| if not os.path.isfile(manifest_path): | ||
| print(f"⚠️ No manifest file found at {manifest_path}") | ||
| return binaries | ||
|
|
||
| with open(manifest_path) as f: | ||
| for line in f: | ||
| binary_path = line.strip() | ||
| if not binary_path: | ||
| continue | ||
| if not os.path.isfile(binary_path): | ||
| print(f"⚠️ Binary listed in manifest not found: {binary_path}") | ||
| continue | ||
| if not _is_executable(binary_path): | ||
| print(f"⚠️ Binary listed in manifest is not executable: {binary_path}") | ||
| continue | ||
|
|
||
| binary_name = os.path.basename(binary_path) | ||
| print(f"Found fuzz binary: {binary_path}") | ||
| binaries.append( | ||
| FuzzBinary( | ||
| pkgname=get_package_name(binary_name), | ||
| binary_name=binary_name, | ||
| binary_path=binary_path, | ||
| ) | ||
| ) | ||
|
|
||
| return binaries | ||
|
|
||
|
|
||
| def create_fuzzer(binary: FuzzBinary, git_sha: str, team: str, slack_channel: str, repository_url: str) -> bool: | ||
| """Register a fuzzer with the fuzzing platform.""" | ||
| print(f"Starting fuzzer for {binary.pkgname} ({binary.binary_name})...") | ||
| run_payload = { | ||
| "app": binary.pkgname, | ||
| "debug": False, | ||
| "version": git_sha, | ||
| "type": FUZZ_TYPE, | ||
| "binary": binary.binary_name, | ||
| "team": team, | ||
| "slack_channel": slack_channel, | ||
| "repository_url": repository_url, | ||
| } | ||
| try: | ||
| response = requests.post( | ||
| f"{API_URL}/apps/{binary.pkgname}/fuzzers", headers=get_headers(), json=run_payload, timeout=30 | ||
| ) | ||
| response.raise_for_status() | ||
| print(f"✅ Started fuzzer for {binary.pkgname} ({binary.binary_name})") | ||
| print(response.json()) | ||
| except Exception as e: | ||
| print(f"❌ Failed to start fuzzer for {binary.pkgname} ({binary.binary_name}): {e}") | ||
| return True | ||
|
|
||
| return False | ||
|
|
||
|
|
||
| def upload_binary(binary: FuzzBinary, git_sha: str) -> bool: | ||
| """Upload a fuzz binary to the fuzzing platform.""" | ||
| try: | ||
| # Get presigned URL so we can use s3 uploading | ||
| print(f"Getting presigned URL for {binary.pkgname} ({binary.binary_name})...") | ||
| presigned_response = requests.post( | ||
| f"{API_URL}/apps/{binary.pkgname}/builds/{git_sha}/url", headers=get_headers(), timeout=30 | ||
| ) | ||
|
|
||
| presigned_response.raise_for_status() | ||
| presigned_url = presigned_response.json()["data"]["url"] | ||
|
|
||
| print(f"Uploading {binary.pkgname} ({binary.binary_name}) for {git_sha}...") | ||
| with open(binary.binary_path, "rb") as f: | ||
| upload_response = requests.put(presigned_url, data=f, timeout=300) | ||
| upload_response.raise_for_status() | ||
| print(f"✅ Uploaded {binary.binary_name}") | ||
| except Exception as e: | ||
| print(f"❌ Failed to upload binary for {binary.pkgname} ({binary.binary_name}): {e}") | ||
| return True | ||
| return False | ||
|
|
||
|
|
||
| def get_headers(): | ||
| auth_header = ( | ||
| os.popen(f"{VAULT_PATH} read -field=token identity/oidc/token/security-fuzzing-platform").read().strip() | ||
| ) | ||
| return {"Authorization": f"Bearer {auth_header}", "Content-Type": "application/json"} | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| print("🚀 Starting fuzzing infrastructure setup...") | ||
| try: | ||
| build_and_upload_fuzz() | ||
| print("✅ Fuzzing infrastructure setup completed successfully!") | ||
| except Exception as e: | ||
| print(f"❌ Failed to set up fuzzing infrastructure: {e}") | ||
| sys.exit(1) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.