Skip to content

sccache: integrate with Velox build#52

Merged
mattgara merged 20 commits intomainfrom
mattgara/sccache-velox
Oct 22, 2025
Merged

sccache: integrate with Velox build#52
mattgara merged 20 commits intomainfrom
mattgara/sccache-velox

Conversation

@mattgara
Copy link
Copy Markdown
Contributor

  • Add sccache integration with build_velox.sh

  • Document how to setup and use sccache

* Add `sccache` integration with `build_velox.sh`

* Document how to setup and use `sccache`
@mattgara mattgara changed the title sccache: Integrate with Velox build sccache: integrate with Velox build Sep 22, 2025
Remove non-existent temporary directory removal.
First, set up authentication credentials:
```bash
cd velox-testing/velox/scripts
./setup_sccache_auth.sh
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running this script results in the following error:

ERROR: failed to build: failed to solve: failed to read dockerfile: open sccache_auth.dockerfile: no such file or directory

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh oops I think I forgot to include a dockerfile in this PR, my bad.

Comment on lines +36 to +47
sccache Usage:
The --sccache option enables distributed compilation caching using the RAPIDS sccache infrastructure.

Setup authentication first:
./setup_sccache_auth.sh [output_dir]

Then use with build:
--sccache --sccache-auth-dir ~/.sccache-auth

Or set environment variable:
export SCCACHE_AUTH_DIR=~/.sccache-auth
./build_velox.sh --sccache
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Is this duplicating information in the README?

Comment on lines +61 to +62
$(basename "$0") --sccache --sccache-github-token ghp_xxx # Full distributed compilation
$(basename "$0") --sccache --sccache-show-stats # Build with sccache and show stats
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are --sccache-github-token and --sccache-show-stats supported options?

Copy link
Copy Markdown
Contributor Author

@mattgara mattgara Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry these were vestigial option I had at one point and forgot to remove from docs.

# Add sccache build arguments
if [[ "$ENABLE_SCCACHE" == true ]]; then
DOCKER_BUILD_OPTS+=(--build-arg ENABLE_SCCACHE="ON")
# Copy auth files to build context
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be cleaned up later?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I put in a trap.

mkdir -p ~/.config/sccache ~/.aws

# Install AWS credentials
cp /sccache_auth/aws_credentials ~/.aws/credentials
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this override existing credential setup?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also only affects the ~/.aws/credentials in the docker container that is used to build velox, it does not affect the host.

else
echo "Error: Distributed compilation not available, check connectivity"
exit 1
fi No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Add newline to the end of file.

echo " ./build_velox.sh --sccache"
echo

echo -e "${GREEN}Setup complete!${NC}" No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Add newline to the end of file.

* Added missing dockerfile which was omitted in first commit
Comment on lines +27 to +28
echo "AWS credentials file preview:"
head -3 ~/.aws/credentials
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be printing credentials?

Similarly, should this script be enabling tracing (i.e. set -x) by default?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm not sure why this is showing in GH, this section was deleted a while ago, please double check.

I've disabled set -x for this script now.

-j|--num-threads NUM Number of threads to use for building (default: 3/4 of CPU cores).
--benchmarks true|false Enable benchmarks and nsys profiling tools (default: true).
--sccache Enable sccache distributed compilation caching.
--sccache-auth-dir DIR Directory containing sccache authentication files (github_token, aws_credentials).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have to be an argument given that this directory is created by the setup script?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove this option and we can just rely on the relevant environment variable.

README.md Outdated

Then build Velox with sccache enabled:
```bash
./build_velox.sh --sccache --sccache-auth-dir ~/.sccache-auth
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running this results in the following error:

 => [stage-0 9/9] RUN --mount=type=bind,source=velox,target=/workspace/velox,ro     set -euxo pipefail &&     if [ "ON" = "ON" ]; then   542.7s
 => => # ed-qualifiers        -Wno-implicit-fallthrough          -Wno-class-memaccess          -Wno-comment          -Wno-int-in-bool-context  
 => => #         -Wno-redundant-move          -Wno-array-bounds          -Wno-maybe-uninitialized          -Wno-unused-result          -Wno-for
 => => # mat-overflow          -Wno-strict-aliasing -Wno-restrict -Werror -O3 -DNDEBUG -std=gnu++20 -fPIC -fdiagnostics-color=always -ffp-contr
 => => # act=off -fPIC -MD -MT velox/buffer/CMakeFiles/velox.dir/__/dwio/dwrf/writer/ColumnWriter.cpp.o -MF velox/buffer/CMakeFiles/velox.dir/_
 => => # _/dwio/dwrf/writer/ColumnWriter.cpp.o.d -o velox/buffer/CMakeFiles/velox.dir/__/dwio/dwrf/writer/ColumnWriter.cpp.o -c /workspace/velo
 => => # x/velox/dwio/dwrf/writer/ColumnWriter.cpp                                                                                             
failed to execute bake: exit status 1

Copy link
Copy Markdown
Contributor Author

@mattgara mattgara Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a compilation error unrelated to this PR, can you post more of the logs to confirm?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build logs have been attached. This is using this velox commit: facebookincubator/velox@8875b4d. Note that the build succeeds when sccache is not used.

./build_velox.sh --no-cache --log velox_build.log
velox_build.log

./build_velox.sh --no-cache --sccache --sccache-auth-dir ~/.sccache-auth --log velox_build_sccache.log
velox_build_sccache.log

Copy link
Copy Markdown
Contributor Author

@mattgara mattgara Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paul-aiyedun Hmm, I've been able to build that commit (facebookincubator/velox@8875b4d) using the command

./build_velox.sh --no-cache --sccache --sccache-auth-dir ~/.sccache-auth --log velox_build_sccache.log successfully.

Build log:
velox_build_sccache.log

Could you please retry with a clean checkout/env to further debug this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After several attempts I was able to reproduce the issue above, and it looks like when distributed compilation is enabled with sccache it can compile with warnings that otherwise would not be hit. This causes the build to fail due to treating warnings as errors.

To address this, I've made distributed compilation an opt in flag, and provide a warning that observable behaviour of the compilers may differ if enabled (resulting in failed compilation.)

exit 1
fi
SCCACHE_AUTH_DIR="$2"
shift 2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

;; needed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops this looks like a merge error by the automated merge in Cursor/VS Code.

Add missing ;; lost in automated merge of conflict.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this dockerfile based on an existing dockerfile or documentation?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dockerfile is largely based on documentation in a slack channel which I can link offline.

# Cleanup function to remove copied sccache auth files
cleanup_sccache_auth() {
if [[ "$ENABLE_SCCACHE" == true && -d "../docker/sccache/sccache_auth/" ]]; then
rm -f ../docker/sccache/sccache_auth/github_token ../docker/sccache/sccache_auth/aws_credentials
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Delete the ../docker/sccache/sccache_auth directory?

@mattgara mattgara force-pushed the mattgara/sccache-velox branch from 082ca6e to f7eb417 Compare October 9, 2025 22:02
Copy link
Copy Markdown
Contributor

@paul-aiyedun paul-aiyedun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

TREAT_WARNINGS_AS_ERRORS="${TREAT_WARNINGS_AS_ERRORS:-1}"
LOGFILE="./build_velox.log"
ENABLE_SCCACHE=false
SCCACHE_AUTH_DIR="$HOME/.sccache-auth"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this to accept value from SCCACHE_AUTH_DIR env var or fallback to default "$HOME/.sccache-auth"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks for the catch.

cat > $HOME/.sccache-auth/aws_credentials << EOF
[default]
aws_access_key_id = $SCCACHE_AWS_ACCESS_KEY_ID
aws_secret_access_key = $SCCACHE_AWS_SECRET_ACCESS_KEY
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uses the default AWS credentials available through GHA secrets instead of generating them every time. The same applies for the GitHub token.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to make these changes in this PR, enabling sccache for CI is a separate work item which will have a follow-up PR.

@Avinash-Raj
Copy link
Copy Markdown
Contributor

Relevant CI run., can we fix the CI before merging?

@mattgara
Copy link
Copy Markdown
Contributor Author

Relevant CI run., can we fix the CI before merging?

See above. In summary, I think this is a separate work item that should not block this PR.

@mattgara mattgara merged commit 8d52c99 into main Oct 22, 2025
@mattgara mattgara deleted the mattgara/sccache-velox branch October 22, 2025 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants