Skip to content

Conversation

colin-ho
Copy link
Contributor

@colin-ho colin-ho commented Aug 20, 2025

Changes Made

Test the built wheel with tests/dataframe before publishing.

Successful run https://github.com/Eventual-Inc/Daft/actions/runs/17419058418

Related Issues

Checklist

  • Documented in API Docs (if applicable)
  • Documented in User Guide (if applicable)
  • If adding a new documentation page, doc is added to docs/mkdocs.yml navigation
  • Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)

@github-actions github-actions bot added the ci label Aug 20, 2025
Comment on lines +133 to +175
name: Test built wheels
runs-on: ${{ matrix.os }}
needs: build
if: ${{ inputs.run_tests }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest]
runner: [native, ray]
arch: [x86_64]
python-version: ['3.9', '3.11']
env:
package-name: daft
steps:
- uses: actions/checkout@v4
- name: Download built wheels
uses: actions/download-artifact@v4
with:
pattern: wheels-*
merge-multiple: true
path: dist
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Setup Virtual Env
run: |
python -m venv venv
echo "$GITHUB_WORKSPACE/venv/bin" >> $GITHUB_PATH
- name: Install wheel and test dependencies
uses: nick-fields/retry@v3
with:
timeout_minutes: 10
max_attempts: 3
retry_wait_seconds: 10
command: |
source venv/bin/activate
pip install ray pytest pandas dist/*${{ matrix.arch }}*.whl --force-reinstall
rm -rf daft
- name: Run dataframe tests
run: |
source venv/bin/activate
DAFT_RUNNER=${{ matrix.runner }} pytest tests/dataframe --durations=50

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 8 days ago

To fix the problem, add an explicit permissions block at the top level of the workflow file. This block will establish the least privilege required for all jobs in the workflow unless overridden at the job level. For standard CI/CD workflows (build, test, upload/download artifacts, checkout code), contents: read is sufficient. Place the permissions block below the name: and run-name: and above on: on line 3. This provides security and avoids excessive privileges for the GITHUB_TOKEN. No need to modify any individual jobs, as the root-level block applies to all jobs unless overridden.

Suggested changeset 1
.github/workflows/build-wheel.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/build-wheel.yml b/.github/workflows/build-wheel.yml
--- a/.github/workflows/build-wheel.yml
+++ b/.github/workflows/build-wheel.yml
@@ -1,5 +1,7 @@
 name: Build Daft wheel
 run-name: 'Build Daft wheel for ${{ inputs.os }}-${{ inputs.arch }}-lts=${{ inputs.lts }}-lto=${{ inputs.use_lto }}'
+permissions:
+  contents: read
 
 on:
   workflow_call:
EOF
@@ -1,5 +1,7 @@
name: Build Daft wheel
run-name: 'Build Daft wheel for ${{ inputs.os }}-${{ inputs.arch }}-lts=${{ inputs.lts }}-lto=${{ inputs.use_lto }}'
permissions:
contents: read

on:
workflow_call:
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link

codecov bot commented Aug 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.61%. Comparing base (058c6cd) to head (2a10320).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5009      +/-   ##
==========================================
+ Coverage   76.47%   76.61%   +0.13%     
==========================================
  Files         934      936       +2     
  Lines      128764   128819      +55     
==========================================
+ Hits        98471    98692     +221     
+ Misses      30293    30127     -166     

see 21 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@colin-ho colin-ho marked this pull request as ready for review September 3, 2025 00:31
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR enhances the CI/CD pipeline by adding a quality gate that tests built wheels before publishing them to PyPI and S3. The changes implement optional testing functionality across multiple workflows and improve test robustness to handle missing optional dependencies.

The core enhancement is in the GitHub workflows: build-wheel.yml now includes a test-wheels job that can be optionally enabled via a run_tests parameter. When enabled, this job downloads the built wheels and runs the tests/dataframe test suite across Python 3.9 and 3.11 with both native and Ray runners on Ubuntu. The publish-pypi.yml workflow now sets run_tests: true by default, ensuring wheels are validated before reaching end users. The publish-dev-s3.yml workflow adds the same capability as an optional parameter.

To support reliable testing in CI environments, several test files were updated to handle optional dependencies more gracefully. Files like test_logical_type.py, test_repr.py, and others now use pytest.importorskip('PIL') instead of direct PIL imports, ensuring tests are skipped rather than failing when PIL is unavailable. Similarly, test_creation.py was updated to conditionally define PyExtType and skip tests when PyArrow's PyExtensionType is not available. The integration test configuration in conftest.py was optimized with lazy imports for boto3, numpy, s3fs, and PIL to improve startup performance.

Additional improvements include environment variables to disable analytics and progress bars during CI execution, and enhanced test conditions in test_aggregations.py to ensure proper runner-specific behavior validation. These changes integrate seamlessly with the existing build matrix across different operating systems and architectures while adding this crucial validation step.

Context used:

Rule - Import statements should be placed at the top of the file rather than inline within functions or methods. (link)

8 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

Copy link
Member

@kevinzwang kevinzwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the rationale for running the tests on the built wheel? Is it for checking dependencies? Additionally, wondering why only the dataframe tests.

@colin-ho
Copy link
Contributor Author

colin-ho commented Sep 3, 2025

Could you explain the rationale for running the tests on the built wheel? Is it for checking dependencies? Additionally, wondering why only the dataframe tests.

Yeah to make sure we didn't accidentally add dependencies. These tests run on a fresh env with pip installed daft.

A lot of the other tests have testing dependencies. I guess we can add 'importorskip' to all of them, but i thought just dataframe is good enough, better than nothing, for now.

@colin-ho colin-ho requested a review from kevinzwang September 3, 2025 18:26
@colin-ho colin-ho merged commit e36a830 into main Sep 3, 2025
52 of 54 checks passed
@colin-ho colin-ho deleted the colin/run-tests-before-publish branch September 3, 2025 19:26
venkateshdb pushed a commit to venkateshdb/Daft that referenced this pull request Sep 6, 2025
## Changes Made

Test the built wheel with `tests/dataframe` before publishing. 

Successful run
https://github.com/Eventual-Inc/Daft/actions/runs/17419058418

## Related Issues

<!-- Link to related GitHub issues, e.g., "Closes Eventual-Inc#123" -->

## Checklist

- [ ] Documented in API Docs (if applicable)
- [ ] Documented in User Guide (if applicable)
- [ ] If adding a new documentation page, doc is added to
`docs/mkdocs.yml` navigation
- [ ] Documentation builds and is formatted properly (tag @/ccmao1130
for docs review)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants