ci: Run tests before publish #5009

colin-ho · 2025-08-20T18:27:27Z

Changes Made

Test the built wheel with tests/dataframe before publishing.

Successful run https://github.com/Eventual-Inc/Daft/actions/runs/17419058418

Related Issues

Checklist

Documented in API Docs (if applicable)
Documented in User Guide (if applicable)
If adding a new documentation page, doc is added to docs/mkdocs.yml navigation
Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)

.github/workflows/build-wheel.yml

+    name: Test built wheels
+    runs-on: ${{ matrix.os }}
+    needs: build
+    if: ${{ inputs.run_tests }}
+    strategy:
+      fail-fast: false
+      matrix:
+        os: [ubuntu-latest]
+        runner: [native, ray]
+        arch: [x86_64]
+        python-version: ['3.9', '3.11']
+    env:
+      package-name: daft
+    steps:
+    - uses: actions/checkout@v4
+    - name: Download built wheels
+      uses: actions/download-artifact@v4
+      with:
+        pattern: wheels-*
+        merge-multiple: true
+        path: dist
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v5
+      with:
+        python-version: ${{ matrix.python-version }}
+    - name: Setup Virtual Env
+      run: |
+        python -m venv venv
+        echo "$GITHUB_WORKSPACE/venv/bin" >> $GITHUB_PATH
+    - name: Install wheel and test dependencies
+      uses: nick-fields/retry@v3
+      with:
+        timeout_minutes: 10
+        max_attempts: 3
+        retry_wait_seconds: 10
+        command: |
+          source venv/bin/activate
+          pip install ray pytest pandas dist/*${{ matrix.arch }}*.whl --force-reinstall
+          rm -rf daft
+    - name: Run dataframe tests
+      run: |
+        source venv/bin/activate
+        DAFT_RUNNER=${{ matrix.runner }} pytest tests/dataframe --durations=50


To fix the problem, add an explicit permissions block at the top level of the workflow file. This block will establish the least privilege required for all jobs in the workflow unless overridden at the job level. For standard CI/CD workflows (build, test, upload/download artifacts, checkout code), contents: read is sufficient. Place the permissions block below the name: and run-name: and above on: on line 3. This provides security and avoids excessive privileges for the GITHUB_TOKEN. No need to modify any individual jobs, as the root-level block applies to all jobs unless overridden.

tests/dataframe/test_repr.py

tests/dataframe/test_logical_type.py

codecov · 2025-08-20T23:31:50Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.61%. Comparing base (058c6cd) to head (2a10320).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #5009      +/-   ##
==========================================
+ Coverage   76.47%   76.61%   +0.13%     
==========================================
  Files         934      936       +2     
  Lines      128764   128819      +55     
==========================================
+ Hits        98471    98692     +221     
+ Misses      30293    30127     -166

see 21 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

greptile-apps

Greptile Summary

This PR enhances the CI/CD pipeline by adding a quality gate that tests built wheels before publishing them to PyPI and S3. The changes implement optional testing functionality across multiple workflows and improve test robustness to handle missing optional dependencies.

The core enhancement is in the GitHub workflows: build-wheel.yml now includes a test-wheels job that can be optionally enabled via a run_tests parameter. When enabled, this job downloads the built wheels and runs the tests/dataframe test suite across Python 3.9 and 3.11 with both native and Ray runners on Ubuntu. The publish-pypi.yml workflow now sets run_tests: true by default, ensuring wheels are validated before reaching end users. The publish-dev-s3.yml workflow adds the same capability as an optional parameter.

To support reliable testing in CI environments, several test files were updated to handle optional dependencies more gracefully. Files like test_logical_type.py, test_repr.py, and others now use pytest.importorskip('PIL') instead of direct PIL imports, ensuring tests are skipped rather than failing when PIL is unavailable. Similarly, test_creation.py was updated to conditionally define PyExtType and skip tests when PyArrow's PyExtensionType is not available. The integration test configuration in conftest.py was optimized with lazy imports for boto3, numpy, s3fs, and PIL to improve startup performance.

Additional improvements include environment variables to disable analytics and progress bars during CI execution, and enhanced test conditions in test_aggregations.py to ensure proper runner-specific behavior validation. These changes integrate seamlessly with the existing build matrix across different operating systems and architectures while adding this crucial validation step.

Context used:

Rule - Import statements should be placed at the top of the file rather than inline within functions or methods. (link)

_{8 files reviewed, no comments}

_{Edit Code Review Bot Settings | Greptile}

kevinzwang

Could you explain the rationale for running the tests on the built wheel? Is it for checking dependencies? Additionally, wondering why only the dataframe tests.

colin-ho · 2025-09-03T05:46:00Z

Could you explain the rationale for running the tests on the built wheel? Is it for checking dependencies? Additionally, wondering why only the dataframe tests.

Yeah to make sure we didn't accidentally add dependencies. These tests run on a fresh env with pip installed daft.

A lot of the other tests have testing dependencies. I guess we can add 'importorskip' to all of them, but i thought just dataframe is good enough, better than nothing, for now.

## Changes Made Test the built wheel with `tests/dataframe` before publishing. Successful run https://github.com/Eventual-Inc/Daft/actions/runs/17419058418 ## Related Issues  ## Checklist - [ ] Documented in API Docs (if applicable) - [ ] Documented in User Guide (if applicable) - [ ] If adding a new documentation page, doc is added to `docs/mkdocs.yml` navigation - [ ] Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)

colin-ho added 11 commits August 20, 2025 09:28

run tests before publish

8a9f4a1

run tests before publish

b09bd25

uv run + no slack notif

7a4256d

no mac

7454d79

source activate

79b7cdd

only install x86

0ac702a

modify tests

6c6355d

add native and ray

e47f350

add native and ray

3998741

fix

194d58f

propagate changes

2ae1d68

github-actions bot added the ci label Aug 20, 2025

colin-ho added 6 commits August 20, 2025 11:31

fix test

69ed66f

or

68eade8

optional for dev

3e78560

optional for dev

52118df

workflow dispatch

a567ab3

consolidate test in build

24406fe

github-advanced-security bot found potential problems Aug 20, 2025

View reviewed changes

style

fdec78f

colin-ho commented Aug 20, 2025

View reviewed changes

tests/dataframe/test_repr.py Outdated Show resolved Hide resolved

colin-ho commented Aug 20, 2025

View reviewed changes

tests/dataframe/test_logical_type.py Outdated Show resolved Hide resolved

colin-ho added 2 commits August 20, 2025 12:50

import or skip

23de8b3

fix py ext test

2a10320

colin-ho added 3 commits August 27, 2025 22:39

Merge branch 'main' into colin/run-tests-before-publish

343be5f

Merge branch 'main' into colin/run-tests-before-publish

bb41b02

Merge branch main into colin/run-tests-before-publish

4576842

colin-ho marked this pull request as ready for review September 3, 2025 00:31

greptile-apps bot reviewed Sep 3, 2025

View reviewed changes

colin-ho requested review from srilman and kevinzwang September 3, 2025 00:41

kevinzwang reviewed Sep 3, 2025

View reviewed changes

colin-ho added 2 commits September 3, 2025 11:16

Merge branch 'main' into colin/run-tests-before-publish

13a3be0

Merge branch main into colin/run-tests-before-publish

6e0dd0b

colin-ho requested a review from kevinzwang September 3, 2025 18:26

kevinzwang approved these changes Sep 3, 2025

View reviewed changes

colin-ho merged commit e36a830 into main Sep 3, 2025
52 of 54 checks passed

colin-ho deleted the colin/run-tests-before-publish branch September 3, 2025 19:26

@@ -1,5 +1,7 @@
             name: Build Daft wheel
             run-name: 'Build Daft wheel for ${{ inputs.os }}-${{ inputs.arch }}-lts=${{ inputs.lts }}-lto=${{ inputs.use_lto }}'
+            permissions:
+              contents: read
             on:
               workflow_call:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ci: Run tests before publish #5009

ci: Run tests before publish #5009

Uh oh!

colin-ho commented Aug 20, 2025 •

edited

Loading

Uh oh!

Check warning

Copilot Autofix

Uh oh!

Uh oh!

codecov bot commented Aug 20, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

kevinzwang left a comment

Uh oh!

colin-ho commented Sep 3, 2025

Uh oh!

Uh oh!

Uh oh!

ci: Run tests before publish #5009

ci: Run tests before publish #5009

Uh oh!

Conversation

colin-ho commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes Made

Related Issues

Checklist

Uh oh!

Check warning

Copilot Autofix

Uh oh!

Uh oh!

codecov bot commented Aug 20, 2025

Codecov Report

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Summary

Context used:

Uh oh!

kevinzwang left a comment

Choose a reason for hiding this comment

Uh oh!

colin-ho commented Sep 3, 2025

Uh oh!

Uh oh!

Uh oh!

colin-ho commented Aug 20, 2025 •

edited

Loading