Skip to content

Conversation

@charlesbluca
Copy link
Collaborator

@charlesbluca charlesbluca commented Jul 21, 2022

It looks like the recent release of Fugue 0.7.0 has bumped its qpd dependency to a version that only has python support up to 3.8. I'm not sure if this is the cause for the recent Fugue-related failures, but it does mean that at least for now, we should constrain to fugue<0.7.0, where 3.9+ support is guaranteed.

In the long run, we should probably see what the blockers are to allowing 3.9+ support on qpd again, cc @goodwanghan in case you have any additional context to provide here.

@charlesbluca charlesbluca changed the title Remove antlr4-python3-runtime constraint from 3.9+ test envs Constrain fugue<0.7.0 for 3.9+ test envs Jul 21, 2022
@charlesbluca charlesbluca changed the title Constrain fugue<0.7.0 for 3.9+ test envs Add max version constraint for fugue Jul 21, 2022
@goodwanghan
Copy link
Contributor

Hi, I apologize for the inconvenience, the qpd constraint should not exist. How about I make the change on the QPD side so that you don't need this PR?

@charlesbluca
Copy link
Collaborator Author

@goodwanghan that would be great! I'm not 100% sure if that's the cause of the failures we've been seeing in CI, but resolving that would certainly narrow things down a lot 🙂

@charlesbluca
Copy link
Collaborator Author

It looks like the failures here persist even with the pip package of qpd=0.3.1 (which doesn't have the python<3.9 constraint) 😕 so think for now we might want to go ahead and merge this in to unblock tests, and follow up later with a resolution for whatever the underlying issue is.

@ayushdg
Copy link
Collaborator

ayushdg commented Jul 22, 2022

It looks like the failures here persist even with the pip package of qpd=0.3.1 (which doesn't have the python<3.9 constraint) 😕 so think for now we might want to go ahead and merge this in to unblock tests, and follow up later with a resolution for whatever the underlying issue is.

I see so the issue isn't necessarily qpd and python 3.9+ related but still shows up only on python 3.9 and above right?

@goodwanghan
Copy link
Contributor

I think the problem is on qpd,

 classifiers=[
        # "3 - Alpha", "4 - Beta" or "5 - Production/Stable"
        "Development Status :: 3 - Alpha",
        "Intended Audience :: Developers",
        "Topic :: Software Development :: Libraries :: Python Modules",
        "License :: OSI Approved :: Apache Software License",
        "Programming Language :: Python :: 3.6",
        "Programming Language :: Python :: 3.7",
        "Programming Language :: Python :: 3.8",
        "Programming Language :: Python :: 3 :: Only",
    ],

I will go ahead adding the versions and publish 0.3.2 to unblock you guys

@goodwanghan
Copy link
Contributor

I just published qpd 0.3.2 can you bump up the version to see if that works for you?

@charlesbluca
Copy link
Collaborator Author

charlesbluca commented Jul 22, 2022

so the issue isn't necessarily qpd and python 3.9+ related but still shows up only on python 3.9 and above right?

@ayushdg yeah the issue doesn't seem to be tied to qpd, seems like a larger issue with fugue 0.7.0; for clarity this problem also impacts python 3.8, but we don't see those failures in our CI because our 3.8 testing is pinned to our minimum dependencies (#555) and doesn't pick up the latest fugue package.

I just published qpd 0.3.2 can you bump up the version to see if that works for you?

@goodwanghan looks like you published a new package to PyPI, but the problem here is with the package on conda-forge; to illustrate, looking at qpd 0.3.1 on PyPI, we can see that the python_requires is >=3.6, while the conda-forge recipe for 0.3.1 specifies python>=3.6,<3.9. We'll need to bump the recipe there and publish the new conda package to get those changes reflected here.

In either case, I tried running the tests on python 3.8 with fugue 0.7.0 and the conda package of qpd 0.3.1 and still ran into the same failures, so I think things here will still be broken once the new package is published; I'm happy to dig into this a little more over the next few days, but think a quick fix to at least unblock things in CI is to constrain to older fugue versions for now.

Copy link
Collaborator

@ayushdg ayushdg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion but lgtm!

@codecov-commenter
Copy link

codecov-commenter commented Jul 22, 2022

Codecov Report

Merging #639 (c66a095) into main (0db4506) will increase coverage by 0.14%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main     #639      +/-   ##
==========================================
+ Coverage   88.45%   88.60%   +0.14%     
==========================================
  Files          69       69              
  Lines        3500     3500              
  Branches      707      707              
==========================================
+ Hits         3096     3101       +5     
+ Misses        317      308       -9     
- Partials       87       91       +4     
Impacted Files Coverage Δ
dask_sql/_version.py 34.00% <0.00%> (+1.44%) ⬆️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@charlesbluca charlesbluca merged commit ddc26ee into dask-contrib:main Jul 22, 2022
goodwanghan pushed a commit to goodwanghan/dask-sql that referenced this pull request Jul 23, 2022
* Remove antlr4-python3-runtime constraint from 3.9+ test envs

* Revert "Remove antlr4-python3-runtime constraint from 3.9+ test envs"

This reverts commit ef30656.

* Add max version constraint for fugue in 3.9+ envs

* Constrain Fugue in remaining env/setup files

* Clarify fugue constraint comments

* Add pinning back to python 3.8 jdk11 tests

* More reversions to python 3.8 jdk11 testing env
charlesbluca added a commit that referenced this pull request Jul 26, 2022
* Set Dask-sql as the default Fugue Dask engine when installed

* Set Dask-sql as the default Fugue Dask engine when installed

* Add max version constraint for `fugue` (#639)

* Remove antlr4-python3-runtime constraint from 3.9+ test envs

* Revert "Remove antlr4-python3-runtime constraint from 3.9+ test envs"

This reverts commit ef30656.

* Add max version constraint for fugue in 3.9+ envs

* Constrain Fugue in remaining env/setup files

* Clarify fugue constraint comments

* Add pinning back to python 3.8 jdk11 tests

* More reversions to python 3.8 jdk11 testing env

* update

* update

* update

* fix tests

* update tests

* update a few things

* update

* fix

* conda install fugue in testing envs

* Remove diff from features notebook

* Alter documentation to mention automatic registration of execution engine

* Expand FugueSQL notebook

* Don't manually close client in simple statement test

Co-authored-by: Charles Blackmon-Luca <[email protected]>
charlesbluca added a commit that referenced this pull request Aug 2, 2022
* Add basic predicate-pushdown optimization (#433)

* basic predicate-pushdown support

* remove explict Dispatch class

* use _Frame.fillna

* cleanup comments

* test coverage

* improve test coverage

* add xfail test for dt accessor in predicate and fix test_show.py

* fix some naming issues

* add config and use assert_eq

* add logging events when predicate-pushdown bails

* move bail logic earlier in function

* address easier code review comments

* typo fix

* fix creation_info access bug

* convert any expression to DNF

* csv test coverage

* include IN coverage

* improve test rigor

* address code review

* skip parquet tests when deps are not installed

* fix bug

* add pyarrow dep to cluster workers

* roll back test skipping changes

Co-authored-by: Charles Blackmon-Luca <[email protected]>

* Add workflow to keep datafusion dev branch up to date (#440)

* Update gpuCI `RAPIDS_VER` to `22.06` (#434)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Bump black to 22.3.0 (#443)

* Check for ucx-py nightlies when updating gpuCI (#441)

* Simplify gpuCI updating workflow

* Add check for cuML nightly version

* Add handling for newer `prompt_toolkit` versions in cmd tests (#447)

* Add handling for newer prompt-toolkit version

* Place compatibility code in _compat

* Fix version for gha-find-replace (#446)

* Update versions of Java dependencies (#445)

* Update versions for java dependencies with cves

* Rerun tests

* Update jackson databind version (#449)

* Update versions for java dependencies with cves

* Rerun tests

* update jackson-databind dependency

* Disable SQL server functionality (#448)

* Disable SQL server functionality

* Update docs/source/server.rst

Co-authored-by: Ayush Dattagupta <[email protected]>

* Disable server at lowest possible level

* Skip all server tests

* Add tests to ensure server is disabled

* Fix CVE fix test

Co-authored-by: Ayush Dattagupta <[email protected]>

* Update dask pinnings for release (#450)

* Add Java source code to source distribution (#451)

* Bump `httpclient` dependency (#453)

* Revert "Disable SQL server functionality (#448)"

This reverts commit 37a3a61.

* Bump httpclient version

* Unpin Dask/distributed versions (#452)

* Unpin dask/distributed post release

* Remove dask/distributed version ceiling

* Add jsonschema to ci testing (#454)

* Add jsonschema to ci env

* Fix typo in config schema

* Switch tests from `pd.testing.assert_frame_equal` to `dd.assert_eq` (#365)

* Start moving tests to dd.assert_eq

* Use assert_eq in datetime filter test

* Resolve most resulting test failures

* Resolve remaining test failures

* Convert over tests

* Convert more tests

* Consolidate select limit cpu/gpu test

* Remove remaining assert_series_equal

* Remove explicit cudf imports from many tests

* Resolve rex test failures

* Remove some additional compute calls

* Consolidate sorting tests with getfixturevalue

* Fix failed join test

* Remove breakpoint

* Use custom assert_eq function for tests

* Resolve test failures / seg faults

* Remove unnecessary testing utils

* Resolve local test failures

* Generalize RAND test

* Avoid closing client if using independent cluster

* Fix failures on Windows

* Resolve black failures

* Make random test variables more clear

* Set max pin on antlr4-python-runtime  (#456)

* Set max pin on antlr4-python-runtime due to incompatibilities with fugue_sql

* update comment on antlr max pin version

* Move / minimize number of cudf / dask-cudf imports (#480)

* Move / minimize number of cudf / dask-cudf imports

* Add tests for GPU-related errors

* Fix unbound local error

* Fix ddf value error

* Use `map_partitions` to compute LIMIT / OFFSET (#517)

* Use map_partitions to compute limit / offset

* Use partition_info to extract partition_index

* Use `dev` images for independent cluster testing (#518)

* Switch to dask dev images

* Use mamba for conda installs in images

* Remove sleep call for installation

* Use timeout / until to wait for cluster to be initialized

* Add documentation for FugueSQL integrations (#523)

* Add documentation for FugueSQL integrations

* Minor nitpick around autodoc obj -> class

* Timestampdiff support (#495)

* added timestampdiff

* initial work for timestampdiff

* Added test cases for timestampdiff

* Update interval month dtype mapping

* Add datetimesubOperator

* Uncomment timestampdiff literal tests

* Update logic for handling interval_months for pandas/cudf series and scalars

* Add negative diff testcases, and gpu tests

* Update reinterpret and timedelta to explicitly cast to int64 instead of int

* Simplify cast_column_to_type mapping logic

* Add scalar handling to castOperation and reuse it for reinterpret

Co-authored-by: rajagurnath <[email protected]>

* Relax jsonschema testing dependency (#546)

* Update upstream testing workflows (#536)

* Use dask nightly conda packages for upstream testing

* Add independent cluster testing to nightly upstream CI [test-upstream]

* Remove unnecessary dask install [test-upstream]

* Remove strict channel policy to allow nightly dask installs

* Use nightly Dask packages in independent cluster test [test-upstream]

* Use channels argument to install Dask conda nightlies [test-upstream]

* Fix channel expression

* [test-upstream]

* Need to add mamba update command to get dask conda nightlies

* Use conda nightlies for dask-sql import test

* Add import test to upstream nightly tests

* [test-upstream]

* Make sure we have nightly Dask for import tests [test-upstream]

* Fix pyarrow / cloudpickle failures in cluster testing (#553)

* Explicitly install libstdcxx-ng in clusters

* Make pyarrow dependency consistent across testing

* Make libstdcxx-ng dep a min version

* Add cloudpickle to cluster dependencies

* cloudpickle must be in the scheduler environment

* Bump cloudpickle version

* Move cloudpickle install to workers

* Fix pyarrow constraint in cluster spec

* Use bash -l as default entrypoint for all jobs (#552)

* Constrain dask/distributed for release (#563)

* Unpin dask/distributed for development (#564)

* Unpin dask/distributed post release

* Remove dask/distributed version ceiling

* update dask-sphinx-theme (#567)

* Make sure scheduler has Dask nightlies in upstream cluster testing (#573)

* Make sure scheduler has Dask nightlies in upstream cluster testing

* empty commit to [test-upstream]

* Update gpuCI `RAPIDS_VER` to `22.08` (#565)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Modify test environment pinnings to cover minimum versions (#555)

* Remove black/isort deps as we prefer pre-commit

* Unpin all non python/jdk dependencies

* Minor package corrections for py3.9 jdk11 env

* Set min version constraints for all non-testing dependencies

* Pin all non-test deps for 3.8 testing

* Bump sklearn min version to 1.0.0

* Bump pyarrow min version to 1.0.1

* Fix pip notation for fugue

* Use unpinned deps for cluster testing for now

* Add fugue deps to environments, bump pandas to 1.0.2

* Add back antlr4 version ceiling

* Explicitly mark all fugue dependencies

* Alter test_analyze to avoid rtol

* Bump pandas to 1.0.5 to fix upstream numpy issues

* Alter datetime casting util to dodge panda casting failures

* Bump pandas to 1.1.0 for groupby dropna support

* Simplify string dtype check for get_supported_aggregations

* Add check_dtype=False back to test_group_by_nan

* Bump cluster to python 3.9

* Bump fastapi to 0.69.0, resolve remaining JDBC failures

* Typo - correct pandas version

* Generalize test_multi_case_when's dtype check

* Bump pandas to 1.1.1 to resolve flaky test failures

* Constrain mlflow for windows python 3.8 testing

* Selectors don't work for conda env files

* Problems seem to persist in 1.1.1, bump to 1.1.2

* Remove accidental debug changes

* [test-upstream]

* Use python 3.9 for upstream cluster testing [test-upstream]

* Updated missed pandas pinning

* Unconstrain mlflow to see if Windows failures persist

* Add min version for protobuf

* Bump pyarrow min version to allow for newer protobuf versions

* Don't move jar to local mvn repo (#579)

* Add max version constraint for `fugue` (#639)

* Remove antlr4-python3-runtime constraint from 3.9+ test envs

* Revert "Remove antlr4-python3-runtime constraint from 3.9+ test envs"

This reverts commit ef30656.

* Add max version constraint for fugue in 3.9+ envs

* Constrain Fugue in remaining env/setup files

* Clarify fugue constraint comments

* Add pinning back to python 3.8 jdk11 tests

* More reversions to python 3.8 jdk11 testing env

* Add environment file & documentation for GPU tests  (#633)

* Add gpuCI environment file

* Add documentation for GPU tests / environment

* Add GPU testing to docs page

* Validate UDF metadata (#641)

* initial

* improvements

* bugfixes

* Move UDF validation to registration, cache relevant info

Co-authored-by: Charles Blackmon-Luca <[email protected]>

* Set Dask-sql as the default Fugue Dask engine when installed (#640)

* Set Dask-sql as the default Fugue Dask engine when installed

* Set Dask-sql as the default Fugue Dask engine when installed

* Add max version constraint for `fugue` (#639)

* Remove antlr4-python3-runtime constraint from 3.9+ test envs

* Revert "Remove antlr4-python3-runtime constraint from 3.9+ test envs"

This reverts commit ef30656.

* Add max version constraint for fugue in 3.9+ envs

* Constrain Fugue in remaining env/setup files

* Clarify fugue constraint comments

* Add pinning back to python 3.8 jdk11 tests

* More reversions to python 3.8 jdk11 testing env

* update

* update

* update

* fix tests

* update tests

* update a few things

* update

* fix

* conda install fugue in testing envs

* Remove diff from features notebook

* Alter documentation to mention automatic registration of execution engine

* Expand FugueSQL notebook

* Don't manually close client in simple statement test

Co-authored-by: Charles Blackmon-Luca <[email protected]>

* Add Rust setup to upstream testing workflow

* Resolve style failures

* Bump fugue version in CI envs

* Add back scalar case for cast operation

* Resolve UDF failures

* Resolve UDF failures for windows

* Remove calcite-specific reinterpret

Co-authored-by: Richard (Rick) Zamora <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ayush Dattagupta <[email protected]>
Co-authored-by: rajagurnath <[email protected]>
Co-authored-by: Sarah Charlotte Johnson <[email protected]>
Co-authored-by: ksonj <[email protected]>
Co-authored-by: brandon-b-miller <[email protected]>
Co-authored-by: Han Wang <[email protected]>
@charlesbluca charlesbluca deleted the resolve-fugue-failures branch August 31, 2022 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants