Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
245 commits
Select commit Hold shift + click to select a range
12da2f1
First pass at datafusion parsing
jdye64 Mar 2, 2022
c544794
updates
jdye64 Mar 8, 2022
7596144
updates
jdye64 Mar 8, 2022
1b17a6a
updates
jdye64 Mar 8, 2022
c5dbb96
DaskSchema implementation for Python in Rust
jdye64 Mar 9, 2022
5249808
updated mappings so that Python types map to PyArrow types which is t…
jdye64 Mar 9, 2022
b20704e
Add ability to add columns to an existing DaskTable
jdye64 Mar 9, 2022
dc98d7c
Add ability to tables to be added to the DaskSchema
jdye64 Mar 9, 2022
64d2688
Completion of _get_ral() function in dask-sql. Still does not actuall…
jdye64 Mar 9, 2022
5cf8eb0
Finished converting base class and DaskRelDataType and DaskRelDataTyp…
jdye64 Mar 9, 2022
5b4d169
Can make a very simple pass of a projection on a TableScan operation …
jdye64 Mar 9, 2022
02c257c
updates
jdye64 Mar 14, 2022
13cc875
Allow for the rough registration of Schemas to the DaskSQLContext
jdye64 Mar 15, 2022
ca410e4
pytest test_context.py working/checkpoint
jdye64 Mar 16, 2022
1eced89
all unit tests passing/checkpoint
jdye64 Mar 16, 2022
41ffe94
checkpoint
jdye64 Mar 16, 2022
45a568b
Update on test_select.py
jdye64 Mar 21, 2022
c66f0a3
Refactor setup.py
jdye64 Mar 23, 2022
172d3cc
Refactored Rust code to traverse the AST SQL parse tree
jdye64 Mar 24, 2022
55bf4c2
merge updates
jdye64 Mar 25, 2022
afeee32
Datafusion aggregate (#471)
jdye64 Apr 21, 2022
b695daa
Bump DataFusion version (#494)
andygrove Apr 21, 2022
914a48d
Basic DataFusion Select Functionality (#489)
jdye64 Apr 28, 2022
484951e
Allow for Cast parsing and logicalplan (#498)
jdye64 May 3, 2022
672821f
Minor code cleanup in row_type() (#504)
andygrove May 5, 2022
5932f35
Bump Rust version to 1.60 from 1.59 (#508)
jdye64 May 5, 2022
11b2db4
Improve code for getting column name from expression (#509)
andygrove May 9, 2022
1a04ee8
Update exceptions that are thrown (#507)
jdye64 May 10, 2022
346be12
add support for expr_to_field for Expr::Sort expressions (#515)
andygrove May 10, 2022
70d7fb8
reduce crate dependencies (#516)
andygrove May 10, 2022
75116de
Datafusion dsql explain (#511)
ayushdg May 12, 2022
64133e2
Port sort logic to the datafusion planner (#505)
ayushdg May 13, 2022
2c46de6
Add helper method to convert LogicalPlan to Python type (#522)
andygrove May 13, 2022
7030909
Support CASE WHEN and BETWEEN (#502)
jdye64 May 13, 2022
43f862d
Upgrade to DataFusion 8.0.0 (#533)
andygrove May 16, 2022
c1ebe53
Enable passing tests (#539)
jdye64 May 16, 2022
7272093
Datafusion crossjoin (#521)
jdye64 May 19, 2022
f7c57b3
Implement TryFrom for plans (#543)
andygrove May 19, 2022
e10b3f2
Support for LIMIT clause with DataFusion (#529)
jdye64 May 24, 2022
c3d8425
Support Joins using DataFusion planner/parser (#512)
jdye64 May 31, 2022
53f0d0b
Datafusion is not (#557)
jdye64 May 31, 2022
11dcf22
[REVIEW] Add support for `UNION` (#542)
galipremsagar Jun 1, 2022
41cb674
[REVIEW] Fix issue with duplicates in column renaming (#559)
galipremsagar Jun 1, 2022
8d2ac41
enable tests (#560)
galipremsagar Jun 1, 2022
0c30a7b
Add CODEOWNERS file (#562)
charlesbluca Jun 2, 2022
16282ed
Upgrade DataFusion version & support non-equijoin join conditions (#566)
andygrove Jun 6, 2022
2a2b5d1
Add ayushdg and galipremsagar to rust CODEOWNERS (#572)
charlesbluca Jun 7, 2022
d233b9d
Enable DataFusion CBO and introduce DaskSqlOptimizer (#558)
jdye64 Jun 7, 2022
89837d8
Only use the specific DataFusion crates that we need (#568)
andygrove Jun 8, 2022
a52dd7b
Fix some clippy warnings (#574)
andygrove Jun 14, 2022
230d726
Datafusion invalid projection (#571)
jdye64 Jun 14, 2022
453249e
Datafusion upstream merge (#576)
jdye64 Jun 17, 2022
cb1fe52
Datafusion filter (#581)
jdye64 Jun 22, 2022
00855e7
Table_scan column projection (#578)
ayushdg Jun 22, 2022
c9a23b3
Expose groupby agg configs to drop_duplicates (distinct) egg (#575)
ayushdg Jun 22, 2022
7d32f0f
Datafusion year & support for DaskSqlDialect (#585)
jdye64 Jun 23, 2022
2f289d1
Optimization rule to optimize out nulls for inner joins (#588)
jdye64 Jun 23, 2022
ba50d24
Push down null filters into TableScan (#595)
andygrove Jun 23, 2022
e77f339
Datafusion IndexError - Return fields from the lhs and rhs of a join …
jdye64 Jun 25, 2022
48acbea
Datafusion uncomment working filter tests (#601)
jdye64 Jun 25, 2022
e000032
Search all schemas when attempting to locate index by field name (#602)
jdye64 Jun 29, 2022
ac6f940
Fix join condition eval when joining on 3 or more columns (#603)
ayushdg Jun 29, 2022
8056edf
Add inList support (#604)
ayushdg Jun 30, 2022
2ccee5f
Enable Datafusion user defined functions UDFs (#605)
jdye64 Jul 8, 2022
bbeb05c
Datafusion empty relation (#611)
jdye64 Jul 11, 2022
33ecd4b
Add 'not like' rex operation (#615)
jdye64 Jul 11, 2022
8e4a3d1
Enable passing pytests and add simple '!=' operation mapping (#616)
jdye64 Jul 11, 2022
45ed48e
Fix bug when filtering on specific scalars. (#609)
ayushdg Jul 12, 2022
c0d05ac
Datafusion NULL & NOT NULL literals (#618)
jdye64 Jul 12, 2022
29e7bff
Fix the results from a subquery alias operation with optimizations en…
ayushdg Jul 14, 2022
f87aa36
Initial version of contributing guide (#600)
jdye64 Jul 14, 2022
d109af0
Add helper function for converting expression lists to Python (#631)
andygrove Jul 15, 2022
244884a
Plugins support multiply types (#636)
jdye64 Jul 18, 2022
71e2917
Consolidate limit/offset logic in partition func (#598)
charlesbluca Jul 19, 2022
47efa68
Datafusion version bump (#628)
jdye64 Jul 20, 2022
92e99a9
Expand getOperands support to cover all currently available Expr type…
jdye64 Jul 25, 2022
7cf2661
Introduce Inverse Rex Operation (#643)
jdye64 Jul 25, 2022
d3d8ce5
Remove code segment that was causing double the amount of columns to …
jdye64 Jul 25, 2022
2e90c19
Include Columns in Empty DataFrame (#645)
jdye64 Jul 25, 2022
5ac11ae
Bump setuptools-rust from 1.1.1 -> 1.4.1 (#646)
jdye64 Jul 26, 2022
2a7b482
Update with Issue-3002 changes to DataFusion
jdye64 Aug 1, 2022
17afd70
Update to point to Apache
jdye64 Aug 2, 2022
dcd6c83
Merge `main` into `datafusion-sql-planner` (#654)
charlesbluca Aug 2, 2022
475b1b1
Merge branch 'main' of github.com:dask-contrib/dask-sql into datafusi…
charlesbluca Aug 2, 2022
f948429
Port window logic to datafusion (#545)
ayushdg Aug 2, 2022
0dbd48d
Bump DataFusion version
jdye64 Aug 3, 2022
9cd751d
Add support for CREATE TABLE AS
charlesbluca Aug 3, 2022
01314d2
Actually commit the create_memory_table.rs addition
charlesbluca Aug 3, 2022
31cb060
Merge remote-tracking branch 'upstream/datafusion-sql-planner' into i…
jdye64 Aug 3, 2022
e5e5db8
change 3.8 python environment to match 3.9 and 3.10
jdye64 Aug 3, 2022
70045b6
Update github actions to use Rust nightly toolchain
jdye64 Aug 3, 2022
81f230b
Add TODO for dialect-based if_not_exists handling
charlesbluca Aug 3, 2022
342ec48
Move shared logic between CTAS plugin and Context.sql to _compute_tab…
charlesbluca Aug 3, 2022
1977318
Add support for DROP TABLE
charlesbluca Aug 3, 2022
b305bb2
Adjust conda build file to use Rust 1.62.1
jdye64 Aug 4, 2022
3ff6f58
conda syntax adjustment
jdye64 Aug 4, 2022
4a50fc0
Conda testing
jdye64 Aug 4, 2022
d9877fd
Switch out Python-facing panics for PyErr
charlesbluca Aug 4, 2022
666adba
Correct return_futures default value
charlesbluca Aug 4, 2022
604f482
Reimplement original persistance behavior with a TODO
charlesbluca Aug 4, 2022
7a14b89
Replace format! calls with string clones
charlesbluca Aug 4, 2022
7b2509a
Merge branch 'df-create-table-as' into df-drop-table
charlesbluca Aug 4, 2022
f579633
Replace format! call with clone()
charlesbluca Aug 4, 2022
6045176
`COT` function (#657)
sarahyurick Aug 4, 2022
8a6802a
Use local errors/exceptions wherever possible
charlesbluca Aug 4, 2022
7ff73a6
Replace python-facing panics with PyErrs
charlesbluca Aug 4, 2022
5dfb254
Reduce repetition in get_expr_type
charlesbluca Aug 4, 2022
feefa4b
Math functions (#660)
sarahyurick Aug 4, 2022
dfaf58a
Clean up match statements in expressions
charlesbluca Aug 8, 2022
34ada7a
Clean up match statements
charlesbluca Aug 8, 2022
b66143a
Merge branch 'df-create-table-as' into df-drop-table
charlesbluca Aug 8, 2022
2be2012
use datafusion 85d53634c4d26a3b9e545878e377860c14d01d7d
andygrove Aug 8, 2022
fad2e03
Address review comments
jdye64 Aug 8, 2022
24afec8
Merge pull request #662 from charlesbluca/rust-error-string-cleanup
jdye64 Aug 8, 2022
2866fab
Merge pull request #653 from jdye64/invalid-crossjoin-in-plan
jdye64 Aug 8, 2022
c830cb5
Merge pull request #656 from charlesbluca/df-create-table-as
jdye64 Aug 8, 2022
91fd37c
Merge branch 'datafusion-sql-planner' of github.com:dask-contrib/dask…
charlesbluca Aug 8, 2022
6e7089c
Merge branch 'datafusion-sql-planner' into resolve-merge-conflicts
charlesbluca Aug 8, 2022
2b69f8d
Merge pull request #669 from dask-contrib/resolve-merge-conflicts
ayushdg Aug 8, 2022
40073a3
Datafusion expand scalarvalue catchall (#638)
jdye64 Aug 8, 2022
e3dbefa
Merge pull request #670 from dask-contrib/main
charlesbluca Aug 8, 2022
4b291ed
Merge pull request #658 from charlesbluca/df-drop-table
jdye64 Aug 8, 2022
accde51
Remove un-necessary sqlparser dependency and duplicate Dialect defini…
jdye64 Aug 9, 2022
b0c56c9
Point to my upstream fork
jdye64 Aug 9, 2022
b6ba05c
Merge remote-tracking branch 'upstream/datafusion-sql-planner' into d…
jdye64 Aug 9, 2022
700c780
Fix faulty closure for UDF return types
charlesbluca Aug 9, 2022
5d917ac
updates
jdye64 Aug 9, 2022
3f3e054
updates
jdye64 Aug 10, 2022
4cf3aa9
updates
jdye64 Aug 10, 2022
a0f7ba6
Skip remaining failing test
charlesbluca Aug 10, 2022
d9551c1
Merge pull request #672 from charlesbluca/df-resolve-udf-failures
charlesbluca Aug 10, 2022
a8c2a53
introduce DaskParser
jdye64 Aug 11, 2022
29de442
Refactoring updates
jdye64 Aug 11, 2022
5afb461
Uncomment skipped rex pytests (#661)
ayushdg Aug 11, 2022
8aa4ceb
Merge branch 'main' of github.com:dask-contrib/dask-sql into merge-up…
charlesbluca Aug 11, 2022
66c0fea
Merge remote-tracking branch 'upstream/datafusion-sql-planner' into l…
andygrove Aug 11, 2022
af26e58
bump datafusion version again, use arrow 20.0
andygrove Aug 11, 2022
9e53287
Merge pull request #677 from dask-contrib/merge-upstream-main
charlesbluca Aug 11, 2022
a10ebdd
fix some clippy issues (#679)
andygrove Aug 11, 2022
42c81b3
Update to use DFStatement
jdye64 Aug 11, 2022
c8259b9
Merge pull request #667 from andygrove/latest-df
jdye64 Aug 11, 2022
cb2288f
merge upstream
jdye64 Aug 11, 2022
e5eb4b6
updates
jdye64 Aug 12, 2022
9d4b9d1
Merge pull request #685 from dask-contrib/main
charlesbluca Aug 12, 2022
83d6352
updates
jdye64 Aug 12, 2022
08883a8
use latest datafusion
andygrove Aug 15, 2022
ccee927
test for andy
jdye64 Aug 16, 2022
de4c73b
Configure clippy to error on warnings
charlesbluca Aug 16, 2022
0c449c5
Updates for parsing CreateModel
jdye64 Aug 16, 2022
2910f2c
Merge remote-tracking branch 'upstream/datafusion-sql-planner' into d…
jdye64 Aug 16, 2022
0f8b030
Refactoring for crete model parsing
jdye64 Aug 16, 2022
2f5d720
First pass at resolving clippy warnings
charlesbluca Aug 16, 2022
3ac3b59
Fix remaining warnings
charlesbluca Aug 16, 2022
9b83d77
Merge pull request #692 from charlesbluca/clippy-enforce-warnings
jdye64 Aug 16, 2022
9812f75
Add drop model parsing
jdye64 Aug 16, 2022
c1e446f
Merge with upstream/datafusion-sql-planner
jdye64 Aug 16, 2022
c9c6532
clippy warnings
jdye64 Aug 16, 2022
ec7f2a6
Merge pull request #691 from dask-contrib/main
charlesbluca Aug 16, 2022
a148f97
Merge remote-tracking branch 'upstream/datafusion-sql-planner' into d…
jdye64 Aug 16, 2022
e3723d0
Merge remote-tracking branch 'upstream/datafusion-sql-planner' into d…
andygrove Aug 16, 2022
9155194
remove unused import
andygrove Aug 16, 2022
4d25f0c
Updates to schema
jdye64 Aug 16, 2022
81d919e
Merge with upstream/datafusion-sql-planner
jdye64 Aug 16, 2022
3980ff0
Remove pytest-html results and added folder to .gitignore to prevent …
jdye64 Aug 16, 2022
670c1da
[REVIEW] - suggestions for handling panic, and if let statements
jdye64 Aug 16, 2022
bbaf9b3
Remove pytest-html files that should not have been added
jdye64 Aug 16, 2022
e5c85da
Peek for create token to prevent character from being eagerly consume…
jdye64 Aug 16, 2022
54386ad
Add optimizer rules to translate subqueries to joins (#680)
andygrove Aug 16, 2022
dd8399b
Merge pull request #689 from andygrove/df-upgrade-c0b4ba
jdye64 Aug 16, 2022
c2f1087
Merge remote-tracking branch 'upstream/datafusion-sql-planner' into d…
jdye64 Aug 16, 2022
903a770
clippy ignore this function since something is wrong with clippy in t…
jdye64 Aug 16, 2022
110f11c
refactor to use String to make pyo3 happy
jdye64 Aug 16, 2022
131d822
Add STDDEV, STDDEV_SAMP, and STDDEV_POP (#629)
ChrisJar Aug 16, 2022
4db3530
Merge with upstream/datafusion-sql-planner
jdye64 Aug 16, 2022
03045fc
Update drop to cover other conditions
jdye64 Aug 17, 2022
d7d95b7
Merge pull request #693 from jdye64/datafusion-predict-model
jdye64 Aug 17, 2022
86e1c79
Merge pull request #695 from jdye64/datafusion-drop-model
jdye64 Aug 17, 2022
a22cdb9
Support for parsing [or replace] with create [or replace] model (#700)
ayushdg Aug 18, 2022
9a537a8
Parsing logic for SHOW SCHEMAS (#697)
jdye64 Aug 18, 2022
26a0cbe
Support for parsing SHOW TABLES FROM grammar (#699)
jdye64 Aug 22, 2022
3be7323
Merge pull request #708 from dask-contrib/main
charlesbluca Aug 22, 2022
902c694
Enable passing pytests (#709)
jdye64 Aug 22, 2022
b324cd0
Merge branch 'datafusion-sql-planner' into merge-upstream-main
charlesbluca Aug 22, 2022
1339ea3
Introduce 'schema' to the DaskTable instance and modify context.fqn t…
jdye64 Aug 23, 2022
d3529b8
Merge pull request #711 from dask-contrib/merge-upstream-main
charlesbluca Aug 23, 2022
8d40822
Use `compiler` function in nightly recipe, pin to Rust 1.62.1 (#687)
charlesbluca Aug 23, 2022
a10f703
Add test queries to gpuCI checks (#650)
charlesbluca Aug 23, 2022
89b7a3d
Support for DISTRIBUTE BY (#715)
jdye64 Aug 23, 2022
fb26dbd
Datafusion create table with (#714)
jdye64 Aug 23, 2022
583fefa
Make comment for parse_create_table more accurate
charlesbluca Aug 24, 2022
ae3a4ad
Revert "Make comment for parse_create_table more accurate"
charlesbluca Aug 24, 2022
3c13afa
Bump DataFusion to rev 076b42 (#720)
andygrove Aug 24, 2022
f4a7348
[DF] Add support for `CREATE [OR REPLACE] TABLE [IF NOT EXISTS] WITH`…
charlesbluca Aug 24, 2022
b78ea3c
Stop overwriting aggregations on same column (#675)
ChrisJar Aug 24, 2022
6c4c5d3
[DF] Add `TypeCoercion` optimizer rule (#723)
andygrove Aug 25, 2022
c24f639
Support for SHOW COLUMNS syntax (#721)
ayushdg Aug 25, 2022
03f998b
Implment PREDICT parsing and python wiring (#722)
jdye64 Aug 26, 2022
d7d4363
Support all boolean operations (#719)
sarahyurick Aug 26, 2022
f079472
Resolve issue that crept in during code merge and caused build issues…
jdye64 Aug 26, 2022
ca504f4
[DF] Add handling for overloaded UDFs (#682)
charlesbluca Aug 26, 2022
32b4f2a
[DF] Minor quality of life updates to test_queries.py (#730)
charlesbluca Aug 30, 2022
b531341
[DF] Fix `PyExpr.index` bug where it returns `Ok(0)` instead of an `E…
andygrove Aug 30, 2022
2368109
Add Cargo.lock and bump DataFusion rev (#734)
andygrove Aug 30, 2022
dd77334
[DF] Implement `ANALYZE TABLE` (#733)
charlesbluca Aug 30, 2022
8480f06
Merge pull request #735 from dask-contrib/main
charlesbluca Aug 31, 2022
0c289b9
Switch out java dependencies for rust (#737)
charlesbluca Aug 31, 2022
eec1244
{CREATE | USE | DROP} Schema support (#727)
jdye64 Aug 31, 2022
ef8c1d7
Test function `test_aggregate_function` (#738)
sarahyurick Aug 31, 2022
0887e8d
Uncomment more test_model pytests (#728)
ChrisJar Aug 31, 2022
9c534f2
Unskip passing postgres test (#739)
jdye64 Sep 1, 2022
d091b30
Publish datafusion nightlies under dev_datafusion label (#729)
charlesbluca Sep 1, 2022
bbd4c89
Use latest DataFusion (#742)
andygrove Sep 2, 2022
6b52591
[DF] Resolve `test_aggregations` and `test_group_by_all` (#743)
charlesbluca Sep 3, 2022
e1d0538
Merge branch 'datafusion-sql-planner' into merge-upstream-main
charlesbluca Sep 6, 2022
081a102
Merge pull request #745 from dask-contrib/merge-upstream-main
charlesbluca Sep 6, 2022
9273e17
Upgrade to latest DataFusion (#744)
andygrove Sep 9, 2022
27a8c35
Uncomment passing pytests (#750)
ayushdg Sep 9, 2022
d564d0b
[DF] Update DataFusion to pick up SQL support for LIKE, ILIKE, SIMILA…
andygrove Sep 13, 2022
ccd8dbc
Use DataFusion 12.0.0 RC1 (#755)
andygrove Sep 13, 2022
4a133f2
Merge branch 'datafusion-sql-planner' into merge-upstream-main
charlesbluca Sep 13, 2022
329489f
[DF] Optimize away `COUNT DISTINCT` aggregate operations - eliminate_…
andygrove Sep 13, 2022
4d44ffb
Resolve xpassing queries
charlesbluca Sep 14, 2022
87a6681
Merge pull request #757 from dask-contrib/merge-upstream-main
charlesbluca Sep 14, 2022
7ec9812
Upgrade pyo (#762)
andygrove Sep 14, 2022
5da63a9
Merge branch 'datafusion-sql-planner' into merge-upstream-main
charlesbluca Sep 15, 2022
862b901
Merge pull request #764 from dask-contrib/merge-upstream-main
charlesbluca Sep 15, 2022
0577f8e
[DF] Switch back to architectured builds (#765)
charlesbluca Sep 15, 2022
6676574
Remove python constraint (#766)
charlesbluca Sep 15, 2022
da1b31f
[DF] Generalize `CREATE | PREDICT MODEL` to accept non-native `SELECT…
charlesbluca Sep 19, 2022
312b01b
Use DataFusion 12.0.0 (#767)
andygrove Sep 19, 2022
fc1507b
[DF] Use correct schema in TableProvider (#769)
andygrove Sep 19, 2022
a8241b8
Update docs (#768)
sarahyurick Sep 19, 2022
4a0fbc5
[DF] Add support for switching schema in DaskSqlContext (#770)
andygrove Sep 19, 2022
e683bab
Merge pull request #775 from dask-contrib/main
charlesbluca Sep 20, 2022
3cab31c
c.ipython_magic fix for Jupyter Lab (#772)
sarahyurick Sep 20, 2022
150e374
Remove PyPI release workflow
charlesbluca Sep 20, 2022
13db840
Merge pull request #776 from dask-contrib/remove-pypi-release
charlesbluca Sep 20, 2022
509a091
Merge pull request #778 from dask-contrib/main
charlesbluca Sep 20, 2022
243b247
[DF] Support complex queries with multiple DISTINCT aggregates (#759)
andygrove Sep 20, 2022
528108c
[DF] WIP: Upgrade to DataFusion rev 8ea59a (#779)
andygrove Sep 21, 2022
2af0b06
[DF] Add support for filtered groupby aggregations (#760)
charlesbluca Sep 21, 2022
0570b4b
[DF] Fix regressions in `EliminateAggDistinct`, run `cargo test` in C…
andygrove Sep 21, 2022
3c000b6
[DF] Fix remaining regressions in optimizer rule (#784)
andygrove Sep 21, 2022
8470413
Resolve `test_stats_aggregation` (#746)
sarahyurick Sep 21, 2022
033d2f4
[DF] Enable some distinct agg tests (#786)
andygrove Sep 21, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# global codeowners
* @ayushdg @charlesbluca @galipremsagar

# rust codeowners
dask_planner/ @ayushdg @galipremsagar @jdye64
17 changes: 17 additions & 0 deletions .github/actions/setup-builder/action.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: Prepare Rust Builder
description: 'Prepare Rust Build Environment'
inputs:
rust-version:
description: 'version of rust to install (e.g. stable)'
required: true
default: 'stable'
runs:
using: "composite"
steps:
- name: Setup Rust toolchain
shell: bash
run: |
echo "Installing ${{ inputs.rust-version }}"
rustup toolchain install ${{ inputs.rust-version }}
rustup default ${{ inputs.rust-version }}
rustup component add rustfmt
15 changes: 13 additions & 2 deletions .github/workflows/conda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,18 @@ on:
push:
branches:
- main
- datafusion-sql-planner
pull_request:
paths:
- setup.py
- dask_planner/Cargo.toml
- dask_planner/Cargo.lock
- dask_planner/pyproject.toml
- dask_planner/rust-toolchain.toml
- continuous_integration/recipe/**
- .github/workflows/conda.yml
schedule:
- cron: '0 0 * * 0'

# When this workflow is queued, automatically cancel any previous running
# or pending jobs from the same branch
Expand Down Expand Up @@ -49,12 +60,12 @@ jobs:
- name: Upload conda package
if: |
github.event_name == 'push'
&& github.ref == 'refs/heads/main'
&& github.repository == 'dask-contrib/dask-sql'
env:
ANACONDA_API_TOKEN: ${{ secrets.DASK_CONDA_TOKEN }}
LABEL: ${{ github.ref == 'refs/heads/datafusion-sql-planner' && 'dev_datafusion' || 'dev' }}
run: |
# install anaconda for upload
mamba install anaconda-client

anaconda upload --label dev noarch/*.tar.bz2
anaconda upload --label $LABEL linux-64/*.tar.bz2
72 changes: 72 additions & 0 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
name: Rust

on:
# always trigger on PR
push:
pull_request:
# manual trigger
# https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow
workflow_dispatch:

jobs:
# Check crate compiles
linux-build-lib:
name: cargo check
runs-on: ubuntu-latest
container:
image: amd64/rust
env:
# Disable full debug symbol generation to speed up CI build and keep memory down
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
steps:
- uses: actions/checkout@v3
- name: Cache Cargo
uses: actions/cache@v3
with:
# these represent dependencies downloaded by cargo
# and thus do not depend on the OS, arch nor rust version.
path: /github/home/.cargo
key: cargo-cache-
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: stable
- name: Check workspace in debug mode
run: |
cd dask_planner
cargo check
- name: Check workspace in release mode
run: |
cd dask_planner
cargo check --release

# test the crate
linux-test:
name: cargo test (amd64)
needs: [linux-build-lib]
runs-on: ubuntu-latest
container:
image: amd64/rust
env:
# Disable full debug symbol generation to speed up CI build and keep memory down
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
steps:
- uses: actions/checkout@v3
with:
submodules: true
- name: Cache Cargo
uses: actions/cache@v3
with:
path: /github/home/.cargo
# this key equals the ones on `linux-build-lib` for re-use
key: cargo-cache-
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: stable
- name: Run tests
run: |
cd dask_planner
cargo test
98 changes: 31 additions & 67 deletions .github/workflows/test-upstream.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,58 +10,24 @@ defaults:
shell: bash -l {0}

jobs:
build:
# This build step should be similar to the deploy build, to make sure we actually test
# the future deployable
name: Build the jar on ubuntu
runs-on: ubuntu-latest
if: github.repository == 'dask-contrib/dask-sql'
steps:
- uses: actions/checkout@v2
- name: Cache local Maven repository
uses: actions/cache@v2
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-v1-jdk11-${{ hashFiles('**/pom.xml') }}
- name: Set up Python
uses: conda-incubator/setup-miniconda@v2
with:
miniforge-variant: Mambaforge
use-mamba: true
python-version: "3.8"
channel-priority: strict
activate-environment: dask-sql
environment-file: continuous_integration/environment-3.8-jdk11-dev.yaml
- name: Install dependencies and build the jar
run: |
python setup.py build_ext
- name: Upload the jar
uses: actions/upload-artifact@v1
with:
name: jar
path: dask_sql/jar/DaskSQL.jar

test-dev:
name: "Test upstream dev (${{ matrix.os }}, java: ${{ matrix.java }}, python: ${{ matrix.python }})"
needs: build
name: "Test upstream dev (${{ matrix.os }}, python: ${{ matrix.python }})"
runs-on: ${{ matrix.os }}
if: github.repository == 'dask-contrib/dask-sql'
env:
CONDA_FILE: continuous_integration/environment-${{ matrix.python }}-jdk${{ matrix.java }}-dev.yaml
CONDA_FILE: continuous_integration/environment-${{ matrix.python }}-dev.yaml
defaults:
run:
shell: bash -l {0}
strategy:
fail-fast: false
matrix:
java: [8, 11]
os: [ubuntu-latest, windows-latest]
python: ["3.8", "3.9", "3.10"]
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0 # Fetch all history for all branches and tags.
- name: Cache local Maven repository
uses: actions/cache@v2
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-v1-jdk${{ matrix.java }}-${{ hashFiles('**/pom.xml') }}
- name: Set up Python
uses: conda-incubator/setup-miniconda@v2
with:
Expand All @@ -72,21 +38,21 @@ jobs:
channels: dask/label/dev,conda-forge,nodefaults
activate-environment: dask-sql
environment-file: ${{ env.CONDA_FILE }}
- name: Download the pre-build jar
uses: actions/download-artifact@v1
- name: Setup Rust Toolchain
uses: actions-rs/toolchain@v1
id: rust-toolchain
with:
name: jar
path: dask_sql/jar/
toolchain: stable
override: true
- name: Build the Rust DataFusion bindings
run: |
python setup.py build install
- name: Install hive testing dependencies for Linux
if: matrix.os == 'ubuntu-latest'
run: |
mamba install -c conda-forge sasl>=0.3.1
docker pull bde2020/hive:2.3.2-postgresql-metastore
docker pull bde2020/hive-metastore-postgresql:2.3.0
- name: Set proper JAVA_HOME for Windows
if: matrix.os == 'windows-latest'
run: |
echo "JAVA_HOME=${{ env.CONDA }}\envs\dask-sql\Library" >> $GITHUB_ENV
- name: Install upstream dev Dask / dask-ml
run: |
mamba update dask
Expand All @@ -101,11 +67,6 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Cache local Maven repository
uses: actions/cache@v2
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-v1-jdk11-${{ hashFiles('**/pom.xml') }}
- name: Set up Python
uses: conda-incubator/setup-miniconda@v2
with:
Expand All @@ -115,12 +76,16 @@ jobs:
channel-priority: strict
channels: dask/label/dev,conda-forge,nodefaults
activate-environment: dask-sql
environment-file: continuous_integration/environment-3.9-jdk11-dev.yaml
- name: Download the pre-build jar
uses: actions/download-artifact@v1
environment-file: continuous_integration/environment-3.9-dev.yaml
- name: Setup Rust Toolchain
uses: actions-rs/toolchain@v1
id: rust-toolchain
with:
name: jar
path: dask_sql/jar/
toolchain: stable
override: true
- name: Build the Rust DataFusion bindings
run: |
python setup.py build install
- name: Install cluster dependencies
run: |
mamba install python-blosc lz4 -c conda-forge
Expand Down Expand Up @@ -151,23 +116,22 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Cache local Maven repository
uses: actions/cache@v2
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-v1-jdk11-${{ hashFiles('**/pom.xml') }}
- name: Set up Python
uses: conda-incubator/setup-miniconda@v2
with:
python-version: "3.8"
mamba-version: "*"
channels: dask/label/dev,conda-forge,nodefaults
channel-priority: strict
- name: Download the pre-build jar
uses: actions/download-artifact@v1
- name: Setup Rust Toolchain
uses: actions-rs/toolchain@v1
id: rust-toolchain
with:
name: jar
path: dask_sql/jar/
toolchain: stable
override: true
- name: Build the Rust DataFusion bindings
run: |
python setup.py build install
- name: Install upstream dev Dask / dask-ml
if: needs.detect-ci-trigger.outputs.triggered == 'true'
run: |
Expand Down
Loading