Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP]: Datapipe ml opts: добавлена динамическая фильтрация индексов #308

Open
wants to merge 51 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
809e812
Merge branch 'fix-join-of-aux-table' into fix-asc-desc-merge-fix-join…
bobokvsky Sep 26, 2023
912c19b
Merge branch 'add-kwargs-to-dbconn' into fix-asc-desc-merge-fix-join-…
bobokvsky Sep 26, 2023
17a6213
Merge branch 'add-kwargs-to-dbconn' into fix-asc-desc-merge-fix-join-…
bobokvsky Sep 26, 2023
effbc5a
add executor_config to DatatableBatchTransform
bobokvsky Oct 6, 2023
17b2d4d
Merge branch 'fix-join-of-aux-table' into datapipe-ml-opts
bobokvsky Oct 19, 2023
295eb89
refactor filters as list of LabelDict
bobokvsky Oct 23, 2023
43ad666
added Filters to str as some table support
bobokvsky Oct 23, 2023
4064147
added filters to cli
bobokvsky Oct 23, 2023
7daaae0
fix filter in transform tables
bobokvsky Oct 23, 2023
fdde324
fix or_ -> and_
bobokvsky Oct 23, 2023
abf3725
fix or_ -> and_
bobokvsky Oct 23, 2023
0df0f22
*
bobokvsky Nov 13, 2023
e7f9578
Merge branch 'v0.13' into datapipe-ml-opts
bobokvsky Jan 22, 2024
e4db5ea
fix delete_stale
bobokvsky Feb 1, 2024
9997d04
fix suffix problem
bobokvsky Apr 1, 2024
55849d0
fix bug when reading multiply suffixes
bobokvsky Jul 15, 2024
208ddf8
fix2
bobokvsky Jul 15, 2024
f46cc66
Merge branch 'master' into datapipe-ml-opts
bobokvsky Aug 12, 2024
b579061
*
bobokvsky Aug 12, 2024
3dc9de4
WIPg
bobokvsky Aug 14, 2024
526b1ab
fix typing
bobokvsky Aug 14, 2024
50c8f9f
mypy fixs + add IndexDF support
bobokvsky Aug 16, 2024
0008bf9
add tests, part 1
bobokvsky Aug 16, 2024
181cf40
fix tests
bobokvsky Aug 19, 2024
a775700
fix mypyg
bobokvsky Aug 19, 2024
fab83ca
sql filters change
bobokvsky Aug 19, 2024
a458dce
fix tests
bobokvsky Aug 19, 2024
9ec0675
fix test
bobokvsky Aug 19, 2024
220e344
fix tests
bobokvsky Aug 19, 2024
c4b97fb
fix filedir
bobokvsky Aug 19, 2024
18d3f93
revert changes
bobokvsky Aug 19, 2024
ecc68fd
refactoring filters
bobokvsky Aug 19, 2024
0838d0b
rename function
bobokvsky Aug 19, 2024
d2045ad
fix tests
bobokvsky Aug 19, 2024
7310152
rm print
bobokvsky Aug 19, 2024
267ca06
fix tests
bobokvsky Aug 19, 2024
a61aedb
add tests examples
bobokvsky Aug 19, 2024
38ba3a8
fix tests
bobokvsky Aug 19, 2024
255756c
Merge remote-tracking branch 'origin/master' into datapipe-ml-opts
elephantum Aug 20, 2024
d1f6a99
add test_complex_transform_with_filters2
bobokvsky Aug 20, 2024
586afc0
fix tests
bobokvsky Aug 20, 2024
33b482d
fix tests, added new ValueError
bobokvsky Aug 20, 2024
bf69bf8
add new tests
bobokvsky Aug 20, 2024
5cb433e
*
elephantum Aug 22, 2024
4c98525
Merge branch 'datapipe-ml-opts' of github.com:epoch8/datapipe into da…
elephantum Aug 22, 2024
195c808
Merge branch 'master' into datapipe-ml-opts-merge-v0.14.1-alpha.1
bobokvsky Sep 5, 2024
17cffbb
fix tests
bobokvsky Sep 5, 2024
d95b2af
fix mypy
bobokvsky Sep 5, 2024
97dd2d2
fix tests
bobokvsky Sep 6, 2024
b1de4b8
*
bobokvsky Sep 6, 2024
8055860
removed Keys from filters must be in transform_keys error
bobokvsky Sep 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix mypy
bobokvsky committed Sep 5, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
commit d95b2afe6791c6c1f642107a4fbddcf3b0e25a9c
8 changes: 4 additions & 4 deletions datapipe/meta/sql_meta.py
Original file line number Diff line number Diff line change
@@ -389,8 +389,8 @@ def get_stale_idx(
)
)

sql = sql_apply_runconfig_filters_to_subquery(
sql, self.primary_keys, run_config
sql = sql_apply_runconfig_filters(
sql, self.sql_table, self.primary_keys, run_config
)

with self.dbconn.con.begin() as con:
@@ -649,8 +649,8 @@ def mark_all_rows_unprocessed(
.where(self.sql_table.c.is_success == True)
)

sql = sql_apply_runconfig_filters_to_subquery(
update_sql, self.primary_keys, run_config
sql = sql_apply_runconfig_filters(
update_sql, self.sql_table, self.primary_keys, run_config
)

# execute
7 changes: 4 additions & 3 deletions datapipe/sql_util.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from collections import defaultdict
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List, Optional, cast

import pandas as pd
from sqlalchemy import Column, Integer, String, Table, column, tuple_
@@ -43,8 +43,9 @@ def sql_apply_runconfig_filters(
) -> Any:
if run_config is not None:
filters_idx = pd.DataFrame(run_config.filters)
keys = [key for key in table.c if key in keys]
sql = sql_apply_idx_filter_to_table(sql, table, keys, filters_idx)
primary_keys = [key for key in keys if key in table.c]
if len(filters_idx) > 0 and len(keys) > 0:
sql = sql_apply_idx_filter_to_table(sql, table, primary_keys, cast(IndexDF, filters_idx))

return sql