Update README, licensing, build scripts #44

alexbaden · 2022-06-28T14:58:52Z

No description provided.

aregm · 2022-06-29T20:46:25Z

README.md

@@ -1,21 +1,105 @@
-# Heterogeneous Data Kernels
+# oneAPI Heterogeneous Data Kernels
+oneHDK is a low-level execution library for analytic data processing. 


data analytics processing? Swap the adjective.

aregm · 2022-06-29T20:48:01Z

README.md

-Cloning a project with submodules. Either use `git clone --recurse-submodules` to clone the repo, or clone as normal and then run:
+### Storage
+
+`ArrowStorage` is the default (and only available) HDK storage layer. `ArrowStorage` provides storage support for [Apache Arrow](https://github.com/apache/arrow) format data. The storage layer must be explicitly initialized:


aregm · 2022-06-29T20:48:21Z

README.md

+HDK is used as a fast execution backend in [Modin](https://github.com/intel-ai/modin). The HDK library provides a set of components for federating analytic queries to an execution backend based on [OmniSciDB](https://github.com/intel-ai/omniscidb). Currently, HDK targets OLAP-style queries expressed as relational algebra or SQL. Major and immediate project priorities include:
+- Introducing a HDK-specific IR and set of optimizations to reduce reliance on RelAlg and improve extensibility of the query API. 
+- Supporting heterogeneous device execution, where a query is split across a set of hardware devices (e.g. CPU and GPU) for best performance. We have developed an initial cost model for heterogeneous execution.
+- Improving performance of the CPU backend on Modin-specific queries and current-generation data science workstations and servers. 


aregm · 2022-06-29T20:48:44Z

README.md

-A low-level execution library for analytic data processing. 
+HDK is used as a fast execution backend in [Modin](https://github.com/intel-ai/modin). The HDK library provides a set of components for federating analytic queries to an execution backend based on [OmniSciDB](https://github.com/intel-ai/omniscidb). Currently, HDK targets OLAP-style queries expressed as relational algebra or SQL. Major and immediate project priorities include:
+- Introducing a HDK-specific IR and set of optimizations to reduce reliance on RelAlg and improve extensibility of the query API. 
+- Supporting heterogeneous device execution, where a query is split across a set of hardware devices (e.g. CPU and GPU) for best performance. We have developed an initial cost model for heterogeneous execution.


Supporting both Intel and Nvidia GPU

aregm · 2022-06-29T20:49:11Z

README.md


-A low-level execution library for analytic data processing. 
+HDK is used as a fast execution backend in [Modin](https://github.com/intel-ai/modin). The HDK library provides a set of components for federating analytic queries to an execution backend based on [OmniSciDB](https://github.com/intel-ai/omniscidb). Currently, HDK targets OLAP-style queries expressed as relational algebra or SQL. Major and immediate project priorities include:


Also can be used standalone through pyHDK

alexbaden · 2022-07-01T17:54:27Z

Comments addressed w/ latest push.

Add SPDX License Identifier to header files

7bde0b4

alexbaden force-pushed the alex/licensing branch from 0cd3c03 to 259a127 Compare June 28, 2022 19:46

alexbaden marked this pull request as ready for review June 28, 2022 19:46

alexbaden requested review from vlad-penkin and aregm June 28, 2022 19:47

aregm suggested changes Jun 30, 2022

View reviewed changes

aregm approved these changes Jul 1, 2022

View reviewed changes

alexbaden added 2 commits July 1, 2022 13:58

Update README

3c2d74a

Add Conda env file

a011892

alexbaden force-pushed the alex/licensing branch from 414128e to a011892 Compare July 1, 2022 18:58

vlad-penkin approved these changes Jul 1, 2022

View reviewed changes

alexbaden merged commit 506951c into main Jul 1, 2022

alexbaden deleted the alex/licensing branch July 1, 2022 19:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update README, licensing, build scripts #44

Update README, licensing, build scripts #44

alexbaden commented Jun 28, 2022

aregm Jun 29, 2022

aregm Jun 29, 2022

aregm Jun 29, 2022

aregm Jun 29, 2022

aregm Jun 29, 2022

alexbaden commented Jul 1, 2022


		A low-level execution library for analytic data processing.
		HDK is used as a fast execution backend in [Modin](https://github.com/intel-ai/modin). The HDK library provides a set of components for federating analytic queries to an execution backend based on [OmniSciDB](https://github.com/intel-ai/omniscidb). Currently, HDK targets OLAP-style queries expressed as relational algebra or SQL. Major and immediate project priorities include:

Update README, licensing, build scripts #44

Update README, licensing, build scripts #44

Conversation

alexbaden commented Jun 28, 2022

aregm Jun 29, 2022

Choose a reason for hiding this comment

aregm Jun 29, 2022

Choose a reason for hiding this comment

aregm Jun 29, 2022

Choose a reason for hiding this comment

aregm Jun 29, 2022

Choose a reason for hiding this comment

aregm Jun 29, 2022

Choose a reason for hiding this comment

alexbaden commented Jul 1, 2022