Skip to content
This repository was archived by the owner on May 9, 2024. It is now read-only.

Update README, licensing, build scripts #44

Merged
merged 3 commits into from
Jul 1, 2022
Merged

Update README, licensing, build scripts #44

merged 3 commits into from
Jul 1, 2022

Conversation

alexbaden
Copy link
Contributor

No description provided.

@alexbaden alexbaden marked this pull request as ready for review June 28, 2022 19:46
@alexbaden alexbaden requested review from vlad-penkin and aregm June 28, 2022 19:47
README.md Outdated
@@ -1,21 +1,105 @@
# Heterogeneous Data Kernels
# oneAPI Heterogeneous Data Kernels
oneHDK is a low-level execution library for analytic data processing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data analytics processing? Swap the adjective.

README.md Outdated
Cloning a project with submodules. Either use `git clone --recurse-submodules` to clone the repo, or clone as normal and then run:
### Storage

`ArrowStorage` is the default (and only available) HDK storage layer. `ArrowStorage` provides storage support for [Apache Arrow](https://github.com/apache/arrow) format data. The storage layer must be explicitly initialized:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently?

README.md Outdated
HDK is used as a fast execution backend in [Modin](https://github.com/intel-ai/modin). The HDK library provides a set of components for federating analytic queries to an execution backend based on [OmniSciDB](https://github.com/intel-ai/omniscidb). Currently, HDK targets OLAP-style queries expressed as relational algebra or SQL. Major and immediate project priorities include:
- Introducing a HDK-specific IR and set of optimizations to reduce reliance on RelAlg and improve extensibility of the query API.
- Supporting heterogeneous device execution, where a query is split across a set of hardware devices (e.g. CPU and GPU) for best performance. We have developed an initial cost model for heterogeneous execution.
- Improving performance of the CPU backend on Modin-specific queries and current-generation data science workstations and servers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by >2x

A low-level execution library for analytic data processing.
HDK is used as a fast execution backend in [Modin](https://github.com/intel-ai/modin). The HDK library provides a set of components for federating analytic queries to an execution backend based on [OmniSciDB](https://github.com/intel-ai/omniscidb). Currently, HDK targets OLAP-style queries expressed as relational algebra or SQL. Major and immediate project priorities include:
- Introducing a HDK-specific IR and set of optimizations to reduce reliance on RelAlg and improve extensibility of the query API.
- Supporting heterogeneous device execution, where a query is split across a set of hardware devices (e.g. CPU and GPU) for best performance. We have developed an initial cost model for heterogeneous execution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting both Intel and Nvidia GPU

README.md Outdated

A low-level execution library for analytic data processing.
HDK is used as a fast execution backend in [Modin](https://github.com/intel-ai/modin). The HDK library provides a set of components for federating analytic queries to an execution backend based on [OmniSciDB](https://github.com/intel-ai/omniscidb). Currently, HDK targets OLAP-style queries expressed as relational algebra or SQL. Major and immediate project priorities include:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can be used standalone through pyHDK

@alexbaden
Copy link
Contributor Author

Comments addressed w/ latest push.

@alexbaden alexbaden merged commit 506951c into main Jul 1, 2022
@alexbaden alexbaden deleted the alex/licensing branch July 1, 2022 19:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants