-
Notifications
You must be signed in to change notification settings - Fork 2
initial update of UDF sections #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
efc74c3
initial update of UDF sections
mkcorneli c14fc41
Update doc/distributed_python/intro.rst
mkcorneli 6028bf9
Update doc/distributed_python/advanced.rst
mkcorneli d7b287f
Update doc/distributed_python/advanced.rst
mkcorneli c7dd56c
Update doc/distributed_python/debugging.rst
mkcorneli 0063db3
Update doc/distributed_python/usage.rst
mkcorneli 6ffb177
Update doc/distributed_python/intro.rst
mkcorneli c699e4d
Update doc/distributed_python/usage.rst
mkcorneli 5f1cb9c
Update doc/distributed_python/usage.rst
mkcorneli 0b06701
added data science page and videos
mkcorneli 9bb7a52
added data science page and videos
mkcorneli 86a08f6
updated wording
mkcorneli File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| Data Science with Exasol | ||
| ------------------------- | ||
|
|
||
| Exasol has significant capabilities for implementing data science workflows - from classic machine learning to Gen AI and language model solutions. | ||
|
|
||
| The best way to get started is with `Exasol's AI Lab <https://github.com/exasol/ai-lab>`_. | ||
|
|
||
| This video walks through `getting started with AI Lab <https://www.youtube.com/watch?v=LkqdLlRF2Go>`_. | ||
|
|
||
| AI Lab includes various workbooks that you can run to load data into Exasol. | ||
| This video walks through `loading data <https://www.youtube.com/watch?v=-t1q6CeswJs&t=1s>`_ in more detail. | ||
|
|
||
| If you want to leverage Exasol to build Gen AI and LM-based solutions we recommend starting with the Exasol `Transformers Extension <https://github.com/exasol/transformers-extension>`_. | ||
|
|
||
| This video showcases the potential `applications of the Exasol Transformers Extension <https://www.youtube.com/watch?v=sHSnCR71kyc>`_ . |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,2 +1,14 @@ | ||
| Advanced | ||
| -------- | ||
| -------- | ||
|
|
||
| From a performance perspective, which programming language you should use in an UDF script depends on the purpose and context of the script, as specific elements may have different capacities in each language. For example, string processing can be faster in one language while XML parsing can be faster in another. This means that one language cannot be said to have better performance in all circumstances. However, if overall performance is the most important criteria, we recommend using Lua. Lua is integrated in Exasol in the most native way, and therefore, it has the smallest process overhead. | ||
|
|
||
| During the processing of a SELECT statement, multiple virtual machines are started for each script and node. These virtual machines process the data independently. For scalar functions, the input rows are distributed across those virtual machines to achieve maximum parallelism. For SET input tuples, the virtual machines are used per group if you specify a GROUP BY clause. Otherwise, there will be only one group, which means only one node and virtual machine can process the data. | ||
|
|
||
| The following pages contain information about more advanced UDF functionality: | ||
|
|
||
| * `UDF Instance Limiting <https://docs.exasol.com/db/latest/database_concepts/udf_scripts/udf_instance_limit.htm>`_ | ||
|
|
||
| * `Hiding Access tokens and secrets <https://docs.exasol.com/db/latest/database_concepts/udf_scripts/hide_access_keys_passwords.htm>`_ | ||
|
|
||
| * `Managing Script Language Containers <https://docs.exasol.com/db/latest/database_concepts/udf_scripts/adding_new_packages_script_languages.htm>`_ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,2 +1,4 @@ | ||
| Debugging | ||
| --------- | ||
| --------- | ||
|
|
||
| For Python versions 3.x, we recommend using `pyexasol <https://exasol.github.io/pyexasol/master/index.html>`_ and the `script output functionality <https://exasol.github.io/pyexasol/master/user_guide/udf_script_output.html>`_ to debug your UDFs. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,2 +1,10 @@ | ||
| Intro to UDFs | ||
| ------------- | ||
| ------------- | ||
|
|
||
| UDF scripts allow you to program your own analysis, processing, and generation functions, and to execute these functions in parallel inside an Exasol cluster. | ||
| By using UDF scripts, you can solve problems that are not possible to solve with SQL statements. | ||
|
|
||
| Exasol supports the programming languages Java, Lua, R, and Python in UDF scripts. These languages provide different functionalities (for example, statistical functions in R) and different libraries. | ||
|
|
||
| UDFs are the key to unlocking much of Exasol's AI, ML and Data Science potential, as well as customizing Exasol to suit your unique use cases. | ||
| UDFs are executed by Exasol's massively parallel query engine and scale across available hardware in the same way SQL queries do - this gives them significant performance potential. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,2 +1,30 @@ | ||
| Creating and running UDFs | ||
| ------------------------- | ||
| ------------------------- | ||
|
|
||
| In the CREATE SCRIPT command, you must define the type of input and output values. | ||
| There are two types of UDF inputs (set and scalar) and two types of UDF outputs (returns and emits). | ||
| These can be combined as needed to suite your use case. | ||
|
|
||
| - Input values | ||
|
|
||
| - **SCALAR** Specifies that the script processes single input rows. The code is therefore called once per input row. | ||
|
|
||
| - **SET** Specifies that the processing refers to a set of input rows. Within the code, you can iterate through those rows. | ||
|
|
||
| - Output values | ||
|
|
||
| - **RETURNS** Specifies that the script returns a single value. | ||
|
|
||
| - **EMITS** Specifies that the script can create (emit) multiple result rows (tuples). | ||
|
|
||
| Each UDF script must contain the main function run(). This function is called with a parameter providing access to the input data of Exasol. If your script processes multiple input tuples (using SET), you can iterate through the single tuples using this parameter. | ||
| You can specify an ORDER BY clause either when creating a script or when calling it. This clause sorts the processing of the groups of SET input data. If it is necessary for the algorithm, you should specify this clause when creating the script to avoid wrong results due to misuse. | ||
mkcorneli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Input parameters in scripts are always case sensitive, similar to the script code. This is different to SQL identifiers, which are only case sensitive if they are delimited. | ||
|
|
||
| You can use this `UDF Generator <https://htmlpreview.github.io/?https://github.com/EXASOL/script-languages/blob/master/udf-script-signature-generator/udf-script-signature-generator.html>`_ to help you get started building your own UDFs. | ||
|
|
||
| Examples | ||
| ^^^^^^^^^ | ||
|
|
||
| You can view examples of UDFs `here <https://docs.exasol.com/db/latest/database_concepts/udf_scripts/udf_examples.htm>`_. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.