Skip to content

SAP Hana data source for LinkedIn's Datahub

License

Notifications You must be signed in to change notification settings

contiamo/datahub-sap-hana

Repository files navigation

Datahub SAP Hana Metadata source

Add your SAP Hana databases to your Linkedin Datahub!

Description

This python package extracts views metadata from SAP Hana db to push it to Datahub. These includes:

  • Table schema with datatypes
  • View definitions
  • View lineages with support for cross-schema references and column-level lineages

The metadata from SAP Hana are extracted and parsed using sqlalchemy (for table lineage) and sqlglot (for column lineage).

Recipe File

The recipe file for the source file supports ingestion of both table and column lineage via the Datahub CLI. The ingestion config file can also specify specific schemas to exclude and include, allowing for the creation of lineage across different schemas in a SAP Hana db. Results can be seen in the Datahub UI or printed in the console, or file.

Installing

Pre-built Wheels can be downloaded from the Releases page

Otherwise, you must install from source.

Requirements

You need the following tools pre-installed

Try it out

  1. Clone the project

    git clone [email protected]:contiamo/datahub-sap-hana.git
    cd datahub-sap-hana
  2. You will need Python 3.10 or higher

    Once you have pyenv and Poetry installed, you should run

    pyenv install 3.10.10
    pyenv local 3.10.10
    poetry config virtualenvs.in-project true
  3. Install the project and dependencies

    task setup
  4. Edit the examples/hana_recipe.yaml to set the connection details to your SAP Hana database.

    If you just want to do a local test, SAP offers SAP Hana Express as a free trial version of Hana. There is also a Docker image that makes this very easy, this is our recommendation.

  5. Run the test sync

    poetry run datahub ingest run -c examples/hana_recipe.yaml
  6. Inspect the contents of the hana_mces.json file that was created.

Docker image

A Docker image with datahub and this package preinstalled is provided via the Github Container Registry, see here

docker run -it --rm -v `pwd`:/opt \
   ghcr.io/contiamo/datahub-sap-hana:latest \
   ingest run -c /opt/examples/hana_recipe.yaml

Note that you may need to set the --network flag if you are using the Hana Express Docker image.

Development

Running the tests

To run the unit tests, use

task test

To run all of the tests, just use

task test -- -v