rdw-ingestion-tools

A DS team repository for shared data ingestion utilities.

setup

1. Ensure the necessary global environment variables are set in your Python environment.

For aaq:

AAQ_API_KEY="<secret>"
AAQ_API_BASE_URL="<url>"

For content_repo:

CONTENT_REPO_API_KEY="<secret>"
CONTENT_REPO_BASE_URL="<url>"

For flow_results:

FLOW_RESULTS_API_KEY="<secret>"
FLOW_RESULTS_API_BASE_URL="<url>"

For rapidpro:

RAPIDPRO_API_KEY="<secret>"
RAPIDPRO_API_BASE_URL="<url>"

For survey:

SURVEY_API_KEY="<secret>"
SURVEY_API_BASE_URL="<url>"

For turn: (Original Data Export API)

TURN_API_KEY="<secret>"
TURN_API_BASE_URL="<url>"

For the turn_bq API: (New Addition)

TURN_BQ_API_KEY="<secret>"
TURN_BQ_API_BASE_URL="<url>"

If you want to use the s3 utilities (that allow you to read and write specific parquet files amongst other things), the following variables should be set:

S3_KEY="<key>"
S3_SECRET="<secret>"

2. Install the `rdw-ingestion-tools` package

There are 2 ways of doing this.

Versioned install from github:

rdw-ingestion-tools is public!

pip3 install git+https://github.com/praekeltfoundation/[email protected]

From clone (with poetry). This is recommended:

git clone [email protected]:praekeltfoundation/rdw-ingestion-tools.git

poetry install

usage

For more examples on how to interact with particular API endpoints, see the examples file. These contain examples for each supported third party service and the endpoint associated with each.

For instance, to get flows from the Flow Results Specification API, the example is as follows:

from api.flow_results import pyFlows

flows = pyFlows.flows.get_flows()

print(flows.keys())

To access some of the s3 utilities used in ingestion.

import os
from s3 import pyS3

bucket=os.environ["BUCKET_NAME"]
prefix=os.environ["PREFIX"]

pyS3.s3.get_filenames(bucket=bucket, prefix=prefix)

to-do

Add tests - yes, I am a bad developer for not having any yet.

Name		Name	Last commit message	Last commit date
Latest commit History 186 Commits
.ci		.ci
.github/workflows		.github/workflows
examples		examples
rdw_ingestion_tools		rdw_ingestion_tools
tests		tests
.gitignore		.gitignore
.yamllint.yaml		.yamllint.yaml
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rdw-ingestion-tools

setup

1. Ensure the necessary global environment variables are set in your Python environment.

2. Install the `rdw-ingestion-tools` package

usage

to-do

About

Releases 16

Packages

Contributors 2

Languages

License

praekeltfoundation/rdw-ingestion-tools

Folders and files

Latest commit

History

Repository files navigation

rdw-ingestion-tools

setup

1. Ensure the necessary global environment variables are set in your Python environment.

2. Install the rdw-ingestion-tools package

usage

to-do

About

Resources

License

Stars

Watchers

Forks

Releases 16

Packages 0

Contributors 2

Languages

2. Install the `rdw-ingestion-tools` package

Packages