Skip to content

A collection of tools for extracting FHIR resources and analytics services on top of that data.

License

Notifications You must be signed in to change notification settings

jembi/fhir-data-pipes

 
 

Repository files navigation

Build Status codecov

What is this?

This repository includes pipelines to transform data from a FHIR server (like HAPI, GCP FHIR store, or even OpenMRS) using the FHIR format into a data warehouse based on Apache Parquet files, or another FHIR server. There is also a query library in Python to make working with FHIR-based data warehouses simpler.

These tools are intended to be generic and eventually work with any FHIR-based data source and data warehouse. Here is the list of main directories with a brief description of their content:

  • pipelines/ *START HERE*: Batch and streaming pipelines to transform data from a FHIR-based source to an analytics-friendly data warehouse or another FHIR store.

  • docker/: Docker configurations for various servers/pipelines.

  • doc/: Documentation for project contributors. See the pipelines README and wiki for usage documentation.

  • utils/: Various artifacts for setting up an initial database, running pipelines, etc.

  • dwh/: Query library for working with distributed FHIR-based data warehouses.

  • bunsen/: A fork of a subset of the Bunsen project which is used to transform FHIR JSON resources to Avro records with SQL-on-FHIR schema.

  • e2e-tests/: Scripts for testing pipelines end-to-end.

NOTE: This was originally started as a collaboration between Google and the OpenMRS community.

About

A collection of tools for extracting FHIR resources and analytics services on top of that data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 52.8%
  • Jupyter Notebook 33.6%
  • Python 7.8%
  • Shell 3.4%
  • HTML 1.6%
  • Dockerfile 0.7%
  • CSS 0.1%