Skip to content

iptk/indexer-elasticsearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Elasticsearch Metadata Indexer

This metadata indexer lists all existing metadata in all datasets within a single dataset store and exports them to an Elasticsearch host.

Usage

Either use the index.py script directly, which requires Python 3.6 and the elasticsearch and iptk packages, or use the provided Docker image.

Configuration

The Elasticsearch host can be configured through the ELASTICSEARCH_HOST environment variable. The dataset store location can be specified by DATASETS_PATH.

If SHUFFLE_DATASETS is set, the script will generate a list of all datasets first, then shuffle that list and index datasets according to this order. This incurs additional startup time for iterating through the file system and generating the list, but ensures that all datasets have the same expected index time relative to the launch of the script. This is especially helpful when running multiple instances of the script to decrease the average index time.

About

Copy IPTK metadata to an Elasticsearch instance

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages