Skip to content

ria-ee/X-Road-catalogue-collector

Repository files navigation

Collector for X-tee subsystems and methods catalogue

This application will collect data for X-tee subsystems and methods catalogue. Collector will output collected data to a directory that is ready to by served by web server (like Apache or Nginx). Subsequent executions will create new versions of catalogue while preserving old versions.

Configuration

Create a configuration file for your X-Road instance using an example configuration file: example-config.json. If you need to provide catalogue data for multiple X-Road instances then you will need separate configurations for each X-Road instance.

Configuration parameters:

  • output_path - output directory for collected data;

  • minio_url - address of your MinIO server;

  • minio_access_key - access key for MinIO;

  • minio_secret_key - secret key for MinIO;

  • minio_secure - boolean flag indicating if secure HTTPS connection is used for MinIO;

  • minio_ca_certs - CA certificate for validating MinIO certificate;

  • minio_bucket - MinIO bucket used for file storage;

  • minio_path - path inside MinIO bucket used for file storage;

  • server_url - address of your security server;

  • client - array of X-Road client identifiers;

  • instance - X-Road instance to collect data from;

  • timeout - X-Road query timeout;

  • server_cert - optional TLS certificate of your security server for verification;

  • client_cert - optional application TLS certificate for authentication with security server;

  • client_key - optional application key for authentication with security server;

  • thread_count - amount of parallel threads to use;

  • wsdl_replaces - replace metadata like creation timestamp in WSDLs to avoid duplicates;

  • excluded_member_codes - exclude certain members who are permanently in faulty state or should not be queried for any other reasons;

  • excluded_subsystem_codes - exclude certain members who are permanently in faulty state or should not be queried for any other reasons;

  • filtered_hours - amount of parallel threads to use;

  • filtered_days - amount of parallel threads to use;

  • filtered_months - amount of parallel threads to use;

  • cleanup_interval - interval in days when automatic removal of older reports will be performed. During the cleanup only the first report of each day is preserved and extra reports are deleted;

  • days_to_keep - amount of latest days to protect against cleanup;

  • logging-config - logging configuration passed to logging.config.dictConfig(). You can read more about Python3 logging here: https://docs.python.org/3/library/logging.config.html.

Installing python venv

Python virtual environment is an easy way to manage application dependencies. First You will need to install support for python venv:

sudo apt-get install python3-venv

Then install required python modules into venv:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Running

You can run the collector by issuing command (with activated venv):

python catalogue-collector.py config-instance1.json

Systemd timer

Systemd timer can be used as more advanced version of cron. You can use provided example timer and service definitions to perform scheduled collection of data from your instances.

Add service description systemd/catalogue-collector.service to /lib/systemd/system/catalogue-collector.service and timer description systemd/catalogue-collector.timer to /lib/systemd/system/catalogue-collector.timer.

Then start and enable automatic startup:

sudo systemctl daemon-reload
sudo systemctl start catalogue-collector.timer
sudo systemctl enable catalogue-collector.timer

Helper scripts

  • recreate_history.py - This script can be used to to update history.json file when it was corrupted or when some of the reports were deleted. Usage: python3 recreate_history.py <path to catalogue>
  • remove_unused.py - This script can be used to remove WSDL files that are no longer used in X-tee catalogue. For example due to deletion of older catalogue reports. Usage: python3 remove_unused.py <path to catalogue>. Or to simply list unused WSDLs: python3 remove_unused.py --only-list <path to catalogue>
  • clean_history.py - This script can be used to remove old JSON index files to free up disk space. Second parameter is an ammount of latest days that will not be cleaned. Older days will be cleaned so that only the first report of the day is kept. Usage: python3 clean_history.py <path to catalogue> <days to keep>

If after usage of remove_unused.py you need to also delete empty directories then execute the following command inside catalogue directory:

find . -type d -empty -delete

Minio storage

If minio_url is configured then collected data will be pushed to minio storage.

To test minio locally on linux machine execute the following commands (note that you should never use the default password for production):

sudo mkdir -p /mnt/data
docker run -d -p 9000:9000 --name minio1 -e "MINIO_ACCESS_KEY=minioadmin" -e "MINIO_SECRET_KEY=minioadmin" -v /mnt/data:/data minio/minio server /data
cd
wget https://dl.min.io/client/mc/release/linux-amd64/mc
chmod +x mc
~/mc config host add --quiet --api s3v4 cat http://localhost:9000 minioadmin minioadmin
~/mc mb cat/catalogue
~/mc policy set download cat/catalogue

To copy catalogue data to minio execute the following command:

~/mc cp -r EE cat/catalogue/

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages