This project is a fork of Pipelinewise's target-s3
where we experimented with adding new features but ultimately are no longer maintaining this project and do not recommend it's use in prodcution. See more here: #1 (comment)
Singer target that uploads loads data to S3 following the Singer spec.
This is a Meltano compatible target connector.
The recommended method of running this target is to use it from Meltano.
If you want to run this Singer Target independently please read further.
First, make sure Python 3 is installed on your system or follow these installation instructions for Mac or Ubuntu.
It's recommended to use a virtualenv:
python3 -m venv venv
pip install target-s3
or
python3 -m venv venv
. venv/bin/activate
pip install --upgrade pip
pip install .
Like any other target that's following the singer specificiation:
some-singer-tap | target-s3 --config [config.json]
It's reading incoming messages from STDIN and using the properites in config.json
to upload data into Postgres.
Note: To avoid version conflicts run tap
and targets
in separate virtual environments.
Running the the target connector requires a config.json
file. An example with the minimal settings:
{
"s3_bucket": "my_bucket"
}
Profile based authentication used by default using the default
profile. To use another profile set aws_profile
parameter in config.json
or set the AWS_PROFILE
environment variable.
For non-profile based authentication set aws_access_key_id
, aws_secret_access_key
and optionally the aws_session_token
parameter in the config.json
. Alternatively you can define them out of config.json
by setting AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
and AWS_SESSION_TOKEN
environment variables.
Full list of options in config.json
:
Property | Type | Required? | Description |
---|---|---|---|
aws_access_key_id | String | No | S3 Access Key Id. If not provided, AWS_ACCESS_KEY_ID environment variable will be used. |
aws_secret_access_key | String | No | S3 Secret Access Key. If not provided, AWS_SECRET_ACCESS_KEY environment variable will be used. |
aws_session_token | String | No | AWS Session token. If not provided, AWS_SESSION_TOKEN environment variable will be used. |
aws_profile | String | No | AWS profile name for profile based authentication. If not provided, AWS_PROFILE environment variable will be used. |
s3_bucket | String | Yes | S3 Bucket name |
field_to_partition_by_time | String | Yes | The timestamp or date field (key) that will be parsed into year/month/day to create partitions for large event datasets. |
file_type | String | No | (Default: 'parquet') The type of file to upload to s3. Supported options are parquet . The file extension will automatically be updated based off the corresponding file type. |
compression | String | No | The type of compression to apply before uploading. Supported options are none , snappy (default), gzip , and brotli . The file extension will automatically be updated based off the corresponding compression. |
add_metadata_columns | Boolean | (Default: False) Metadata columns add extra row level information about data ingestions, (i.e. when was the row read in source, when was inserted or deleted in snowflake etc.) Metadata columns are creating automatically by adding extra columns to the tables with a column prefix _SDC_ . The column names are following the stitch naming conventions documented at https://www.stitchdata.com/docs/data-structure/integration-schemas#sdc-columns. Enabling metadata columns will flag the deleted rows by setting the _SDC_DELETED_AT metadata column. Without the add_metadata_columns option the deleted rows from singer taps will not be recongisable in Snowflake. |
|
encryption_type | String | No | (Default: 'none') The type of encryption to use. Current supported options are: 'none' and 'KMS'. |
encryption_key | String | No | A reference to the encryption key to use for data encryption. For KMS encryption, this should be the name of the KMS encryption key ID (e.g. '1234abcd-1234-1234-1234-1234abcd1234'). This field is ignored if 'encryption_type' is none or blank. |
Apache License Version 2.0
See LICENSE to see the full text.