Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker: offer to configure S3 storage by environment variables #450

Open
oupala opened this issue Jul 26, 2022 · 14 comments
Open

Docker: offer to configure S3 storage by environment variables #450

oupala opened this issue Jul 26, 2022 · 14 comments

Comments

@oupala
Copy link

oupala commented Jul 26, 2022

It looks like it is not currentyl possible to configure S3 storage by environment variables. The only way to configure an S3 storage is by configuration the plikd.cfg config file.

When you are using a k8s that has an S3 operator, the S3 bucket is dynamically set on startup and credentials are made accessible by environment variables (mainly configMaps).

It would be great if plik can retrieve its S3 credentials from environment variables (aka configMaps) so that is can be dynamically linked to the S3 bucket. This would ignore any settings in plikd.cfg if any setting is also set as an environment variable.

I think this requires a change in the plikd binary so that variable can be read from env variables in addition to configuration file. Am I right?

@camathieu
Copy link
Member

Hello,

I think that this is already possible. You should be able to pass a JSON config to the PLIKD_DATA_BACKEND_CONFIG environment variable.
See : https://github.com/root-gg/plik#configuration-

@oupala
Copy link
Author

oupala commented Aug 28, 2022

A config file is not really the same thing as an environment variable.

A config file is giving all variable at once in a file.

I was expecting to be able to pass each variable as an environment variable. This is especially useful when the S3 bucket is being provisionned by a K8S operator that is making all credentials available by environment variables.

@camathieu
Copy link
Member

camathieu commented Aug 28, 2022

Each configuration parameter is overridable using environment variable as follow :

One can specify configuration parameters using env variable with the configuration parameter in screaming snake case

PLIKD_DEBUG_REQUESTS=true ./plikd

For Arrays and config maps they must be provided in json format. Arrays are overridden but maps are merged

PLIKD_DATA_BACKEND_CONFIG='{"Directory":"/var/files

Having to pass the whole data backend config as a json in a single environment variable is an issue ?

@camathieu
Copy link
Member

If needed we could improve the environment variable parser to understand things like

PLIKD_DATA_BACKEND_CONFIG_DIRECTORY="/var/files"

@oupala
Copy link
Author

oupala commented Aug 28, 2022

Here is an extract from plikd.cfg config file:

   DataBackend  = "s3"
   [DataBackendConfig]
       Endpoint = "127.0.0.1:9000"
       AccessKeyID = "access_key_id"
       SecretAccessKey = "access_key_secret"
       Bucket = "plik"
       Location = "us-east-1"
       Prefix = ""
       UseSSL = true
       PartSize = 16000000 // Chunk size when file size is not known. (default to 16MB)
                           // Multiply by 10000 to get the max upload file size (max upload file size 160GB)
       SSE = ""  // the following encryption methods are available :
                 //  - SSE-C: server-side-encryption with customer provided keys ( managed by Plik )
                 //  - S3:    server-side-encryption using S3 storage encryption ( managed by the S3 backend )

As far as I understand, some of these variables can be set via environment variables:

  • DataBackend => PLIKD_DATA_BACKEND

But for the following variables, I think an improvement of the data parser would be required:

  • [DataBackendConfig]
    • Endpoint => PLIKD_DATA_BACKEND_CONFIG_ENDPOINT
    • AccessKeyID => PLIKD_DATA_BACKEND_CONFIG_ACCESS_KEY_ID
    • SecretAccessKey => PLIKD_DATA_BACKEND_CONFIG_SECRET_ACCESS_KEY
    • Bucket => PLIKD_DATA_BACKEND_CONFIG_BUCKET
    • Location => PLIKD_DATA_BACKEND_CONFIG_LOCATION
    • Prefix => PLIKD_DATA_BACKEND_CONFIG_PREFIX
    • UseSSL => PLIKD_DATA_BACKEND_CONFIG_USE_SSL
    • PartSize => PLIKD_DATA_BACKEND_CONFIG_PART_SIZE
    • SSE => PLIKD_DATA_BACKEND_CONFIG_SSE

In fact, the S3 operator is dynamically creating some environment variables: the endpoint, the access key id, the secret access key, the bucket name. It would not be possible to pass a json file as the S3 operator does not provide a json file, but only unitary environment variables.

@camathieu
Copy link
Member

camathieu commented Aug 29, 2022 via email

@oupala
Copy link
Author

oupala commented Aug 29, 2022

As of now you can already pass the data backend config a JSON string (not a JSON file) with the data backend settings. I'll see if I can implement the improvement you described.

Is that behavior documented?

How would you do that?

@camathieu
Copy link
Member

camathieu commented Aug 29, 2022

For example this in the plikd.cfg config file :

DataBackend  = "s3"
[DataBackendConfig]
    Endpoint = "127.0.0.1:9000"
    AccessKeyID = "access_key_id"
    SecretAccessKey = "access_key_secret"
    Bucket = "plik"
    Location = "us-east-1"

Would look like this using environement vairables :

export PLIKD_DATA_BACKEND="s3"
export PLIKD_DATA_BACKEND_CONFIG='{Endpoint:"127.0.0.1:9000", "AccessKeyID": "access_key_id", "SecretAccessKey ": "access_key_secret", "Bucket":"plik","Location":"us-east-1"}'

As Map/Dict are merged you could specify "safe" parameters like Endpoint or Bucket in the config file and pass only the "secret" parameters using the environement variable like this :

export PLIKD_DATA_BACKEND_CONFIG='{"SecretAccessKey ":"access_key_secret"}'

@oupala
Copy link
Author

oupala commented Aug 29, 2022

Thanks for the quick reply.

It would be great if the content of the previous comment was pasted in the documentation.

There is still a usecase were individual environment variables would be required for optimized automation with k8s.

@oupala
Copy link
Author

oupala commented Sep 14, 2022

For example, when our S3 K8S operator is creating a new bucket, the operator creates a configmap and a secret that become available in the K8S namespace:

  • <bucket-name>-configmap:
    • BUCKET_HOST
    • BUCKET_NAME
    • BUCKET_PORT
    • BUCKET_REGION
    • BUCKET_SSL
    • BUCKET_SUBREGION
  • <bucket-name>-secret:
    • AWS_ACCESS_KEY_ID
    • AWS_SECRET_ACCESS_KEY
    • USERNAME

The best solution is that plik is able to use these predefined environment variables by the following deployment syntax:

- name: PLIKD_DATA_BACKEND_CONFIG_ENDPOINT
  valueFrom:
    configMapKeyRef:
      name: <bucket-name>-configmap
      key: BUCKET_HOST
- name: PLIKD_DATA_BACKEND_CONFIG_ACCESS_KEY_ID
  valueFrom:
    secretKeyRef:
      name: <bucket-name>-secret
      key: AWS_ACCESS_KEY_ID
[...]
# and so on for all other variables

@mattjhammond
Copy link

I've been trying to get an AWS S3 backend working unsuccessfully. I'm not sure what the Endpoint should be set to, should Prefix be set?

Endpoint = "127.0.0.1:9000"
AccessKeyID = "mykey"
SecretAccessKey = "mysecretkey"
Bucket = "mytestbucket"
Location = "us-east-1"
Prefix = ""
UseSSL = true
PartSize = 16000000 #// Chunk size when file size is not known. (default to 16MB)
                    #// Multiply by 10000 to get the max upload file size (max upload file size 160GB)
SSE = "S3"  #// the following encryption methods are available :

unable to start Plik server : unable to initialize data backend : unable to check if bucket mytestbucket exists : Get "https://127.0.0.1:9000/mytestbucket/?location=": dial tcp 127.0.0.1:9000: connect: connection refused

@bodji
Copy link
Member

bodji commented Nov 8, 2022

The endpoint is the AWS url.

You should find one corresponding to the zone you want here : https://docs.aws.amazon.com/general/latest/gr/s3.html

Let us know if you made any progress

@mattjhammond
Copy link

@bodji Thanks for the endpoint assistance, the service is working now!

@oupala
Copy link
Author

oupala commented Dec 5, 2022

@mattjhammond Next time, please open a new issue as your issue is not the same as the one in the title. One thread = one subject => clean language.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants