-
Notifications
You must be signed in to change notification settings - Fork 117
[AWS] Use credentials and config from AWS SDK file #1114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@JosepSampe please don't merge yet |
|
@JosepSampe ready for review and merge |
| temp = copy.deepcopy(config_data['aws_lambda']) | ||
| config_data['aws_lambda'].update(config_data['aws']) | ||
| config_data['aws_lambda'].update(temp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any particular reason to, instead of copying the aws keys inside aws_lambda, create another level of configuration inside aws_lambda and put aws section inside aws_lambda section? I think this new approach will breake a functionality explained below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the idea was to remove the "aws" section altogether when clients are created. Reverted
| if "secret_access_key" in config_data["aws_lambda"]["aws"] or "access_key_id" in config_data["aws_lambda"]["aws"]: | ||
| logger.warning('Using "secret_access_key" and "access_key_id" in lithops configuration is deprecated and ' | ||
| 'it will be removed in future releases ' | ||
| '- Use boto3 configuration with environment variables or config file in ~/.aws instead ' | ||
| '(https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html)') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any particular reason to deprecate access_key_id secret_access_key? IMO we should keep it as an option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the deprecation notice but kept the warning
| else: | ||
| logger.debug("Creating default boto3 client") | ||
| client_config = Config( | ||
| signature_version=UNSIGNED, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, this line is necessary for accessing public buckets when no s3 config is provided in the lithops config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I will keep the default client as is, so that it can get credentials from the AWS role in the lambda runtime. If any user needs unsigned requests, it should explicitly create an s3 client or specify it in the s3 config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have an app that is analyzing an AWS S3 public bucket, where the client is automatically created by lithops because we put in iterdata a bucket address, like in this example but for AWS S3, so we should find a way to keep this unsigned flag in the case that a user does not have credentials at all in his machine. If the user has credentials in his machine, the unsigned flag is not necessary.
| elif 'region' not in config_data['aws_lambda']['aws']: | ||
| config_data['aws_lambda']['aws']['region'] = config_data['aws_lambda']['region'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, these lines where used to propagate the region from the aws_lambda section to the aws section, and later to aws_s3, in that case where someone puts a region in aws_lambda, but not inaws and aws_s3, this way the s3 backend would use the same region as the lambda backend. I think this new approach will break this propagation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Josep, now I understand... I'll add some comments to the code to explain this. What I was thinking is that all AWS configuration should be inherited from the standard AWS SDK ways, but I understand that is also important to enforce users deploy lambdas and buckets in the same region to avoid data transfer costs. In this case, the safest option is require to specifiy the "region" parameter the "aws" section, regardless of the AWS SDK config, and not have it repeated in "aws_s3" or "aws_lambda"?
| sts_client = self.aws_session.client('sts', region_name=self.region_name) | ||
| caller_id = sts_client.get_caller_identity() | ||
|
|
||
| self.user_key = caller_id["UserId"].split(":")[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested these particular lines, and in my case the caller_id["UserId"] does not contain :, so it fails. My UserID looks like: {'UserId': 'XIPOZEKLFKWLQSXXX7587', ...}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| if "secret_access_key" in config_data["aws"] or "access_key_id" in config_data["aws"]: | ||
| logger.warning( | ||
| 'using "secret_access_key" and "access_key_id" in the lithops configuration is deprecated and ' | ||
| 'they will be removed in future releases ' | ||
| '- Use boto3 configuration with environment variables or config file in ~/.aws instead ' | ||
| '(https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html)') | ||
|
|
||
| # Put "aws" section inside AWS backends, so we can access credentials at the backend class | ||
| # Remove from config_data to avoid storing secrets | ||
| if "aws" in config_data: | ||
| if "aws_lambda" in config_data: | ||
| config_data["aws_lambda"]["aws"] = config_data["aws"] | ||
| if "aws_s3" in config_data: | ||
| config_data["aws_s3"]["aws"] = config_data["aws"] | ||
| if "aws_batch" in config_data: | ||
| config_data["aws_batch"]["aws"] = config_data["aws"] | ||
| if "aws_ec2" in config_data: | ||
| config_data["aws_ec2"]["aws"] = config_data["aws"] | ||
| del config_data["aws"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments as in the lambda backend
| temp = copy.deepcopy(config_data['aws_s3']) | ||
| config_data['aws_s3'].update(config_data['aws']) | ||
| config_data['aws_s3'].update(temp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments as in the lambda backend
| # temp = copy.deepcopy(config_data["aws_lambda"]) | ||
| config_data["aws_lambda"].update(config_data["aws"]) | ||
| # config_data["aws_lambda"].update(temp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason of this roundtrip is to make sure that the most specific config is always applied. For example, if you set region both in aws and aws_lamda, the aws_lamda region must be applied. I don't know if there is a better way to make sure the most specific config is applied and not overwritten by the .update(), but by commenting those 2 lines, if you have a region set in your aws section of the config, for example us-east1, and then you do this in your code: fexec = lithops.FunctionExecutor(region="eu-west2"), the us-east1 region will always overwrite the region you explicitly set in the function executor.
| else: | ||
| logger.debug("Creating default boto3 client") | ||
| client_config = Config( | ||
| signature_version=UNSIGNED, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have an app that is analyzing an AWS S3 public bucket, where the client is automatically created by lithops because we put in iterdata a bucket address, like in this example but for AWS S3, so we should find a way to keep this unsigned flag in the case that a user does not have credentials at all in his machine. If the user has credentials in his machine, the unsigned flag is not necessary.
|
@JosepSampe Hi Josep, all requests have been implemented. Please we should need this merged ASAP, we switched to an SSO-based account and the current implementation in main does not work well (and also to be ready for the next release #1137 ). Thanks! |
| if "region" not in config_data["aws_lambda"]: | ||
| raise Exception("\"region\" is mandatory under the \"aws_lambda\" or \"aws\" section of the configuration") | ||
| elif "region" not in config_data["aws"]: | ||
| config_data["aws"]["region"] = config_data["aws_lambda"]["region"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if region is set using options 1 (~/.aws/config) or 2 (env var)? Or is region mandatory in any case in the lithops config like in the documentation?
lithops:
backend: aws_lambda
aws_lambda:
execution_role: <EXECUTION_ROLE_ARN>
region: <REGION_NAME>There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if "secret_access_key" in config_data["aws"] or "access_key_id" in config_data["aws"]: | ||
| logger.warning("Using 'secret_access_key' and 'access_key_id' in lithops configuration is not recommended " | ||
| "- Use boto3 configuration file in ~/.aws or environment variables instead " | ||
| "(https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html)") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for any warning for now. We still have to decide whether we want to deprecate it or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't add comments in unmodifed code, but:
In lines 42-43: if in options 1 and 2 region is not mandatory, this should be fixed.
Now lithops supports the automatic creation of the storage bucket if it is not provided in the config. So lines 45-48 should be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now lithops supports the automatic creation of the storage bucket if it is not provided in the config. So lines 45-48 should be fixed.
Not sure how to proceed with this... With the SSO approach, we don't have a "fixed" key or ID to read from, contrary to the key pair approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the S3 instance shared between all the users in the SSO approach?
Or each SSO user has its own instance ?
Or the same insytance but s/he can only see his own buckets?
I wonder if we can simply use the config_profile name (or a hash of it) for the bucket name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple SSO users can access to a same account and share buckets, but each one will have a different profile name. Yes, using config profile is viable 👍
|
|
||
| Lithops needs at least `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` and `AWS_DEFAULT_REGION` environment variables set. | ||
|
|
||
| 3. Provide the credentials in the `aws` section of the Lithops config file **This option is not ideal and will be removed in future Lithops releases!**: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for any warning for now. We still have to decide whether we want to deprecate it or not
| if "secret_access_key" in config_data["aws"] or "access_key_id" in config_data["aws"]: | ||
| logger.warning("Using 'secret_access_key' and 'access_key_id' in lithops configuration is not recommended " | ||
| "- Use boto3 configuration file in ~/.aws or environment variables instead " | ||
| "(https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for any warning for now. We still have to decide whether we want to deprecate it or not
| aws_lambda: | ||
| execution_role: <EXECUTION_ROLE_ARN> | ||
| region: <REGION_NAME> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the execution_role mandatory in aws_lambda? if yes I would update this .md file and include it in all the parts where you put some lithops config example, to make it clearer. Is region mandatory in all the cases? I think this lithops config example is confusing here. I would remove it and put the config example in the next section, when necessary, with all the necessary parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, execution_role is mandatory. The user must specify which services can the lambda access to. We could automate this, but the user should have IAM permissions like create role... We can leave like this for now
| if "access_key_id" in payload["config"]["aws"] and "secret_access_key" in payload["config"]["aws"]: | ||
| del payload["config"]["aws"]["access_key_id"] | ||
| del payload["config"]["aws"]["secret_access_key"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary/convenient here to remove the session_token too?
I think here you can simply pop the keys instead of checking one by one if they exists
payload["config"]["aws"].pop("access_key_id", None)
payload["config"]["aws"].pop("secret_access_key", None)
payload["config"]["aws"].pop("session_token", None)|
My last comments are about the 2 other AWS backend (Batch & EC2).
|
| if 'access_key_id' in config_data['aws']: | ||
| key = config_data['aws_s3']['access_key_id'] | ||
| elif 'config_profile' in config_data['aws']: | ||
| key = hashlib.md5(config_data['aws']['config_profile'].encode("utf-8"), usedforsecurity=False).hexdigest() | ||
| else: | ||
| raise Exception("'access_key_id' or 'config_profile' is mandatory in 'aws' section of the configuration") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the option of AWS_ACCESS_KEY_ID in the env missing here?
| 1. Provide credentials via the `~/.aws/config` file. **This is the preferred option to configure AWS credentials for use with Lithops**: | ||
|
|
||
| You can run `aws configure` command if the AWS CLI is installed to setup the credentials. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How the ~/.aws/config looks like in this case? are the keys going into a default profile by defaut? or are the keys set in the file without a profile?
I mean, after calling aws configure, you get this:?
aws_access_key_id=XXXXXXXXX
aws_secret_access_key=XXXXXX
or something like this:?
[profile default]
aws_access_key_id=XXXXXXXXXXXXXXX
aws_secret_access_key=XXXXXXXXXXXXXXXX
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if in this case it makes sense to force the user to provide a profile_name with aws configure --profile my-unique-profile-name and then configure lithops like in the SSO approach, with:
lithops:
backend: aws_lambda
aws:
config_profile: my-unique-profile-name
aws_lambda:
execution_role: <EXECUTION_ROLE_ARN>
region: <REGION_NAME>| 2. Provide credentials via environment variables: | ||
|
|
||
| Lithops needs at least `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` and `AWS_DEFAULT_REGION` environment variables set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in this option you can put a config example (and maybe remove AWS_DEFAULT_REGION?):
lithops:
backend: aws_lambda
aws_lambda:
execution_role: <EXECUTION_ROLE_ARN>
region: <REGION_NAME>|
Closing for now, #1164 partially solves the issue described |
Fix for #1107
This pull request adds functionality to retrieve AWS SDK config and credentials from the standard config file (
~/.aws/configand~/.aws/credentials) or env vars (more info).Consequently, it deprecates using
aws_access_key_idandaws_secret_access_keyinawsLithops config section.This approach is not only more secure, as we avoid sending secrets to the runtime via payload, but also we support users with SSO-based accounts, which will need configure a profile in their
~/.aws/configfile and retrieve their session credentials dynamically. E.g.:Summary:
config_profile.Developer's Certificate of Origin 1.1