-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(glue): add L2 resources for Database
and Table
#1988
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beautiful
new glue.Table(stack, 'MyTable', { | ||
database: myDatabase, | ||
tableName: 'my_table', | ||
columns: [{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering, if name
is unique, why not use a hash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two semantics we want to model as strictly as we can: column uniqueness and ordering.
- A hash models uniqueness well, but it does not model ordering. In
node.js
, the order of variables is the order in which they are added to the object, but that is not the case for other languages likejava
, where a developer would have to know to use aLinkedHashMap
. - An array explicitly and intuitively defines the ordering in all languages, but it doesn't model column uniqueness.
I chose to statically model the ordering property with an array
and check the uniqueness at runtime because then, at least the experience is consistent for all consumers. Using a hash might create confusion for consumers - they would not receive an error, the layout of their columns could just change arbitrarily.
/** | ||
* Storage type of the table's data. | ||
*/ | ||
storageType: StorageType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default to JSON?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's best to ask customers to be explicit about their data format. Front-load the important questions: schema, file format, location and security.
packages/@aws-cdk/aws-glue/README.md
Outdated
* [SSE-S3](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingServerSideEncryption.html) - Server side encryption (SSE) with an Amazon S3-managed key. | ||
```ts | ||
new glue.Table(stack, 'MyTable', { | ||
encryption: glue.TableEncryption.SSE_S3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enum names should be consistent with BucketEncryption
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue the other way around - the enum values are consistent with the S3, Athena, Glue and EMR documentation. What would I name CSE-KMS
if I were copying BucketEncryption
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, but I think we have a problem with ALL_CAPS when converting those member names to other languages. Can we find names that are PascalCase?
*/ | ||
// CSE_KMS = 'CSE-KMS' | ||
CSE_KMS = 'CSE-KMS' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposed names that pass the jsii naming bar:
Unencrypted
SSE_KMS
=>Kms
SSE_KMS_MANAGED
=>KmsManaged
SSE-S3
=>S3Managed
CSE_KMS
=>ClientKms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your way distinguishes CSE with the prefix Client
and implies SSE for the others. It's efficient. I think I'd prefer ClientSideKms
over ClientKms
, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me.
I am okay with ServerSideXxx
as well, but then we'll have to also change it in other places ;-) and I favor consistency at this point.
…ypted and CSE-KMS for an explicitly passed bucket
This change adds L2 resources for
Database
andTable
.Schemas are defined as an array of
Column
, each of which have aname
and aType
:Types
A table's schema is a collection of columns, each of which have a
name
and atype
. Types are recursive structures, consisting of primitive and complex types:Primitive
Numeric:
bigint
float
integer
smallint
tinyint
Date and Time:
date
timestamp
String Types:
string
decimal
char
varchar
Misc:
boolean
binary
Complex
array
- array of some other type.map
- map of some primitive key type to any value type.struct
- nested structure containing individually named and typed columns.Pull Request Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license.