Document existing Hive properties#10831
Conversation
d659d33 to
5d5cbca
Compare
5d5cbca to
dc8cf74
Compare
There was a problem hiding this comment.
Currently it just says "if you want faster, enable this", which isn't the whole truth
The documentation should say when this is ok to set and when it's not.
There was a problem hiding this comment.
@findepi can you help me come up with a better description here?
There was a problem hiding this comment.
Maybe
Allow the metastore to assume that partition keys are normalized for non-string partition keys. This can lead to performance improvements for affected tables. Not that ..
There was a problem hiding this comment.
Changed to
Allow the metastore to assume that the values of partition
columns can be converted to string values. This can lead to
performance improvements in the queries which apply filters
on the partition columns.
Note that the partition keys of typetimestampdo
not get canonicalized. Default isfalse.
There was a problem hiding this comment.
Currently it just says "if you want faster, enable this", which isn't the whole truth
The documentation should say when this is ok to set and when it's not.
There was a problem hiding this comment.
The documentation should say when this is ok to set and when it's not.
I am missing knowledge in this area. For which types would it be a bad idea to set the parameter to true ?
Is there any automated test showcasing the limitations?
2b6def3 to
e8ff50f
Compare
8f75a47 to
e5803f7
Compare
e5803f7 to
115d1c9
Compare
There was a problem hiding this comment.
Here and in other places below .. only one space character before code or between sentences
There was a problem hiding this comment.
In this specific case true is part of of the "Default" column and not the description sentence.
There was a problem hiding this comment.
Is that true .. does the HMS actually run the delete of the actual data? I thought it just manages the metadata...
There was a problem hiding this comment.
In case of dealing with tables of type MANAGED_TABLE , when dropping the table, HMS is responsible of dropping the content of the table as well. Do note that AWS Glue metastore does not provide the same functionality when dropping tables.
There was a problem hiding this comment.
Maybe
Allow the metastore to assume that partition keys are normalized for non-string partition keys. This can lead to performance improvements for affected tables. Not that ..
There was a problem hiding this comment.
URL of the SOCKS proxy to use to for connecting to the Thrift Hive metastore.
There was a problem hiding this comment.
same for the various other ones like that below
There was a problem hiding this comment.
Related question not to be answered here...
how to you add access to that class.. drop the related jars into the Hive connector plugin folder?
If this is a common use case we might need to document specifically .. or generically somewhere
There was a problem hiding this comment.
I'm assuming that this setting was thought to deal with one of the many credential providers which come with the aws-java-sdk-core library.
If the class is not in the library aws-java-sdk-core then it would need to be probably added to hive's plugin folder.
Related PR: #1363
115d1c9 to
888d9be
Compare
|
@findinpath will you pick this up again and progress it towards merge .. Fyi @m57lyra is leading a plan to heavily refactor the hive connector docs. |
|
@mosabua I think that this PR can be closed then. Sorry for keeping it open for so long. I lost in the meantime the context to the PR and would very likely need to start over from scratch. |
|
@colebow can you harvest whatever is useful from this PR and still missing in our docs and do a lift and shift... it would be a shame to loose all the good work from @findinpath .. to keep it simple we could break it up into a whole bunch of small PR that can be merged quickly and that can be spread across the team |
No description provided.