FixIgnoreKey by vinishjail97 · Pull Request #11 · nsivabalan/hudi

vinishjail97 · 2022-01-24T06:59:25Z

[HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException ([HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException apache/hudi#3955)
[HUDI-2741] Fixing instantiating metadata table config in HoodieFileIndex ([HUDI-2741] Fixing instantiating metadata table config in HoodieFileIndex apache/hudi#3974)
[HUDI-2697] Minor changes about hbase index config. ([HUDI-2697]Minor changes about hbase index config. apache/hudi#3927)
[HUDI-2472] Enabling metadata table in TestHoodieIndex and TestMergeOnReadRollbackActionExecutor ([HUDI-2472] Enabling metadata table in TestHoodieIndex and TestMergeOnReadRollbackActionExecutor apache/hudi#3978)
[HUDI-2756] Fix flink parquet writer decimal type conversion ([HUDI-2756] Fix flink parquet writer decimal type conversion apache/hudi#3988)
[HUDI-2706] refactor spark-sql to make consistent with DataFrame api ([HUDI-2706] refactor spark-sql to make consistent with DataFrame api apache/hudi#3936)
[HUDI-2589] Claiming RFC-37 for Metadata based bloom index feature. ([HUDI-2589] Claiming RFC-37 for Metadata based bloom index feature. apache/hudi#3995)
[HUDI-2758] remove redundant code in the hoodieRealtimeInputFormatUitls.getRealtimeSplits ([HUDI-2758] remove redundant code in the HoodieRealtimeInputFormatUtils.getRealtimeSplits apache/hudi#3994)
[MINOR] Fix typo in IntervalTreeBasedGlobalIndexFileFilter ([MINOR] fix typo in IntervalTreeBasedGlobalIndexFileFilter apache/hudi#3993)
[HUDI-2744] Fix parsing of metadadata table compaction timestamp when metrics are enabled ([HUDI-2744] Fix parsing of metadadata table compaction timestamp apache/hudi#3976)
[HUDI-2683] Parallelize deleting archived hoodie commits ([HUDI-2683] Parallelize deleting archived hoodie commits apache/hudi#3920)
[HUDI-2712] Fixing a bug with rollback of partially failed commit which has new partitions ([HUDI-2712] Fixing list based rollback of partially failed commit which has new partitions apache/hudi#3947)
[HUDI-2769] Fix StreamerUtil#medianInstantTime for very near instant time ([HUDI-2769] Fix StreamerUtil#medianInstantTime for very near instant … apache/hudi#4005)
[MINOR] Fixed checkstyle config to be based off Maven root-dir (requires Maven >=3.3.1 to work properly); ([MINOR] Fixed checkstyle config to be based off Maven root-dir apache/hudi#4009)
[HUDI-2753] Ensure list based rollback strategy is used for restore ([HUDI-2753] Ensure list based rollback strategy is used for restore apache/hudi#3983)
[HUDI-2151] Part3 Enabling marker based rollback as default rollback strategy ([HUDI-2151] Part3 Enabling marker based rollback as default rollback strategy apache/hudi#3950)
Check --source-avro-schema-path parameter ([HUDI-2760] Check --source-avro-schema-path parameter apache/hudi#3987)
[MINOR] Fix typo,'Hooide' corrected to 'Hoodie' ([MINOR] Fix typo,'Hooide' corrected to 'Hoodie' apache/hudi#4007)
[MINOR] Add the Schema for GooseFS to StorageSchemes (Add the Schema for GooseFS to StorageSchemes apache/hudi#3982)
[HUDI-2314] Add support for DynamoDb based lock provider ([HUDI-2314] Add support for DynamoDb based lock apache/hudi#3486)
[HUDI-2716] InLineFS support for S3FS logs ([HUDI-2716] InLineFS support for S3FS logs apache/hudi#3977)
[HUDI-2734] Setting default metadata enable as false for Java ([HUDI-2734] Setting default metadata enable per engine apache/hudi#4003)
[HUDI-2789] Flink batch upsert for non partitioned table does not work ([HUDI-2789] Flink batch upsert for non partitioned table does not work apache/hudi#4028)
[HUDI-2790] Fix the changelog mode of HoodieTableSource ([HUDI-2790] Fix the changelog mode of HoodieTableSource apache/hudi#4029)
[HUDI-2362] Add external config file support ([HUDI-2362] Add external config file support apache/hudi#3416)
[HUDI-2641] Avoid deleting all inflight commits heartbeats while rolling back failed writes ([HUDI-2641] Avoid deleting all inflight commits heartbeats while rolling back failed writes apache/hudi#3956)
[HUDI-2791] Allows duplicate files for metadata commit ([HUDI-2791] Allows duplicate files for metadata commit apache/hudi#4033)
[HUDI-2798] Fix flink query operation fields ([HUDI-2798] Fix flink query operation fields apache/hudi#4041)
[HUDI-2731] Make clustering work regardless of whether there are base… ([HUDI-2731] Make clustering work regardless of whether there are base… apache/hudi#3970)
[HUDI-2593] Virtual keys support for metadata table ([HUDI-2593] Virtual keys support for metadata table apache/hudi#3968)
[HUDI-2472] Enabling metadata table for TestHoodieMergeOnReadTable and TestHoodieCompactor ([HUDI-2472] Enabling metadata table for TestHoodieMergeOnReadTable and TestHoodieCompactor apache/hudi#4023)
[HUDI-2796] Metadata table support for Restore action to first commit ([HUDI-2796] Metadata table support for Restore action to first commit apache/hudi#4039)
[HUDI-2242] Add configuration inference logic for few options ([HUDI-2242] Add configuration inference logic for few options apache/hudi#3359)
Remove the aws packages from hudi flink bundle jar ([HUDI-2803]Remove the aws packages from hudi flink bundle jar apache/hudi#4050)
[HUDI-2742] Added S3 object filter to support multiple S3EventsHoodieIncrSources single S3 meta table ([HUDI-2742] - Added s3 object filter to support multiple S3EventsHood… apache/hudi#4025)
[HUDI-2795] Add mechanism to safely update,delete and recover table properties ([HUDI-2795] Add mechanism to safely update,delete and recover table properties apache/hudi#4038)
[MINOR] Claim RFC number for RFC for debezium source for deltastreamer (Claim RFC number for RFC for debezium source for deltastreamer apache/hudi#4047)
[MINOR] optimize in constructor of inputbatch class ([MINOR] optimize in constructor of inputbatch class apache/hudi#4040)
[HUDI-2813] Claim RFC number for RFC for spark datasource V2 Integration ([HUDI-2813] Claim RFC number for RFC for spark datasource V2 Integration apache/hudi#4059)
[HUDI-2804] Add option to skip compaction instants for streaming read ([HUDI-2804] Add option to skip compaction instants for streaming read apache/hudi#4051)
[HUDI-2392] Make flink parquet reader compatible with decimal BINARY encoding ([HUDI-2392] Make flink parquet reader compatible with decimal BINARY … apache/hudi#4057)
[HUDI-1932] Update Hive sync timestamp when change detected ([HUDI-1932] Update Hive sync timestamp when change detected apache/hudi#3053)
[MINOR] Fix typos ([MINOR] Fix typos apache/hudi#4053)
[HUDI-2799] Fix the classloader of flink write task ([HUDI-2799] Fix the classloader of flink write task apache/hudi#4042)
[HUDI-1870] Add more Spark CI build tasks ([HUDI-1870] Add more Spark CI build tasks apache/hudi#4022)
[HUDI-2533] New option for hoodieClusteringJob to check, rollback and re-execute the last failed clustering job ([HUDI-2533] New option for hoodieClusteringJob to check, rollback and re-execute the last failed clustering job apache/hudi#3765)
[HUDI-2472] Enabling metadata table for TestHoodieIndex test case ([HUDI-2472] Enabling metadata table for TestHoodieIndex apache/hudi#4045)
[MINOR] Fix instant parsing in HoodieClusteringJob ([MINOR] Fix instant parsing in HoodieClusteringJob apache/hudi#4071)
[HUDI-2559] Converting commit timestamp format to millisecs ([HUDI-2559] Converting commit timestamp format to millisecs apache/hudi#4024)
[HUDI-2599] Make addFilesToview and fetchLatestBaseFiles public ([HUDI-2599] Make addFilesToview and fetchLatestBaseFiles public apache/hudi#4066)
[HUDI-2550] Expand File-Group candidates list for appending for MOR tables ([HUDI-2550] Expand File-Group candidates list for appending for MOR tables apache/hudi#3986)
[HUDI-2737] Use earliest instant by default for async compaction and clustering jobs ([HUDI-2737] Use earliest instant for async compaction and clustering jobs apache/hudi#3991)
[MINOR] Fix typo,'multipe' corrected to 'multiple' ([MINOR] Fix typo,'multipe' corrected to 'multiple' apache/hudi#4068)
[HUDI-1937] Rollback unfinished replace commit to allow updates ([HUDI-1937] Rollback unfinished replace commit to allow updates apache/hudi#3869)
[MINOR] Add more configuration to Kafka setup script ([MINOR] Add more configuration to Kafka setup script apache/hudi#3992)
[HUDI-2743] Assume path exists and defer fs.exists() in AbstractTableFileSystemView ([HUDI-2743] Assume path exists and defer fs.exists() in AbstractTableFileSystemView apache/hudi#4002)
[HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs ([HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs apache/hudi#4013)
[HUDI-2409] Using HBase shaded jars in Hudi presto bundle ([HUDI-2409] Using HBase shaded jars in Hudi presto bundle apache/hudi#3623)
[HUDI-2332] Add clustering and compaction in Kafka Connect Sink ([HUDI-2332] Add clustering and compaction in Kafka Connect Sink apache/hudi#3857)
[MINOR] Fix typo,rename 'HooodieAvroDeserializer' to 'HoodieAvroDeserializer' ([MINOR] Fix typo,rename 'HooodieAvroDeserializer' to 'HoodieAvroDeserializer' apache/hudi#4064)
[HUDI-2325] Add hive sync support to kafka connect ([HUDI-2325] Add hive sync support to kafka connect apache/hudi#3660)
[HUDI-2831] Securing usages of SimpleDateFormat to be thread-safe ([HUDI-2831] Securing usages of SimpleDateFormat to be thread-safe apache/hudi#4073)
[HUDI-2818] Fix 2to3 upgrade when set hoodie.table.keygenerator.class ([HUDI-2818] Fix 2to3 upgrade when set hoodie.table.keygenerator.class apache/hudi#4077)
[HUDI-2838] refresh table after drop partition ([HUDI-2838] refresh table after drop partition apache/hudi#4084)
Revert "[HUDI-2799] Fix the classloader of flink write task ([HUDI-2799] Fix the classloader of flink write task apache/hudi#4042)" (Revert "[HUDI-2799] Fix the classloader of flink write task (#4042)" apache/hudi#4069)
[HUDI-2847] Flink metadata table supports virtual keys ([HUDI-2847] Flink metadata table supports virtual keys apache/hudi#4096)
[HUDI-2759] extract HoodieCatalogTable to coordinate spark catalog table and hoodie table ([HUDI-2759] extract HoodieCatalogTable to coordinate spark catalog table and hoodie table apache/hudi#3998)
[HUDI-2688] Claim the next rfc 40 for Hudi connector for Trino ([HUDI-2688] Claim rfc 40 for Hudi Connector for Trino apache/hudi#4105)
[HUDI-2671] Fix kafka offset handling in Kafka Connect protocol ([HUDI-2671] Fix kafka offset handling in Kafka Connect protocol apache/hudi#4021)
[HUDI-2443] Hudi KVComparator for all HFile writer usages ([HUDI-2443] Hudi KVComparator for all HFile writer usages apache/hudi#3889)
[HUDI-2788] Fixing issues w/ Z-order Layout Optimization ([HUDI-2788] Fixing issues w/ Z-order Layout Optimization apache/hudi#4026)
[HUDI-2766] Cluster update strategy should not be fenced by write config ([HUDI-2766] Cluster update strategy should not be fenced by write config apache/hudi#4093)
[HUDI-2793] Fixing deltastreamer checkpoint fetch/copy over ([HUDI-2793] Fixing deltastreamer checkpoint fetch/copy over apache/hudi#4034)
[HUDI-2853] Add JMX deps in hudi utilities and kafka connect bundles ([HUDI-2853] Add JMX deps in hudi utilities and kafka connect bundles apache/hudi#4108)
[HUDI-2844][CLI] Fixing archived Timeline crashing if timeline contains REPLACE_COMMIT ([HUDI-2844][CLI] Fixing archived Timeline crashing if timeline contains REPLACE_COMMIT apache/hudi#4091)
[MINOR] Fix build failure due to checkstyle issues ([MINOR] Fix build failure due to checkstyle issues apache/hudi#4111)
[HUDI-1290] [RFC-39] Deltastreamer avro source for Debezium CDC ([HUDI-1290] [RFC-39] Deltastreamer avro source for Debezium CDC apache/hudi#4048)
[HUDI-1290] Add Debezium Source for deltastreamer ([HUDI-1290] Add Debezium Source for deltastreamer apache/hudi#4063)
[HUDI-2792] Configure metadata payload consistency check ([HUDI-2792] Configure metadata payload consistency check apache/hudi#4035)
[HUDI-2855] Change the default value of 'PAYLOAD_CLASS_NAME' to 'DefaultHoodieRecordPayload' ([HUDI-2855] Change the default value of 'PAYLOAD_CLASS_NAME' to 'DefaultHoodieRecordPayload' apache/hudi#4115)
[HUDI-2480] FileSlice after pending compaction-requested instant-time… ([HUDI-2480] FileSlice after pending compaction-requested instant-time… apache/hudi#3703)
[HUDI-1290] fixing mysql debezium source ([HUDI-1290] fixing mysql Debezium source apache/hudi#4119)
[HUDI-2800] Remove rdd.isEmpty() validation to prevent CreateHandle being called twice ([HUDI-2800] Remove rdd.isEmpty() validation to prevent CreateHandle b… apache/hudi#4121)
[HUDI-2794] Guarding table service commits within a single lock to commit to both data table and metadata table ([HUDI-2794] Guarding table service commits within a single lock to commit to both data table and metadata table apache/hudi#4037)
[HUDI-2671] Making error -> warn logs from timeline server with concurrent writers for inconsistent state ([HUDI-2671] Making error -> warn logs from timeline server with concurrent writers for inconsistent state apache/hudi#4088)
[HUDI-2858] Fixing handling of cluster update reject exception in deltastreamer ([HUDI-2858] Fixing handling of cluster update reject exception in deltastreamer apache/hudi#4120)
[HUDI-2841] Fixing lazy rollback for MOR with list based strategy ([HUDI-2841] Fixing lazy rollback for MOR with list based strategy apache/hudi#4110)
[HUDI-2801] Add Amazon CloudWatch metrics reporter ([HUDI-2801] Add Amazon CloudWatch metrics reporter apache/hudi#4081)
[HUDI-2840] Fixed DeltaStreaemer to properly respect configuration passed t/h properties file ([HUDI-2840] Fixed DeltaStreaemer to properly respect configuration passed t/h properties file apache/hudi#4090)
[HUDI-2005] Removing direct fs call in HoodieLogFileReader ([HUDI-2005] Removing direct fs call in HoodieLogFileReader apache/hudi#3865)
[HUDI-2851] Shade org.apache.hadoop.hive.ql.optimizer package for flink bundle jar ([HUDI-2851] Shade org.apache.hadoop.hive.ql.optimizer package for flink bundle jar apache/hudi#4104)
[MINOR] Include hudi-aws in flink bundle jar ([MINOR] Include hudi-aws in flink bundle jar apache/hudi#4127)
[HUDI-2852] Table metadata returns empty for non-exist partition ([HUDI-2852] Table metadata returns empty for non-exist partition apache/hudi#4117)
[HUDI-2863] Rename option 'hoodie.parquet.page.size' to 'write.parquet.page.size' ([HUDI-2863] Rename option 'hoodie.parquet.page.size' to 'write.parque… apache/hudi#4128)
[HUDI-2850] Fixing Clustering CLI - schedule and run command fixes to avoid NumberFormatException ([HUDI-2850] Clustering CLI - schedule and run command fixes to avoid NumberFormatException apache/hudi#4101)
[HUDI-2814] Addressing issues w/ Z-order Layout Optimization ([HUDI-2814] Addressing issues w/ Z-order Layout Optimization apache/hudi#4060)
[MINOR] Fixing test failure to fix CI build failure ([MINOR] Fixing test failure to fix CI build failure apache/hudi#4132)
[HUDI-2861] Re-use same rollback instant time for failed rollbacks ([HUDI-2861] Re-use same rollback instant time for failed rollbacks apache/hudi#4123)
[HUDI-2767] Enabling timeline-server-based marker as default ([HUDI-2767] Enabling timeline-server-based marker as default apache/hudi#4112)
[HUDI-2845] Metadata CLI - files/partition file listing fix and new validate option ([HUDI-2845] Metadata CLI - files/partition file listing fix and new validate option apache/hudi#4092)
[HUDI-2848] Excluse guava from hudi-cli pom ([HUDI-2848]When I run hudi-cli.sh using hadoop 3.2.1 , this is a error about guava class conflict apache/hudi#4100)
[HUDI-2864] Fix README and scripts with current limitations of hive sync ([HUDI-2864] Fix README and scripts with current limitations of hive sync apache/hudi#4129)
[HUDI-2856] Bit cask disk map delete modified ([HUDI-2856] Bit cask disk map delete modified apache/hudi#4116)
[MINOR] Follow ups from HUDI-2861 (re-use same rollback instant for failed rollback) ([MINOR] Follow ups from HUDI-2861 (re-use same rollback instant for failed rollback) apache/hudi#4133)
[HUDI-2868] Fix skipped HoodieSparkSqlWriterSuite ([HUDI-2868] Fix skipped HoodieSparkSqlWriterSuite apache/hudi#4125)
[HUDI-2475] [HUDI-2862] Metadata table creation and avoid bootstrapping race for write client & add locking for upgrade ([HUDI-2475] [HUDI-2862] Metadata table creation and avoid bootstrapping race for write client & add locking for upgrade apache/hudi#4114)
[HUDI-2102] Support hilbert curve for hudi ([HUDI-2102] Support hilbert curve for hudi apache/hudi#3952)
Moving to 0.11.0-SNAPSHOT on master branch.
[MINOR] fix typo ([MINOR] fix typo apache/hudi#4140)
[MINOR] Fixing integ test suite for hudi-aws and archival validation ([MINOR] Fixing integ test suite for hudi-aws and archival validation apache/hudi#4142)
Removing rfc from release package and fixing release validation script ([HUDI-2879] Removing rfc from release package and fixing release validation script apache/hudi#4147)
[MINOR] Fix syntax error in create_source_release.sh ([MINOR] Fix syntax error in create_source_release.sh apache/hudi#4150)
[MINOR] Fix typo,rename 'getUrlEncodePartitoning' to 'getUrlEncodePartitioning' ([MINOR] Fix typo,rename 'getUrlEncodePartitoning' to 'getUrlEncodePartitioning' apache/hudi#4130)
[HUDI-2642] Add support ignoring case in update sql operation ([HUDI-2642] Add support ignoring case in update sql operation apache/hudi#3882)
[HUDI-2891] Fix write configs for Java engine in Kafka Connect Sink ([HUDI-2891] Fix write configs for Java engine in Kafka Connect Sink apache/hudi#4161)
Revert "[HUDI-2855] Change the default value of 'PAYLOAD_CLASS_NAME' to 'DefaultHoodieRecordPayload' ([HUDI-2855] Change the default value of 'PAYLOAD_CLASS_NAME' to 'DefaultHoodieRecordPayload' apache/hudi#4115)" ([HUDI-2898] Reverting "Change the default value of 'PAYLOAD_CLASS_NAME' to 'DefaultHoodieRecordPayload' (#4115)" apache/hudi#4169)
Revert "[HUDI-2856] Bit cask disk map delete modified ([HUDI-2856] Bit cask disk map delete modified apache/hudi#4116)" (Revert "[HUDI-2856] Bit cask disk map delete modified" apache/hudi#4171)
[HUDI-2880] Fixing loading of props from default dir ([HUDI-2880] Fixing loading of props from default dir apache/hudi#4167)
[HUDI-2881] Compact the file group with larger log files to reduce write amplification ([HUDI-2881] Compact the file group with larger log files to reduce wr… apache/hudi#4152)
Fixed partitions produced by layout optimization in case order-by key is composed of a single column ([HUDI-2908] Fixed partitions produced by layout optimization in case order-by key is composed of a single column apache/hudi#4183)
[MINOR] Fix the wrong usage of timestamp length variable bug (Fix HoodieSqlUtils.formatQueryInstant timestamp variable bug apache/hudi#4179)
[HUDI-2904] Fix metadata table archival overstepping between regular writers and table services ([HUDI-2904] Fix metadata table archival overstepping between regular writers and table services apache/hudi#4186)
[HUDI-2914] Fix remote timeline server config for flink ([HUDI-2914] Fix remote timeline server config for flink apache/hudi#4191)
[minor] Refactor write profile to always generate fs view ([minor] Refactor write profile to always generate fs view apache/hudi#4198)
[HUDI-2924] Refresh the fs view on successful checkpoints for write profile ([HUDI-2924] Refresh the fs view on successful checkpoints for write p… apache/hudi#4199)
[MINOR] use catalog schema if can not find table schema ([MINOR] use catalog schema if can not find table schema apache/hudi#4182)
[HUDI-2902] Fixing populate meta fields with Hfile writers and Disabling virtual keys by default for metadata table ([HUDI-2902] Fixing populate meta fields with Hfile writers apache/hudi#4194)
[HUDI-2911] Removing default value for PARTITIONPATH_FIELD_NAME resulting in incorrect KeyGenerator configuration ([HUDI-2911] Removing default value for PARTITIONPATH_FIELD_NAME resulting in incorrect KeyGenerator configuration apache/hudi#4195)
Revert "[HUDI-2495] Resolve inconsistent key generation for timestamp types by GenericRecord and Row ([HUDI-2495] resolve inconsistent key generation for timestamp types b… apache/hudi#3944)" (Revert "[HUDI-2495] Resolve inconsistent key generation for timestamp… apache/hudi#4201)
[HUDI-2894][HUDI-2905] Metadata table - avoiding key lookup failures on base files over S3 ([HUDI-2894][HUDI-2905] Metadata table - avoiding key lookup failures on base files over S3 apache/hudi#4185)
Revert "[HUDI-2489]Tuning HoodieROTablePathFilter by caching hoodieTableFileSystemView, aiming to reduce unnecessary list/get requests"
[MINOR] Mitigate CI jobs timeout issues ([MINOR] Mitigate CI jobs timeout issues apache/hudi#4173)
[HUDI-2933] DISABLE Metadata table by default ([HUDI-2933] DISABLE Metadata table by default apache/hudi#4213)
[HUDI-2890] Kafka Connect: Fix failed writes and avoid table service concurrent operations ([HUDI-2890] Kafka Connect: Fix failed writes and avoid table service concurrent operations apache/hudi#4211)
[HUDI-2923] Fixing metadata table reader when metadata compaction is inflight ([HUDI-2923] Fixing metadata table reader when metadata compaction is inflight apache/hudi#4206)
[HUDI-2934] Optimize RequestHandler code style
[HUDI-2935] Remove special casing of clustering in deltastreamer checkpoint retrival ([HUDI-2935] Remove special casing of clustering in deltastreamer checkpoint retrieval apache/hudi#4216)
[HUDI-2877] Support flink catalog to help user use flink table conveniently ([HUDI-2877] Support flink catalog to help user use flink table conveniently apache/hudi#4153)
[HUDI-2937] Introduce a pulsar implementation of hoodie write commit … ([HUDI-2937] Introduce a pulsar implementation of hoodie write commit … apache/hudi#4217)
[HUDI-2418] Support HiveSchemaProvider ([HUDI-2418] Support HiveSchemaProvider apache/hudi#3671)
[HUDI-2916] Add IssueNavigationLink for IDEA ([HUDI-2916] Add IssueNavigationLink for IDEA apache/hudi#4192)
[HUDI-2900] Fix corrupt block end position ([HUDI-2900] Fix corrupt block end position apache/hudi#4181)
[HUDI-2876] for hive/presto hudi should remove the temp file which created by HoodieMergedLogRecordSanner when the query finished. ([HUDI-2876] for hive/presto hudi should remove the temp file which cr… apache/hudi#4139)
[MINOR] Fix partition path formatting in error log ([MINOR] Fix partition path formatting in error log apache/hudi#4168)
[MINOR] Use maven-shade-plugin version for hudi-timeline-server-bundle from main pom.xml ([MINOR] Use maven-shade-plugin version for hudi-timeline-server-bundle from main pom.xml apache/hudi#4209)
[MINOR] Remove redundant and conflicting spark-hive dependency ([MINOR] Remove redundant and conflicting spark-hive dependency apache/hudi#4228)
[HUDI-2951] Disable remote view storage config for flink ([HUDI-2951] Disable remote view storage config for flink apache/hudi#4237)
[HUDI-2942] add error message log in HoodieCombineHiveInputFormat ([HUDI-2942] add error message log in HoodieCombineHiveInputFormat apache/hudi#4224)
[MINOR] Update DOAP with 0.10.0 Release ([MINOR] Update DOAP with 0.10.0 Release apache/hudi#4246)
[HUDI-2832][RFC-41] Proposal to integrate Hudi on Snowflake platform ([HUDI-2832][RFC-41] Proposal to integrate Hudi on Snowflake platform apache/hudi#4074)
[HUDI-2964] Fixing aws lock configs to inherit from HoodieConfig ([HUDI-2964] Fixing aws lock configs to inherit from HoodieConfig apache/hudi#4258)
[HUDI-2957] Shade kryo jar for flink bundle jar ([HUDI-2957] Shade kryo jar for flink bundle jar apache/hudi#4251)
[HUDI-2665] Fix overflow of huge log file in HoodieLogFormatWriter ([HUDI-2665] Fix overflow of huge log file in HoodieLogFormatWriter apache/hudi#3912)
[MINOR] Fix Compile broken ([MINOR] Fix Compile broken apache/hudi#4263)
[HUDI-2779] Cache BaseDir if HudiTableNotFound Exception thrown ([HUDI-2779] Cache BaseDir if HudiTableNotFound Exception thrown apache/hudi#4014)
[HUDI-2966] Add TaskCompletionListener for HoodieMergeOnReadRDD to close logScaner when the query finished. ([HUDI-2966] Add TaskCompletionListener for HoodieMergeOnReadRDD to close logScaner when the query finished. apache/hudi#4265)
[MINOR] FAQ link in SUPPORT_REQUEST template ([MINOR] Fix FAQ link in SUPPORT_REQUEST template apache/hudi#4266)
Claiming RFC for data skipping index for updated version ([HUDI-2973] Claiming RFC number for data skipping index (updated version) apache/hudi#4271)
Revert "Claiming RFC for data skipping index for updated version ([HUDI-2973] Claiming RFC number for data skipping index (updated version) apache/hudi#4271)" ([MINOR] Revert "Claiming RFC for data skipping index for updated version (#42… apache/hudi#4272)
[HUDI-2901] Fixed the bug clustering jobs cannot running in parallel ([HUDI-2901] Fixed the bug clustering jobs cannot running in parallel apache/hudi#4178)
[HUDI-2936] Add data count checks in async clustering tests ([HUDI-2936] Add data count checks in async clustering tests apache/hudi#4236)
[HUDI-2849] Improve SparkUI job description for write path ([HUDI-2849] improve SparkUI job description for write path apache/hudi#4222)
[HUDI-2952] Fixing metadata table for non-partitioned dataset ([HUDI-2952] Fixing metadata table for non-partitioned dataset apache/hudi#4243)
[HUDI-2912] Fix CompactionPlanOperator typo ([HUDI-2912] Fix CompactionPlanOperator typo apache/hudi#4187)
Adding verbose output for metadata validate files command ([MINOR] Adding verbose output for metadata validate files command apache/hudi#4166)
[HUDI-2892][BUG] Pending Clustering may stain the ActiveTimeLine and lead to incomplete query results ([HUDI-2892][BUG]Pending Clustering may stain the ActiveTimeLine and lead to incomplete query results apache/hudi#4172)
[HUDI-2784] Add a hudi-trino-bundle for Trino ([HUDI-2784] Add a hudi-trino-bundle for Trino apache/hudi#4279)
[HUDI-2814] Make Z-index more generic Column-Stats Index ([HUDI-2814] Make Z-index more generic Column-Stats Index apache/hudi#4106)
[HUDI-2527] Multi writer test with conflicting async table services ([HUDI-2527] Multi writer test with conflicting async table services apache/hudi#4046)
[HUDI-2974] Make the prefix for metrics name configurable ([HUDI-2974] Make the prefix for metrics name configurable apache/hudi#4274)
[HUDI-2959] Fix the thread leak of cleaning service ([HUDI-2959] Fix the thread leak of cleaning service apache/hudi#4252)
[HUDI-2985] Shade jackson for hudi flink bundle jar ([HUDI-2985] Shade jackson for hudi flink bundle jar apache/hudi#4284)
[HUDI-2906] Add a repair util to clean up dangling data and log files ([HUDI-2906] Add a repair util to clean up dangling base and log files apache/hudi#4278)
[HUDI-2984] Implement #close for AbstractTableFileSystemView ([HUDI-2984] Implement #close for AbstractTableFileSystemView apache/hudi#4285)
[HUDI-2946] Upgrade maven plugins to be compatible with higher Java versions ([HUDI-2946] Upgrade maven plugins to be compatible with higher Java versions apache/hudi#4232)
[HUDI-2938] Metadata table util to get latest file slices for reader/writers ([HUDI-2938] Metadata table util to get latest file slices for readers/writers apache/hudi#4218)
[HUDI-2990] Sync to HMS when deleting partitions ([HUDI-2990] Sync to HMS when deleting partitions apache/hudi#4291)
[HUDI-2994] Add judgement to existed partitionPath in the catch code block for HU… ([HUDI-2994] Add judgement to existed partitionPath in the catch code block for HU… apache/hudi#4294)
[HUDI-2996] Flink streaming reader 'skip_compaction' option does not work ([HUDI-2996] Flink streaming reader 'skip_compaction' option does not work apache/hudi#4304)
[HUDI-2997] Skip the corrupt meta file for pending rollback action ([HUDI-2997] Skip the corrupt meta file for pending rollback action apache/hudi#4296)
[HUDI-2995] Enabling metadata table by default ([HUDI-2995] Enabling metadata table by default apache/hudi#4295)
[HUDI-3022] Fix NPE for isDropPartition method ([HUDI-3022] Fix npe for isDropPartition method apache/hudi#4319)
[HUDI-3024] Add explicit write handler for flink ([HUDI-3024] Add explicit write handler for flink apache/hudi#4329)
[HUDI-3025] Add additional wait time for namenode availability during IT tests initiatialization ([HUDI-3025] Add additional wait time for namenode availability during IT tests initiatialization apache/hudi#4328)
[HUDI-3028] Use blob storage to speed up CI downloads ([HUDI-3028] Use blob storage to speed up CI downloads apache/hudi#4331)
[HUDI-2998] claiming rfc number for consistent hashing index ([HUDI-2998] Claiming RFC number for consistent hashing index apache/hudi#4303)
[HUDI-3015] Implement #reset and #sync for metadata filesystem view ([HUDI-3015] Implement #reset and #sync for metadata filesystem view apache/hudi#4307)
[Minor] Catch and ignore all the exceptions in quietDeleteMarkerDir ([Minor][HUDI-3109] Catch and ignore all the exceptions in quietDeleteMarkerDir apache/hudi#4301)
[HUDI-3001] Clean up the marker directory when finish bootstrap operation. ([HUDI-3001] Clean up the marker directory when finish bootstrap operation apache/hudi#4298)
[HUDI-3043] Revert async cleaner leak commit to unblock CI failure ([HUDI-3043] Revert async cleaner leak commit to unblock CI failure apache/hudi#4343)
[HUDI-3037] Add back remote view storage config for flink ([HUDI-3037] Add back remote view storage config for flink apache/hudi#4338)
[HUDI-3046] Claim RFC number for RFC for Compaction / Clustering Service ([HUDI-3046] Claim RFC number for RFC for Compaction / Clustering Service apache/hudi#4347)
[HUDI-2958] Automatically set spark.sql.parquet.writelegacyformat, when using bulkinsert to insert data which contains decimalType ([HUDI-2958] Automatically set spark.sql.parquet.writelegacyformat, when using bulkinsert to insert data which contains decimalType apache/hudi#4253)
[HUDI-3043] Adding some test fixes to continuous mode multi writer tests ([HUDI-3043] Adding some test fixes to continuous mode multi writer tests apache/hudi#4356)
[HUDI-2962] InProcess lock provider to guard single writer process with async table operations ([HUDI-2962] InProcess lock provider to guard single writer process with async table operations apache/hudi#4259)
[HUDI-3043] De-coupling multi writer tests ([HUDI-3043] De-coupling multi writer tests apache/hudi#4362)
[HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions ([HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions apache/hudi#4363)
[HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions ([HUDI-3029] Transaction manager - fixing the transaction owner reset and unit tests apache/hudi#4373)
[HUDI-3064] Fixing a bug in TransactionManager and FileSystemTestLock ([HUDI-3064] Fixing a bug in FileSystemTestLock apache/hudi#4372)
[HUDI-3054] Fixing default lock configs for FileSystemBasedLock and fixing a flaky test ([HUDI-3054] Fixing default lock configs for FileSystemBasedLock and fixing a flaky test apache/hudi#4374)
[MINOR] Azure CI IT tasks clean up ([HUDI-3034] Add option for parallel mvn install apache/hudi#4337)
[HUDI-3052] Fix flaky testJsonKafkaSourceResetStrategy ([HUDI-3052] Fix flaky testJsonKafkaSourceResetStrategy apache/hudi#4381)
[minor] fix NetworkUtils#getHostname ([HUDI-3050] fix NetworkUtils#getHostname apache/hudi#4355)
[HUDI-2970] Adding tests for archival of replace commit actions ([HUDI-2970] Adding tests for archival of replace commit actions apache/hudi#4268)
[HUDI-3064][HUDI-3054] FileSystemBasedLockProviderTestClass tryLock fix and TestHoodieClientMultiWriter test fixes ([HUDI-3064][HUDI-3054] FileSystemBasedLockProviderTestClass tryLock fix and TestHoodieClientMultiWriter test fixes apache/hudi#4384)
remove unused import ([MINOR] remove unused import in HoodieFileIndex apache/hudi#4349)
[MINOR] Remove unused method in HoodieActiveTimeline ([MINOR] remove unused method in HoodieActiveTimeline apache/hudi#4401)
[MINOR] Increasing CI timeout to 90 mins ([MINOR] Increasing CI timeout to 90 mins apache/hudi#4407)
[HUDI-3070] Add rerunFailingTestsCount for flakly testes ([HUDI-3070] Add rerunFailingTestsCount for flakly testes apache/hudi#4398)
[HUDI-2970] Add test for archiving replace commit ([HUDI-2970] Add test for archiving replace commit apache/hudi#4345)
[HUDI-3008] Fixing HoodieFileIndex partition column parsing for nested fields
[HUDI-3027] Update hudi-examples README.md ([HUDI-3027] Update hudi-examples README.md apache/hudi#4330)
[HUDI-3032] Do not clean the log files right after compaction for metadata table ([HUDI-3032] Do not clean the log files right after compaction for met… apache/hudi#4336)
[HUDI-2547] Schedule Flink compaction in service ([HUDI-2547] Schedule Flink compaction in service apache/hudi#4254)
[HUDI-3011] Adding ability to read entire data with HoodieIncrSource with empty checkpoint ([HUDI-3011] Adding ability to read entire data with HoodieIncrSource with empty checkpoint apache/hudi#4334)
[HUDI-3060] drop table for spark sql ([HUDI-3060] drop table for spark sql apache/hudi#4364)
[MINOR] Fix DedupeSparkJob typo ([MINOR] Fix DedupeSparkJob typo apache/hudi#4418)
[HUDI-3014] Add table option to set utc timezone ([HUDI-3014] add table option to set utc timezone apache/hudi#4306)
[MINOR] Remove unused method in HoodieActiveTimeline ([Minor] remove unused method in HoodieActiveTimeline apache/hudi#4435)
[HUDI-3101] Excluding compaction instants from pending rollback info ([HUDI-3101] Excluding compaction instants from pending rollback info apache/hudi#4443)
[HUDI-3102] Do not store rollback plan in inflight instant ([HUDI-3102] Do not store rollback plan in inflight instant apache/hudi#4445)
[HUDI-3099] Purge drop partition for spark sql ([HUDI-3099] Purge drop partition for spark sql apache/hudi#4436)
[HUDI-2374] Fixing AvroDFSSource does not use the overridden schema to deserialize Avro binaries ([HUDI-2374] Fixing AvroDFSSource does not use the overridden schema t… apache/hudi#4353)
[HUDI-3093] fix spark-sql query table that write with TimestampBasedKeyGenerator ([HUDI-3093] fix spark-sql query table that write with TimestampBasedK… apache/hudi#4416)
[HUDI-3106] Fix HiveSyncTool not sync schema ([HUDI-3106] Fix HiveSyncTool not sync schema apache/hudi#4452)
[HUDI-2811] Support Spark 3.2 ([HUDI-2811] Support Spark 3.2 apache/hudi#4270)
Fixing dynamoDbLockConfig required prop check ([HUDI-3098] Fixing dynamoDbLockConfig required prop check apache/hudi#4422)
[HUDI-2983] Remove Log4j2 transitive dependencies ([HUDI-2983] Remove Log4j2 transitive dependencies apache/hudi#4281)
[MINOR] HoodieInstantTimeGenerator improve method used ([MINOR] HoodieInstantTimeGenerator improve method used apache/hudi#4462)
[HUDI-3108] Fix Purge Drop MOR Table Cause error ([HUDI-3108] Fix Purge Drop MOR Table Cause error apache/hudi#4455)
Revert "[HUDI-3043] Revert async cleaner leak commit to unblock CI failure ([HUDI-3043] Revert async cleaner leak commit to unblock CI failure apache/hudi#4343)" ([HUDI-2959] Reverting previous revert of HUDI-2959 Original PR fixed a leak in async service. Reverted due to CI flakiness. apache/hudi#4465)
[HUDI-3083] Support component data types for flink bulk_insert ([HUDI-3083] Support component data types for flink bulk_insert apache/hudi#4470)
[HUDI-2675] Fix the exception 'Not an Avro data file' when archive and clean ([HUDI-2675] Fix the exception 'Not an Avro data file' when archive and clean apache/hudi#4016)
[HUDI-3124] Bootstrap when timeline have completed instant ([HUDI-3124] Bootstrap when timeline have completed instant apache/hudi#4467)
[HUDI-1951] Add bucket hash index, compatible with the hive bucket ([HUDI-1951] Add bucket hash index, compatible with the hive bucket apache/hudi#3173)
[HUDI-3120] Cache compactionPlan in buffer ([HUDI-3120] Cache compactionPlan in buffer apache/hudi#4463)
[HUDI-3095] abstract partition filter logic to enable code reuse ([HUDI-3095] abstract partition filter logic to enable code reuse apache/hudi#4454)
[HUDI-3107]Fix HiveSyncTool drop partitions using JDBC or hivesql or hms ([HUDI-3107]Fix HiveSyncTool drop partitions using JDBC or hivesql or hms apache/hudi#4453)
[HUDI-3040] Fix HoodieSparkBootstrapExample error info for usage ([HUDI-3040] Fix HoodieSparkBootstrapExample error info for usage apache/hudi#4341)
[HUDI-3134] Fix insert error after adding columns on Spark 3.2.0 ([HUDI-3134] Fix insert error after adding columns on Spark 3.2.0 apache/hudi#4488)
[HUDI-3136] Fix merge/insert/show partitions error on Spark3.2 ([HUDI-3136] Fix merge/insert/show partitions error on Spark3.2 apache/hudi#4490)
[HUDI-3131] fix ctas error in spark3.1.1 ([HUDI-3131] fix ctas error in spark3.1.1 and 3.2.0 apache/hudi#4476)
[HUDI-3138] Fix broken UT test for TestHiveSyncTool.testDropPartitions ([HUDI-3138] Fix broken UT test for TestHiveSyncTool apache/hudi#4493)
[MINOR] Update README.md ([MINOR] Update README.md apache/hudi#4492)
[HUDI-2558] Fixing Clustering w/ sort columns with null values fails ([HUDI-2558] Fixing Clustering w/ sort columns with null values fails apache/hudi#4404)
[HUDI-3140] Fix bulk_insert failure on Spark 3.2.0 ([HUDI-3140] Fix bulk_insert failure on Spark 3.2.0 apache/hudi#4498)
Adding tests to validate different key generators ([HUDI-2590] Adding tests to validate different key generators apache/hudi#4473)
[HUDI-2774] Handle duplicate instants when fetching pending clustering plans ([HUDI-2774] Handle duplicate instants while fetching pending clustering plans apache/hudi#4118)
[HUDI-3141] Metadata merged log record reader - avoiding NullPointerException when records by keys ([HUDI-3141] Metadata merged log record reader - avoiding NullPointerException when reading records by keys apache/hudi#4505)
[HUDI-3147] Add endpoint_url to dynamodb lock provider ([HUDI-3147] Add endpoint_url to dynamodb lock provider apache/hudi#4500)
[HUDI-2966] Closing LogRecordScanner in compactor ([HUDI-2966] Closing LogRecordScanner in compactor apache/hudi#4478)
[HUDI-3171] Sync empty table to hive metastore ([HUDI-3171] Sync empty table to hive metastore apache/hudi#4511)
[HUDI-3170] Do not preserve filename when preserveCommitMetadata enabled ([HUDI-3170] Do not preserve filename when preserveCommitMetadata enabled apache/hudi#4512)
[HUDI-3168] Fixing null schema with empty commit in incremental relation ([HUDI-3168] Fixing null schema with empty commit in incremental relation apache/hudi#4513)
[HUDI-3132] Minor fixes for HoodieCatalog
Update HiveIncrementalPuller to configure filesystem (Update HiveIncrementalPuller.java apache/hudi#4431)
[HUDI-44] Adding support to preserve commit metadata for compaction ([HUDI-44] Adding support to preserve commit metadata for compaction apache/hudi#4428)
[HUDI-52] Enabling savepoint and restore for MOR table ([HUDI-52] Enabling savepoint and restore for MOR table apache/hudi#4507)
[HUDI-3165] Enabling InProcessLockProvider for all multi-writer tests instead of FileSystemBasedLockProviderTestClass ([HUDI-3165] Enabling InProcessLockProvider for all multi-writer tests instead of FileSystemBasedLockProviderTestClass apache/hudi#4427)
[MINOR] Remove unused methods in HoodieColumnProjectionUtils ([MINOR] unused method in HoodieColumnProjectionUtils removed apache/hudi#4408)
[HUDI-3118] Add default HUDI_DIR in setupKafka.sh ([HUDI-3118] Add default HUDI_DIR in setupKafka.sh apache/hudi#4460)
[HUDI-3183] Wrong result of HoodieArchivedTimeline loadInstants with TimeRangeFilter ([HUDI-3183] Wrong result of HoodieArchivedTimeline loadInstants with TimeRangeFilter apache/hudi#4521)
[HUDI-3100] Add config for hive conditional sync ([HUDI-3100] Add config for hive conditional sync apache/hudi#4440)
[HUDI-3188] Update quick start guide for Kafka Connect Sink for Hudi ([HUDI-3188] Update quick start guide for Kafka Connect Sink for Hudi apache/hudi#4527)
[MINOR] fix typos in DDLExecutor ([MINOR] fix typos in DDLExecutor apache/hudi#4534)
[HUDI-2947] Fixing checkpoint fetch in detlastreamer ([HUDI-2947] Fixing checkpoint fetch in deltastreamer apache/hudi#4485)
[HUDI-3185] HoodieConfig#getBoolean should return false when default not set ([HUDI-3185] HoodieConfig#getBoolean should return false when default … apache/hudi#4536)
[HUDI-3192] Spark metastore schema evolution broken ([HUDI-2682] Spark schema not updated with new columns on hive sync apache/hudi#4533)
[HUDI-3195] optimize spark3 pom and modify build command ([HUDI-3195] optimize spark3 pom and modify build command apache/hudi#4538)
[HUDI-2909] Handle logical type in TimestampBasedKeyGenerator ([HUDI-2909] Handle logical type in TimestampBasedKeyGenerator apache/hudi#4203)
[HUDI-3139] Shade htrace and parquet-avro in presto bundle ([HUDI-3139] Shade htrace and parquet-avro in presto bundle apache/hudi#4495)
[HUDI-3178] Fixing metadata table compaction so as to not include uncommitted data ([HUDI-3178] Fixing metadata table compaction so as to not include uncommitted data apache/hudi#4530)
[HUDI-3104] Kafka-connect support of hadoop config environments and properties ([HUDI-3104] Kafka-connect support hadoop config environments and properties apache/hudi#4451)
[HUDI-3125] spark-sql write timestamp directly ([HUDI-3125] spark-sql write timestamp directly apache/hudi#4471)
[MINOR] Fix some code style issues based on check-style plugin ([Minor]Fix some code style based on check-sytle plugin apache/hudi#4532)
[HUDI-3157] Remove aws jars from hudi bundles ([HUDI-3157] Remove aws jars from hudi bundles apache/hudi#4542)
[HUDI-3009] making some fixes to S3 incremental source ([HUDI-3009] making some fixes to S3 incremental source apache/hudi#4517)
[HUDI-3112] Fix KafkaConnect cannot sync to Hive Problem ([HUDI-3112] Fix KafkaConnect can not sync to Hive Problem apache/hudi#4458)
Removing rollbacks instants from timeline for restore operation ([HUDI-2477] Removing rollbacks instants from timeline for restore operation apache/hudi#4518)
[HUDI-3030] InProcessLockPovider as default when any async servcies enabled with no lock provider override ([HUDI-3030] InProcessLockPovider as default when any async servcies enabled with no lock provider override apache/hudi#4406)
[HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi ([HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi apache/hudi#4544)
[HUDI-3180] Include files from completed commits while bootstrapping metadata table ([HUDI-3180] Include files from completed commits while bootstrapping metadata table apache/hudi#4519)
[MINOR] Fix port number in setupKafka.sh ([MINOR] Fix port number in setupKafka.sh apache/hudi#4546)
[HUDI-3148] Create pushgateway client based on port ([HUDI-3148] Create pushgateway client based on port apache/hudi#4497)
[HUDI-2950] Addressing performance traps in Bulk Insert/Layout Optimization ([HUDI-2950] Addressing performance traps in Bulk Insert/Layout Optimization apache/hudi#4234)
Removing extraneous warn logs in ClusteringUtils ([HUDI-3158] Removing extraneous warn logs in ClusteringUtils apache/hudi#4553)
[HUDI-3195] Fix spark 3 pom ([HUDI-3195] Fix spark 3 pom apache/hudi#4554)
[HUDI-3211] Claim RFC number for RFC for Hudi Connector for Presto ([HUDI-3211][RFC-44] Claim RFC number for RFC for Hudi Connector for Presto apache/hudi#4562)
[MINOR] Remove unused static var in HoodieAvroWriteSupport ([MINOR] refactor HoodieAvroWriteSupport with remove unused static var apache/hudi#4543)
[HUDI-3094] Unify Hive's InputFormat implementations to avoid duplication ([HUDI-3094] Unify Hive's InputFormat implementations to avoid duplication apache/hudi#4417)
[HUDI-485] Corrected the check for incremental sql ([HUDI-485]: corrected the check for incremental sql apache/hudi#2768)
[HUDI-3184] hudi-flink support timestamp-micros ([HUDI-3184] hudi-flink support timestamp-micros apache/hudi#4548)
[MINOR] Fix typos ([MINOR] Fix typos apache/hudi#4567)
[HUDI-3045] New clustering regex match config to choose partitions when building clustering plan ([HUDI-3045] New clustering regex match config to choose partitions when building clustering plan apache/hudi#4346)
[HUDI-3233] Make metadata commit synchronous for flink batch
[HUDI-3225] Claim RFC-45 for async metadata indexing ([HUDI-3225] Claim RFC-45 for async metadata indexing apache/hudi#4569)
[HUDI-3235] Fix ClassNotFoundException due to log4j-core dependency ([HUDI-3235] Fix ClassNotFoundException due to log4j-core dependency apache/hudi#4574)
[HUDI-3007] Fix issues in HoodieRepairTool ([HUDI-3007] Fix issues in HoodieRepairTool apache/hudi#4564)
[HUDI-3010] Unbundle parquet-avro and shade other dependencies in prsto bundle ([HUDI-3010] Unbundle parquet-avro and shade hbase in presto-bundle apache/hudi#4551)
[MINOR] Disable flaky tests to unlock CI ([MINOR] Disable flaky tests to unlock CI apache/hudi#4592)
[HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation ([HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation apache/hudi#4514)
[MINOR] Fix local flaky test in TestFSUtils ([MINOR] Fix local flaky test in TestFSUtils apache/hudi#4596)
[HUDI-2785] Add Trino setup in Docker Demo ([HUDI-2785] Add Trino setup in Docker Demo apache/hudi#4300)
[HUDI-3198] Improve Spark SQL create table from existing hudi table ([HUDI-3198] Improve Spark SQL create table from existing hudi table apache/hudi#4584)
[MINOR] Optimize variable names and logs ([MINOR] Optimize variable names and logs apache/hudi#4581)
[MINOR] Remove org.apache.directory.api.util.Strings import (Fix class references to specific version dependent package apache/hudi#4601)
[HUDI-2968] add UT for update/delete on non-pk condition ([HUDI-2968] add UT for update/delete on non-pk condition apache/hudi#4568)
[MINOR] Delete unused parameter in TablePathUtils ([MINOR] Fix delete unused parameter in TablePathUtils apache/hudi#4595)
[HUDI-3179] Extracted common AbstractHoodieTableFileIndex to be shared across engines ([HUDI-3179] Extracted common AbstractHoodieTableFileIndex to be shared across engines apache/hudi#4520)
[HUDI-3257] Excluding clustering instants from pending rollback info ([HUDI-3257] Excluding clustering instants from pending rollback info apache/hudi#4616)
[HUDI-3194] fix MOR snapshot query during compaction ([HUDI-3194] fix MOR snapshot query (HIVE) during compaction apache/hudi#4540)
[HUDI-3252] Avoid creating empty requestedReplaceCommit in the startCommit method ([HUDI-3252] Avoid creating empty requestedReplaceCommit in the startCommit method apache/hudi#4515)
[HUDI-1558] Struct Stream Source Support Spark3 ([HUDI-1558] Struct Stream Source Support Spark3 apache/hudi#4586)
[MINOR] Minor improvement in JsonkafkaSource ([MINOR] Minor improvement in JsonkafkaSource apache/hudi#4620)
[HUDI-3261] Read rt table by hive cli throw NoSuchMethodError ([HUDI-3261] Query rt table by hive cli throw NoSuchMethodError apache/hudi#4624)
[HUDI-3263] Do not nullify members in HoodieTableFileSystemView#resetViewState to avoid NPE ([HUDI-3263] Do not nullify members in HoodieTableFileSystemView#reset… apache/hudi#4625)
[HUDI-2903] get table schema from the last commit with data written ([HUDI-2903] get table schema from the last commit with data written apache/hudi#4180)
[HUDI-3245] Convert uppercase letters to lowercase in storage configs ([HUDI-3245] Convert uppercase letters to lowercase in storage configs apache/hudi#4602)
[HUDI-3191] Rebasing Hive's FileInputFormat onto AbstractHoodieTableFileIndex ([HUDI-3191] Rebasing Hive's FileInputFormat onto AbstractHoodieTableFileIndex apache/hudi#4531)
[HUDI-2833][Design] Merge small archive files instead of expanding indefinitely. ([HUDI-2833][Design] Merge small archive files instead of expanding indefinitely. apache/hudi#4078)
[HUDI-3277] Filter non-parquet files in bootstrap procedure ([HUDI-3277] Filter non-parquet files in bootstrap procedure apache/hudi#4639)
[MINOR] Add instructions to build and upload Docker Demo images ([MINOR] Add instructions to build and upload Docker Demo images apache/hudi#4612)
[HUDI-3236] use fields'comments persisted in catalog to fill in schema ([HUDI-3236] use fields'comments persisted in catalog to fill in schema apache/hudi#4587)
[HUDI-3283] Bootstrap support overwrite existing table ([HUDI-3283] Bootstrap support overwrite existing table apache/hudi#4647)
[MINOR] Fix typo in the doc of BULK_INSERT_SORT_MODE ([MINOR] Fix typo in the doc of BULK_INSERT_SORT_MODE apache/hudi#4652)
[HUDI-3285] Drop unused method SparkBootstrapCommitActionExecutor#handleMetadataBootstrap ([HUDI-3285] Drop unused method SparkBootstrapCommitActionExecutor#han… apache/hudi#4653)
[HUDI-3250] Upgrade Presto docker image ([HUDI-3250] Upgrade Presto docker image apache/hudi#4646)
[HUDI-3281][Performance]Tuning performance of getAllPartitionPaths API in FileSystemBackedTableMetadata ([HUDI-3281][Performance]Tuning performance of getAllPartitionPaths API in FileSystemBackedTableMetadata apache/hudi#4643)
[HUDI-3271] Code optimization and clean up unused code in HoodieSparkSqlWriter ([HUDI-3271] Code optimization and clean up unused code in HoodieSparkSqlWriter apache/hudi#4631)
[HUDI-3268] Fix NPE while reading table with Spark datasource ([HUDI-3268] Fix NPE while reading table with Spark datasource apache/hudi#4630)
[minor] Fix hive-exec scope of flink bundle jar ([minor] Fix hive-exec scope of flink bundle jar apache/hudi#4664)
[HUDI-2837] Add support for using database name in incremental query ([HUDI-2837] Add support for using database name in incremental query apache/hudi#4083)
[HUDI-3262] Fixing utilities and integ test suite bundle to include hudi spark datasource ([HUDI-3262] Fixing utilities and integ test suite bundle to include hudi spark datasources apache/hudi#4670)
[HUDI-1850][HUDI-3234] Fixing read of a empty table but with failed write ([HUDI-1850][HUDI-3234] Fixing read of a empty table but with failed write apache/hudi#2903)
[HUDI-3282] Fix delete exception for Spark SQL when sync Hive ([HUDI-3282] Fix delete exception for Spark SQL when sync Hive apache/hudi#4644)
Use default value as null for hoodie.deltastreamer.source.s3incr.ignore.key.prefix

Tips

Thank you very much for contributing to Apache Hudi.
Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.

What is the purpose of the pull request

(For example: This pull request adds quick-start document.)

Brief change log

(for example:)

Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end.
Added HoodieClientWriteTest to verify the change.
Manually verified the change by running a job locally.

Committer checklist

Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

…ssed t/h properties file (apache#4090) * Rebased `DFSPropertiesConfiguration` to access Hadoop config in liue of FS to avoid confusion * Fixed `readConfig` to take Hadoop's `Configuration` instead of FS; Fixing usages * Added test for local FS access * Rebase to use `FSUtils.getFs` * Combine properties provided as a file along w/ overrides provided from the CLI * Added helper utilities to `HoodieClusteringConfig`; Make sure corresponding config methods fallback to defaults; * Fixed DeltaStreamer usage to respect properly combined configuration; Abstracted `HoodieClusteringConfig.from` convenience utility to init Clustering config from `Properties` * Tidying up * `lint` * Reverting changes to `HoodieWriteConfig` * Tdiying up * Fixed incorrect merge of the props * Converted `HoodieConfig` to wrap around `Properties` into `TypedProperties` * Fixed compilation * Fixed compilation

…nk bundle jar (apache#4104)

HUDI-2801 makes this jar as required.

…che#4117) * [HUDI-2852] Table metadata returns empty for non-exist partition * add unit test * fix code checkstyle Co-authored-by: wangminchao <wangminchao@asinking.com>

…t.page.size' (apache#4128)

… avoid NumberFormatException (apache#4101)

…4060) * `ZCurveOptimizeHelper` > `ZOrderingIndexHelper`; Moved Z-index helper under `hudi.index.zorder` package * Tidying up `ZOrderingIndexHelper` * Fixing compilation * Fixed index new/original table merging sequence to always prefer values from new index; Cleaned up `HoodieSparkUtils` * Added test for `mergeIndexSql` * Abstracted Z-index name composition w/in `ZOrderingIndexHelper`; * Fixed `DataSkippingUtils` to interrupt prunning in case data filter contains non-indexed column reference * Properly handle exceptions origination during pruning in `HoodieFileIndex` * Make sure no errors are logged upon encountering `AnalysisException` * Cleaned up Z-index updating sequence; Tidying up comments, java-docs; * Fixed Z-index to properly handle changes of the list of clustered columns * Tidying up * `lint` * Suppressing `JavaDocStyle` first sentence check * Fixed compilation * Fixing incorrect `DecimalType` conversion * Refactored test `TestTableLayoutOptimization` - Added Z-index table composition test (against fixtures) - Separated out GC test; Tidying up * Fixed tests re-shuffling column order for Z-Index table `DataFrame` to align w/ the one by one loaded from JSON * Scaffolded `DataTypeUtils` to do basic checks of Spark types; Added proper compatibility checking b/w old/new index-tables * Added test for Z-index tables merging * Fixed import being shaded by creating internal `hudi.util` package * Fixed packaging for `TestOptimizeTable` * Revised `updateMetadataIndex` seq to provide Z-index updating process w/ source table schema * Make sure existing Z-index table schema is sync'd to source table's one * Fixed shaded refs * Fixed tests * Fixed type conversion of Parquet provided metadata values into Spark expected schemas * Fixed `composeIndexSchema` utility to propose proper schema * Added more tests for Z-index: - Checking that Z-index table is built correctly - Checking that Z-index tables are merged correctly (during update) * Fixing source table * Fixing tests to read from Parquet w/ proper schema * Refactored `ParquetUtils` utility reading stats from Parquet footers * Fixed incorrect handling of Decimals extracted from Parquet footers * Worked around issues in javac failign to compile stream's collection * Fixed handling of `Date` type * Fixed handling of `DateType` to be parsed as `LocalDate` * Updated fixture; Make sure test loads Z-index fixture using proper schema * Removed superfluous scheme adjusting when reading from Parquet, since Spark is actually able to perfectly restore schema (given Parquet was previously written by Spark as well) * Fixing race-condition in Parquet's `DateStringifier` trying to share `SimpleDataFormat` object which is inherently not thread-safe * Tidying up * Make sure schema is used upon reading to validate input files are in the appropriate format; Tidying up; * Worked around javac (1.8) inability to infer expression type properly * Updated fixtures; Tidying up * Fixing compilation after rebase * Assert clustering have in Z-order layout optimization testing * Tidying up exception messages * XXX * Added test validating Z-index lookup filter correctness * Added more test-cases; Tidying up * Added tests for string expressions * Fixed incorrect Z-index filter lookup translations * Added more test-cases * Added proper handling on complex negations of AND/OR expressions by pushing NOT operator down into inner expressions for appropriate handling * Added `-target:jvm-1.8` for `hudi-spark` module * Adding more tests * Added tests for non-indexed columns * Properly handle non-indexed columns by falling back to a re-write of containing expression as `TrueLiteral` instead * Fixed tests * Removing the parquet test files and disabling corresponding tests Co-authored-by: Vinoth Chandar <vinoth@apache.org>

…pache#4123)

…4112) - Changes the default config of marker type (HoodieWriteConfig.MARKERS_TYPE or hoodie.write.markers.type) from DIRECT to TIMELINE_SERVER_BASED for Spark Engine. - Adds engine-specific marker type configs: Spark -> TIMELINE_SERVER_BASED, Flink -> DIRECT, Java -> DIRECT. - Uses DIRECT markers as well for Spark structured streaming due to timeline server only available for the first mini-batch. - Fixes the marker creation method for non-partitioned table in TimelineServerBasedWriteMarkers. - Adds the fallback to direct markers even when TIMELINE_SERVER_BASED is configured, in WriteMarkersFactory: when HDFS is used, or embedded timeline server is disabled, the fallback to direct markers happens. - Fixes the closing of timeline service. - Fixes tests that depend on markers, mainly by starting the timeline service for each test.

…alidate option (apache#4092) - Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>

…ync (apache#4129) * Fix README with current limitations of hive sync * Fix README with current limitations of hive sync * Fix dep issue * Fix Copy on Write flow Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>

* modified BitCaskDiskMap_close_function * change iterators location to finally * Update BitCaskDiskMap.java

…ailed rollback) (apache#4133)

- Co-authored-by: Yann Byron <biyan900116@gmail.com>

…ng race for write client & add locking for upgrade (apache#4114) Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>

Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>

…pache#4142)

apache#4147)

…titioning' (apache#4130)

…#3882)

…pache#4161)

…to 'DefaultHoodieRecordPayload' (apache#4115)" (apache#4169) This reverts commit 88067f5.

…pache#4171) This reverts commit 257a6a7.

Co-authored-by: zhangxiaotian13 <zhangxiaotian13@jd.com>

…red across engines (apache#4520)

…pache#4616)

…ommit method (apache#4515)

Co-authored-by: Hui An <hui.an@shopee.com>

…#4624)

…ViewState to avoid NPE (apache#4625)

…pache#4180)

…apache#4602)

…FileIndex` (apache#4531)

…definitely. (apache#4078) Co-authored-by: yuezhang <yuezhang@freewheel.tv>

)

…he#4612) * [MINOR] Add instructions to build and upload Docker Demo images * Add local test instruction

apache#4587)

…dleMetadataBootstrap (apache#4653)

…I in FileSystemBackedTableMetadata (apache#4643) Co-authored-by: yuezhang <yuezhang@freewheel.tv>

…SqlWriter (apache#4631)

…#4630)

…pache#4083)

…udi spark datasource (apache#4670)

…rite (apache#2903)

…#4644)

…re.key.prefix

umehrot2 and others added 30 commits November 25, 2021 13:33

[HUDI-2801] Add Amazon CloudWatch metrics reporter (apache#4081)

e0125a7

[HUDI-2005] Removing direct fs call in HoodieLogFileReader (apache#3865)

8340ccb

[HUDI-2851] Shade org.apache.hadoop.hive.ql.optimizer package for fli…

38585e4

…nk bundle jar (apache#4104)

[MINOR] Include hudi-aws in flink bundle jar (apache#4127)

f5da9b5

HUDI-2801 makes this jar as required.

[HUDI-2852] Table metadata returns empty for non-exist partition (apa…

e554c7f

…che#4117) * [HUDI-2852] Table metadata returns empty for non-exist partition * add unit test * fix code checkstyle Co-authored-by: wangminchao <wangminchao@asinking.com>

[HUDI-2863] Rename option 'hoodie.parquet.page.size' to 'write.parque…

e9efbdb

…t.page.size' (apache#4128)

[HUDI-2850] Fixing Clustering CLI - schedule and run command fixes to…

3d75aca

… avoid NumberFormatException (apache#4101)

[MINOR] Fixing test failure to fix CI build failure (apache#4132)

a88691f

[HUDI-2861] Re-use same rollback instant time for failed rollbacks (a…

f8e0176

…pache#4123)

[HUDI-2845] Metadata CLI - files/partition file listing fix and new v…

445208a

…alidate option (apache#4092) - Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>

[HUDI-2848] Excluse guava from hudi-cli pom (apache#4100)

8402cac

[HUDI-2856] Bit cask disk map delete modified (apache#4116)

257a6a7

* modified BitCaskDiskMap_close_function * change iterators location to finally * Update BitCaskDiskMap.java

[MINOR] Follow ups from HUDI-2861 (re-use same rollback instant for f…

9c059ef

…ailed rollback) (apache#4133)

[HUDI-2868] Fix skipped HoodieSparkSqlWriterSuite (apache#4125)

3a8d64e

- Co-authored-by: Yann Byron <biyan900116@gmail.com>

[HUDI-2475] [HUDI-2862] Metadata table creation and avoid bootstrappi…

2c7656c

…ng race for write client & add locking for upgrade (apache#4114) Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>

[HUDI-2102] Support hilbert curve for hudi (apache#3952)

780a2ac

Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>

Moving to 0.11.0-SNAPSHOT on master branch.

a1d0ff4

[MINOR] fix typo (apache#4140)

eca1693

[MINOR] Fixing integ test suite for hudi-aws and archival validation (a…

52aae36

…pache#4142)

Removing rfc from release package and fixing release validation script (

38e75ea

apache#4147)

[MINOR] Fix syntax error in create_source_release.sh (apache#4150)

536af4b

[MINOR] Fix typo,rename 'getUrlEncodePartitoning' to 'getUrlEncodePar…

3433f00

…titioning' (apache#4130)

[HUDI-2642] Add support ignoring case in update sql operation (apache…

a398aad

…#3882)

[HUDI-2891] Fix write configs for Java engine in Kafka Connect Sink (a…

ea009b5

…pache#4161)

Revert "[HUDI-2855] Change the default value of 'PAYLOAD_CLASS_NAME' …

24380c2

…to 'DefaultHoodieRecordPayload' (apache#4115)" (apache#4169) This reverts commit 88067f5.

Revert "[HUDI-2856] Bit cask disk map delete modified (apache#4116)" (a…

9b254b6

…pache#4171) This reverts commit 257a6a7.

Timzhang01 and others added 29 commits January 16, 2022 22:24

[MINOR] Delete unused parameter in TablePathUtils (apache#4595)

ed92c21

Co-authored-by: zhangxiaotian13 <zhangxiaotian13@jd.com>

[HUDI-3179] Extracted common AbstractHoodieTableFileIndex to be sha…

75caa7d

…red across engines (apache#4520)

[HUDI-3257] Excluding clustering instants from pending rollback info (a…

36a9f63

…pache#4616)

[HUDI-3194] fix MOR snapshot query during compaction (apache#4540)

d365337

[HUDI-3252] Avoid creating empty requestedReplaceCommit in the startC…

20e7983

…ommit method (apache#4515)

[HUDI-1558] Struct Stream Source Support Spark3 (apache#4586)

f184474

Co-authored-by: Hui An <hui.an@shopee.com>

[MINOR] Minor improvement in JsonkafkaSource (apache#4620)

3d93e85

[HUDI-3261] Read rt table by hive cli throw NoSuchMethodError (apache…

3b56320

…#4624)

[HUDI-3263] Do not nullify members in HoodieTableFileSystemView#reset…

45f054f

…ViewState to avoid NPE (apache#4625)

[HUDI-2903] get table schema from the last commit with data written (a…

a09c231

…pache#4180)

[HUDI-3245] Convert uppercase letters to lowercase in storage configs (…

caeea94

…apache#4602)

[HUDI-3191] Rebasing Hive's FileInputFormat onto `AbstractHoodieTable…

4bea758

…FileIndex` (apache#4531)

[HUDI-2833][Design] Merge small archive files instead of expanding in…

7647562

…definitely. (apache#4078) Co-authored-by: yuezhang <yuezhang@freewheel.tv>

[HUDI-3277] Filter non-parquet files in bootstrap procedure (apache#4639

db93ad2

)

[MINOR] Add instructions to build and upload Docker Demo images (apac…

a08a2b7

…he#4612) * [MINOR] Add instructions to build and upload Docker Demo images * Add local test instruction

[HUDI-3236] use fields'comments persisted in catalog to fill in schema (

31b57a2

apache#4587)

[HUDI-3283] Bootstrap support overwrite existing table (apache#4647)

b7a79aa

[MINOR] Fix typo in the doc of BULK_INSERT_SORT_MODE (apache#4652)

14d08bb

[HUDI-3285] Drop unused method SparkBootstrapCommitActionExecutor#han…

a66004a

…dleMetadataBootstrap (apache#4653)

[HUDI-3250] Upgrade Presto docker image (apache#4646)

2071e3b

[HUDI-3281][Performance]Tuning performance of getAllPartitionPaths AP…

79bf6ab

…I in FileSystemBackedTableMetadata (apache#4643) Co-authored-by: yuezhang <yuezhang@freewheel.tv>

[HUDI-3271] Code optimization and clean up unused code in HoodieSpark…

8547f11

…SqlWriter (apache#4631)

[HUDI-3268] Fix NPE while reading table with Spark datasource (apache…

4b90850

…#4630)

[minor] Fix hive-exec scope of flink bundle jar (apache#4664)

64b1426

[HUDI-2837] Add support for using database name in incremental query (a…

56cd8ff

…pache#4083)

[HUDI-3262] Fixing utilities and integ test suite bundle to include h…

e72553a

…udi spark datasource (apache#4670)

[HUDI-1850][HUDI-3234] Fixing read of a empty table but with failed w…

f7a7796

…rite (apache#2903)

[HUDI-3282] Fix delete exception for Spark SQL when sync Hive (apache…

cfde45b

…#4644)

Use default value as null for hoodie.deltastreamer.source.s3incr.igno…

3b1e11b

…re.key.prefix

vinishjail97 closed this Jan 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FixIgnoreKey#11

FixIgnoreKey#11
vinishjail97 wants to merge 337 commits intonsivabalan:masterfrom
vinishjail97:FixIgnoreKey

vinishjail97 commented Jan 24, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

vinishjail97 commented Jan 24, 2022

Tips

What is the purpose of the pull request

Brief change log

Verify this pull request

Committer checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants