Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
29a53c6
Implement RockDbBasedMap as an alternate to DiskBasedMap in SpillableMap
Jun 19, 2021
e41f13f
[MINOR] Put Azure cache tasks first (#3118)
xushiyan Jun 20, 2021
429e9fb
[HUDI-1248] Increase timeout for deltaStreamerTestRunner in TestHoodi…
codope Jun 21, 2021
adf1679
[HUDI-2049] StreamWriteFunction should wait for the next inflight ins…
danny0405 Jun 21, 2021
f8d9242
[HUDI-2050] Support rollback inflight compaction instances for batch …
swuferhong Jun 21, 2021
4fd8a88
[HUDI-1776] Support AlterCommand For Hoodie (#3086)
Jun 21, 2021
93dad05
Fix iterator for rocks db
Jun 21, 2021
cb5cd35
[HUDI-2043] HoodieDefaultTimeline$filterPendingCompactionTImeline() m…
swuferhong Jun 22, 2021
7bd517a
[HUDI-2031] JVM occasionally crashes during compaction when spark spe…
marin-ma Jun 22, 2021
5db37c2
[HUDI-2047] Ignore FileNotFoundException in WriteProfiles #getWritePa…
yuzhaojing Jun 22, 2021
69c0d9e
[HUDI-1883] Support Truncate Table For Hoodie (#3098)
Jun 22, 2021
062d5ba
[HUDI-2013] Removed option to fallback to file listing when Metadata …
prashantwason Jun 22, 2021
11e64b2
[HUDI-1717] Metadata Reader should merge all the un-synced but comple…
prashantwason Jun 22, 2021
3fb59dd
[HUDI-1988] FinalizeWrite() been executed twice in AbstractHoodieWrit…
swuferhong Jun 23, 2021
2687eab
[HUDI-2054] Remove the duplicate name for flink write pipeline (#3135)
danny0405 Jun 23, 2021
43b9c1f
[HUDI-1826] Add ORC support in HoodieSnapshotExporter (#3130)
vaibhav-sinha Jun 23, 2021
380518e
[HUDI-2038] Support rollback inflight compaction instances for Compac…
yuzhaojing Jun 23, 2021
dd248a3
Fix checkstyle issues
Jun 23, 2021
e039e0f
[HUDI-2064] Fix TestHoodieBackedMetadata#testOnlyValidPartitionsAdded…
leesf Jun 23, 2021
7e50f9a
[HUDI-2061] Incorrect Schema Inference For Schema Evolved Table (#3137)
Jun 24, 2021
84dd3ca
[HUDI-2053] Insert Static Partition With DateType Return Incorrect P…
Jun 24, 2021
b328555
[HUDI-2069] Fix KafkaAvroSchemaDeserializer to not rely on reflection…
sbernauer Jun 24, 2021
218f2a6
[HUDI-2062] Catch FileNotFoundException in WriteProfiles #getCommitMe…
yuzhaojing Jun 25, 2021
e64fe55
[HUDI-2068] Skip the assign state for SmallFileAssign when the state …
danny0405 Jun 25, 2021
0fb8556
Add ability to provide multi-region (global) data consistency across …
s-sanjay Jun 25, 2021
23dbc09
[MINOR] Removing un-used files and references (#3150)
n3nash Jun 25, 2021
ed1a5da
[HUDI-2060] Added tests for KafkaOffsetGen (#3136)
veenaypatil Jun 25, 2021
f73bedd
[MINOR] Remove unused methods (#3152)
wangxianghu Jun 26, 2021
e99a6b0
[HUDI-2073] Fix the bug of hoodieClusteringJob never quit (#3157)
zhangyue19921010 Jun 27, 2021
d24341d
[HUDI-2074] Use while loop instead of recursive call in MergeOnReadIn…
danny0405 Jun 28, 2021
9e61dad
[MINOR] Drop duplicate keygenerator class configuration setting (#3167)
wangxianghu Jun 28, 2021
34fc8a8
[HUDI-2067] Sync FlinkOptions config to FlinkStreamerConfig (#3151)
veenaypatil Jun 28, 2021
039aeb6
[HUDI-1910] Commit Offset to Kafka after successful Hudi commit (#3092)
veenaypatil Jun 28, 2021
37b7c65
[HUDI-2084] Resend the uncommitted write metadata when start up (#3168)
yuzhaojing Jun 29, 2021
0749cc8
[HUDI-2081] Move schema util tests out from TestHiveSyncTool (#3166)
xushiyan Jun 29, 2021
b8a8f57
[HUDI-2094] Supports hive style partitioning for flink writer (#3178)
danny0405 Jun 29, 2021
5a7d1b3
[HUDI-2097] Fix Flink unable to read commit metadata error (#3180)
swuferhong Jun 29, 2021
f665db0
[HUDI-2085] Support specify compaction paralleism and compaction targ…
swuferhong Jun 29, 2021
6d4b556
Address reviewer comments
Jun 29, 2021
72a3594
Address reviewer comments
Jun 30, 2021
202887b
[HUDI-2092] Fix NPE caused by FlinkStreamerConfig#writePartitionUrlEn…
wangxianghu Jun 30, 2021
5564c7e
[HUDI-2006] Adding more yaml templates to test suite (#3073)
nsivabalan Jun 30, 2021
1cbf43b
[HUDI-2103] Add rebalance before index bootstrap (#3185)
yuzhaojing Jun 30, 2021
94f0f40
[HUDI-1944] Support Hudi to read from committed offset (#3175)
veenaypatil Jun 30, 2021
07e93de
[HUDI-2052] Support load logFile in BootstrapFunction (#3134)
yuzhaojing Jun 30, 2021
91427f3
Implement RockDbBasedMap as an alternate to DiskBasedMap in SpillableMap
Jun 19, 2021
26ed037
Fix iterator for rocks db
Jun 21, 2021
6ca64ba
Fix checkstyle issues
Jun 23, 2021
4eb25d0
Address reviewer comments
Jun 29, 2021
e45577b
Address reviewer comments
Jun 30, 2021
231b679
Merge branch 'rm_rocks_db' of https://github.com/rmahindra123/hudi in…
Jun 30, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,6 @@ stages:
jobs:
- job: unit_tests_spark_client
steps:
- script: |
mvn $(MAVEN_OPTS) clean install -DskipTests
- task: Cache@2
inputs:
key: 'maven | "$(Agent.OS)" | **/pom.xml'
Expand All @@ -44,6 +42,8 @@ stages:
maven
path: $(MAVEN_CACHE_FOLDER)
displayName: Cache Maven local repo
- script: |
mvn $(MAVEN_OPTS) clean install -DskipTests
- task: Maven@3
inputs:
mavenPomFile: 'pom.xml'
Expand All @@ -58,8 +58,6 @@ stages:
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
- job: unit_tests_utilities
steps:
- script: |
mvn $(MAVEN_OPTS) clean install -DskipTests
- task: Cache@2
inputs:
key: 'maven | "$(Agent.OS)" | **/pom.xml'
Expand All @@ -68,6 +66,8 @@ stages:
maven
path: $(MAVEN_CACHE_FOLDER)
displayName: Cache Maven local repo
- script: |
mvn $(MAVEN_OPTS) clean install -DskipTests
- task: Maven@3
inputs:
mavenPomFile: 'pom.xml'
Expand All @@ -82,8 +82,6 @@ stages:
mavenOptions: '-Xmx2g $(MAVEN_OPTS)'
- job: unit_tests_other_modules
steps:
- script: |
mvn $(MAVEN_OPTS) clean install -DskipTests
- task: Cache@2
inputs:
key: 'maven | "$(Agent.OS)" | **/pom.xml'
Expand All @@ -92,6 +90,8 @@ stages:
maven
path: $(MAVEN_CACHE_FOLDER)
displayName: Cache Maven local repo
- script: |
mvn $(MAVEN_OPTS) clean install -DskipTests
- task: Maven@3
inputs:
mavenPomFile: 'pom.xml'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# yaml to test clustering
dag_name: NAME-clustering.yaml
dag_rounds: clustering_num_iterations
dag_intermittent_delay_mins: clustering_delay_in_mins
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,11 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Long running test suite which cleans up input after every round of a dag. Which means, validation
# happens only for 1 round of dag everytime (as input is cleaned up)
dag_name: NAME-long-running-multi-partitions.yaml
dag_rounds: num_iterations
dag_rounds: long_num_iterations
dag_intermittent_delay_mins: delay_in_mins
dag_content:
first_insert:
Expand Down Expand Up @@ -82,7 +85,7 @@ dag_content:
deps: second_hive_sync
last_validate:
config:
execute_itr_count: 50
execute_itr_count: long_num_iterations
validate_clean: true
validate_archival: true
type: ValidateAsyncOperations
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Long running test suite which validates entire input after every dag. Input accumulates and so validation
# happens for entire dataset.
dag_name: NAME-long-running-multi-partitions.yaml
dag_rounds: medium_num_iterations
dag_intermittent_delay_mins: delay_in_mins
dag_content:
first_insert:
config:
record_size: 1000
num_partitions_insert: 5
repeat_count: 1
num_records_insert: 1000
type: InsertNode
deps: none
second_insert:
config:
record_size: 1000
num_partitions_insert: 50
repeat_count: 1
num_records_insert: 10000
deps: first_insert
type: InsertNode
third_insert:
config:
record_size: 1000
num_partitions_insert: 2
repeat_count: 1
num_records_insert: 300
deps: second_insert
type: InsertNode
first_hive_sync:
config:
queue_name: "adhoc"
engine: "mr"
type: HiveSyncNode
deps: third_insert
first_validate:
config:
validate_hive: true
type: ValidateDatasetNode
deps: first_hive_sync
first_upsert:
config:
record_size: 1000
num_partitions_insert: 2
num_records_insert: 300
repeat_count: 1
num_records_upsert: 100
num_partitions_upsert: 1
type: UpsertNode
deps: first_validate
first_delete:
config:
num_partitions_delete: 50
num_records_delete: 8000
type: DeleteNode
deps: first_upsert
second_hive_sync:
config:
queue_name: "adhoc"
engine: "mr"
type: HiveSyncNode
deps: first_delete
second_validate:
config:
validate_hive: true
delete_input_data: false
type: ValidateDatasetNode
deps: second_hive_sync
last_validate:
config:
execute_itr_count: medium_num_iterations
validate_clean: true
validate_archival: true
type: ValidateAsyncOperations
deps: second_validate
83 changes: 83 additions & 0 deletions docker/demo/config/test-suite/templates/sanity.yaml.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Sanity yaml to test simple operations.
dag_name: NAME-sanity.yaml
dag_rounds: 1
dag_intermittent_delay_mins: delay_in_mins
dag_content:
first_insert:
config:
record_size: 1000
num_partitions_insert: 5
repeat_count: 1
num_records_insert: 1000
type: InsertNode
deps: none
second_insert:
config:
record_size: 1000
num_partitions_insert: 50
repeat_count: 1
num_records_insert: 10000
deps: first_insert
type: InsertNode
third_insert:
config:
record_size: 1000
num_partitions_insert: 2
repeat_count: 1
num_records_insert: 300
deps: second_insert
type: InsertNode
first_hive_sync:
config:
queue_name: "adhoc"
engine: "mr"
type: HiveSyncNode
deps: third_insert
first_validate:
config:
validate_hive: true
type: ValidateDatasetNode
deps: first_hive_sync
first_upsert:
config:
record_size: 1000
num_partitions_insert: 2
num_records_insert: 300
repeat_count: 1
num_records_upsert: 100
num_partitions_upsert: 1
type: UpsertNode
deps: first_validate
first_delete:
config:
num_partitions_delete: 50
num_records_delete: 8000
type: DeleteNode
deps: first_upsert
second_hive_sync:
config:
queue_name: "adhoc"
engine: "mr"
type: HiveSyncNode
deps: first_delete
second_validate:
config:
validate_hive: true
type: ValidateDatasetNode
deps: second_hive_sync
Loading