Skip to content

Conversation

@yuzhaojing
Copy link
Contributor

@yuzhaojing yuzhaojing commented Jun 21, 2022

prep step #6732

@vinothchandar
Copy link
Member

vinothchandar commented Jul 20, 2022

@yuzhaojing - I have given @prasannarajaperumal full context on this. and he will take over this. and I ll check-in/chime in as well as needed

@yuzhaojing
Copy link
Contributor Author

@yuzhaojing - I have given @prasannarajaperumal full context on this. and he will take over this. and I ll check-in/chime in as well as needed

Ok, will communicate with him.

@yuzhaojing
Copy link
Contributor Author

yuzhaojing commented Aug 2, 2022

@prasannarajaperumal Can you take a look for this pr?

@xushiyan xushiyan self-assigned this Aug 15, 2022
Copy link
Contributor

@prasannarajaperumal prasannarajaperumal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a more general note - how are these API calls triggered? TableManagementService would ideally monitor all the Hudi datasets and keep track of the various stats and have configurations to trigger each of these table management actions. Are you planning on doing this as a follow up?


package org.apache.hudi.table.management.entity;

public enum InstanceStatus {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we can split State (SCHEDULED, RUNNING, COMPLETED) and Status (SUCESS, FAIL). Completed state can either be a success or fail.

break;
case FLINK:
default:
throw new IllegalStateException("Unexpected value: " + instance.getExecutionEngine());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Engine {0} not supported


@Override
public Map<String, String> getJobParams(Instance instance) {
Map<String, String> sparkParams = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could possibly register a execution engine and config with a seperate API endpoint and use that engineConfig as an input to run a table service. ServiceConfig is not generalized to any engine today - would rather get this as a input through the rest API.

@ToString
@NoArgsConstructor
@AllArgsConstructor
public class Instance {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better name for this would TableManagementAction? Also rename Action as ActionType


import java.util.List;

public class RelationDBBasedStore implements MetadataStore {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we want this to be built over the HudiTimeline persistence of RFC-36 (Metastore server). These 2 RFC's have persistence layer shared. I am okay to track this as an action item and proceed here.

<version>1.7.25</version>
</dependency>

<dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check on the license of this third-party? Seems like apache 2.0 (https://github.com/brettwooldridge/HikariCP/blob/dev/LICENSE) but want to make sure.

@ToString
@NoArgsConstructor
@AllArgsConstructor
public class Instance {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JavaDoc and comments on the fields used here will be helpful.

@yihua yihua added priority:critical Production degraded; pipelines stalled and removed priority:high Significant impact; potential bugs labels Sep 13, 2022
@yuzhaojing yuzhaojing force-pushed the HUDI-3475 branch 3 times, most recently from 4fbf616 to 9c63087 Compare January 5, 2023 13:56
Copy link
Member

@xushiyan xushiyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roughly went through the code base and commented about some improvements. will continue review tmr

import java.util.Calendar;
import java.util.Date;

public class DateTimeUtils {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we consolidate this with org.apache.hudi.common.util.DateTimeUtils ? and pls add some UTs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

import java.sql.PreparedStatement;
import java.util.stream.Collectors;

public class SqlSessionFactoryUtil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we consolidate this with org.apache.hudi.metaserver.store.jdbc.SqlSessionFactoryUtils ? and more UTs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the follow-up PR, hudi-platform-common will be extracted for unification, let us follow up.

}

public static void main(String[] args) throws Exception {
System.out.println("SPARK_HOME = " + System.getenv("SPARK_HOME"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you review all occurrence of system.out/err.print and replace with logger ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@Override
public void handle(@NotNull Context context) throws Exception {
boolean success = true;
long beginTs = System.currentTimeMillis();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use org.apache.hudi.common.util.HoodieTimer as a standard for compute code execution time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

import org.apache.ibatis.session.RowBounds;
import org.apache.ibatis.session.SqlSession;

public class JdbcMapper {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should standardize jdbc interaction, at least for metaserver and TSM. We can have a follow up jira for this - having a module hudi-platform-service/hudi-platform-common for common components & classes maybe ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

public void updateExecutionInfo(Instance instance) {
int retryNum = 0;
try {
while (retryNum++ < 3) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use retryhelper here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update it.


void init();

void startService();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just call it start() ? . :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.


import java.util.Map;

public abstract class ExecutionEngine {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please review all new classes for TSM models and add javadoc to explain the use case ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will add doc.


import lombok.Getter;

import java.util.Date;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls avoid all java.util.Date, which is not thread safe. let's change all to java.time.* instead. Also for timestamp, can you see if better with all Long type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@Parameter(names = {"-instance-submit-timeout-sec"}, description = "Instance Submit Timeout Sec")
public Integer instanceSubmitTimeoutSec = 600;

@Parameter(names = {"-spark-submit-jar-path"}, description = "Spark Submit Jar Path")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is spark-specific? it's under common/CommandConfig so it better be engine agnostic.

@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

<result column="update_time" property="updateTime" javaType="java.util.Date"/>
</resultMap>

<update id="createInstance">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be misunderstood with saveInstance below, maybe we can call it createInstanceTable?

@Parameter(names = {"-instance-submit-timeout-sec"}, description = "Instance Submit Timeout Sec")
public Integer instanceSubmitTimeoutSec = 600;

@Parameter(names = {"-spark-submit-jar-path"}, description = "Spark Submit Jar Path")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does spark-submit-jar refer to hudi-cli? Its parameters seem to be

sparkLauncher.addAppArgs("COMPACT_RUN", master, sparkMemory, instance.getBasePath(), 
    instance.getTableName(), instance.getInstant(), parallelism, "", maxRetryNum, "")

sparkLauncher.setConf(entry.getKey(), entry.getValue());
}

sparkLauncher.addSparkArg("--queue", instance.getQueue());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use spark.yarn.queue and put it in the jobParams above?

// sparkLauncher.addAppArgs(SparkCommand.COMPACT_RUN.toString(), master, sparkMemory, client.getBasePath(),
// client.getTableConfig().getTableName(), compactionInstantTime, parallelism, schemaFilePath,
// retry, propsFilePath);
sparkLauncher.addAppArgs("COMPACT_RUN", master, sparkMemory, instance.getBasePath(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COMPACT_RUN might be set as a parameter


private int maxRetry;

private Date queryStartTime = DateTimeUtils.addDay(-3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this -3 mean ~

throw new RuntimeException("Instance not exist: " + instance);
}
// 2. update status
metadataStore.updateStatus(instance);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like no changes have been made here

<resultMap type="org.apache.hudi.table.service.manager.entity.Instance" id="InstanceMapping">
<result column="id" property="id" javaType="java.lang.Long"/>
<result column="db_name" property="dbName"/>
<result column="table_name" property="tableName"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the tbl table canceled? I noticed that tbl_id is used here in rfc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

big-needle-movers priority:critical Production degraded; pipelines stalled size:XL PR with lines of changes > 1000

Projects

Status: 🏗 Under discussion

Development

Successfully merging this pull request may close these issues.

7 participants