Skip to content

Conversation

@ghost
Copy link

@ghost ghost commented Oct 28, 2021

Fixes #1720

Motivation

Introduce the basic framework for a workflow based on flinksql.

Modifications

Add basic flinksql executor.

Verifying this change

  • Make sure that the change passes the CI checks.

Demo

Try a demo, please run org.apache.inlog.sort.examples.BasicDemoExample.

One source, insert into two sink.

image

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
  • If a feature is not applicable for documentation, explain why?
  • If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

@codecov-commenter
Copy link

Codecov Report

Merging #1721 (88a2d75) into master (fe45b53) will decrease coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1721      +/-   ##
============================================
- Coverage     12.21%   12.19%   -0.03%     
- Complexity     1048     1049       +1     
============================================
  Files           392      392              
  Lines         32755    32755              
  Branches       5159     5159              
============================================
- Hits           4001     3993       -8     
- Misses        27989    27997       +8     
  Partials        765      765              
Impacted Files Coverage Δ
.../java/org/apache/flume/sink/tubemq/TubemqSink.java 51.42% <0.00%> (-4.00%) ⬇️
...bemq/server/common/heartbeat/HeartbeatManager.java 36.36% <0.00%> (-2.03%) ⬇️
.../producer/qltystats/DefaultBrokerRcvQltyStats.java 45.70% <0.00%> (+0.39%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe45b53...88a2d75. Read the comment docs.

@gosonzhang
Copy link
Contributor

This PR can make Sort's flow information more obvious, but I personally think it might be better to put it under the sort module as a sub-module of Sort.

How about this PR? @ifndef-SleePy

@ghost
Copy link
Author

ghost commented Oct 29, 2021

hi @gosonzhang @dockerzhang, thanks for review

The key point is we need use high flink version(e.g flink-1.13.x)

  • Flink1.9.X can not parse sql, we need custom the sql parser ourself.
  • Flink1.9.X api can not compatible with flink-1.13.x
  • Before flink-1.13.x, we can not assign different parallelism for different sink
  • and so on

In general, if we use old flink version, we will spend most time on flink framework instead of flinksql workflow job, on the other hand, we will not expose flink to end users.

And, I had tried put the basic framework to the sort module and upgrade the flink version in sort module, but met many errors.

@dockerzhang
Copy link
Contributor

@leo65535 Flink will not be updated (from 1.9.x->1.13.x) recently, but we have noted this potential problem. I think you could continue to complete your task about this issue at your branch, we could move forward to merge after flink be updated in the future.

@dockerzhang
Copy link
Contributor

dockerzhang commented Nov 1, 2021

@leo65535 how about using inlong-sort-sql instead of inlong-sort-enhance, and inlong-sort-sql be a submodule of inlong-sort?
too many modules for InLong is not a good choice, it will be difficult to manage.

@ghost
Copy link
Author

ghost commented Nov 1, 2021

how about using inlong-sort-sql instead of inlong-sort-enhance

Make sense.

Flink will not be updated (from 1.9.x->1.13.x) recently, but we have noted this potential problem.

If not update the flink version, we can not go on, because flink1.9.x has many limits, like

  • can not parse flinksql statement
  • can not support sink parallelism
  • can not work well with iceberg/hudi
  • ...

We can refer to other projects, already upgraded the flink version.

apache/hudi#3291
apache/iceberg#3116

Can we use a new module first with new flink version, and keep working on it, so we can support flinksql workflow quickly.
And we can try to upgrade the flink old version in origin sort module at the same time?

@ghost
Copy link
Author

ghost commented Nov 1, 2021

Flink will not be updated (from 1.9.x->1.13.x) recently, but we have noted this potential problem. I think you could continue to complete your task about this issue at your branch, we could move forward to merge after flink be updated in the future.

ok, close first.

@ghost ghost closed this Nov 1, 2021
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Introduce the basic framework for a workflow based on flinksql

3 participants