-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature][Connector-v2][Mongodb]Refactor mongodb connector #4620
Conversation
I will commit the changes to the release-note.md file after the PR review is completed, in order to prevent conflicts that may cause the CI to not run. |
cc @TyrantLucifer @hailin0 PTAL, thanks. |
@@ -106,75 +83,6 @@ public void initConnection() { | |||
client = MongoClients.create(url); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more test case to verify refactor is make sense.
import java.util.concurrent.TimeUnit; | ||
import java.util.stream.Stream; | ||
|
||
import static java.net.HttpURLConnection.HTTP_OK; | ||
import static java.net.HttpURLConnection.HTTP_UNAUTHORIZED; | ||
|
||
@Slf4j | ||
@DisabledOnContainer( | ||
value = {}, | ||
type = {EngineType.FLINK, EngineType.SPARK}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why disable spark and flink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TyrantLucifer I apologize for the inconvenience. Due to the fact that the Flink 1.13 image cannot run on the Mac M1, I have encountered issues with running Flink CI locally, even though I have successfully validated it with Zeta, Flink, and Spark. When I submit the code directly to GitHub, it often fails for various reasons. My plan is to first use this annotation to limit the scope and get Flink running smoothly. After that, I will remove the annotation.
c_string = string | ||
c_boolean = boolean | ||
c_tinyint = tinyint | ||
c_smallint = smallint | ||
c_tinyint = int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change it? Connector should support all type that SeaTunnel has defined, although mongo does not distinguish, you should and convert logic in your connector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TyrantLucifer You can take a look at the documentation of mongodb connector. The data type of mongodb and the data type of st are not fully mapped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
private GenericContainer<?> mongodbContainer; | ||
private MongoClient client; | ||
|
||
private static final Random random = new Random(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static parameter should on the top.
rules = | ||
{ | ||
row_rules = [ | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more check rules.
8949a6f
to
c75e0ab
Compare
database = "test_db" | ||
collection = "source_table" | ||
collection = "sink_table" | ||
no-timeout = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unify the parameter naming style
} | ||
|
||
source { | ||
MongoDB { | ||
uri = "mongodb://e2e_mongodb:27017/test_db?retryWrites=true&writeConcern=majority" | ||
Mongodb { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change this name
0a39149
to
09ee942
Compare
372c9e4
to
cd535dc
Compare
ba17d01
to
5c98098
Compare
b1f2c1d
to
22df9ab
Compare
f1be51b
to
ca51c48
Compare
e5b4977
to
bc06ecf
Compare
MongodbConfig.COLLECTION, | ||
MongodbConfig.DATABASE, | ||
MongodbConfig.COLLECTION, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate config COLLECTION
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing uri
?
...c/main/java/org/apache/seatunnel/connectors/seatunnel/mongodb/sink/writer/MongodbWriter.java
Outdated
Show resolved
Hide resolved
...java/org/apache/seatunnel/connectors/seatunnel/mongodb/sink/config/MongodbWriterOptions.java
Outdated
Show resolved
Hide resolved
...main/java/org/apache/seatunnel/connectors/seatunnel/mongodb/source/MongodbSourceFactory.java
Outdated
Show resolved
Hide resolved
...main/java/org/apache/seatunnel/connectors/seatunnel/mongodb/source/reader/MongodbReader.java
Outdated
Show resolved
Hide resolved
@hailin0 PTAL, thanks. |
doBulkWrite(); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Override | |
public Optional<Void> prepareCommit() { | |
doBulkWrite(); | |
return Optional.empty(); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hailin0 Does your review
mean what is described in the picture above?
} | ||
} | ||
|
||
void doBulkWrite() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void doBulkWrite() throws IOException { | |
synchronized void doBulkWrite() throws IOException { |
ac75c9c
to
84bd528
Compare
@hailin0 @EricJoy2048 PTAL, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, PTAL @hailin0 @TyrantLucifer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide multi split read e2e examples.
case ROW: | ||
return createRowConverter((SeaTunnelRowType) type); | ||
default: | ||
throw new UnsupportedOperationException("Not support to parse type: " + type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MongoDBConnectorException
|
||
private static final long serialVersionUID = 1L; | ||
|
||
private static String[] upsertKey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why static
.withDatabase(database) | ||
.withCollection(collection) | ||
.withFlushSize( | ||
pluginConfig.hasPath(MongodbConfig.BUFFER_FLUSH_MAX_ROWS.key()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use if-else instead
boolean useSimpleTextSchema = CatalogTableUtil.buildSimpleTextSchema().equals(rowType); | ||
return new MongodbSinkWriter(rowType, useSimpleTextSchema, params); | ||
return new MongodbWriter( | ||
new RowDataDocumentSerializer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add context into writer, it will be used to dirty records collect
rowType.getFieldNames(), | ||
rowType, | ||
pluginConfig.hasPath(MongodbConfig.FLAT_SYNC_STRING.key()) | ||
? pluginConfig.getBoolean(MongodbConfig.FLAT_SYNC_STRING.key()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as above
# limitations under the License. | ||
# | ||
###### | ||
###### This config file is a demonstration of streaming processing in seatunnel config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same as above
# limitations under the License. | ||
# | ||
###### | ||
###### This config file is a demonstration of streaming processing in seatunnel config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same as above
# limitations under the License. | ||
# | ||
###### | ||
###### This config file is a demonstration of streaming processing in seatunnel config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same as above
# limitations under the License. | ||
# | ||
###### | ||
###### This config file is a demonstration of streaming processing in seatunnel config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same as above
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
###### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same as above
/** A builder class for creating {@link MongodbClientProvider}. */ | ||
public class MongodbCollectionProvider { | ||
|
||
public static Builder getBuilder() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static Builder getBuilder() { | |
public static Builder builder() { |
IMO, we will implement ddl change in the future. so the write logic of all connectors should not use async write, write action should bundle with checkpoint and preCommit stage cc @hailin0 @EricJoy2048 |
@TyrantLucifer PTAL, thanks. |
98d82d8
to
7d76fd0
Compare
@@ -67,36 +58,145 @@ | |||
@Slf4j | |||
public class MongodbIT extends TestSuiteBase implements TestResource { | |||
|
|||
private static final String MONGODB_IMAGE = "mongo:6.0.5"; | |||
private static final Random random = new Random(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private static final Random random = new Random(); | |
private static final Random RANDOM = new Random(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Purpose of this pull request
close:#4280
Check list
New License Guide
release-note
.