-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature][Connector-V2][OceanBase] Support vector types on OceanBase #7375
Conversation
Could you add a test case for insert/read vector into/from oceanbase? |
I added the test case and the test on giothub was successful |
Please merge from dev then use fake source to produce vector data to write into oceanbase |
|
try to fix it in #7435 |
Please merge from dev. Thanks |
done, |
No, maybe it's unstable. Please retry failed CI. |
value = {}, | ||
type = {EngineType.SPARK, EngineType.FLINK}, | ||
disabledReason = "Currently SPARK and FLINK not support adapt") | ||
@Disabled("oceanbase vector and milvus takes up too much memory") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove disabled on fake to oceanbase case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At present, Oceanbase's vector image will not be closed, resulting in a process still running, and this image problem will be dealt with in the future, but the function is okay。seatunne's executeJob logic has an assertion to determine whether there is an image running, and after testing, replacing it with a mysql image or other versions of the image will not have this error, and the current ob vector image has this problem, so add @disable without modifying the original test logic on the premise。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oceanbase's vector image will not be closed, resulting in a process still running
Do you know the reason? And is there any problem if we don't solve this problem and enable the test case at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove disabled on fake to oceanbase case.
Subsequent PR will be submitted, the OB-related testcontainers will be modified, and the test container will be replaced to the latest version @Hisoka-X What do you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oceanbase's vector image will not be closed, resulting in a process still running
Do you know the reason? And is there any problem if we don't solve this problem and enable the test case at the same time?
The problem of the ob vector container, I am sure that there is no problem with the test case, and the synchronous results can be printed out at the moment, but the test container still has threads running and cannot be closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At present, Oceanbase's vector image will not be closed, resulting in a process still running, and this image problem will be dealt with in the future, but the function is okay。seatunne's executeJob logic has an assertion to determine whether there is an image running, and after testing, replacing it with a mysql image or other versions of the image will not have this error, and the current ob vector image has this problem, so add @disable without modifying the original test logic on the premise。
This is the assert logic provided by seatunnel to ensure that there are no abnormal threads in the engine. I believe this test case did not pass our assert logic. The JNA cleaner
thread come from oceanbase client. This is a bug come from jna tool. Fixed by java-native-access/jna@e8182b2 in version 5.14.0
So we should do this:
- open test case and add
JNA cleaner
thread intoLine 407 in 570bbb3
private boolean isIssueWeAlreadyKnow(String threadName) { - waiting oceanbase client update jna version to 5.14.0 then remove it from
isIssueWeAlreadyKnow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @whhe as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @xxsc0529 !
Purpose of this pull request
Added support for vectors in Oceanbase Close #7290
Does this PR introduce any user-facing change?
yes
How was this patch tested?
When creating test data in milvus, the database is default, the collection is created as simple_examole, and the data sample is as shown in the following figure
The test script is as follows
env {
job.mode = "BATCH"
}
source {
Milvus {
url = "http://localhost:19530"
token = "root:Milvus"
database = "default"
collection="simple_example"
}
}
transform {
}
sink {
jdbc {
url = "jdbc:oceanbase://localhost:2881/test"
driver = "com.oceanbase.jdbc.Driver"
user = "root"
password = ""
generate_sink_sql =true
compatible_mode="mysql"
database = "test"
table = "simple_example"
}
}
Since oceanbase-client does not support vector parsing, oceanbase catalog does not support automatic creation of vector tables
The DDL of Oceanbase is as follows:
CREATE TABLE IF NOT EXISTS simple_example
(
book_id
int NOT NULL,book_intro
vector(4) DEFAULT NULL,book_title
varchar(64) DEFAULT NULL,primary key (
book_id
));
This mission was successful
Check list
New License Guide
release-note
.