Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MySQL] VECTOR data type support #114

Open
fanyang01 opened this issue Nov 5, 2024 · 3 comments
Open

[MySQL] VECTOR data type support #114

fanyang01 opened this issue Nov 5, 2024 · 3 comments

Comments

@fanyang01
Copy link
Collaborator

No description provided.

@NewtonVan
Copy link

Hi @fanyang01

I’ve implemented a simple version of a Go-duckdb test with the VSS (Vector Similarity Search) plugin, and I’ve been able to confirm that vectorization through the DuckDB VSS plugin is feasible. However, I’ve run into two main issues:

  1. Lack of Support for Vector Type Syntax Parsing: It seems myduck doesn’t yet support syntax parsing related to vector types.

  2. Scan Operation Error with Vector-Type Query Results: When querying for vector-type results, the scan operation throws the following error:

    database/sql/driver: unsupported data type: ARRAY: index: 0
    

My Proposed Approach

To address these, my current approach includes:

  1. Enabling syntax parsing for vector types.
  2. Loading the VSS plugin during initialization.
  3. Supporting the return of vec type results.

I’d really appreciate any suggestions or feedback you might have on this approach, or if you see any potential improvements. Thanks so much in advance for your help!

By the way, I really appreciate the amazing quack you have made!

@fanyang01
Copy link
Collaborator Author

Hi @NewtonVan,

Nice to see you here! 😄

Enabling syntax parsing for vector types

This should be contributed to the upstream project dolthub/vitess. I'm surprised that in pull request #365, only VECTOR INDEX was supported, but not VECTOR(n).

Supporting the return of vec type results.

Yes, it's a good idea to add support for vector return types. Fortunately, go-duckdb nicely supports nested/structured types.

Loading the VSS plugin during initialization.

Sure. Although the VSS plugin currently comes with several limitations, such as being serialized as a whole for every checkpoint, it's nice to load it and let potential users explore it.

@NewtonVan
Copy link

Hi @NewtonVan,

Nice to see you here! 😄

Enabling syntax parsing for vector types

This should be contributed to the upstream project dolthub/vitess. I'm surprised that in pull request #365, only VECTOR INDEX was supported, but not VECTOR(n).

Supporting the return of vec type results.

Yes, it's a good idea to add support for vector return types. Fortunately, go-duckdb nicely supports nested/structured types.

Loading the VSS plugin during initialization.

Sure. Although the VSS plugin currently comes with several limitations, such as being serialized as a whole for every checkpoint, it's nice to load it and let potential users explore it.

Thank you very much for your guidance; it helped me pinpoint the issue more quickly.

Regarding the parsing logic, I’ll look further into how Vitess handles this as I continue exploring possible solutions. I also noted some of the issues highlighted in DuckDB’s blog post on the VSS plugin. Fortunately, adding the plugin only involves minimal changes to the project’s logic, so we can offer basic support for it as a lightweight, experimental feature. I’ll also be exploring this interesting little feature further.

Thanks again for your valuable insights!

@fanyang01 fanyang01 changed the title VECTOR data type support [MySQL] VECTOR data type support Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants