Skip to content

Conversation

wenym1
Copy link
Contributor

@wenym1 wenym1 commented Sep 9, 2025

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Example

create table items (id int primary key, name string, embedding vector(3));
create table events (event_id int primary key, time timestamp, embedding vector(3));
select 
  event_id, time, array(
    select row(id, name)
    from items
    order by events.embedding <=> items.embedding
    limit 3
  )
as related_info from events;

logical plan

LogicalProject { exprs: [events.event_id, events.time, $expr3] }
└─LogicalApply { type: LeftOuter, on: true, correlated_id: 1, max_one_row: true }
  ├─LogicalScan { table: events, columns: [events.event_id, events.time, events.embedding, events._rw_timestamp] }
  └─LogicalProject { exprs: [Coalesce(array_agg($expr1 order_by($expr2 ASC)), ARRAY[]:List(Struct(StructType { fields: [("f1", Int32), ("f2", Varchar)] }))) as $expr3] }
    └─LogicalAgg { aggs: [array_agg($expr1 order_by($expr2 ASC))] }
      └─LogicalTopN { order: [$expr2 ASC], limit: 3, offset: 0 }
        └─LogicalProject { exprs: [Row(items.id, items.name) as $expr1, CosineDistance(CorrelatedInputRef { index: 2, correlated_id: 1 }, items.embedding) as $expr2] }
          └─LogicalScan { table: items, columns: [items.id, items.name, items.embedding, items._rw_timestamp] }

convert to

LogicalProject { exprs: [events.event_id, events.time, array] }
└─LogicalCorrelatedVectorSearch { distance_type: Cosine, top_n: 3, base_vector: events.embedding:Vector(3), lookup_vector: items.embedding, lookup_output_columns: [items.id:Int32, items.name:Varchar] }
  ├─LogicalScan { table: events, columns: [events.event_id, events.time, events.embedding] }
  └─LogicalScan { table: items, columns: [items.id, items.name, items.embedding, items._rw_timestamp] }

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • I have checked the Release Timeline and Currently Supported Versions to determine which release branches I need to cherry-pick this PR into.

Documentation

  • My PR needs documentation updates.
Release note

Copy link
Contributor Author

wenym1 commented Sep 9, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions github-actions bot added the type/feature Type: New feature. label Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Type: New feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant