Skip to content

Commit 396aa70

Browse files
authored
update readme (#79)
* update readme * update readme * add hyper links
1 parent 13dec20 commit 396aa70

File tree

8 files changed

+49
-16
lines changed

8 files changed

+49
-16
lines changed

.github/workflows/extension_ci.yml

+2
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,13 @@ on:
1111
- main
1212
paths-ignore:
1313
- 'README.md'
14+
- docs/**
1415
push:
1516
branches:
1617
- main
1718
paths-ignore:
1819
- 'README.md'
20+
- docs/**
1921

2022
release:
2123
types:

.github/workflows/extension_upgrade.yml

+6-3
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,15 @@ on:
99
pull_request:
1010
branches:
1111
- main
12+
paths:
13+
- ".github/workflows/extension_upgrade.yml"
14+
- "./extension"
1215
push:
1316
branches:
1417
- main
15-
release:
16-
types:
17-
- created
18+
paths:
19+
- ".github/workflows/extension_upgrade.yml"
20+
- "./extension"
1821

1922
jobs:
2023
test:

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,5 @@ pgvector
1010
test.sql
1111
META.json
1212
/vectorize-*
13+
site/
14+
poetry.lock

README.md

+14-8
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,9 @@ This project relies heavily on the work by [pgvector](https://github.com/pgvecto
2121
[![PGXN version](https://badge.fury.io/pg/vectorize.svg)](https://pgxn.org/dist/vectorize/)
2222
[![OSSRank](https://shields.io/endpoint?url=https://ossrank.com/shield/3815)](https://ossrank.com/p/3815)
2323

24+
25+
pg_vectorize powers the [VectorDB Stack](https://tembo.io/docs/tembo-stacks/vector-db) on [Tembo Cloud](https://cloud.tembo.io/) and is available in all hobby tier instances.
26+
2427
**API Documentation**: https://tembo-io.github.io/pg_vectorize/
2528

2629
**Source**: https://github.com/tembo-io/pg_vectorize
@@ -38,7 +41,7 @@ This project relies heavily on the work by [pgvector](https://github.com/pgvecto
3841
- [Installation](#installation)
3942
- [Vector Search Example](#vector-search-example)
4043
- [RAG Example](#rag-example)
41-
- [Trigger based updates](#trigger-based-updates)
44+
- [Updating Embeddings](#updating-embeddings)
4245
- [Try it on Tembo Cloud](#try-it-on-tembo-cloud)
4346

4447
## Installation
@@ -128,7 +131,8 @@ SELECT vectorize.table(
128131
"table" => 'products',
129132
primary_key => 'product_id',
130133
columns => ARRAY['product_name', 'description'],
131-
transformer => 'sentence-transformers/multi-qa-MiniLM-L6-dot-v1'
134+
transformer => 'sentence-transformers/multi-qa-MiniLM-L6-dot-v1',
135+
schedule => 'realtime'
132136
);
133137
```
134138

@@ -201,9 +205,15 @@ SELECT vectorize.rag(
201205
"A pencil is an item that is commonly used for writing and is known to be most effective on paper."
202206
```
203207

204-
## Trigger based updates
208+
## Updating Embeddings
209+
210+
When the source text data is updated, how and when the embeddings are updated is determined by the value set to the `schedule` parameter in `vectorize.table` and `vectorize.init_rag`.
211+
212+
The default behavior is `schedule => '* * * * *'`, which means the background worker process checks for changes every minute, and updates the embeddings accordingly. This method requires setting the `updated_at_col` value to point to a colum on the table indicating the time that the input text columns were last changed. `schedule` can be set to any cron-like value.
205213

206-
When vectorize job is set up as `realtime` (the default behavior, via `vectorize.table(..., schedule => 'realtime')`), vectorize will create triggers on your table that will keep your embeddings up to date. When the text inputs are updated or if new rows are inserted, the triggers handle creating a background job that updates the embeddings. Since the transformation is executed in a background job and the transformer model is invoked in a separate container, there is minimal impact on the performance of the update or insert statement.
214+
Alternatively, `schedule => 'realtime` creates triggers on the source table and updates embeddings anytime new records are inserted to the source table or existing records are updated.
215+
216+
Statements below would will result in new embeddings being generated either immediately (`schedule => 'realtime'`) or within the cron schedule set in the `schedule` parameter.
207217

208218
```sql
209219
INSERT INTO products (product_id, product_name, description)
@@ -213,7 +223,3 @@ UPDATE products
213223
SET description = 'sling made of fabric, rope, or netting, suspended between two or more points, used for swinging, sleeping, or resting'
214224
WHERE product_name = 'Hammock';
215225
```
216-
217-
## Try it on Tembo Cloud
218-
219-
Try it for yourself! Install with a single click on a Vector DB Stack (or any other instance) in [Tembo Cloud](https://cloud.tembo.io/) today.

docs/api/search.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ vectorize."table"(
1717
"update_col" TEXT DEFAULT 'last_updated_at',
1818
"transformer" TEXT DEFAULT 'text-embedding-ada-002',
1919
"search_alg" vectorize.SimilarityAlg DEFAULT 'pgv_cosine_similarity',
20-
"table_method" vectorize.TableMethod DEFAULT 'append',
21-
"schedule" TEXT DEFAULT 'realtime'
20+
"table_method" vectorize.TableMethod DEFAULT 'join',
21+
"schedule" TEXT DEFAULT '* * * * *'
2222
) RETURNS TEXT
2323
```
2424

@@ -33,8 +33,8 @@ vectorize."table"(
3333
| update_col | text | Column specifying the last time the record was updated. Required for cron-like schedule. Defaults to `last_updated_at` |
3434
| transformer | text | The name of the transformer to use for the embeddings. Defaults to 'text-embedding-ada-002'. |
3535
| search_alg | SimilarityAlg | The name of the search algorithm to use. Defaults to 'pgv_cosine_similarity'. |
36-
| table_method | TableMethod | The method to use for the table. Defaults to 'append', which adds a column to the existing table. |
37-
| schedule | text | 'realtime' by default for trigger based updates. accepts a cron-like input for a cron based updates. |
36+
| table_method | TableMethod | `join` to store embeddings in a new table in the vectorize schema. `append` to create columns for embeddings on the source table. Defaults to `join`. |
37+
| schedule | text | Accepts a cron-like input for a cron based updates. Or `realtime` to set up a trigger. |
3838

3939
### Sentence-Transformer Examples
4040

docs/examples/scheduling.md

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Scheduling Embedding Updates
2+
3+
When the source text data is updated, how and when the embeddings are updated is determined by the value set to the `schedule` parameter in `vectorize.table` and `vectorize.init_rag`.
4+
5+
The default behavior is `schedule => '* * * * *'`, which means the background worker process checks for changes every minute, and updates the embeddings accordingly. This method requires setting the `updated_at_col` value to point to a colum on the table indicating the time that the input text columns were last changed. `schedule` can be set to any cron-like value.
6+
7+
Alternatively, `schedule => 'realtime` creates triggers on the source table and updates embeddings anytime new records are inserted to the source table or existing records are updated.
8+
9+
Statements below would will result in new embeddings being generated either immediately (`schedule => 'realtime'`) or within the cron schedule set in the `schedule` parameter.
10+
11+
```sql
12+
INSERT INTO products (product_id, product_name, description)
13+
VALUES (12345, 'pizza', 'dish of Italian origin consisting of a flattened disk of bread');
14+
15+
UPDATE products
16+
SET description = 'sling made of fabric, rope, or netting, suspended between two or more points, used for swinging, sleeping, or resting'
17+
WHERE product_name = 'Hammock';
18+
```

docs/examples/sentence_transformers.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,8 @@ SELECT vectorize.table(
5757
"table" => 'products',
5858
primary_key => 'product_id',
5959
columns => ARRAY['product_name', 'description'],
60-
transformer => 'sentence-transformers/multi-qa-MiniLM-L6-dot-v1'
60+
transformer => 'sentence-transformers/multi-qa-MiniLM-L6-dot-v1',
61+
scheduler => 'realtime'
6162
);
6263
```
6364

mkdocs.yml

+1
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ nav:
2323
- Search:
2424
- 'examples/sentence_transformers.md'
2525
- 'examples/openai_embeddings.md'
26+
- 'examples/scheduling.md'
2627
markdown_extensions:
2728
- toc:
2829
permalink: true

0 commit comments

Comments
 (0)