You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: server/src/handlers/chunk_handler.rs
+17-4
Original file line number
Diff line number
Diff line change
@@ -59,6 +59,15 @@ pub struct SemanticBoost {
59
59
pubdistance_factor:f32,
60
60
}
61
61
62
+
/// Scoring options provides ways to modify the sparse or dense vector created for the query in order to change how potential matches are scored. If not specified, this defaults to no modifications.
/// Full text boost is useful for when you want to boost certain phrases in the fulltext (SPLADE) and BM25 search results. I.e. making sure that the listing for AirBNB itself ranks higher than companies who make software for AirBNB hosts by boosting the in-document-frequency of the AirBNB token (AKA word) for its official listing. Conceptually it multiples the in-document-importance second value in the tuples of the SPLADE or BM25 sparse vector of the chunk_html innerText for all tokens present in the boost phrase by the boost factor like so: (token, in-document-importance) -> (token, in-document-importance*boost_factor).
66
+
pubfulltext_boost:Option<FullTextBoost>,
67
+
/// Semantic boost is useful for moving the embedding vector of the chunk in the direction of the distance phrase. I.e. you can push a chunk with a chunk_html of "iphone" 25% closer to the term "flagship" by using the distance phrase "flagship" and a distance factor of 0.25. Conceptually it's drawing a line (euclidean/L2 distance) between the vector for the innerText of the chunk_html and distance_phrase then moving the vector of the chunk_html distance_factor*L2Distance closer to or away from the distance_phrase point along the line between the two points.
/// Sort Options lets you specify different methods to rerank the chunks in the result set. If not specified, this defaults to the score of the chunks.
948
957
pubsort_options:Option<SortOptions>,
958
+
/// Scoring options provides ways to modify the sparse or dense vector created for the query in order to change how potential matches are scored. If not specified, this defaults to no modifications.
959
+
pubscoring_options:Option<ScoringOptions>,
949
960
/// Highlight Options lets you specify different methods to highlight the chunks in the result set. If not specified, this defaults to the score of the chunks.
950
961
pubhighlight_options:Option<HighlightOptions>,
951
-
/// Set score_threshold to a float to filter out chunks with a score below the threshold for cosine distance metric
952
-
/// For Manhattan Distance, Euclidean Distance, and Dot Product, it will filter out scores above the threshold distance
953
-
/// This threshold applies before weight and bias modifications. If not specified, this defaults to no threshold
954
-
/// A threshold of 0 will default to no threshold
962
+
/// Set score_threshold to a float to filter out chunks with a score below the threshold for cosine distance metric. For Manhattan Distance, Euclidean Distance, and Dot Product, it will filter out scores above the threshold distance. This threshold applies before weight and bias modifications. If not specified, this defaults to no threshold. A threshold of 0 will default to no threshold.
955
963
pubscore_threshold:Option<f32>,
956
964
/// Set slim_chunks to true to avoid returning the content and chunk_html of the chunks. This is useful for when you want to reduce amount of data over the wire for latency improvement (typically 10-50ms). Default is false.
957
965
pubslim_chunks:Option<bool>,
@@ -977,6 +985,7 @@ impl Default for SearchChunksReqPayload {
/// Sort Options lets you specify different methods to rerank the chunks in the result set. If not specified, this defaults to the score of the chunks.
1331
1340
pubsort_options:Option<SortOptions>,
1341
+
/// Scoring options provides ways to modify the sparse or dense vector created for the query in order to change how potential matches are scored. If not specified, this defaults to no modifications.
1342
+
pubscoring_options:Option<ScoringOptions>,
1332
1343
/// Highlight Options lets you specify different methods to highlight the chunks in the result set. If not specified, this defaults to the score of the chunks.
1333
1344
pubhighlight_options:Option<HighlightOptions>,
1334
1345
/// Set score_threshold to a float to filter out chunks with a score below the threshold. This threshold applies before weight and bias modifications. If not specified, this defaults to 0.0.
@@ -1356,6 +1367,7 @@ impl From<AutocompleteReqPayload> for SearchChunksReqPayload {
0 commit comments