-
-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query Results Inconsistent Across Nodes for an index #2779
Comments
Hello @mohdmsl This is an interesting case. It's not clear why |
Also, if you provide your snapshotted tables where the number of docs is the same, but the where queries return different results it may be also helpful to figure out what's going on. |
hi @sanikolaev The schema consists of document_id and vectors with a dimension of 768. The objective is to insert identical data into multiple indexes on a single instance and then compare the resulting outputs. Queries used:
Results: 5 DEC 2024
Note:: The above result was obtained by inserting 100 documents repeatedly in a loop 100 times. |
Hello @mohdmsl I also had to do Here's wha I saw while running the load script:
but nothing was inserted into Manticore:
perhaps because smth was wrong with the HTTP JSON queries since I saw this in the log:
Please fix. |
okay I will check. which python version you have used? |
@sanikolaev |
Thanks. I got this:
but I don't understand what it means and how it's related to what you said initially:
I.e. the problem was (as I understood it) that on different nodes you had different counts, but in your demo there's only one node and multiple tables. What conclusion should I draw from this? |
Yes, the original question was related to a multi-cluster setup. However, I attempted to reproduce the issue on a single-node cluster, as I noticed that even after a full load of data, queries return inconsistent results. What I am observing is that after each data load, the query results vary, whereas my expectation is that the query results should remain consistent if data has not changed. This behaviour creates a mess when I backfill same data on different environment(dev/stage) |
Thanks for the extra details. I've looked into the issue, and the main reason the count difference is due to a slight variation in the number of documents stored in disk chunks and the ram chunk:
etc. This happens because of the adaptive RAM chunk size (you can read more about the "rate" here: https://manual.manticoresearch.com/Creating_a_table/Local_tables/Plain_and_real-time_table_settings#rt_mem_limit). While the tables might look identical, there are small differences in how data is stored. This can affect how certain queries are executed. To eliminate any differences and make the count "accurate", you can merge everything into a single disk chunk. For example: mysql> flush ramchunk lisdocument1; optimize table lisdocument1 option sync=1, cutoff=1;
mysql> flush ramchunk lisdocument2; optimize table lisdocument2 option sync=1, cutoff=1;
mysql> SELECT count(*) as total FROM lisdocument1 WHERE knn (vector, 100, ... , 2000 ) OPTION cutoff = 0, boolean_simplify = 1, max_matches = 1000;
+-------+
| total |
+-------+
| 100 |
+-------+
mysql> SELECT count(*) as total FROM lisdocument2 WHERE knn (vector, 100, ... , 2000 ) OPTION cutoff = 0, boolean_simplify = 1, max_matches = 1000;
+-------+
| total |
+-------+
| 100 |
+-------+ The 20K+ count occurs because the It is recommended to use However, I noticed something odd. For example, this query sometimes returned only 409 results, which is lower than expected, provided all the limits (
I'll create a separate issue about it. As for your original query:
this is a different case. If you notice a count difference with a query like this, please share a test case so we can investigate further. |
Thanks for your answer @sanikolaev
|
I added The purpose of this parameter is to guarantee there is only one disk chunk at any given time, but that behavior is not being observed. |
Yes, merging can take some time, which is why it usually happens in the background, so you don’t need to worry about it. But I’m still not sure I understand your goal. If it’s |
|
We have a use case where we want to show count of records having |
Alright, that changes things, so what you are looking for is just not supported yet. Here's an MRE: mysql> drop table if exists t; create table t(v float_vector knn_type='hnsw' knn_dims='1' hnsw_similarity='l2'); insert into t values(1, (0.1)),(2, (0.6)); select count(*) from t where knn(v, 5, (0.3)) and knn_dist() < 0.05;
--------------
drop table if exists t
--------------
Query OK, 0 rows affected (0.00 sec)
--------------
create table t(v float_vector knn_type='hnsw' knn_dims='1' hnsw_similarity='l2')
--------------
Query OK, 0 rows affected (0.00 sec)
--------------
insert into t values(1, (0.1)),(2, (0.6))
--------------
Query OK, 2 rows affected (0.00 sec)
--------------
select count(*) from t where knn(v, 5, (0.3)) and knn_dist() < 0.05
--------------
ERROR 1064 (42000): P01: syntax error, unexpected '(' near '() < 0.05' Feel free to create a separate feature request about it. |
Bug Description:
Set up a Manticore cluster with three nodes (search-01, search-02, search-03) and synchronize the index lisdocument_20241122_2991.
Ensure all nodes have the same total document count in the index:
SELECT COUNT(*) FROM lisdocument_20241122_2991;
Run the following query on each node:
search-01:
search-02:
search-03:
Additional Information:
The total document count in the index is consistent across nodes:
Returns the same result on all nodes.
Schema of index is:
Cluster State:

Manticore Search Version:
6.3.6
Operating System Version:
linux
Have you tried the latest development version?
None
Internal Checklist:
To be completed by the assignee. Check off tasks that have been completed or are not applicable.
The text was updated successfully, but these errors were encountered: