-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serializer_refactor - MOD-4196 #262
Conversation
b0a99a0
to
9755881
Compare
Codecov ReportBase: 95.35% // Head: 95.35% // No change to project coverage 👍
Additional details and impacted files@@ Coverage Diff @@
## main #262 +/- ##
=======================================
Coverage 95.35% 95.35%
=======================================
Files 58 58
Lines 3254 3254
=======================================
Hits 3103 3103
Misses 151 151 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
@@ -331,7 +331,7 @@ VecSimIndexInfo BruteForceIndex<DataType, DistType>::info() const { | |||
info.bfInfo.indexSize = this->count; | |||
info.bfInfo.indexLabelCount = this->indexLabelCount(); | |||
info.bfInfo.blockSize = this->blockSize; | |||
info.bfInfo.memory = this->allocator->getAllocationSize(); | |||
info.bfInfo.memory = this->getAllocationSize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getAllocationSize() was added to vecsim_interface
496083c
to
3ad3cad
Compare
static const char *hnsw_index_file; | ||
static inline std::string GetSerializedIndexLocation(const char *file_name) { | ||
return std::string(getenv("ROOT")) + "/" + file_name; | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serializer already includes stream opening logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string(getenv("ROOT"))
throws an exception when ROOT is not defined (and getenv
returns null). Consider passing a default value instead ("." or calling pwd
).
tests/unit/test_hnsw_multi.cpp
Outdated
|
||
VecSimIndex_Free(index); | ||
|
||
// Create new index, set it into the serializer and extract the data to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ill fix the comment along with the review fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! See the comments and let me know what you think. While most of the comments are minor, I do want us to discuss if we should change how we load index entirety, so that loading will be an alternative constructor for the index.
src/VecSim/utils/serializer.h
Outdated
} EncodingVersion; | ||
|
||
// Persist index into a file in the specified location. | ||
void saveIndex(const std::string &location, EncodingVersion version = EncodingVersion_V1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change the default EncodingVersion
to EncodingVersion_V2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i forgot to change this 😄
src/VecSim/utils/serializer.h
Outdated
// Restore the index from the file in the specified location. | ||
void loadIndex(const std::string &location); | ||
// Check if the serialized index is valid. | ||
virtual bool serializingIsValid() const = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
virtual bool serializingIsValid() const = 0; | |
virtual bool checkIndexIntergrity() const = 0; |
@@ -33,7 +33,7 @@ struct VecSimAllocator { | |||
void operator delete(void *p, size_t size); | |||
void operator delete[](void *p, size_t size); | |||
|
|||
int64_t getAllocationSize(); | |||
int64_t getAllocationSize() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is that related to this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkIntegrity() uses this
as const so i cant call getAllocationSize() if its not const...
src/VecSim/vec_sim_interface.h
Outdated
@@ -154,4 +155,6 @@ struct VecSimIndexInterface : public VecsimBaseObject { | |||
inline static void setTimeoutCallbackFunction(timeoutCallbackFunction callback) { | |||
VecSimIndexInterface::timeoutCallback = callback; | |||
} | |||
|
|||
inline int64_t getAllocationSize() const { return this->allocator->getAllocationSize(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this addition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is just a shortcut
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider moving this to VecsimBaseObject
src/VecSim/algorithms/hnsw/hnsw.h
Outdated
void HNSWIndex<DataType, DistType>::saveIndexIMP(std::ofstream &output, | ||
EncodingVersion version) const { | ||
// We already checked in the serializer that this is a valid version number. | ||
// Now checking the version number to decide which data to write. | ||
if (version != EncodingVersion_V1) { | ||
saveIndexFields_v2(output); | ||
} | ||
|
||
this->saveIndexFields(output); | ||
this->saveGraph(output); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comment for serialization and versioning - when we introduce a new version, we don't need to keep old versions of encoding functions, (unlike the old decoding functions that we do keep). That is since if we save an index using this code, we always want it to be in the latest version. However, if we load a serialized index, we wan't to allow it whether it was serialized with an older versions as well.
tests/unit/test_allocator.cpp
Outdated
@@ -360,16 +360,15 @@ TYPED_TEST(IndexAllocatorTest, test_hnsw_reclaim_memory) { | |||
ASSERT_EQ(allocator->getAllocationSize(), initial_memory_size + accumulated_mem_delta); | |||
// Also validate that there are no unidirectional connections (these add memory to the incoming | |||
// edges sets). | |||
// auto serializer = HNSWIndexSerializer(hnswIndex); | |||
// ASSERT_EQ(serializer.checkIntegrity().unidirectional_connections, 0); | |||
// ASSERT_EQ(hnswIndex->checkIntegrity().unidirectional_connections, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why was this commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was moved to a separate test, not templated
tests/unit/test_common.cpp
Outdated
this->file_name = std::string(getenv("ROOT")) + "/tests/unit/data/test_common_hnsw"; | ||
ASSERT_EQ(this->GetFileSize(), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a simpler way to validate that a file under this name doesn't exist? Or am I missing something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think i put this line here by mistake 😅
tests/unit/test_common.cpp
Outdated
VecSimIndex *index = test_utils::CreateNewIndex(params, VecSimType_FLOAT32); | ||
HNSWIndex<float, float> *hnsw_index = reinterpret_cast<HNSWIndex<float, float> *>(index); | ||
|
||
this->file_name = std::string(getenv("ROOT")) + "/tests/unit/data/test_common_hnsw"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this->file_name = std::string(getenv("ROOT")) + "/tests/unit/data/test_common_hnsw"; | |
this->file_name = std::string(getenv("ROOT")) + "/tests/unit/data/bad_index.hnsw |
tests/unit/test_hnsw.cpp
Outdated
// Save index. | ||
EXPECT_THROW(hnsw_index->saveIndex(this->file_name, Serializer::EncodingVersion_NOT_VALID), | ||
std::runtime_error); | ||
hnsw_index->saveIndex(this->file_name, Serializer::EncodingVersion_V2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have this check already in test_common no?
tests/unit/test_hnsw.cpp
Outdated
index = this->CreateNewIndex(other_params); | ||
|
||
VecSimIndexInfo other_index_info = VecSimIndex_Info(index); | ||
// Make sure that this index is different | ||
ASSERT_NE(serialized_index_info.hnswInfo.dim, other_index_info.hnswInfo.dim); | ||
ASSERT_NE(serialized_index_info.hnswInfo.blockSize, other_index_info.hnswInfo.blockSize); | ||
ASSERT_NE(serialized_index_info.hnswInfo.M, other_index_info.hnswInfo.M); | ||
ASSERT_NE(serialized_index_info.hnswInfo.efConstruction, | ||
other_index_info.hnswInfo.efConstruction); | ||
ASSERT_NE(serialized_index_info.hnswInfo.efRuntime, other_index_info.hnswInfo.efRuntime); | ||
ASSERT_NE(serialized_index_info.hnswInfo.epsilon, other_index_info.hnswInfo.epsilon); | ||
|
||
hnsw_index = this->CastToHNSW(index); | ||
hnsw_index->loadIndex(this->file_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this test makes me think that we should use the loadIndex
as an alternative constructor. Creating an index and then loading it from file was the temporary solution at the time, but now it looks awkward to use loadIndex
to override index data. Let's discuss it...
9125357
to
74b8879
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! :)
See some comments and questions that I had. Also, I would add some more tests such as - serializing and loading an empty index, searching before and after serialization (only in v2...)
src/VecSim/utils/serializer.cpp
Outdated
void Serializer::saveIndex(const std::string &location, EncodingVersion version) { | ||
// Only V1 and V2 are supported | ||
if (version >= EncodingVersion_INVALID) { | ||
throw std::runtime_error("Cannot load index: bad encoding version"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always save with EncodingVersion_V2
no?
src/VecSim/vec_sim_index.h
Outdated
@@ -42,6 +42,12 @@ struct VecSimIndexAbstract : public VecSimIndexInterface { | |||
spaces::SetDistFunc(metric, dim, &dist_func); | |||
} | |||
|
|||
#ifdef BUILD_TESTS | |||
VecSimIndexAbstract(std::shared_ptr<VecSimAllocator> allocator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a comment here that explains what this constructor exists for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i didn't use it eventually
HNSWIndexMetaData checkIntegrity() const; | ||
|
||
// Index memory size might be changed during index saving. | ||
virtual void saveIndexIMP(std::ofstream &output, EncodingVersion version) override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not virtual, right? we wouldn't override it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is pure virtual in the serializer class so it can call it from saveIndex function
@@ -8,7 +8,7 @@ namespace HNSWFactory { | |||
template <typename DataType, typename DistType = DataType> | |||
inline VecSimIndex *NewIndex_ChooseMultiOrSingle(const HNSWParams *params, | |||
std::shared_ptr<VecSimAllocator> allocator) { | |||
// check if single and return new bf_index | |||
// check if single and return new hnsw_index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch :)
case Serializer::EncodingVersion_V1: { | ||
assert(params); | ||
file_params.type = params->type; | ||
file_params.dim = params->dim; | ||
file_params.metric = params->metric; | ||
file_params.multi = params->multi; | ||
file_params.blockSize = params->blockSize; | ||
file_params.epsilon = params->epsilon; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit picking: what if the last 3 params are empty? What will be the default behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question
maybe epsilon and block size can have default values because they don't affect the loading process, but all the others must be the same as the loaded index since we cant get them from the file.
input.ignore(sizeof(unsigned long)); | ||
} | ||
|
||
// for V2 and up we don't serialize the level generator, so we just returb and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo return
@@ -29,7 +29,8 @@ def get_data_set(dataset_name): | |||
|
|||
|
|||
# Create an HNSW index from dataset based on specific params. | |||
def create_hnsw_index(dataset, ef_construction, M): | |||
def load_or_create_hnsw_index(dataset, index_file_name, ef_construction, M): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you ran this script locally to see that these changes work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes i ran make benchmark
|
||
std::string file_name; | ||
}; | ||
TEST_F(SerializerTest, HNSWSerialzer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TEST_F(SerializerTest, HNSWSerialzer) { | |
TEST_F(SerializerTest, HNSWSerialzerErrors) { |
Also, try to see if you can increase the coverage in some additional cases where we send invalid params
to the new constructor (bad M
, bad type etc...), and test the error message itself in addition to the exception type
tests/unit/test_hnsw.cpp
Outdated
serializer.saveIndex(file_name); | ||
VecSimIndex_Free(new_index); | ||
// Load the index from the file. | ||
VecSimIndex *serialized_index = HNSWFactory::NewIndex(file_name, ¶ms); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why sending params
? I thought that for V2 you don't need them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remvoed
size_t ef = 10; | ||
double epsilon = 0.004; | ||
size_t blockSize = 1; | ||
bool is_multi[] = {false, true}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have multi
in v1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theoretically we can have
// Check if we found an invalid neighbor. | ||
if (data[j] > this->max_id || data[j] == i) { | ||
res.valid_state = false; | ||
return res; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was never covered
// Check if a certain neighbor appeared more than once. | ||
if (s.size() != size) { | ||
res.valid_state = false; | ||
return res; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was never covered
: 0; | ||
if (incoming_edges_sets_sizes + double_connections != connections_checked) { | ||
res.valid_state = false; | ||
return res; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was never covered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You did it! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done!
regarding adding invalid values to enums, IMO we should avoid it
src/VecSim/vec_sim_interface.h
Outdated
@@ -154,4 +155,6 @@ struct VecSimIndexInterface : public VecsimBaseObject { | |||
inline static void setTimeoutCallbackFunction(timeoutCallbackFunction callback) { | |||
VecSimIndexInterface::timeoutCallback = callback; | |||
} | |||
|
|||
inline int64_t getAllocationSize() const { return this->allocator->getAllocationSize(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider moving this to VecsimBaseObject
#include "VecSim/memory/vecsim_base.h" | ||
#include "info_iterator_struct.h" | ||
|
||
#include <stddef.h> | ||
#include <stdexcept> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why include <stdexcept>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't remember but I removed it and it's ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i don't why it doesn't crush locally but the default flow it says that the it doesn't know std::runtime_error
src/VecSim/utils/serializer.cpp
Outdated
// Only V1 and V2 are supported | ||
if (version >= EncodingVersion_INVALID) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about version 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i thought it is no longer supported
@alonre24 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right @meiravgri. But we don't return an error if version==0
. You can just add this validation to this if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add EncodingVersion_V0 and it won't throw because it is smaller than EncodingVersion_INVALID
typedef enum EncodingVersion { EncodingVersion_V1 = 1, EncodingVersion_V2 = 2, EncodingVersion_INVALID, // This should always be last. } EncodingVersion;
but it will return when we try to initialize params. So I don't quite understand how to handle v0 without returning an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you SHOULD return an error for version==0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agrees.
// Only V1 and V2 are supported | |
if (version >= EncodingVersion_INVALID) { | |
// Only V1 and V2 are supported | |
if (version <= 0 || version >= EncodingVersion_INVALID) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh got it 😅
virtual void AddToLabelLookup(labelType label, idType id) override { | ||
if (label_lookup_.find(label) == label_lookup_.end()) { | ||
label_lookup_.emplace(label, vecsim_stl::vector<idType>{this->allocator}); | ||
} | ||
setVectorId(label, id); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding this function complicates things for no reason (my bad). I think it is better to move this check to setVectorId
itself and remove it from addVector
. we call this function once anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function is called by the parent class so if we dont want to affect performance when we are not in tests mode we need the virtual function wrapper anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won’t effect performance because it is called once every insertion anyway
@@ -0,0 +1,307 @@ | |||
#pragma once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider calling this file hnsw_serializer.h
}; | ||
TEST_F(SerializerTest, HNSWSerialzer) { | ||
|
||
this->file_name = std::string(getenv("ROOT")) + "/tests/unit/data/bad_index.hnsw"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string(getenv("ROOT"))
throws an exception when ROOT is not defined (and getenv
returns null). Consider passing a default value instead ("." or calling pwd
).
// Set index type. | ||
params.multi = is_multi[i]; | ||
|
||
auto file_name = std::string(getenv("ROOT")) + "/tests/unit/data/1k-d4-L2-M8-ef_c10_" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string(getenv("ROOT"))
throws an exception when ROOT is not defined (and getenv
returns null). Consider passing a default value instead ("." or calling pwd
).
@@ -105,3 +105,18 @@ inline double GetInfVal(VecSimType type) { | |||
throw std::invalid_argument("This type is not supported"); | |||
} | |||
} | |||
|
|||
// Test a specific exception type is thrown and prints the right message. | |||
#define ASSERT_EXCEPTION_MESSAGE(VALUE, EXCEPTION_TYPE, MESSAGE) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
VALUE; \ | ||
FAIL() << "exception '" << MESSAGE << "' not thrown at all!"; \ | ||
} catch (const EXCEPTION_TYPE &e) { \ | ||
EXPECT_EQ(std::string(MESSAGE), e.what()) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a macro for comparing c-strings if you prefer.
static const char *hnsw_index_file; | ||
static inline std::string GetSerializedIndexLocation(const char *file_name) { | ||
return std::string(getenv("ROOT")) + "/" + file_name; | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string(getenv("ROOT"))
throws an exception when ROOT is not defined (and getenv
returns null). Consider passing a default value instead ("." or calling pwd
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done!
regarding adding invalid values to enums, IMO we should avoid it
c3c9611
to
e64f1ca
Compare
…ing, added multi test
added hnsw ctor for the serializer(required also new ctor for VecSimIndexAbstract) and factory functions to call this ctor, added VecSimAlgo_INVALID, removed friend class SerializerTest_HNSWSerialzer_Test, added serilized index v1 to unit data and the v1 only loads it
put ser test in the begining of hnsw, commented some in the end commented ser in hnsw multi tests fixed common test commented out all the unit tests except hnsw and allocator in hnsw::resize chnaged visited_nodes_handler from unique ptr to shared
… version = v2, hnsw ctor to load from file get params as argument for v1, also gets the version to intialize serializer level_generator is not saved from V2 and up, and skipped whin lodaing v1, new files for v2 unit tests datasets in bm support loading from files if existed
fixed test common
…ex arguments (always saving version 2), added default value to the v1 unit test
removed AddToLabelLookup (calling directly set vector id) checking if the label already exists or should be emplaced from add vector to set vector id
0971995
to
e79690c
Compare
Describe the changes in the pull request
In this PR we refactored the serializer class. it is no longer holding an index instance as a data member.
If an index is supposed to be serialized it should inherit the serializer and implement:
to avoid adding unnecessary functions and memory, the inheritance and serializing related functions can be declared and implemented under the BUILD_TESTS preprocessor flag.
-When using encoding version 1 we need to explicitly generate the HNSWParams struct with the same attributes as the loaded index.
VecSimIndexAbstract
class data members (dim, vector data type, metric, etc.) so we can initialize HNSWParams for the ctor from the file.Which issues this PR fixes
Main objects this PR modified
Mark if applicable