Framework represents CBIR as several steps:
- Transformation (computing descriptors, any array manipultaions)
- Sampling (frequent step before quantization)
- Quantization
- Search
- Search evaluation, plotting
All steps (except plotting) are designed in flavour of pipeline: each step has input data store and output data store and in transformation step you can even imitate real pipeline chaining transformers.
To add new type of descriptor you need add your <transformer.py> which must implement method like 'transform_item', then you can easily pass it to transformation step and compute descriptors of new type you need.
Similar situtaion with data stores. There are already implementations like SQLiteDataStore, CSVFileDataStore, NumpyDataStore. You can add your <data_store.py> which must implement several methods like 'get_items_sorted_by_ids' and then pass it to any step in your cbir pipeline.
(Note that there is considerable mess with way of getting data, processing and saving it. There were intentions to process data in stream-like style, but its has led to unpleasant restrictions and time-perfomance issues.)
Framework depends on python modules in inverted_multi_index prjoect. It utilizes them for fast (I hope) vector operations, building inverted multi-index, perfoming inverted multi-index search, exhaustive search with SDC and ADC distance computations.
Here goes major steps and examples of them.
- global descriptors
- local descriptors
- finding centroids
- quantizing global descriptors to pq codes
- finding centroids pairwise distances
It`s often enough to quantize only sample from descriptors.
4 types of search are supported:
requires:
requires:
requires:
requires:
Step to evaluate search perfomance.
- compare descriptors for exhaustive search Example->
- compare memory for descriptors for exhaustive search Example->
- compare quantization parameters for pq search techniques(adc, sdc, imi) Example->
- compare pq search types(adc, sdc, imi) Example->
For UI of finding images by example, client-server architecture was implemented. Server needs to be configurated with one of "searcher" (like in examples), then he listens to requests on localhost. Client with GUI connects to server on startup, can choose images in filesystem, and then search for most similar images. (Here file pathes are transmitted to server, because all communication is on one machine, but its not difficult to change it to send/receive image bytes). When server recieves request for search, it searches using searcher it was configurated and send back to client file pathes to nearest images.