This application is part of the ReInHerit Toolkit.
TBA
This application is part of the ReInHerit Toolkit, implementing the system presented in the paper reported in the biblio. It shows how to implement smart retreival using image-to-image and text-to-image retrieval in the Cultural Heritage domain. The dataset used is NoisyArt, a dataset designed for webly-supervised recognition of artworks, considering multi-modality learning and zero-shot learning. The dataset consists of more than 80'000 webly-supervised images from 3120 classes, and a subset of 200 classes with more than 1300 verified images.
To get a local copy up and running follow these simple steps.
We strongly recommend the use of the Anaconda package manager in order to avoid dependency/reproducibility problems. A conda installation guide for linux systems can be found here
- Clone the repo
git clone https://github.com/ReInHerit/SmartRetrievalArtDemo insert repo url
- Install Python dependencies
conda create -n clip4art -y python=3.8
conda activate clip4art
conda install -y -c pytorch pytorch=1.7.1 torchvision=0.8.2
pip install flask==2.0.2
pip install git+https://github.com/openai/CLIP.git
- Download NoisyArt dataset
Here's a brief description of each and every file and folder in the repo:
utils.py
: Utils filedata_utils.py
: Dataset loading and preprocessing utils fileextract_features.py
: Feature extraction fileapp.py
: Flask server filestatic
: Flask static files foldertemplates
: Flask templates folder
To properly work with the codebase NoisyArt dataset should have the following structure:
project_base_path
└─── noisyart_dataset
└─── noisyart
| metadata.json
└─── splits
└─── trainval_3120
| boot.txt
| train.txt
| trainval.txt
| val.txt
└─── trainval_3120
└─── 0000_http:^^dbpedia.org^resource^'X'_Intertwining
| google_000.jpg
| google_001.jpg
| ...
└─── 0001_http:^^dbpedia.org^resource^1618_in_art
| flickr_000.jpg
| flickr_001.jpg
| ...
└─── ...
└─── trainval_200
└─── 0015_http:^^dbpedia.org^resource^A_Converted_British_Family_Sheltering_a_Christian_Missionary_from_the_Persecution_of_the_Druids
| flickr_000.jpg
| flickr_001.jpg
| ...
└─── 0022_http:^^dbpedia.org^resource^A_Girl_Asleep
| flickr_000.jpg
| flickr_001.jpg
| ...
└─── ...
└─── test_200
└─── 0015_http:^^dbpedia.org^resource^A_Converted_British_Family_Sheltering_a_Christian_Missionary_from_the_Persecution_of_the_Druids
| 11263104_10152876313375669_7205129171364455318_n.jpg
| Afmolean1.jpg
| ...
└─── 0022_http:^^dbpedia.org^resource^A_Girl_Asleep
| 5th-vermeer-in-met-a.jpg
| A_Maid_Asleep_-_Painting_of_Vermeer,_with_frame.jpg
| ...
└─── ...
Before launching the demo it is necessary to extract the features using the following command
python extract_features.py
Start the server and run the demo using the following command
python app.py
By default, the server run on port 5000 of localhost address: http://127.0.0.1:5000/
- Initially, users have the option to enter a description of the artwork they are seeking or select an image from their artwork collection.
- Once the users have entered the description of the artwork, they will be able to view the artworks that correspond with its description.
- When clicking on an artwork, the user will be able to see the artwork and the description of the artwork
This work was partially supported by the European Commission under European Horizon 2020 Programme, grant number 101004545 - ReInHerit.
If you use this software in your work please cite:
@inproceedings{HeriTech-2022,
booktitle = {Communications in Computer and Information Science - Proc. of International Conference Florence Heri-tech: the Future of Heritage Science and Technologies},
date-added = {2022-10-24 12:35:27 +0200},
date-modified = {2023-04-04 15:39:10 +0200},
doi = {https://doi.org/10.1007/978-3-031-20302-2_11},
pages = {140--149},
publisher = {Springer},
title = {Exploiting {CLIP}-based Multi-modal Approach for Artwork Classification and Retrieval},
volume = {1645},
year = {2022},
bdsk-url-1 = {https://doi.org/10.1007/978-3-031-20302-2_11}}
This work was partially supported by the European Commission under European Horizon 2020 Programme, grant number 101004545 - ReInHerit.