A web scrapper written in Selenium
using a chrome driver to scrape through IKEA US website for 4 unique product categories.
- Selenium
- BeautifulSoup -
pip install bs4
- Pyexcel -
pip install pyexcel
- Keras
- Tensorflow
- OpenCV
- Scikit-Learn
- Split-Folders -
pip install split-folders
- Check google chrome browser version with this link.
- Based on the version download the correct one from here.
- Please paste the correct Chrome Driver version in the correct OS folder.
Example:
If windows the final location: IKEA-master/Driver/Windows/Chromedriver.exe
The first step is to scrape the website, this results in populating the folders in Downloads folder.
python scrapper.py
By default four classes namely: Bed, Chair, Lamp, Wardrobe will be downloaded. This can be modified at line 22 keys variable. Necessary ignore keywords can also be added at line 23 ignore variable.
There could be a chances of junk images being downloaded, to clean this us and split the data into train, test and val (generated in Dataset):
python cleaner.py
Xception base network with a few extra layers are used to build the model. To train:
python trainer.py
The CNN is set to use 30 epoch by default with a batch size of 32. This can be changed in line 22 and line 23 respectively of trainer.py script.
After training, the model can be evaluated with predictor.py script and the confusion matrix plotted:
python predictor.py
By default the evaluation is done on the “Test” set and can be changed to “Val” in line 10 of predictor.py.
Accuracy: 93.75%
Bed | Chair | Lamp | Wardrobe | |
---|---|---|---|---|
Bed | 17 | 0 | 0 | 0 |
Chair | 1 | 20 | 0 | 0 |
Lamp | 1 | 0 | 19 | 1 |
Wardrobe | 1 | 1 | 0 | 19 |