SearchMart

SearchEngine for scraped pages from walmart.com

Contains the webscraper used to scrape data from walmart.com and the scraped dataset. Each '_products.txt' file contains unique products in that category and the '_links.txt' file holds the links between different products. The 'all_products.txt' and 'all_links.txt' files contain all the unique products and all their unique links. In total we had 21,532 unique products.

To run the web scraper, you need to provide three command-line arguments: the url where it should start scraping, the max number of products it should crawl to, and the name of the department you are scraping.

All required modules must be in their standard versions except you must install scikit-learn version 0.22.2.post1 (pip install -Iv scikit-learn==0.22.2.post1)

To use the server for the backend, run an Apache server with index.html in ServerPython/, and also run the apy.py REST API in RestAPI/.

Link to our machine learning models are here. To load them into our server, make sure they are in a folder titled 'MachineLearningModels' that is one directory above the directory the server is in.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
JupyerNotebooks		JupyerNotebooks
RestAPI		RestAPI
ScrapeData		ScrapeData
ServerPython		ServerPython
WalmartWebScraper		WalmartWebScraper
formatted_data		formatted_data
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SearchMart

About

Releases

Packages

Languages

themorlock/SearchMart

Folders and files

Latest commit

History

Repository files navigation

SearchMart

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages