Deep Web Extractor (DWX)

Deep Web Extractor system is using statistical machine learning models for crawling and data discovery from the Deep Web (i.e., massive and quality portion of World Wide Web) to build knowledge based databases.

The main objectives are performed by this system as given below:

To discover and extract the deep web's content of quality for web searchers.
To discover automated means for identifying search-able web form interfaces and directing queries to them to digout information.
To build domain specific data repositories (e.g. real estate, newspapers, health, etc.) for purposeful analysis and building knowledge base databases.
To handle the complex queries, like queries containing different range values, not entertained by traditional search engines.
To facilitate Law and Enforcement Agencies to detect Fraudulent web user.

The proposed architecture of Deep Web Extractor (DWX) system is shown in Figure:

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
static		static
templates		templates
Procfile		Procfile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
runtime.txt		runtime.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Web Extractor (DWX)

About

Releases

Packages

Languages

RaoUmer/dwx

Folders and files

Latest commit

History

Repository files navigation

Deep Web Extractor (DWX)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages