Skip to content
/ dwx Public

Deep Web Extractor (DWX): Deep Web Extractor system is using statistical machine learning models for crawling and data discovery from the Deep Web (i.e., massive and quality portion of World Wide Web) to build knowledge based databases.

Notifications You must be signed in to change notification settings

RaoUmer/dwx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Web Extractor (DWX)

Deep Web Extractor system is using statistical machine learning models for crawling and data discovery from the Deep Web (i.e., massive and quality portion of World Wide Web) to build knowledge based databases.

The main objectives are performed by this system as given below:

  1. To discover and extract the deep web's content of quality for web searchers.
  2. To discover automated means for identifying search-able web form interfaces and directing queries to them to digout information.
  3. To build domain specific data repositories (e.g. real estate, newspapers, health, etc.) for purposeful analysis and building knowledge base databases.
  4. To handle the complex queries, like queries containing different range values, not entertained by traditional search engines.
  5. To facilitate Law and Enforcement Agencies to detect Fraudulent web user.

The proposed architecture of Deep Web Extractor (DWX) system is shown in Figure:

DWX

About

Deep Web Extractor (DWX): Deep Web Extractor system is using statistical machine learning models for crawling and data discovery from the Deep Web (i.e., massive and quality portion of World Wide Web) to build knowledge based databases.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published