Skip to content

Find the shortest way from one Wikipedia page to another

Notifications You must be signed in to change notification settings

benediktwerner/WikiLinkAnalyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wikipedia link-chain analyzer

This is a program to find the shortest link chains between two wikipedia pages.

It can also be used to find pages that need the longest link-chains to a given page and more.

Building and Installing

Building and installing requires a Rust Installation.

To install:

$ git clone https://github.com/benediktwerner/WikiLinkAnalyzer
$ cargo install --path WikiLinkAnalyzer
$ wiki-analyzer --version

To build:

$ git clone https://github.com/benediktwerner/WikiLinkAnalyzer
$ cd WikiLinkAnalyzer
$ cargo build
$ ./target/debug/wiki-analyzer

Setup

  • Goto https://dumps.wikimedia.org/
  • Choose a wiki e.g. "enwiki" for English Wikipedia or "dewiki" for German Wikipedia
  • Download the table dumps for the tables page, pagelinks and redirect as .sql.gz archives
  • Place them in the data directory (relative to the directory you run the command in)
  • The analyzer will automatically extract and preprocess these files on the first run

Debugging tips

You can use the Wikipedia API to find an article from it's ID: https://de.wikipedia.org/w/api.php?action=query&prop=info&pageids=3034015&inprop=url.

Table documentations:

About

Find the shortest way from one Wikipedia page to another

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages