Wikigraph

The file wikigraph.py implements classes for finding paths between wikipedia articles and other related functions using the wikimedia API. A path is created by linking articles by the links they contain, just like the wikipedia game. See blog post https://winstonjay.github.io/posts/homunculus for more info on project motivations.

Basic Use

install requiremnts

python findpath.py --start="Car" --end="Home"

Example session:

The main method find_path is better run in a shell session or in a batch collection as its use of memoization will speed up searches whilst it runs, reducing requests to the Wikimedia API.

>>> import wikigraph
>>> w = wikigraph.WikiGraph()
>>> path = w.find_path(start="Tom Hanks", end="Kevin Bacon")
>>> print(path)
<wikigraph.Path: Tom Hanks -> Kevin Bacon>
>>> print(path.info)
Path:
        Path:        Tom Hanks -> Kevin Bacon
        Separation:  1 steps
        Time Taken:  0.578131 seconds
        Requests:    2

>>> path.data
{'start': 'Tom Hanks', 'end': 'Kevin Bacon', 'path': 'Tom Hanks->Kevin Bacon', 'degree': 1}
>>> print(path.json(indent=2))
{
  "start": "Tom Hanks",
  "end": "Kevin Bacon",
  "path": "Tom Hanks->Kevin Bacon",
  "degree": 1
}

collectbatch.py

For a given sample of start articles find a path from each to a central end article. Save the output to a given csv file. Without start list specified, program will default to collecting an k sized random sample generated by the wikimedia API. For more info, See command line arg details below.

usage:

-h, --help            show this help message and exit
-o OUTFILE, --outfile OUTFILE
                        Filename to save the results to.
-x CENTER, --center CENTER
                        Title of valid wiki page to center all nodes on
-k SAMPLE_SIZE, --sample_size SAMPLE_SIZE
                        Sample size of k pages to search from. (Only applies
                        when sample source is not given)
-s SAMPLE_SOURCE, --sample_source SAMPLE_SOURCE
                        Filename containing newline delimited list of valid
                        wiki article titles if not specified sample defaults
                        to random selection from wikimedia api.
-v                    add to display titles of page requests made.

Requirements: requests

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
.gitignore		.gitignore
README.MD		README.MD
collectbatch.py		collectbatch.py
findpath.py		findpath.py
memo.py		memo.py
requirements.txt		requirements.txt
tests.py		tests.py
wikiapi.py		wikiapi.py
wikigraph.py		wikigraph.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wikigraph

Basic Use

Example session:

collectbatch.py

About

Releases

Packages

Languages

winstonjay/wikigraph

Folders and files

Latest commit

History

Repository files navigation

Wikigraph

Basic Use

Example session:

collectbatch.py

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages