This is a series of scripts for extracting and processing data from Strat-O-Matic game files. The game data is stored in three types of files: dailies, scorebooks, and standings. This pipeline only handles dailies and standings files as I only have a very small number of scorebook files relative to dailies and standings.
The scripts use the filesystem as a datastore.
Create the virtual environment
$ cd strat-manager
$ virtualenv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
This is an optional script for extracting dailies, scorebooks, and standings files from an mbox file. This script exists because I had a large number of files stored as reports in a gmail account. The easiest way to get access to them was to export the data as an mbox file using Gmail's export tools.
$ mkdir ./data-files
$ ./extractor.py --stash=./data-files --use-db sample/gmail.mbox
Before using the parsers you need to generate the TatSu Parser file. This file is pre-generated as a convenience, but if you need to make changes you can generate a new version as follows:
$ python -m tatsu -m GameReport GameReport.ebnf >GameReport.py
You can process a single report file into an AST as follows:
$ mkdir ./raw-asts
$ ./parse-report.py --stash=./raw-asts --league=blb ./sample/league-daily.report
$ ./parse-report.py --stash=./raw-asts --league=blb ./sample/game-daily.report
This will generate an AST from the HTML Report file and store it in a
file called ./raw-asts/file-ast.dat
.
This is a series of scripts to import team and statistical data into a
database. It uses a sqlite database for now, called blb.db
.
The easiest way to import the data is to use the provided initialization script:
$ ./initialize-db.sh blb.db
If this succeeds, you can skip the Team and Fangraphs importers.
Generate a table of baseball teams. This table is a prerequisite for the Fangraphs importer, found below.
$ ./mlb-importer.py ./fixtures/mlb.csv
Import a CSV from Fangraphs. See the
Batting
and Pitching
models for a list of the supported
statistical categories. It's easy to add more, but these are the ones
I found most interesting.
$ ./fg-importer.py ./fixtures/fg-batting-2008.csv
$ ./fg-importer.py ./fixtures/fg-pitching-2008.csv
Launch iPython and load the query file
$ ipython
In [1]: %load sql-bootstrap.py
Get all PlayerSeason
records for a Player with id
= 1
In [5]: player_seasons = session.query(FGPlayerSeason).filter(FGPlayerSeason.player_id == 1)
In [7]: for ps in player_seasons:
...: print ps
...:
FGPlayerSeason(FGPlayer=Alfredo Amezaga, Season=2008)>
FGPlayerSeason(FGPlayer=Alfredo Amezaga, Season=2009)>
FGPlayerSeason(FGPlayer=Alfredo Amezaga, Season=2011)>
Launch iPython
$ ipython
Prepare Store
In [21]: %load store-bootstrap.py
Query the Store
In [21]: players = store.get_players_by_year('2009')
In [21]: p = players.result() # resolve Future
$ PYTHONPATH=./ python web/server.py