Converting Sentencing Commission Files into CSVs

The good news: The United States Sentencing Commission makes very detailed files available about sentencing in the US. 🎉

The bad news: They are in a crazy fixed-width format and include SAS and SPSS scripts to read them into those programs and those programs alone. 😱

So what can we do about it? Well, we can write a little converter that converts them all! These files will do that for you.

I just want the data

It turns out the data is small enough that you can upload it to GitHub! However, it's lzma compressed. Here's how you can decompress them.

First, a warning

These files are filled with tons of nulls. The typical file size compressed is around 10MB and uncompressed aroung 1.5GB. So so so many blank fields. Loading this directly into pandas on a small box will probably make your box sad. Instead, you should really look at the usecols kwarg of pd.read_csv.

Mac

If you're using homebrew, just do

$ brew install xz

Then you can open the files by doing

$ unxz [FILENAME].xz

Debian/Ubuntu

First install xz utilties

$ sudo apt update && sudo apt install xz-utils

Then you should be able to open files thus

$ xz -d [FILENAME].xz

Windows

Both 7zip and WinZip will open these files for you. Download and install them at your leisure.

Requirements

This is script has only been tested with Python 3 and it assumes you have click installed. But this is just for progress bars, so you can comment out those lines if you want.

Usage

First you'll need to get the data from the Sentencing Commission. The script getdata.sh gives examples, and will itself download FY08-20's data files.

Next you'll need to point the script convert.py at the file. For instance,

$ python3 convert.py data/opafy14nid.zip

This will leave you a file called data/opafy14nid.csv in that folder.

Be warned, these files end up being quite large, so you may want to gzip or xzip them.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
data		data
README.md		README.md
codebook.pdf		codebook.pdf
convert.py		convert.py
getdata.sh		getdata.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Converting Sentencing Commission Files into CSVs

I just want the data

First, a warning

Mac

Debian/Ubuntu

Windows

Requirements

Usage

License

About

Releases

Packages

Languages

khwilson/SentencingCommissionDatasets

Folders and files

Latest commit

History

Repository files navigation

Converting Sentencing Commission Files into CSVs

I just want the data

First, a warning

Mac

Debian/Ubuntu

Windows

Requirements

Usage

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages