immimaps: United States immigration statistics on map

This repository contains a Python package that can be used to visualize United States immigration statistics on U.S. (and later, world) maps. Currently, only PERM data is supported. PERM data refers to employer-sponsored applications for lawful permanent resident, also known as employment-based green card applications.

Downloading the data

The PERM data is publicly available at the U.S. Department of Labor website at: https://www.dol.gov/agencies/eta/foreign-labor/performance

The downloadable Microsoft Excel (.xlsx) files can be found under "Disclosure Data" section and under "PERM Program". Each fiscal year has its own file, and the total amount of data from fiscal years between 2008 and 2022 is about 620 megabytes. If you prefer, you can choose to download a subset of the available data.

If you are using bash and wget, you can also automatically download all supported files by running the script download.bash which is found in the folder data/dol_perm in this repository. The downloaded files will be placed in that same folder.

Getting started with the code

This package depends on cartopy, pandas and openpyxl packages. For example, if you are using conda, the Python environment can be created and activated as follows:

conda create --name immimaps cartopy pandas openpyxl
conda activate immimaps

If the downloaded data files are stored in the default location data/dol_perm, data preprocessing can be performed by running

python3 -m immimaps.preprocessing

from the repository root folder. This will create a pickle file data/dol_perm/perm.pkl which will contain the most relevant PERM application information from all available fiscal years in a single pandas.DataFrame object. Applications that are denied or withdrawn are excluded from this file. The preprocessing step will also output several intermediate files that may or may not be of interest.

Example

As an example, we can show the percentages of new immigrants that hold a doctoral degree in different U.S. states. A simplified Python code would look like this:

import pandas as pd
import immimaps.cartography

datafile = '/path/to/perm.pkl'
data = pd.read_pickle(datafile)

doctorate_ratio = data.groupby('job_state')['worker_education_level'].\
    apply(lambda x: 100 * (x=='DOCTORATE').sum() / x.count())

ax, sm = immimaps.cartography.draw_us_map(doctorate_ratio.to_dict())
# add title etc...

plt.show()

The output would look similar to this:

A full example script is available in the examples subfolder.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data/dol_perm		data/dol_perm
doc		doc
examples		examples
immimaps		immimaps
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

immimaps: United States immigration statistics on map

Downloading the data

Getting started with the code

Example

About

Releases

Packages

Languages

License

lamm45/immimaps

Folders and files

Latest commit

History

Repository files navigation

immimaps: United States immigration statistics on map

Downloading the data

Getting started with the code

Example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages