Skip to content

My description+ mock solution of a challenge for AIHack 2018, a student-hackathon with 300 expected attendees

Notifications You must be signed in to change notification settings

Eirikalb/AIHack18Challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

AIHack18Challenge

My description and mock solution of a challenge for AIHack 2018, a student-hackathon with 300 expected attendees

Challenge

The goal of this challenge is to look for interesting correlations within the dataset. With features describing everything from average income to education to age distribution and living arrangements there are a lot of opportunities and we want you guys to find insights that you think could be utilized to improve the lives of those living people living in the area. Specifically we want you to pick one or two variables from the dataset you think will be valuable to be able to predict and from that train a model to try and predict said variables using either the entire dataset or subsets of the dataset that you are given.

Dataset

The dataset is 23123x7730 with each instance being a block group, a statistical unit used by the US census bureau, labeled with 7730 features. The features describes various socioeconomic traits of each block group and is rooted in a 10-question questionaire that every single American citizen should have answered, issued by the US census bureau. The questions ask for sex,age,gender, annual income, civil status, education and employment status and the dataset has restructured these answers into anyonomous features describing the averages of some answers and the count of people fitting certain characteristics as well. One variable could be "PER CAPITA INCOME IN THE PAST 12 MONTHS (IN 2016 INFLATION-ADJUSTED DOLLARS): Total: Total population -- (Estimate)". This would also have a corresponding variable with the same name, just ending in "(Margin of Error)",as the name suggest this would give you the "Margin of Error" for that specific feature. So for every actual feature there are two variables: one estimate and one margin of error. (It might be an idea you erase all margin of error labels before you try and fit your models.)

About

My description+ mock solution of a challenge for AIHack 2018, a student-hackathon with 300 expected attendees

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published