A simple regression project on prediction of Tehran house prices.
We have the information of almost 4000 apartments in Tehran. All data is completely real. Your task is to estimate the price in dollars or tomans using the features of the data set that we have described below. The data is stored in the dataset.csv file or can be downloaded from here.
- House size in meters (Area)
- Number of bedrooms
- Is there a parking lot or not?
- Does it have a warehouse or not?
- Does it have an elevator or not?
- An approximate address in Tehran (Address)
- Price in Tomans
- Price in dollars (Price(USD))
In this dataset, some houses do not have addresses, and also the size of some houses is entered incorrectly (they have a very large value). For this purpose, you should also manage these items and remove them from your dataset.
My model could achieve accuracy of about 80% with polynomial regression on Area and Address Features. Of course there are better models with higher accuracy available at kaggle :)