Using Supervised Machine Learning to Predict House Prices

Authors

  • Alexander Tsai Hunter College High School
  • Hieu Nguyen

DOI:

https://doi.org/10.47611/jsrhs.v11i4.3151

Keywords:

Computer Science, Machine Learning, Regression, Artificial Intelligence, Supervised Machine Learning

Abstract

Given the recent influx of prices in the housing market, determining a fair housing price has been of high interest for many homebuyers and sellers alike. In this project, various machine learning models are used to predict the price of a house based on physical features and characteristics such as lot size and neighborhood. Extensive data preprocessing and feature engineering were employed to aid the models’ performance compared to other models in the market. The best models have been able to predict U.S houses’ prices within a RMSE value of $23,000 when the mean price of a house in the dataset is $180,000. In future research, this model can be implemented in various other places within the U.S and additional features can improve performance further.

Downloads

Download data is not yet available.

References or Bibliography

https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques/data

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 12, 2825-2830

Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67. https://doi.org/10.1080/00401706.1970.10488634.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x.

Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2), 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x.

Krauss, C., Do, X. A., & Huck, N. (2017). Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500. European Journal of Operational Research, 259(2), 689-702. https://doi.org/10.1016/j.ejor.2016.10.031

Huang, W., Lai, K. K., Nakamori, Y., & Wang, S. (2004). Forecasting foreign exchange rates with artificial neural networks: A review. International Journal of Information Technology & Decision Making, 3(01), 145-165. https://doi.org/10.1142/S0219622004000969

Published

11-30-2022

How to Cite

Tsai, A., & Nguyen, H. (2022). Using Supervised Machine Learning to Predict House Prices. Journal of Student Research, 11(4). https://doi.org/10.47611/jsrhs.v11i4.3151

Issue

Section

HS Research Projects