• RecordNumber
    4033
  • Author

    Farahzadiy, Mehdi

  • Crop_Body
    Mehdi Farahzadiy;, Rahman Farnooshz and Mohammad Hassan Behzadi
  • Title of Article

    Machine Learning Models for Housing Prices Forecasting using Registration Data

  • Title Of Journal
    مجله پژوهشهاي آماري ايران
  • PublishInfo
    Statistical Research and Training Center پژوهشكده آمار
  • Publication Year
    2020
  • Volum
    17
  • Issue Number
    1
  • Page
    191-214
  • Keywords
    Housing price forecasting , nearest neighbor regression , random forest regression. , support vector regression , long short-Term memory neural network , and extreme gradient boosting regression
  • Abstract
    This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient Boosting Regression Algorithm (XGBR), and the Long Short-Term Memory Neural Network Algorithm (LSTM). This research has been done using the data of the Statistics Center of Iran, which contains information on the purchase and sale of residential units in Tehran in the years 2014 to 2020 and includes 998299 transactions and 11 features. Loss of data, batch data conversion, normalization, etc. are performed on the housing data set to obtain the final and error-free data set. To divide the data set into training and test data sets, the important and practical method of cross-validation or K-Fold has been used because of its simplicity and effectiveness and as a universally valid method. Various eva‎luation criteria such as MSE, RMSE, MAE,ME and R2 were used to compare the models and identify the best model. Comparison of models in terms of all eva‎luation criteria in all K-fold subsets proves the stability and superiority of the Extreme Gradient Boosting Regression model.