Predictability of Belgian residential real estate rents using tree-based ML models and IML techniques

Purpose The purpose is twofold. First, this study aims to establish that black box tree-based machine learning (ML) models have better predictive performance than a standard linear regression (LR) hedonic model for rent prediction. Second, it shows the added value of analyzing tree-based ML models with interpretable machine learning (IML) techniques. Design/methodology/approach Data on Belgian residential rental properties were collected. Tree-based ML models, random forest regression and eXtreme gradient boosting regression were applied to derive rent prediction models to compare predictive p... Mehr ...

Verfasser: Lenaers, Ian
Boudt, Kris
De Moor, Lieven
Dokumenttyp: journalarticle
Erscheinungsdatum: 2023
Schlagwörter: Business and Economics / General Economics / Econometrics and Finance / Rent prediction / Residential real estate / Machine learning / Black box / Interpretable machine learning / SHapley Additive exPlanations / MASS APPRAISAL
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-26917080
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://biblio.ugent.be/publication/01HHQGWGR372B5AP17MNYSHQ4T

Purpose The purpose is twofold. First, this study aims to establish that black box tree-based machine learning (ML) models have better predictive performance than a standard linear regression (LR) hedonic model for rent prediction. Second, it shows the added value of analyzing tree-based ML models with interpretable machine learning (IML) techniques. Design/methodology/approach Data on Belgian residential rental properties were collected. Tree-based ML models, random forest regression and eXtreme gradient boosting regression were applied to derive rent prediction models to compare predictive performance with a LR model. Interpretations of the tree-based models regarding important factors in predicting rent were made using SHapley Additive exPlanations (SHAP) feature importance (FI) plots and SHAP summary plots. Findings Results indicate that tree-based models perform better than a LR model for Belgian residential rent prediction. The SHAP FI plots agree that asking price, cadastral income, surface livable, number of bedrooms, number of bathrooms and variables measuring the proximity to points of interest are dominant predictors. The direction of relationships between rent and its factors is determined with SHAP summary plots. In addition to linear relationships, it emerges that nonlinear relationships exist. Originality/value Rent prediction using ML is relatively less studied than house price prediction. In addition, studying prediction models using IML techniques is relatively new in real estate economics. Moreover, to the best of the authors’ knowledge, this study is the first to derive insights of driving determinants of predicted rents from SHAP FI and SHAP summary plots.