Cost overrun is a common issue in construction projects, significantly affecting investment efficiency and project management. As projects become increasingly complex, developing accurate predictive models is essential for effective cost risk control. This study proposes a prediction approach using the Random Forest (RF) algorithm combined with SHAP for model interpretability. The dataset consists of 1,000 simulated observations representing key project characteristics, including project scale, estimated cost, material cost, schedule pressure, delay risk, design changes, and economic factors such as material price index and inflation. The model is optimized using Randomized Search and evaluated through 5-fold cross-validation, with performance measured by MAE and R². The results demonstrate high predictive accuracy and identify key factors influencing cost overruns. SHAP analysis further enhances model interpretability, providing valuable insights for cost management and...
Cost overrun is a common issue in construction projects, significantly affecting investment efficiency and project management. As projects become increasingly complex, developing accurate predictive models is essential for effective cost risk control. This study proposes a prediction approach using the Random Forest (RF) algorithm combined with SHAP for model interpretability. The dataset consists of 1,000 simulated observations representing key project characteristics, including project scale, estimated cost, material cost, schedule pressure, delay risk, design changes, and economic factors such as material price index and inflation. The model is optimized using Randomized Search and evaluated through 5-fold cross-validation, with performance measured by MAE and R². The results demonstrate high predictive accuracy and identify key factors influencing cost overruns. SHAP analysis further enhances model interpretability, providing valuable insights for cost management and decision-making in construction projects.