Abstract:To achieve more accurate cost forecasting for transmission and substation line projects and enhance investment control capabilities, this paper proposes a predictive model that integrates feature optimization and Stacking ensemble learning. First, based on a theoretical analysis of engineering cost composition and quantified feature importance using Random Forest (RF), a two-stage feature screening process is performed to construct a key feature set, thereby enhancing the representativeness of the model"s input. Subsequently, addressing the structural limitations of single models in handling the heterogeneous cost data of transmission and substation projects, a Stacking ensemble framework is constructed, utilizing Support Vector Regression (SVR), BP Neural Network (BPNN), and Random Forest (RF) as heterogeneous base learners, and Ridge Regression as the meta-learner; meta-features are generated through five-fold cross-validation to effectively integrate the advantages of each base learner. Experimental results based on 120 actual project data points from Province J show that the Stacking ensemble model achieves a Root Mean Square Error (RMSE) of 865.382 million RMB and a Mean Absolute Percentage Error (MAPE) of 9.231%. The model"s performance is not only significantly superior to single models but also outperforms mainstream parameter optimization models such as PSO-SVR and GA-BP. Further case studies demonstrate that the model effectively overcomes the inherent defects of single models in addressing typical industry challenges like sparse technical schemes, complex non-linear paths, and extreme boundary conditions, proving its stronger domain adaptability and generalization ability for transmission and substation project cost forecasting tasks.