Abstract:
As global climate warming intensifies, climate change has impacted every aspect of the occurrence, transmission, and variation of infectious diseases. The adverse effects of weather-related infectious diseases on human health have gradually become a major concern for the public. To promptly implement preventive and timely intervention measures against hand-foot-and-mouth disease (HFMD), accurate and reliable forecasting of daily HFMD cases is imperative. Addressing the issues of low accuracy and poor interpretability in existing HFMD incidence prediction models, this study proposes an interpretable prediction model, GWO-LSTM-GA-XGBoost, integrating multiple meteorological factors with Long Short-Term Memory (LSTM), eXtreme Gradient Boosting (XGBoost), Grey Wolf Optimizer (GWO), Genetic Algorithm (GA), and SHapley Additive exPlanations (SHAP). Initially, missing values in the data are imputed, and key meteorological factors influencing HFMD incidence are identified through grey relational analysis. Subsequently, a model is constructed to capture the relationship between HFMD incidence, meteorological conditions, and temporal factors. The GWO algorithm is employed to adaptively optimize the key parameters in the LSTM algorithm. Then, leveraging the global search capability of the GA algorithm, the parameters of the XGBoost algorithm are optimized to compensate for its slow convergence. Following this, the improved LSTM and XGBoost algorithms are fused using the reciprocal error method to enhance the model's prediction accuracy. Finally, SHAP is utilized to attribute and analyze the feature importance of the model for interpretability. Based on daily HFMD incidence and meteorological monitoring data from a southern city between 2014 and 2019, comparative experiments were conducted to evaluate the model's performance in predicting HFMD incidence. The results demonstrate that compared to other machine learning prediction models, the proposed model achieves higher prediction accuracy, enabling precise forecasting of HFMD incidence and efficient identification of potential features associated with HFMD. Notably, temperature emerges as the most critical factor influencing HFMD incidence.