Abstract:
The rapid development of information technology and increasing penetration of Internet applications have increasingly worsened the cybersecurity landscape. National security and economic interests are seriously threatened by frequent cyberattacks, making the accurate prediction of cybersecurity situational awareness an important and urgent research task. Existing prediction methods are limited by insufficient accuracy owing to the fixed input length of traditional models and the nonstationary nature of the data. To address this issue, a cybersecurity situational awareness prediction method using step adaptation and similarity-based correction is proposed. First, variational modal decomposition is introduced to extract the main modal components. Second, the fast Fourier transform is used to determine the period number for the input length of the prediction model. For the nonperiodic modal components, the decreasing Lempel–Ziv complexity criterion is used to determine the input length of the prediction model adaptively. Third, for each modal component, the support vector machine submodel is constructed using the training dataset. Finally, based on the cosine variance similarity index, similar subsets corresponding to the test set are searched in the training dataset. In addition, using the above submodel, the initial prediction result of a similar data subset is obtained. The similar data subset and initial prediction results are obtained for the final inputs of the support vector machine prediction model. Experiments conducted on the standard cybersecurity dataset NSL-KDD demonstrate the following. First, for the predictive performance of the initial input of the predictive model determined by the proposed step-adaptive strategy, the coefficient of determination (
R2) remains higher than those of the other input lengths. The predictive performance of the proposed similarity-based correction mechanism exhibits more pronounced effects in multistep predictions. In the four-step predictions, the mean absolute error (MAE), mean squared error (MSE), and mean absolute percentage error (MAPE) decrease by 29.00%, 53.69%, and 36.53%, respectively, while
R2 increases by 5.03%. Ultimately, for the overall prediction performance of the proposed adaptive-similarity-based cybersecurity threat prediction method, the single-step prediction method yields an MSE of 1.75×10
−4, MAE of 1.07×10
−2, MAPE of 5.61×10
−2, and
R2 of 0.984. Compared with backpropagation (BP), long short-term memory (LSTM), and temporal convolutional networks (TCN), the proposed method demonstrates superior prediction performance. The two-step prediction method yields an MSE of 2.22×10
−4, MAE of 0.122, MAPE of 6.59×10
−2, and
R2 of 0.979. The three-step prediction method has an MSE of 3.41×10
−4, MAE of 0.149, MAPE of 8.44×10
−2, and
R2 of 0.968. The four-step prediction method has an MSE of 4.14×10
−4, MAE of 0.164, MAPE of 9.33×10
−2, and
R2 of 0.961. The prediction accuracy of the network security status prediction method based on step adaptation and similarity-based correction is confirmed to be significantly superior to that of traditional shallow learning, deep learning, and original support vector machine methods, with high prediction accuracy. To further verify the generalization ability of the proposed method, data from the National Computer Network Emergency Response Technical Team was selected for generalization verification. The results confirm that this method achieves optimal prediction performance. Furthermore, the proposed method addresses the insufficient prediction accuracy caused by the fixed input length of traditional models and data nonstationarity.