Machine-learning-based model and simulation analysis of PM2.5 concentration prediction in Beijing
-
-
Abstract
In recent years, the air quality in China has become a matter of serious concern. Among the available indicators for evaluating air quality, PM2.5 is one of the most important. It comprises a complex mixture of extremely small particles and liquid droplets emitted into the air, whose diameters are no more than 2.5 μm. Environments with a high PM2.5 index are extremely harmful to human health. Once inhaled, these particles can affect the heart and lungs and cause serious health problems. Air pollution is closely related to meteorological conditions such as wind speed, wind direction, atmospheric stability, temperature, and air humidity. With the development of various machine learning methods, deep learning models based on neural networks are increasingly applied in air pollution research. In this study, the temperature, humidity, wind velocity data at different pressure altitudes from 8 locations around Beijing and average of PM2.5 data in Beijing were analyzed and normalized. Multi-dimensional data was ideal for research applications using machine learning methods. and three neural network models were built, including the back propagation (BP), convolutional neural network (CNN), and long short-term memory (LSTM) models, and trained them using the meteorological and PM2.5 data.The results indicate that the accuracies of the back propagation and convolutional neural network models in predicting the PM2.5 pollution level in the next hour is much lower than that of the long short-term memory model. The PM2.5 pollution index predicted for the next hour by the long short-term memory model is very close to the actual value. This result reveals the strong relationship between the PM2.5 pollution index of Beijing and the local meteorological conditions. The long short-term memory model is trained using meteorological data from different pressure altitudes, and found it to be more accurate in predicting pollution levels when using near-surface meteorological data than that obtained from multiple altitudes.
-
-