Air Quality Prediction Based on Long Short-Term Memory (LSTM) and Clustering K-Means in Andahuaylas, Peru

Herwin Alayn Huillcen Baca, Flor de Luz Palomino Valdivia, Manuel J. Ibarra, Mario Aquino Cruz, Melvin Edward Huillcen Baca

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

1 Cita (Scopus)


Air pollution is a global problem that directly affects the health of living beings; the World Health Organization (WHO) estimates that about 7 million of people die each year from exposure to polluted air. Having a prediction model for these air pollutants is an essential source of information for the proper prevention of health and life. There are many methods, and models for predicting air quality, almost all of them focused on large cities in the world. However, there are no models for cities considered underdeveloped and with high air pollution. Under this approach, the present project implemented an air quality prediction model for air pollutants (PM2.5, NO2, and 03). This is a proposal based on a method that combines a recurring neural network architecture LSTM and the increase of characteristics through a clustering process with K-means. The efficiency of our model was evaluated with the mean absolute error (MAE) and the mean square error (RMSE) and compared with machine learning algorithms: (Linnear Regression, K-Nearest, Random Forest, Decision Tree, and LSTM). Our proposed model (LSTM K-means) was more efficient than the traditional machine learning algorithms for regression; in the case of particulate matter (PM25) prediction, an MAE of 1.5 and RMSE of 2.39 was obtained, for Nitrogen Oxide (NO2) an MAE of 0.05 and RMSE of 0.06. For Ozone (O3), an MAE of 7.5 and RMSE of 9.81 was obtained, which are the minimum values compared to other algorithms.

Idioma originalInglés
Título de la publicación alojadaAdvances in Information and Communication - Proceedings of the 2021 Future of Information and Communication Conference, FICC
EditoresKohei Arai
EditorialSpringer Science and Business Media Deutschland GmbH
Número de páginas13
ISBN (versión impresa)9783030731021
EstadoPublicada - 2021
Publicado de forma externa
EventoFuture of Information and Communication Conference, FICC 2021 - Virtual, Online
Duración: 29 abr. 202130 abr. 2021

Serie de la publicación

NombreAdvances in Intelligent Systems and Computing
Volumen1364 AISC
ISSN (versión impresa)2194-5357
ISSN (versión digital)2194-5365


ConferenciaFuture of Information and Communication Conference, FICC 2021
CiudadVirtual, Online


Profundice en los temas de investigación de 'Air Quality Prediction Based on Long Short-Term Memory (LSTM) and Clustering K-Means in Andahuaylas, Peru'. En conjunto forman una huella única.

Citar esto