Predictive Modeling of Air Quality Index Using Ensemble Learning and Multivariate Analysis

Anggina Primanita, Hadipurnawan Satria

Abstract

Breathing polluted air can result in multiple health problems. Thus, it is important to understand and predict the air quality in the environment. Air Quality Index (AQI) is a unit used to measure the air pollutants. In Indonesia, this value is measured and published by the Meteorological, Climatological, and Geophysical Agency regularly. In this research, four commonly used regression algorithms were used to analyzed AQI data, namely, Random Forest, Decision Tree, K-Neural Network, and Ada Boost. All the algorithms model were developed to analyzed 1096 AQI data. The Mean Squared Error value of each model was computed as a measure of comparison. It is found that the Random Forest is the best performing algorithm. It can generalize well without overfitting to the data set.

Full Text:

PDF

References

L. Zhang and X. Ma, “A novel multi-fractional multivariate grey model for city air quality index prediction in China,” Expert Syst Appl, vol. 257, Dec. 2024, doi: 10.1016/j.eswa.2024.125010.

A. A. Almetwally, M. Bin-Jumah, and A. A. Allam, “Ambient air pollution and its influence on human health and welfare: an overview,” Environmental Science and Pollution Research, vol. 27, no. 20, pp. 24815–24830, Jul. 2020, doi: 10.1007/s11356-020-09042-2.

R. Fuller et al., “Pollution and health: a progress update,” Jun. 01, 2022, Elsevier B.V. doi: 10.1016/S2542-5196(22)00090-0.

P. K and P. Kumar, “A critical evaluation of air quality index models (1960–2021),” Environ Monit Assess, vol. 194, no. 5, p. 324, May 2022, doi: 10.1007/s10661-022-09896-8.

BMKG, “Official Website of Badan Meteorologi, Klimatologi dan Geofisika Indonesia,” 2024. Last accessed: July 30, 2024.

I. Pardoe, Applied Regression Modeling. Wiley, 2020. doi: 10.1002/9781119615941.

F. Stulp and O. Sigaud, “Many regression algorithms, one unified model: A review,” Neural Networks, vol. 69, pp. 60–79, Sep. 2015, doi: 10.1016/j.neunet.2015.05.005.

S. Ketu, “Spatial Air Quality Index and Air Pollutant Concentration prediction using Linear Regression based Recursive Feature Elimination with Random Forest Regression (RFERF): a case study in India,” Natural Hazards, vol. 114, no. 2, pp. 2109–2138, Nov. 2022, doi: 10.1007/s11069-022-05463-z.

K. C. Atmakuri and K. V Prasad, “Urban Air Quality Analysis And Aqi Prediction Using Improved Knn Classifier,” Journal of Pharmaceutical Negative Results ¦, vol. 13, 2022, doi: 10.47750/pnr.2022.13.S09.899.

D. Thamizhselvi, B. Kasi, K. Kamalakkannan, S. Bharath, S. Gowtham, and M. A. Kishore, “Air Quality Prediction Using Adaboost,” in 2023 Intelligent Computing and Control for Engineering and Business Systems (ICCEBS), IEEE, Dec. 2023, pp. 1–6. doi: 10.1109/ICCEBS58601.2023.10448934.

R. Yu, Y. Yang, L. Yang, G. Han, and O. A. Move, “RAQ–A random forest approach for predicting air quality in urban sensing systems,” Sensors (Switzerland), vol. 16, no. 1, Jan. 2016, doi: 10.3390/s16010086.

A. Jamal and R. N. Nodehi, “PREDICTING AIR QUALITY INDEX BASED ON METEOROLOGICAL DATA: A COMPARISON OF REGRESSION ANALYSIS, ARTIFICIAL NEURAL NETWORKS AND DECISION TREE,” 2017. [Online]. Available: http://japh.tums.ac.ir

B. BARAN, “AIR QUALITY INDEX PREDICTION IN BESIKTAS DISTRICT BY ARTIFICIAL NEURAL NETWORKS AND K NEAREST NEIGHBORS,” Mühendislik Bilimleri ve Tasarım Dergisi, vol. 9, no. 1, pp. 52–63, Mar. 2021, doi: 10.21923/jesd.671836.

L. Breiman, “Random Forests,” 2001.

A. Fernández, S. García, F. Herrera, and N. V Chawla, “SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary,” 2018.

Refbacks

  • There are currently no refbacks.