Machine learning meteorological normalization models for trend analysis of air quality time series

Roberta Valentina Gagliardi; Claudio Andenna

doi:10.2495/ei-v4-n4-375-389

Prediction and Forecasting of Air Quality Index in Chennai using Regression and ARIMA time series models

Journal of Engineering Research ◽

10.36909/jer.10253 ◽

2021 ◽

Vol 9 ◽

Author(s):

Geetha Mani ◽

◽

Joshi Kumar Viswanadhapalli ◽

Albert Alexander Stonie ◽

◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Air Quality ◽

Linear Regression ◽

Quality Index ◽

Air Quality Index ◽

Model Parameters ◽

Sensor Output ◽

Model Accuracy ◽

Life On Earth

Air is one of the most fundamental constituents for the sustenance of life on earth. The meteorological, traffic factors, consumption of non-renewable energy sources, and industrial parameters are steadily increasing air pollution. These factors affect the welfare and prosperity of life on earth; therefore, the nature of air quality in our environment needs to be monitored continuously. The Air Quality Index (AQI), which indicates air quality, is influenced by several individual factors such as the accumulation of NO2, CO, O3, PM2.5, SO2, and PM10. This research paper aims to predict and forecast the AQI with Machine Learning (ML) techniques, namely linear regression and time series analysis. Primarily,Multi Linear Regression (MLR) model, supervised machine learning, is developed to predict AQI. NO2, Ozone(O3), PM 2.5, and SO2 sensor output collected from Central Pollution Control Board (CPCB) – Chennai region, India feed as input features and optimized AQI calculated from sensor's output set as a target to train the regression model. The obtained model parameters are validated with new and unseen sensor output. The Key Performance Indices(KPI) like co-efficient of determination, root mean square error and mean absolute error were calculated to validate the model accuracy. The K-cross-fold validation for testing data of MLR was obtained as around 92%. Secondly, the Auto-Regressive Integrated Moving Average (ARIMA) time series model is applied to forecast the AQI. The obtained model parameters were validated with unseen data with a timestamp. The forecasted AQI value of the next 15 days lies in a 95 % confidence interval zone. The model accuracy of test data was obtained as more than 80%.

Download Full-text

MLAir (v1.0) – a tool to enable fast and flexible machine learning on air data time series

10.5194/gmd-2020-332 ◽

2020 ◽

Author(s):

Lukas H. Leufen ◽

Felix Kleinert ◽

Martin G. Schultz

Keyword(s):

Machine Learning ◽

Time Series ◽

Air Quality ◽

Graphics Processing Units ◽

Ease Of Use ◽

Software Environment ◽

Flexible Machine ◽

Code Base ◽

Graphics Processing ◽

Scientific Questions

Abstract. With MLAir (Machine Learning on Air data) we created a software environment that simplifies and accelerates the exploration of new machine learning (ML) models for the analysis and forecasting of meteorological and air quality time series. Thereby MLAir is not developed as an abstract workflow, but hand in hand with actual scientific questions. It thus addresses scientists with either a meteorological or a ML background. Due to their relative ease of use and spectacular results in other application areas, neural networks and other ML methods are gaining enormous momentum also in the weather and air quality research communities. Even though there are already many books and tutorials describing how to conduct a ML experiment, there are many stumbling blocks for a newcomer. In contrast, people familiar with ML concepts and technology often have difficulties understanding the nature of atmospheric data. With MLAir we have addressed a number of these pitfalls so that it becomes easier for scientists of both domains to rapidly start off their ML application. MLAir has been developed in such a way that it is easy to use and is designed from the very beginning as a standalone, fully functional experiment. Due to its flexible, modular code base, code modifications are easy and personal experiment schedules can be quickly derived. The package also includes a set of simple validation tools to facilitate the evaluation of ML results using standard meteorological statistics. MLAir can easily be ported onto different computing environments from desktop workstations to high-end supercomputers with or without graphics processing units (GPU).

Download Full-text

Farming systems monitoring using machine learning and trend analysis methods based on fitted NDVI time series data in a semi-arid region of Morocco

Remote Sensing for Agriculture, Ecosystems, and Hydrology XXI ◽

10.1117/12.2532928 ◽

2019 ◽

Cited By ~ 2

Author(s):

Youssef Lebrini ◽

Tarik Benabdelouahab ◽

Abdelghani Boudhar ◽

Abdelaziz Htitiou ◽

Rachid Hadria ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Trend Analysis ◽

Arid Region ◽

Time Series Data ◽

Farming Systems ◽

Series Data ◽

Ndvi Time Series ◽

Semi Arid Region ◽

Semi Arid

Download Full-text

MLAir (v1.0) – a tool to enable fast and flexible machine learning on air data time series

Geoscientific Model Development ◽

10.5194/gmd-14-1553-2021 ◽

2021 ◽

Vol 14 (3) ◽

pp. 1553-1574

Author(s):

Lukas Hubert Leufen ◽

Felix Kleinert ◽

Martin G. Schultz

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Time Series ◽

Air Quality ◽

Graphics Processing Units ◽

Ease Of Use ◽

Software Environment ◽

Flexible Machine ◽

Code Base ◽

Graphics Processing

Abstract. With MLAir (Machine Learning on Air data) we created a software environment that simplifies and accelerates the exploration of new machine learning (ML) models, specifically shallow and deep neural networks, for the analysis and forecasting of meteorological and air quality time series. Thereby MLAir is not developed as an abstract workflow, but hand in hand with actual scientific questions. It thus addresses scientists with either a meteorological or an ML background. Due to their relative ease of use and spectacular results in other application areas, neural networks and other ML methods are also gaining enormous momentum in the weather and air quality research communities. Even though there are already many books and tutorials describing how to conduct an ML experiment, there are many stumbling blocks for a newcomer. In contrast, people familiar with ML concepts and technology often have difficulties understanding the nature of atmospheric data. With MLAir we have addressed a number of these pitfalls so that it becomes easier for scientists of both domains to rapidly start off their ML application. MLAir has been developed in such a way that it is easy to use and is designed from the very beginning as a stand-alone, fully functional experiment. Due to its flexible, modular code base, code modifications are easy and personal experiment schedules can be quickly derived. The package also includes a set of validation tools to facilitate the evaluation of ML results using standard meteorological statistics. MLAir can easily be ported onto different computing environments from desktop workstations to high-end supercomputers with or without graphics processing units (GPUs).

Download Full-text

Deep Learning for text in limted data settings

10.36227/techrxiv.12100692 ◽

2020 ◽

Author(s):

Pathikkumar Patel ◽

Bhargav Lad ◽

Jinan Fiaidhi

Keyword(s):

Machine Learning ◽

Time Series ◽

Deep Learning ◽

Sentiment Analysis ◽

Transfer Learning ◽

Text Classification ◽

State Of The Art ◽

Time Series Forecasting ◽

Text Data ◽

Performance Levels

During the last few years, RNN models have been extensively used and they have proven to be better for sequence and text data. RNNs have achieved state-of-the-art performance levels in several applications such as text classification, sequence to sequence modelling and time series forecasting. In this article we will review different Machine Learning and Deep Learning based approaches for text data and look at the results obtained from these methods. This work also explores the use of transfer learning in NLP and how it affects the performance of models on a specific application of sentiment analysis.

Download Full-text

A Machine Learning Approach to Biodiversity Time Series Analysis

SSRN Electronic Journal ◽

10.2139/ssrn.3520735 ◽

2020 ◽

Author(s):

Rajarshi Paul ◽

Th. Shanta Kumar

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Analysis ◽

Learning Approach ◽

Series Analysis ◽

Machine Learning Approach

Download Full-text

Impact of Near-Time Information for Prediction on Microeconomic Balanced Time Series Data using Different Machine Learning Methods

SSRN Electronic Journal ◽

10.2139/ssrn.3559645 ◽

2020 ◽

Author(s):

Frederik Collin ◽

Martin Kies

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

Series Data ◽

Learning Methods ◽

Machine Learning Methods ◽

Time Information

Download Full-text

Development of A Drug Early Warning System Model for Cardiac Arrest Using Deep Learning: Retrospective Cohort Study (Preprint)

10.2196/preprints.26783 ◽

2020 ◽

Author(s):

Hsiao-Ko Chang ◽

Hui-Chih Wang ◽

Chih-Fen Huang ◽

Feipei Lai

Keyword(s):

Machine Learning ◽

Time Series ◽

Cardiac Arrest ◽

Early Warning ◽

Time Series Data ◽

Predictive Accuracy ◽

Vital Signs ◽

Warning System ◽

Series Data ◽

Dynamic Time

BACKGROUND In most of Taiwan’s medical institutions, congestion is a serious problem for emergency departments. Due to a lack of beds, patients spend more time in emergency retention zones, which make it difficult to detect cardiac arrest (CA). OBJECTIVE We seek to develop a Drug Early Warning System Model (DEWSM), it included drug injections and vital signs as this research important features. We use it to predict cardiac arrest in emergency departments via drug classification and medical expert suggestion. METHODS We propose this new model for detecting cardiac arrest via drug classification and by using a sliding window; we apply learning-based algorithms to time-series data for a DEWSM. By treating drug features as a dynamic time-series factor for cardiopulmonary resuscitation (CPR) patients, we increase sensitivity, reduce false alarm rates and mortality, and increase the model’s accuracy. To evaluate the proposed model, we use the area under the receiver operating characteristic curve (AUROC). RESULTS Four important findings are as follows: (1) We identify the most important drug predictors: bits (intravenous therapy), and replenishers and regulators of water and electrolytes (fluid and electrolyte supplement). The best AUROC of bits is 85%, it means the medical expert suggest the drug features: bits, it will affect the vital signs, and then the evaluate this model correctly classified patients with CPR reach 85%; that of replenishers and regulators of water and electrolytes is 86%. These two features are the most influential of the drug features in the task. (2) We verify feature selection, in which accounting for drugs improve the accuracy: In Task 1, the best AUROC of vital signs is 77%, and that of all features is 86%. In Task 2, the best AUROC of all features is 85%, which demonstrates that thus accounting for the drugs significantly affects prediction. (3) We use a better model: For traditional machine learning, this study adds a new AI technology: the long short-term memory (LSTM) model with the best time-series accuracy, comparable to the traditional random forest (RF) model; the two AUROC measures are 85%. It can be seen that the use of new AI technology will achieve better results, currently comparable to the accuracy of traditional common RF, and the LSTM model can be adjusted in the future to obtain better results. (4) We determine whether the event can be predicted beforehand: The best classifier is still an RF model, in which the observational starting time is 4 hours before the CPR event. Although the accuracy is impaired, the predictive accuracy still reaches 70%. Therefore, we believe that CPR events can be predicted four hours before the event. CONCLUSIONS This paper uses a sliding window to account for dynamic time-series data consisting of the patient’s vital signs and drug injections. The National Early Warning Score (NEWS) only focuses on the score of vital signs, and does not include factors related to drug injections. In this study, the experimental results of adding the drug injections are better than only vital signs. In a comparison with NEWS, we improve predictive accuracy via feature selection, which includes drugs as features. In addition, we use traditional machine learning methods and deep learning (using LSTM method as the main processing time series data) as the basis for comparison of this research. The proposed DEWSM, which offers 4-hour predictions, is better than the NEWS in the literature. This also confirms that the doctor’s heuristic rules are consistent with the results found by machine learning algorithms.

Download Full-text

Forecasting PV Panel Output Using Prophet Time Series Machine Learning Model

2020 IEEE REGION 10 CONFERENCE (TENCON) ◽

10.1109/tencon50793.2020.9293751 ◽

2020 ◽

Author(s):

Md. Mehedi Hasan Shawon ◽

Sumaiya Akter ◽

Md. Kamrul Islam ◽

Sabbir Ahmed ◽

Md. Mosaddequr Rahman

Keyword(s):

Machine Learning ◽

Time Series ◽

Learning Model ◽

Pv Panel ◽

Machine Learning Model

Download Full-text

Color Trend Analysis using Machine Learning with Fashion Collection Images

Clothing and Textiles Research Journal ◽

10.1177/0887302x21995948 ◽

2021 ◽

pp. 0887302X2199594

Author(s):

Ahyoung Han ◽

Jihoon Kim ◽

Jaehong Ahn

Keyword(s):

Machine Learning ◽

Trend Analysis ◽

Image Data ◽

Fashion Industry ◽

Color Palette ◽

Web Scraping ◽

Design Variables ◽

Sales Organizations ◽

Fashion Designers ◽

Selection Of

Fashion color trends are an essential marketing element that directly affect brand sales. Organizations such as Pantone have global authority over professional color standards by annually forecasting color palettes. However, the question remains whether fashion designers apply these colors in fashion shows that guide seasonal fashion trends. This study analyzed image data from fashion collections through machine learning to obtain measurable results by web-scraping catwalk images, separating body and clothing elements via machine learning, defining a selection of color chips using k-means algorithms, and analyzing the similarity between the Pantone color palette (16 colors) and the analysis color chips. The gap between the Pantone trends and the colors used in fashion collections were quantitatively analyzed and found to be significant. This study indicates the potential of machine learning within the fashion industry to guide production and suggests further research expand on other design variables.

Download Full-text