scholarly journals Analysis of Changes in Pollutant Concentrations Levels Using a Meteorological Normalisation Technique Based on a Machine Learning Algorithm

2021 ◽  
Vol 8 (1) ◽  
pp. 16
Author(s):  
Roberta Valentina Gagliardi ◽  
Claudio Andenna

In this study, a methodological procedure combining a technique of meteorological normalisation, based on a random forest algorithm, with trend analysis and the change points detections in air quality time series is developed to analyse changes in pollutant concentrations levels. Data of air pollutants and meteorological parameters, collected over the period 2013–2019 in a rural area affected by anthropic sources of air pollutants, are used to test the procedure. The results appear to be promising in revealing, in a robust way, changes in pollutant levels not clearly observable in the original data.

2021 ◽  
Vol 8 (3) ◽  
pp. 209-221
Author(s):  
Li-Li Wei ◽  
Yue-Shuai Pan ◽  
Yan Zhang ◽  
Kai Chen ◽  
Hao-Yu Wang ◽  
...  

Abstract Objective To study the application of a machine learning algorithm for predicting gestational diabetes mellitus (GDM) in early pregnancy. Methods This study identified indicators related to GDM through a literature review and expert discussion. Pregnant women who had attended medical institutions for an antenatal examination from November 2017 to August 2018 were selected for analysis, and the collected indicators were retrospectively analyzed. Based on Python, the indicators were classified and modeled using a random forest regression algorithm, and the performance of the prediction model was analyzed. Results We obtained 4806 analyzable data from 1625 pregnant women. Among these, 3265 samples with all 67 indicators were used to establish data set F1; 4806 samples with 38 identical indicators were used to establish data set F2. Each of F1 and F2 was used for training the random forest algorithm. The overall predictive accuracy of the F1 model was 93.10%, area under the receiver operating characteristic curve (AUC) was 0.66, and the predictive accuracy of GDM-positive cases was 37.10%. The corresponding values for the F2 model were 88.70%, 0.87, and 79.44%. The results thus showed that the F2 prediction model performed better than the F1 model. To explore the impact of sacrificial indicators on GDM prediction, the F3 data set was established using 3265 samples (F1) with 38 indicators (F2). After training, the overall predictive accuracy of the F3 model was 91.60%, AUC was 0.58, and the predictive accuracy of positive cases was 15.85%. Conclusions In this study, a model for predicting GDM with several input variables (e.g., physical examination, past history, personal history, family history, and laboratory indicators) was established using a random forest regression algorithm. The trained prediction model exhibited a good performance and is valuable as a reference for predicting GDM in women at an early stage of pregnancy. In addition, there are certain requirements for the proportions of negative and positive cases in sample data sets when the random forest algorithm is applied to the early prediction of GDM.


This research paper proposes a solution that should be deployed to identify whether the transaction is fraud or not. Although we know that most of the transaction takes place online meaning that this transaction can be theft on the go and will create problem to user therefore this paper focus on some particular machine learning algorithm for example Random forest Algorithm, Decision Tree Algorithm, Logistic Regression, Support Vector Machine, K Nearest Neighbour, XGBoost .Which aims at solving such kind of real-world problem.


2021 ◽  
Vol 13 (2) ◽  
pp. 447
Author(s):  
Ping Wang ◽  
Xuran He ◽  
Hongyinping Feng ◽  
Guisheng Zhang ◽  
Chenglu Rong

PM2.5 concentration prediction is an important task in atmospheric environment research, so many prediction models have been established, such as machine learning algorithm, which shows remarkable generalization ability. The time series data composed of PM2.5 concentration have the implied structural characteristics such as the sequence characteristic in time dimension and the high dimension characteristic in dynamic-mode space, which makes it different from other research data. However, when the machine learning algorithm is applied to the PM2.5 time series prediction, due to the principle of input data composition, the above structural characteristics can not be fully reflected. In our study, a neighbor structural information extraction algorithm based on dynamic decomposition is proposed to represent the structural characteristics of time series, and a new hybrid prediction system is established by using the extracted neighbor structural information to improve the accuracy of PM2.5 concentration prediction. During the process of extracting neighbor structural information, the original PM2.5 concentration series is decomposed into finite dynamic modes according to the neighborhood data, which reflects the time series structural characteristics. The hybrid model integrates the neighbor structural information in the form of input vector, which ensures the applicability of the neighbor structural information and retains the composition form the original prediction system. The experimental results of six cities show that the hybrid prediction systems integrating neighbor structural information are significantly superior to the traditional models, and also confirm that the neighbor structural information extraction algorithm can capture effective time series structural information.


Sign in / Sign up

Export Citation Format

Share Document