scholarly journals Data Mining for Source Apportionment of Trace Elements in Water and Solid Matrix

Author(s):  
Yao Shan ◽  
Jianjun Shi

Trace elements migrate among different environment bodies with the natural geochemical reactions, and impacted by human industrial, agricultural, and civil activities. High load of trace elements in water, river and lake sediment, soil and air particle lead to potential to health of human being and ecological system. To control the impact on environment, source apportionment is a meaningful, and also a challenging task. Traditional methods to make source apportionment are usually based on geochemical techniques, or univariate analysis techniques. In recently years, the methods of multivariate analysis, and the related concepts data mining, machine learning, big data, are developing fast, which provide a novel route that combing the geochemical and data mining techniques together. These methods have been proved successful to deal with the source apportionment issue. In this chapter, the data mining methods used on this topic and implementations in recent years are reviewed. The basic method includes principal component analysis, factor analysis, clustering analysis, positive matrix fractionation, decision tree, Bayesian network, artificial neural network, etc. Source apportionment of trace elements in surface water, ground water, river and lake sediment, soil, air particles, dust are discussed.






2009 ◽  
Vol 147-149 ◽  
pp. 588-593 ◽  
Author(s):  
Marcin Derlatka ◽  
Jolanta Pauk

In the paper the procedure of processing biomechanical data has been proposed. It consists of selecting proper noiseless data, preprocessing data by means of model’s identification and Kernel Principal Component Analysis and next classification using decision tree. The obtained results of classification into groups (normal and two selected pathology of gait: Spina Bifida and Cerebral Palsy) were very good.



2021 ◽  
Vol 15 (6) ◽  
pp. 1812-1819
Author(s):  
Azita Yazdani ◽  
Ramin Ravangard ◽  
Roxana Sharifian

The new coronavirus has been spreading since the beginning of 2020 and many efforts have been made to develop vaccines to help patients recover. It is now clear that the world needs a rapid solution to curb the spread of COVID-19 worldwide with non-clinical approaches such as data mining, enhanced intelligence, and other artificial intelligence techniques. These approaches can be effective in reducing the burden on the health care system to provide the best possible way to diagnose and predict the COVID-19 epidemic. In this study, data mining models for early detection of Covid-19 in patients were developed using the epidemiological dataset of patients and individuals suspected of having Covid-19 in Iran. C4.5, support vector machine, Naive Bayes, logistic regression, Random Forest, and k-nearest neighbor algorithm were used directly on the dataset using Rapid miner to develop the models. By receiving clinical signs, this model diagnosis the risk of contracting the COVID-19 virus. Examination of the models in this study has shown that the support vector machine with 93.41% accuracy is more efficient in the diagnosis of patients with COVID-19 pandemic, which is the best model among other developed models. Keywords: COVID-19, Data mining, Machine Learning, Artificial Intelligence, Classification



2016 ◽  
Author(s):  
Zhaolian Ye ◽  
Jiashu Liu ◽  
Aijun Gu ◽  
Feifei Feng ◽  
Yuhai Liu ◽  
...  

Abstract. Knowledge on aerosol chemistry in densely populated regions is critical for reduction of air pollution, while such studies haven't been conducted in Changzhou, an important manufacturing base and polluted city in the Yangtze River Delta (YRD), China. This work, for the first time, performed a thorough chemical characterization on the fine particular matter (PM2.5) samples, collected during July 2015 to April 2016 across four seasons in Changzhou city. A suite of analytical techniques were employed to characterize organic carbon / elemental carbon (OC / EC), water-soluble organic carbon (WSOC), water-soluble inorganic ions (WSIIs), trace elements, and polycyclic aromatic hydrocarbons (PAHs) in PM2.5; in particular, an Aerodyne soot particle aerosol mass spectrometer (SP-AMS) was deployed to probe the chemical properties of water-soluble organic aerosols (WSOA). The average PM2.5 concentrations were found to be 108.3 μg m−3, and all identified species were able to reconstruct ~ 80 % of the PM2.5 mass. The WSIIs occupied about half of the PM2.5 mass (~ 52.1 %), with SO42−, NO3− and NH4+ as the major ions. On average, nitrate concentrations dominated over sulfate (mass ratio of 1.21), indicating influences from traffic emissions. OC and EC correlated well with each other and the highest OC / EC ratio (5.16) occurred in winter, suggesting complex OC sources likely including both secondarily formed and primarily emitted OA. Concentrations of eight trace elements (Mn, Zn, Al, B, Cr, Cu, Fe, Pb) can contribute up to 6.0 % of PM2.5 during winter. PAHs concentrations were also high in winter (140.25 ng m−3), which were predominated by median/high molecular weight PAHs with 5- and 6-rings. The organic matter including both water-soluble and water-insoluble species occupied ~ 20 % PM2.5 mass. SP-AMS determined that the WSOA had an average atomic oxygen-to-carbon (O / C), hydrogen-to-carbon (H / C), nitrogen-to-carbon (N / C) and organic matter-to-organic carbon (OM / OC) ratios of 0.36, 1.54, 0.11, and 1.74, respectively. Source apportionment of WSOA further identified two secondary OA (SOA) factors (a less oxidized and a more oxidized OA) and two primary OA (POA) factors (a nitrogen enriched hydrocarbon-like traffic OA and a cooking-related OA). On average, the POA contribution overweighed SOA (55 % vs. 45 %), indicating the important role of local anthropogenic emissions to the aerosol pollution in Changzhou. Our measurement also shows the abundance of organic nitrogen species in WSOA, and the source analyses suggest these species likely associated with traffic emissions, which warrants more investigations on PM samples from other locations.



2005 ◽  
Vol 3 (S1) ◽  
pp. 301-308 ◽  
Author(s):  
Luciano Favretto ◽  
Luciana Gabrielli Favretto


2021 ◽  
Vol 35 (3) ◽  
pp. 209-215
Author(s):  
Pratibha Verma ◽  
Vineet Kumar Awasthi ◽  
Sanat Kumar Sahu

Data mining techniques are included with Ensemble learning and deep learning for the classification. The methods used for classification are, Single C5.0 Tree (C5.0), Classification and Regression Tree (CART), kernel-based Support Vector Machine (SVM) with linear kernel, ensemble (CART, SVM, C5.0), Neural Network-based Fit single-hidden-layer neural network (NN), Neural Networks with Principal Component Analysis (PCA-NN), deep learning-based H2OBinomialModel-Deeplearning (HBM-DNN) and Enhanced H2OBinomialModel-Deeplearning (EHBM-DNN). In this study, experiments were conducted on pre-processed datasets using R programming and 10-fold cross-validation technique. The findings show that the ensemble model (CART, SVM and C5.0) and EHBM-DNN are more accurate for classification, compared with other methods.



Author(s):  
Alfonso Palmer ◽  
Rafael Jimenez ◽  
Elena Gervill


Sign in / Sign up

Export Citation Format

Share Document