Machine learning approach for simulation of heavy metal concentration in river water: the Crimean peninsula case study

This study proposes an approach for simulation of heavy metal concentration in river waters using machine learning techniques. A regression model was built and it captured the relationship between the concentration of heavy metal and metalloids (HMM) and several characteristics of studied catchment. Machine learning techniques allowed to simulate the annual concentration variability of HMM. This approach allows exploring the impact of different factors on studied processes.

Download Full-text

A Brief Survey on Text Classification Using Various Machine Learning Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v8i1.521 ◽

2018 ◽

Vol 8 (1) ◽

pp. 14

Author(s):

Padmavathi .S ◽

M. Chidambaram

Keyword(s):

Machine Learning ◽

Text Classification ◽

Fixed Number ◽

Machine Learning Techniques ◽

Online Information ◽

Rule Based ◽

Learning Techniques ◽

Machine Learning Approach ◽

Rule Based Approach

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.

Download Full-text

Selecting optimal SpMV realizations for GPUs via machine learning

The International Journal of High Performance Computing Applications ◽

10.1177/1094342021990738 ◽

2021 ◽

pp. 109434202199073

Author(s):

Ernesto Dufrechou ◽

Pablo Ezzatti ◽

Enrique S Quintana-Ortí

Keyword(s):

Machine Learning ◽

Sparse Matrix ◽

Machine Learning Techniques ◽

Optimal Method ◽

Learning Techniques ◽

General Rules ◽

Machine Learning Approach ◽

The Matrix ◽

Time And Energy ◽

Matrix Vector

More than 10 years of research related to the development of efficient GPU routines for the sparse matrix-vector product (SpMV) have led to several realizations, each with its own strengths and weaknesses. In this work, we review some of the most relevant efforts on the subject, evaluate a few prominent routines that are publicly available using more than 3000 matrices from different applications, and apply machine learning techniques to anticipate which SpMV realization will perform best for each sparse matrix on a given parallel platform. Our numerical experiments confirm the methods offer such varied behaviors depending on the matrix structure that the identification of general rules to select the optimal method for a given matrix becomes extremely difficult, though some useful strategies (heuristics) can be defined. Using a machine learning approach, we show that it is possible to obtain unexpensive classifiers that predict the best method for a given sparse matrix with over 80% accuracy, demonstrating that this approach can deliver important reductions in both execution time and energy consumption.

Download Full-text

Improving Reliability Estimation for Individual Numeric Predictions: A Machine Learning Approach

INFORMS Journal on Computing ◽

10.1287/ijoc.2020.1019 ◽

2021 ◽

Author(s):

Gediminas Adomavicius ◽

Yaqiong Wang

Keyword(s):

Machine Learning ◽

General Purpose ◽

Reliability Estimation ◽

Machine Learning Techniques ◽

Data Sets ◽

Real World Data ◽

Learning Techniques ◽

Reliability Indicator ◽

Machine Learning Approach ◽

Prediction Reliability

Numerical predictive modeling is widely used in different application domains. Although many modeling techniques have been proposed, and a number of different aggregate accuracy metrics exist for evaluating the overall performance of predictive models, other important aspects, such as the reliability (or confidence and uncertainty) of individual predictions, have been underexplored. We propose to use estimated absolute prediction error as the indicator of individual prediction reliability, which has the benefits of being intuitive and providing highly interpretable information to decision makers, as well as allowing for more precise evaluation of reliability estimation quality. As importantly, the proposed reliability indicator allows the reframing of reliability estimation itself as a canonical numeric prediction problem, which makes the proposed approach general-purpose (i.e., it can work in conjunction with any outcome prediction model), alleviates the need for distributional assumptions, and enables the use of advanced, state-of-the-art machine learning techniques to learn individual prediction reliability patterns directly from data. Extensive experimental results on multiple real-world data sets show that the proposed machine learning-based approach can significantly improve individual prediction reliability estimation as compared with a number of baselines from prior work, especially in more complex predictive scenarios.

Download Full-text

INVESTIGATION INTO HEAVY METAL CONCENTRATION BY THE GRAVEL ROADSIDES / SUNKIŲJŲ METALŲ KONCENTRACIJŲ ŽVYRKELIŲ PAKELIŲ DIRVOŽEMIUOSE VERTINIMAS / ОЦЕНКА КОНЦЕНТР АЦИЙ ТЯЖЕЛЫХ МЕТАЛЛОВ В ПОЧВЕ ОБОЧИН ГРАВИЙНЫХ ДОРОГ

Journal of Environmental Engineering and Landscape Management ◽

10.3846/16486897.2011.557474 ◽

2011 ◽

Vol 19 (1) ◽

pp. 89-100 ◽

Cited By ~ 9

Author(s):

Audronė Mikalajunė ◽

Lina Jakučionytė

Keyword(s):

Heavy Metals ◽

Heavy Metal ◽

Metal Concentration ◽

Road Dust ◽

Heavy Metal Concentration ◽

Concentration Limit ◽

Roadside Soils ◽

Oil Emulsion ◽

Gravel Roads ◽

The Impact

Vehicles release large amounts of heavy metals to the environment. There have been done a lot of investigations analysing the distribution of heavy metals in soils near intensive regional roads. However, there is lack of investigations into the impact of small-intensity gravel roads on roadside soil contamination with heavy metals. The object of this investigation is four gravel roads of local significance connecting small villages. The intensity of these roads is very low. The gravel roads are chosen according to application of dust-minimizing materials, for example, CaCl2 and oil emulsion. According to our results, none of the soil samples had an excess of heavy metal concentration limit. Besides, heavy metal concentrations were decreasing with a distance from the road increasing. We can make an assumption that road dust-minimizing materials do not have a significant impact on heavy metal distribution in roadside soils. The major factors of heavy metal pollution distribution in roadside soils are traffic intensity, roadside trenches, and topographic conditions. Santrauka Eksploatuojant autotransportą, į aplinką patenka daug sunkiųjų metalų. Atlikta nemažai tyrimų sunkiųjų metalų paplitimuidirvožemyje šalia intensyvių magistralinių kelių nustatyti, tačiau mažo intensyvumo keliai šiuo požiūriu tiriami mažai.Tirti pasirinkta 4 žvyrkeliai – vietinės reikšmvs keliai, jungiantys nedideles gyvenvietes. Eismo intensyvumas šiuose keliuose mažas. Žvyrkeliai pasirinkti pagal taikomas priemones dulkėtumui mažinti, t. y. du nagrinvjami žvyrkeliai apdorotiCaCl2, kiti du – naftos emulsija. Nė viename mėginyje sunkiųjų metalų koncentracijos neviršijo DLK, o tolstant nuo važiuojamosios kelio dalies sunkiųjų metalų koncentracijos buvo mažesnės. Galima daryti prielaidą, kad kelio apdorojimo medžiagos dulkėtumui mažinti žymios įtakos sunkiųjų metalų pasiskirstymui pakelių dirvožemyje nedaro, lemia transporto srauto intensyvumas, kelio grioviai pakelėse bei reljefo sąlygos. Резюме При эксплуатации автомобилей в окружающую среду попадает много тяжелых металлов. Проведено немалоисследований, посвященных анализу распространения тяжелых металлов в почве обочин интенсивно эксплуатируемых магистральных дорог, однако исследований, касающихся аналогичных проблем дорог малой интенсивности, в настоящее время имеется немного. В настоящей работе в качестве объекта исследований выбраны четыредороги местного значения с гравийным покрытием, соединяющие небольшие поселения. Интенсивность дорог небольшая. Гравийные дороги выбраны с учетом их обработки для уменьшения пыльности – две дороги обработаны с применением CaCl2, а две другие – с применением нефтяной эмульсии. Ни в одной пробе не былозафиксировано концентраций тяжелых металлов, превышающих допустимые нормами. С удалением от проезжей части концентрации тяжелых металлов уменьшались. На основании исследований можно сделать вывод о том,что материалы, применявшиеся для уменьшения пыльности дорог, большого влияния на распространениетяжелых металлов в почве обочин дорог не оказывают. На распространение тяжелых металлов в почве обочин оказывает влияние интенсивность транспортного потока, кюветы на обочинах и условия рельефа.

Download Full-text

A two point machine learning method for spatial prediction for soil : overcoming the spatially heterogeneous distribution and relationship of soil heavy metal concentration

10.5194/ismc2021-37 ◽

2021 ◽

Author(s):

Gao Bingbo ◽

Alfred Stein ◽

Wang Jinfeng

Keyword(s):

Machine Learning ◽

Heavy Metal ◽

Metal Concentration ◽

Prediction Accuracy ◽

Spatial Prediction ◽

Heavy Metal Concentration ◽

Machine Learning Method ◽

Learning Method ◽

Soil Heavy Metal ◽

The Difference

The soil heavy metal contamination has becoming a serious problem worldwide. An accurate prediction of soil heavy metal concentration at un-sampled locations using a small sample remains a challenge, because of many natural and human factors and resulted complex heterogeneous pattern, and the relationship between influencing factors are also not homogeneous. To overcome those heterogeneities and improve the prediction accuracy, a two point machine learning method is proposed in this paper by fully leveraging the spatial relationship and similarity relationship of high dimensional ancillary variables. It firstly models the difference between paired points using machine learning model, then predict the concentration differences between sampling points and the un-sampled points, and finally utilize the predicted differences to choose near neighbors to obtain the final concentration prediction. In this method, an innovative way to search near neighbors for local model from the difference of response variable was put forward to overcome the Curse of Dimensionality. Its performance was illustrated in two diverse case studies and it is demonstrated that proposed method can dramatically improve the prediction accuracy for soil heavy metal. Besides spatial prediction of soil pollution, it can also be applied to spatial prediction of other other elements of the earth system. And in further the machine learning method in this paper can be replaced to any other supervised learning model according to specific situations. &#160; &#160; &#160; &#160;

Download Full-text

Machine learning analysis of lifeguard flag decisions and recorded rescues

Natural Hazards and Earth System Science ◽

10.5194/nhess-19-2541-2019 ◽

2019 ◽

Vol 19 (11) ◽

pp. 2541-2549

Author(s):

Chris Houser ◽

Jacob Lehner ◽

Nathan Cherry ◽

Phil Wernette

Keyword(s):

Machine Learning ◽

The United States ◽

Machine Learning Techniques ◽

Public Health Issue ◽

Rip Currents ◽

Effective Strategies ◽

Wave Forcing ◽

Learning Techniques ◽

Yellow Flag ◽

The Impact

Abstract. Rip currents and other surf hazards are an emerging public health issue globally. Lifeguards, warning flags, and signs are important, and to varying degrees they are effective strategies to minimize risk to beach users. In the United States and other jurisdictions around the world, lifeguards use coloured flags (green, yellow, and red) to indicate whether the danger posed by the surf and rip hazard is low, moderate, or high respectively. The choice of flag depends on the lifeguard(s) monitoring the changing surf conditions along the beach and over the course of the day using both regional surf forecasts and careful observation. There is a potential that the chosen flag is not consistent with the beach user perception of the risk, which may increase the potential for rescues or drownings. In this study, machine learning is used to determine the potential for error in the flags used at Pensacola Beach and the impact of that error on the number of rescues. Results of a decision tree analysis indicate that the colour flag chosen by the lifeguards was different from what the model predicted for 35 % of days between 2004 and 2008 (n=396/1125). Days when there is a difference between the predicted and posted flag colour represent only 17 % of all rescue days, but those days are associated with ∼60 % of all rescues between 2004 and 2008. Further analysis reveals that the largest number of rescue days and total number of rescues are associated with days where the flag deployed over-estimated the surf and hazard risk, such as a red or yellow flag flying when the model predicted a green flag would be more appropriate based on the wind and wave forcing alone. While it is possible that the lifeguards were overly cautious, it is argued that they most likely identified a rip forced by a transverse-bar and rip morphology common at the study site. Regardless, the results suggest that beach users may be discounting lifeguard warnings if the flag colour is not consistent with how they perceive the surf hazard or the regional forecast. Results suggest that machine learning techniques have the potential to support lifeguards and thereby reduce the number of rescues and drownings.

Download Full-text

Monitoring the Impact of Air Quality on the COVID-19 Fatalities in Delhi, India: Using Machine Learning Techniques

Disaster Medicine and Public Health Preparedness ◽

10.1017/dmp.2020.372 ◽

2020 ◽

pp. 1-8

Author(s):

Jasleen Kaur Sethi ◽

Mamta Mittal

Keyword(s):

Machine Learning ◽

Air Quality ◽

Air Pollutants ◽

Machine Learning Techniques ◽

Environmental Restoration ◽

The Novel ◽

Ozone Pollution ◽

Learning Techniques ◽

Novel Coronavirus ◽

The Impact

ABSTRACT Objective: The focus of this study is to monitor the effect of lockdown on the various air pollutants due to the coronavirus disease (COVID-19) pandemic and identify the ones that affect COVID-19 fatalities so that measures to control the pollution could be enforced. Methods: Various machine learning techniques: Decision Trees, Linear Regression, and Random Forest have been applied to correlate air pollutants and COVID-19 fatalities in Delhi. Furthermore, a comparison between the concentration of various air pollutants and the air quality index during the lockdown period and last two years, 2018 and 2019, has been presented. Results: From the experimental work, it has been observed that the pollutants ozone and toluene have increased during the lockdown period. It has also been deduced that the pollutants that may impact the mortalities due to COVID-19 are ozone, NH3, NO2, and PM10. Conclusions: The novel coronavirus has led to environmental restoration due to lockdown. However, there is a need to impose measures to control ozone pollution, as there has been a significant increase in its concentration and it also impacts the COVID-19 mortality rate.

Download Full-text

Analyzing the impact of red-edge band on land use land cover classification using multispectral RapidEye imagery and machine learning techniques

Journal of Applied Remote Sensing ◽

10.1117/1.jrs.13.044511 ◽

2019 ◽

Vol 13 (04) ◽

pp. 1

Author(s):

Rashmi Saini ◽

Sanjay K. Ghosh

Keyword(s):

Machine Learning ◽

Land Use ◽

Land Cover ◽

Land Cover Classification ◽

Machine Learning Techniques ◽

Land Use Land Cover ◽

Red Edge ◽

Learning Techniques ◽

The Impact ◽

Edge Band

Download Full-text

A Hybrid Machine Learning Approach for Freeway Traffic Speed Estimation

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198120935875 ◽

2020 ◽

Vol 2674 (10) ◽

pp. 68-78

Author(s):

Zhao Zhang ◽

Yun Yuan ◽

Xianfeng (Terry) Yang

Keyword(s):

Machine Learning ◽

Flow Model ◽

Hybrid Approach ◽

Traffic Monitoring ◽

Machine Learning Techniques ◽

Learning Approach ◽

Freeway Traffic ◽

Learning Techniques ◽

Machine Learning Approach ◽

Traffic Speed Estimation

Accurate and timely estimation of freeway traffic speeds by short segments plays an important role in traffic monitoring systems. In the literature, the ability of machine learning techniques to capture the stochastic characteristics of traffic has been proved. Also, the deployment of intelligent transportation systems (ITSs) has provided enriched traffic data, which enables the adoption of a variety of machine learning methods to estimate freeway traffic speeds. However, the limitation of data quality and coverage remain a big challenge in current traffic monitoring systems. To overcome this problem, this study aims to develop a hybrid machine learning approach, by creating a new training variable based on the second-order traffic flow model, to improve the accuracy of traffic speed estimation. Grounded on a novel integrated framework, the estimation is performed using three machine learning techniques, that is, Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Artificial Neural Network (ANN). All three models are trained with the integrated dataset including the traffic flow model estimates and the iPeMS and PeMS data from the Utah Department of Transportation (DOT). Further using the PeMS data as the ground truth for model evaluation, the comparisons between the hybrid approach and pure machine learning models show that the hybrid approach can effectively capture the time-varying pattern of the traffic and help improve the estimation accuracy.

Download Full-text

Delivering Precision Medicine to Patients with Spinal Cord Disorders; Insights into Applications of Bioinformatics and Machine Learning from Studies of Degenerative Cervical Myelopathy

10.5772/intechopen.98713 ◽

2021 ◽

Author(s):

Kalum J. Ost ◽

David W. Anderson ◽

David W. Cadotte

Keyword(s):

Machine Learning ◽

Precision Medicine ◽

New Technologies ◽

Machine Learning Techniques ◽

Massive Datasets ◽

Learning Framework ◽

Learning Techniques ◽

Machine Learning Approach ◽

Spinal Cord Disorders ◽

Degenerative Cervical Myelopathy

With the common adoption of electronic health records and new technologies capable of producing an unprecedented scale of data, a shift must occur in how we practice medicine in order to utilize these resources. We are entering an era in which the capacity of even the most clever human doctor simply is insufficient. As such, realizing “personalized” or “precision” medicine requires new methods that can leverage the massive amounts of data now available. Machine learning techniques provide one important toolkit in this venture, as they are fundamentally designed to deal with (and, in fact, benefit from) massive datasets. The clinical applications for such machine learning systems are still in their infancy, however, and the field of medicine presents a unique set of design considerations. In this chapter, we will walk through how we selected and adjusted the “Progressive Learning framework” to account for these considerations in the case of Degenerative Cervical Myeolopathy. We additionally compare a model designed with these techniques to similar static models run in “perfect world” scenarios (free of the clinical issues address), and we use simulated clinical data acquisition scenarios to demonstrate the advantages of our machine learning approach in providing personalized diagnoses.

Download Full-text