Towards spatial machine learning to reveal hidden patterns and relationships in national and international geochemical databases

Author(s):  
Chaosheng Zhang

<p>Environmental geochemistry is playing an increasingly important role in mineral exploration, environmental management, agricultural practices as well as links with health. With rapidly growing databases available at regional, national, and global scales, environmental geochemistry is facing the challenges in the “big data” era. One of the main challenges is to find out useful information hidden in a large volume of data, with the existence of spatial variation found at all the sizes of global, regional (in square kilometers), field (in square meters) and micro scales (in square centimeters). Meanwhile, the rapidly developing techniques in machine learning become useful tools for classification, identification of clusters/patterns, identification of relationships and prediction. This presentation demonstrates the potential uses of a few practical spatial machine learning techniques (spatial analyses) in environmental geochemistry: neighborhood statistics, hot spot analysis and geographically weighted regression.</p><p> </p><p>Neighborhood (local) statistics are calculated using data within a neighborhood such as a moving window. In this way, spatial variation at the local level can be quantified and more details are revealed. Hot spot analysis techniques are capable of revealing hidden spatial patterns. The techniques of hot spot analysis including local index of spatial association (LISA) and Getis Ord Gi* are investigated using examples of geochemical databases in Ireland, China, the UK and the USA. The geographically weighted regression (GWR) explores the relationships between geochemical parameters and their influencing factors at the local level, which is effective in identifying the complex spatially varying relationships. Machine learning techniques are expected to play more important roles in environmental geochemistry. Challenges for more effective “data analytics” are currently emerging in the era of “big data”.</p><p> </p>

2020 ◽  
Vol 9 (3) ◽  
pp. 164
Author(s):  
Yunyi Xiao

The study uses geographically weighted regression (WGR) and the emerging hot spot analysis (EHSA) methodologies to examine the impact of immigrant and ethnic and racial concentration on patterns of aggravated assault and larceny in Miami.


Author(s):  
Augusto Cerqua ◽  
Roberta Di Stefano ◽  
Marco Letta ◽  
Sara Miccoli

AbstractEstimates of the real death toll of the COVID-19 pandemic have proven to be problematic in many countries, Italy being no exception. Mortality estimates at the local level are even more uncertain as they require stringent conditions, such as granularity and accuracy of the data at hand, which are rarely met. The “official” approach adopted by public institutions to estimate the “excess mortality” during the pandemic draws on a comparison between observed all-cause mortality data for 2020 and averages of mortality figures in the past years for the same period. In this paper, we apply the recently developed machine learning control method to build a more realistic counterfactual scenario of mortality in the absence of COVID-19. We demonstrate that supervised machine learning techniques outperform the official method by substantially improving the prediction accuracy of the local mortality in “ordinary” years, especially in small- and medium-sized municipalities. We then apply the best-performing algorithms to derive estimates of local excess mortality for the period between February and September 2020. Such estimates allow us to provide insights about the demographic evolution of the first wave of the pandemic throughout the country. To help improve diagnostic and monitoring efforts, our dataset is freely available to the research community.


2006 ◽  
Author(s):  
Christopher Schreiner ◽  
Kari Torkkola ◽  
Mike Gardner ◽  
Keshu Zhang

2020 ◽  
Vol 12 (2) ◽  
pp. 84-99
Author(s):  
Li-Pang Chen

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.


Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 389-P
Author(s):  
SATORU KODAMA ◽  
MAYUKO H. YAMADA ◽  
YUTA YAGUCHI ◽  
MASARU KITAZAWA ◽  
MASANORI KANEKO ◽  
...  

Author(s):  
Anantvir Singh Romana

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.


Author(s):  
Padmavathi .S ◽  
M. Chidambaram

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.


Author(s):  
Feidu Akmel ◽  
Ermiyas Birihanu ◽  
Bahir Siraj

Software systems are any software product or applications that support business domains such as Manufacturing,Aviation, Health care, insurance and so on.Software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from other for this reason it is better to apply the software metrics to measure the quality of software. Attributes that we gathered from source code through software metrics can be an input for software defect predictor. Software defect are an error that are introduced by software developer and stakeholders. Finally, in this study we discovered the application of machine learning on software defect that we gathered from the previous research works.


Sign in / Sign up

Export Citation Format

Share Document