DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction

Machine learning models often converge slowly and are unstable due to the significant variance of random data when using a sample estimate gradient in SGD. To increase the speed of convergence and improve stability, a distributed SGD algorithm based on variance reduction, named DisSAGD, is proposed in this study. DisSAGD corrects the gradient estimate for each iteration by using the gradient variance of historical iterations without full gradient computation or additional storage, i.e., it reduces the mean variance of historical gradients in order to reduce the error in updating parameters. We implemented DisSAGD in distributed clusters in order to train a machine learning model by sharing parameters among nodes using an asynchronous communication protocol. We also propose an adaptive learning rate strategy, as well as a sampling strategy, to address the update lag of the overall parameter distribution, which helps to improve the convergence speed when the parameters deviate from the optimal value—when one working node is faster than another, this node will have more time to compute the local gradient and sample more samples for the next iteration. Our experiments demonstrate that DisSAGD significantly reduces waiting times during loop iterations and improves convergence speed when compared to traditional methods, and that our method can achieve speed increases for distributed clusters.

Download Full-text

Mapping the Groundwater Level and Soil Moisture of a Montane Peat Bog Using UAV Monitoring and Machine Learning

Remote Sensing ◽

10.3390/rs13050907 ◽

2021 ◽

Vol 13 (5) ◽

pp. 907

Author(s):

Theodora Lendzioch ◽

Jakub Langhammer ◽

Lukáš Vlček ◽

Robert Minařík

Keyword(s):

Machine Learning ◽

Soil Moisture ◽

Spatial Data ◽

Groundwater Level ◽

Vegetation Index ◽

Normalized Difference Vegetation Index ◽

Sampling Strategy ◽

Peat Bog ◽

Strong Impact ◽

Ground Truth Data

One of the best preconditions for the sufficient monitoring of peat bog ecosystems is the collection, processing, and analysis of unique spatial data to understand peat bog dynamics. Over two seasons, we sampled groundwater level (GWL) and soil moisture (SM) ground truth data at two diverse locations at the Rokytka Peat bog within the Sumava Mountains, Czechia. These data served as reference data and were modeled with a suite of potential variables derived from digital surface models (DSMs) and RGB, multispectral, and thermal orthoimages reflecting topomorphometry, vegetation, and surface temperature information generated from drone mapping. We used 34 predictors to feed the random forest (RF) algorithm. The predictor selection, hyperparameter tuning, and performance assessment were performed with the target-oriented leave-location-out (LLO) spatial cross-validation (CV) strategy combined with forward feature selection (FFS) to avoid overfitting and to predict on unknown locations. The spatial CV performance statistics showed low (R2 = 0.12) to high (R2 = 0.78) model predictions. The predictor importance was used for model interpretation, where temperature had strong impact on GWL and SM, and we found significant contributions of other predictors, such as Normalized Difference Vegetation Index (NDVI), Normalized Difference Index (NDI), Enhanced Red-Green-Blue Vegetation Index (ERGBVE), Shape Index (SHP), Green Leaf Index (GLI), Brightness Index (BI), Coloration Index (CI), Redness Index (RI), Primary Colours Hue Index (HI), Overall Hue Index (HUE), SAGA Wetness Index (TWI), Plan Curvature (PlnCurv), Topographic Position Index (TPI), and Vector Ruggedness Measure (VRM). Additionally, we estimated the area of applicability (AOA) by presenting maps where the prediction model yielded high-quality results and where predictions were highly uncertain because machine learning (ML) models make predictions far beyond sampling locations without sampling data with no knowledge about these environments. The AOA method is well suited and unique for planning and decision-making about the best sampling strategy, most notably with limited data.

Download Full-text

A UMDA-Based Discretization Method for Continuous Attributes

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.403-408.1834 ◽

2011 ◽

Vol 403-408 ◽

pp. 1834-1838

Author(s):

Jing Zhao ◽

Chong Zhao Han ◽

Bin Wei ◽

De Qiang Han

Keyword(s):

Machine Learning ◽

Data Mining ◽

Evolutionary Algorithms ◽

Marginal Distribution ◽

Convergence Speed ◽

Fast Convergence ◽

Experimental Results ◽

Discretization Method ◽

Bottom Up ◽

Global Dynamic

Discretization of continuous attributes have played an important role in machine learning and data mining. They can not only improve the performance of the classifier, but also reduce the space of the storage. Univariate Marginal Distribution Algorithm is a modified Evolutionary Algorithms, which has some advantages over classical Evolutionary Algorithms such as the fast convergence speed and few parameters need to be tuned. In this paper, we proposed a bottom-up, global, dynamic, and supervised discretization method on the basis of Univariate Marginal Distribution Algorithm.The experimental results showed that the proposed method could effectively improve the accuracy of classifier.

Download Full-text

A Field Experiment on Adaptive Learning Applying a Machine Learning Algorithm

AEA Randomized Controlled Trials ◽

10.1257/rct.7637-1.0 ◽

2021 ◽

Author(s):

Isabell Zipperle

Keyword(s):

Machine Learning ◽

Field Experiment ◽

Adaptive Learning ◽

Learning Algorithm ◽

Machine Learning Algorithm

Download Full-text

VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2018.2878765 ◽

2020 ◽

Vol 32 (1) ◽

pp. 188-202 ◽

Cited By ~ 5

Author(s):

Fanhua Shang ◽

Kaiwen Zhou ◽

Hongying Liu ◽

James Cheng ◽

Ivor W. Tsang ◽

...

Keyword(s):

Machine Learning ◽

Variance Reduction ◽

Reduction Method

Download Full-text

Entropy-Penalized Semidefinite Programming

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/157 ◽

2019 ◽

Cited By ~ 2

Author(s):

Mikhail Krechetov ◽

Jakub Marecek ◽

Yury Maximov ◽

Martin Takac

Keyword(s):

Machine Learning ◽

Time Complexity ◽

Optimization Problems ◽

Linear Time ◽

Broad Class ◽

Low Rank ◽

Learning Problems ◽

Unified Framework ◽

Gradient Computation ◽

Machine Learning Applications

Low-rank methods for semi-definite programming (SDP) have gained a lot of interest recently, especially in machine learning applications. Their analysis often involves determinant-based or Schatten-norm penalties, which are difficult to implement in practice due to high computational efforts. In this paper, we propose Entropy-Penalized Semi-Definite Programming (EP-SDP), which provides a unified framework for a broad class of penalty functions used in practice to promote a low-rank solution. We show that EP-SDP problems admit an efficient numerical algorithm, having (almost) linear time complexity of the gradient computation; this makes it useful for many machine learning and optimization problems. We illustrate the practical efficiency of our approach on several combinatorial optimization and machine learning problems.

Download Full-text

Analysis of Best Sampling Strategy in Credit Card Fraud Detection Using Machine Learning

2021 6th International Conference on Intelligent Information Technology ◽

10.1145/3460179.3460186 ◽

2021 ◽

Author(s):

Hanbing Zou

Keyword(s):

Machine Learning ◽

Credit Card ◽

Fraud Detection ◽

Sampling Strategy ◽

Credit Card Fraud

Download Full-text

A distributed computing framework based on lightweight variance reduction method to accelerate machine learning training on blockchain

China Communications ◽

10.23919/jcc.2020.09.007 ◽

2020 ◽

Vol 17 (9) ◽

pp. 77-89

Author(s):

Zhen Huang ◽

Feng Liu ◽

Mingxing Tang ◽

Jinyan Qiu ◽

Yuxing Peng

Keyword(s):

Machine Learning ◽

Distributed Computing ◽

Variance Reduction ◽

Reduction Method ◽

Computing Framework

Download Full-text

Men in maternal health: an analysis of men’s views and knowledge on, and challenges to, involvement in antenatal care services in a Tanzanian community in Dodoma Region

Journal of Biosocial Science ◽

10.1017/s0021932020000541 ◽

2020 ◽

pp. 1-14

Author(s):

Nyasiro Sophia Gibore ◽

Ainory Peter Gesase

Keyword(s):

Maternal Health ◽

Antenatal Care ◽

Cultural Practices ◽

Waiting Times ◽

Community Outreach ◽

Sampling Strategy ◽

Peer Pressure ◽

Cross Sectional ◽

Antenatal Clinics ◽

Men’S Involvement

Abstract Promoting men’s involvement in antenatal care (ANC) requires an understanding of their views on how they ought to be involved. Their involvement in ANC services can help in reducing delay in deciding to seek care and facilitate women’s access to skilled antenatal services. This study sought to determine men’s views and knowledge on, and challenges to, involvement in ANC services in Tanzania. The cross-sectional study was carried out in four districts of Dodoma Region in November 2014 and June 2016. A multi-stage sampling strategy was used to select the study respondents. Data were collected by interviewing 966 men using a structured questionnaire. Univariate, bivariate and multivariate logistic regression analyses were used to examine the association between men’s involvement in ANC services and their background characteristics. About 63.4% of respondents accompanied their partners to ANC services. Men’s view was that they can be involved through accompanying their partner to ANC clinics and providing money for health services. Men who had poor knowledge on ANC services were two times less likely to be involved in ANC services. Similarly, long waiting times at the antenatal clinics decreased the likelihood of service utilization by their partners. Men from a two-income household were more likely to be involved in ANC services than men from households where the men’s earnings were the only source of income. Challenges encountered by men during attendance at ANC services included: perception of antenatal clinics as places only for women, financial difficulties, influence of peer pressure and lack of time due to occupational demands. There is a need to establish community outreach ANC services that offer couple-friendly services in Tanzania. Also, it is crucial to have a policy for men’s involvement in maternal health care that addresses cultural practices that hinder men’s involvement in ANC services.

Download Full-text

Designing adaptive learning support through machine learning techniques

2016 IST-Africa Week Conference ◽

10.1109/istafrica.2016.7530676 ◽

2016 ◽

Author(s):

Robert O. Oboko ◽

Elizaphan M. Maina ◽

Peter W. Waiganjo ◽

Elijah I. Omwenga ◽

Ruth D. Wario

Keyword(s):

Machine Learning ◽

Adaptive Learning ◽

Machine Learning Techniques ◽

Learning Support ◽

Learning Techniques

Download Full-text

Optimal Kernel Selection Based on GPR for Adaptive Learning of Mean Throughput Rates in LTE Networks

Journal of Technological Advancements ◽

10.4018/jta.290350 ◽

2021 ◽

Vol 1 (1) ◽

pp. 1-21

Author(s):

Joseph Isabona ◽

Agbotiname Lucky Imoize

Keyword(s):

Machine Learning ◽

Adaptive Learning ◽

Gaussian Process Regression ◽

Kernel Functions ◽

Computationally Efficient ◽

Lte Networks ◽

Evaluation Indexes ◽

Kernel Selection ◽

Optimal Kernel ◽

4G Lte

Machine learning models and algorithms have been employed in various applications, from prognostic scrutinizing, learning and revealing patterns in data, knowledge extracting, and knowledge deducing. One promising computationally efficient and adaptive machine learning method is the Gaussian Process Regression (GPR). An essential ingredient for tuning the GPR performance is the kernel (covariance) function. The GPR models have been widely employed in diverse regression and functional approximation purposes. However, knowing the right GPR training to examine the impacts of the kernel functions on performance during implementation remains. In order to address this problem, a stepwise approach for optimal kernel selection is presented for adaptive optimal prognostic regression learning of throughput data acquired over 4G LTE networks. The resultant learning accuracy was statistically quantified using four evaluation indexes. Results indicate that the GPR training with the mertern52 kernel function achieved the best user throughput data learning among the ten contending Kernel functions.

Download Full-text