scholarly journals DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction

Sensors ◽  
2021 ◽  
Vol 21 (15) ◽  
pp. 5124
Author(s):  
Haijie Pan ◽  
Lirong Zheng

Machine learning models often converge slowly and are unstable due to the significant variance of random data when using a sample estimate gradient in SGD. To increase the speed of convergence and improve stability, a distributed SGD algorithm based on variance reduction, named DisSAGD, is proposed in this study. DisSAGD corrects the gradient estimate for each iteration by using the gradient variance of historical iterations without full gradient computation or additional storage, i.e., it reduces the mean variance of historical gradients in order to reduce the error in updating parameters. We implemented DisSAGD in distributed clusters in order to train a machine learning model by sharing parameters among nodes using an asynchronous communication protocol. We also propose an adaptive learning rate strategy, as well as a sampling strategy, to address the update lag of the overall parameter distribution, which helps to improve the convergence speed when the parameters deviate from the optimal value—when one working node is faster than another, this node will have more time to compute the local gradient and sample more samples for the next iteration. Our experiments demonstrate that DisSAGD significantly reduces waiting times during loop iterations and improves convergence speed when compared to traditional methods, and that our method can achieve speed increases for distributed clusters.

2021 ◽  
Vol 13 (5) ◽  
pp. 907
Author(s):  
Theodora Lendzioch ◽  
Jakub Langhammer ◽  
Lukáš Vlček ◽  
Robert Minařík

One of the best preconditions for the sufficient monitoring of peat bog ecosystems is the collection, processing, and analysis of unique spatial data to understand peat bog dynamics. Over two seasons, we sampled groundwater level (GWL) and soil moisture (SM) ground truth data at two diverse locations at the Rokytka Peat bog within the Sumava Mountains, Czechia. These data served as reference data and were modeled with a suite of potential variables derived from digital surface models (DSMs) and RGB, multispectral, and thermal orthoimages reflecting topomorphometry, vegetation, and surface temperature information generated from drone mapping. We used 34 predictors to feed the random forest (RF) algorithm. The predictor selection, hyperparameter tuning, and performance assessment were performed with the target-oriented leave-location-out (LLO) spatial cross-validation (CV) strategy combined with forward feature selection (FFS) to avoid overfitting and to predict on unknown locations. The spatial CV performance statistics showed low (R2 = 0.12) to high (R2 = 0.78) model predictions. The predictor importance was used for model interpretation, where temperature had strong impact on GWL and SM, and we found significant contributions of other predictors, such as Normalized Difference Vegetation Index (NDVI), Normalized Difference Index (NDI), Enhanced Red-Green-Blue Vegetation Index (ERGBVE), Shape Index (SHP), Green Leaf Index (GLI), Brightness Index (BI), Coloration Index (CI), Redness Index (RI), Primary Colours Hue Index (HI), Overall Hue Index (HUE), SAGA Wetness Index (TWI), Plan Curvature (PlnCurv), Topographic Position Index (TPI), and Vector Ruggedness Measure (VRM). Additionally, we estimated the area of applicability (AOA) by presenting maps where the prediction model yielded high-quality results and where predictions were highly uncertain because machine learning (ML) models make predictions far beyond sampling locations without sampling data with no knowledge about these environments. The AOA method is well suited and unique for planning and decision-making about the best sampling strategy, most notably with limited data.


2011 ◽  
Vol 403-408 ◽  
pp. 1834-1838
Author(s):  
Jing Zhao ◽  
Chong Zhao Han ◽  
Bin Wei ◽  
De Qiang Han

Discretization of continuous attributes have played an important role in machine learning and data mining. They can not only improve the performance of the classifier, but also reduce the space of the storage. Univariate Marginal Distribution Algorithm is a modified Evolutionary Algorithms, which has some advantages over classical Evolutionary Algorithms such as the fast convergence speed and few parameters need to be tuned. In this paper, we proposed a bottom-up, global, dynamic, and supervised discretization method on the basis of Univariate Marginal Distribution Algorithm.The experimental results showed that the proposed method could effectively improve the accuracy of classifier.


2020 ◽  
Vol 32 (1) ◽  
pp. 188-202 ◽  
Author(s):  
Fanhua Shang ◽  
Kaiwen Zhou ◽  
Hongying Liu ◽  
James Cheng ◽  
Ivor W. Tsang ◽  
...  

Author(s):  
Mikhail Krechetov ◽  
Jakub Marecek ◽  
Yury Maximov ◽  
Martin Takac

Low-rank methods for semi-definite programming (SDP) have gained a lot of interest recently, especially in machine learning applications. Their analysis often involves determinant-based or Schatten-norm penalties, which are difficult to implement in practice due to high computational efforts. In this paper, we propose Entropy-Penalized Semi-Definite Programming (EP-SDP), which provides a unified framework for a broad class of penalty functions used in practice to promote a low-rank solution. We show that EP-SDP problems admit an efficient numerical algorithm, having (almost) linear time complexity of the gradient computation; this makes it useful for many machine learning and optimization problems. We illustrate the practical efficiency of our approach on several combinatorial optimization and machine learning problems.


2020 ◽  
pp. 1-14
Author(s):  
Nyasiro Sophia Gibore ◽  
Ainory Peter Gesase

Abstract Promoting men’s involvement in antenatal care (ANC) requires an understanding of their views on how they ought to be involved. Their involvement in ANC services can help in reducing delay in deciding to seek care and facilitate women’s access to skilled antenatal services. This study sought to determine men’s views and knowledge on, and challenges to, involvement in ANC services in Tanzania. The cross-sectional study was carried out in four districts of Dodoma Region in November 2014 and June 2016. A multi-stage sampling strategy was used to select the study respondents. Data were collected by interviewing 966 men using a structured questionnaire. Univariate, bivariate and multivariate logistic regression analyses were used to examine the association between men’s involvement in ANC services and their background characteristics. About 63.4% of respondents accompanied their partners to ANC services. Men’s view was that they can be involved through accompanying their partner to ANC clinics and providing money for health services. Men who had poor knowledge on ANC services were two times less likely to be involved in ANC services. Similarly, long waiting times at the antenatal clinics decreased the likelihood of service utilization by their partners. Men from a two-income household were more likely to be involved in ANC services than men from households where the men’s earnings were the only source of income. Challenges encountered by men during attendance at ANC services included: perception of antenatal clinics as places only for women, financial difficulties, influence of peer pressure and lack of time due to occupational demands. There is a need to establish community outreach ANC services that offer couple-friendly services in Tanzania. Also, it is crucial to have a policy for men’s involvement in maternal health care that addresses cultural practices that hinder men’s involvement in ANC services.


Author(s):  
Robert O. Oboko ◽  
Elizaphan M. Maina ◽  
Peter W. Waiganjo ◽  
Elijah I. Omwenga ◽  
Ruth D. Wario

2021 ◽  
Vol 1 (1) ◽  
pp. 1-21
Author(s):  
Joseph Isabona ◽  
Agbotiname Lucky Imoize

Machine learning models and algorithms have been employed in various applications, from prognostic scrutinizing, learning and revealing patterns in data, knowledge extracting, and knowledge deducing. One promising computationally efficient and adaptive machine learning method is the Gaussian Process Regression (GPR). An essential ingredient for tuning the GPR performance is the kernel (covariance) function. The GPR models have been widely employed in diverse regression and functional approximation purposes. However, knowing the right GPR training to examine the impacts of the kernel functions on performance during implementation remains. In order to address this problem, a stepwise approach for optimal kernel selection is presented for adaptive optimal prognostic regression learning of throughput data acquired over 4G LTE networks. The resultant learning accuracy was statistically quantified using four evaluation indexes. Results indicate that the GPR training with the mertern52 kernel function achieved the best user throughput data learning among the ten contending Kernel functions.


Sign in / Sign up

Export Citation Format

Share Document