accelerate convergence
Recently Published Documents


TOTAL DOCUMENTS

97
(FIVE YEARS 27)

H-INDEX

10
(FIVE YEARS 1)

2021 ◽  
pp. 1-47
Author(s):  
Yeonjong Shin

Deep neural networks have been used in various machine learning applications and achieved tremendous empirical successes. However, training deep neural networks is a challenging task. Many alternatives have been proposed in place of end-to-end back-propagation. Layer-wise training is one of them, which trains a single layer at a time, rather than trains the whole layers simultaneously. In this paper, we study a layer-wise training using a block coordinate gradient descent (BCGD) for deep linear networks. We establish a general convergence analysis of BCGD and found the optimal learning rate, which results in the fastest decrease in the loss. We identify the effects of depth, width, and initialization. When the orthogonal-like initialization is employed, we show that the width of intermediate layers plays no role in gradient-based training beyond a certain threshold. Besides, we found that the use of deep networks could drastically accelerate convergence when it is compared to those of a depth 1 network, even when the computational cost is considered. Numerical examples are provided to justify our theoretical findings and demonstrate the performance of layer-wise training by BCGD.


2021 ◽  
Vol 55 (1) ◽  
pp. 109-123
Author(s):  
Harry Oviedo

This paper addresses the positive semi-deffnite procrustes problem (PSDP). The PSDP corresponds to a least squares problem over the set of symmetric and semi-deffnite positive matrices. These kinds of problems appear in many applications such as structure analysis, signal processing, among others. A non-monotone spectral projected gradient algorithm is proposed to obtain a numerical solution for the PSDP. The proposed algorithm employs the Zhang and Hager's non-monotone technique in combination with the Barzilai and Borwein's step size to accelerate convergence. Some theoretical results are presented. Finally, numerical experiments are performed to demonstrate the effectiveness and efficiency of the proposed method, and comparisons are made with other state-of-the-art algorithms.


2021 ◽  
Vol 8 (4) ◽  
pp. 635-644
Author(s):  
Safaa M. Aljassas ◽  
Dhuha Abdulameer Kadhim ◽  
Eman Yahea Habeeb

The main goal of this research is to calculate a triple integral included continuous integrands numerically by two composite rules. The first rule is the Mid-point method on the third dimension Z and the first dimension X with a suggested method on the second dimension Y, that is denoted by MSuM. The second rule is the suggested method on the third dimension Z and the first dimension X with a Mid-point method on the second dimension Y, that is denoted by SuMSu. The number of partial intervals is equals on the three dimensions. The study represented two theorems with the proofs to get such rules and the correction terms (the error terms) for each of rule. Moreover, to accelerate convergence and get better results, Romberg acceleration is used with both rules. These rules recalled by RO(MSuM) and RO(SuMSu) respectively such that the obtained results were high accuracy by relatively few partial intervals and shorter times.


Author(s):  
Mingkun Xu ◽  
Yujie Wu ◽  
Lei Deng ◽  
Faqiang Liu ◽  
Guoqi Li ◽  
...  

Biological spiking neurons with intrinsic dynamics underlie the powerful representation and learning capabilities of the brain for processing multimodal information in complex environments. Despite recent tremendous progress in spiking neural networks (SNNs) for handling Euclidean-space tasks, it still remains challenging to exploit SNNs in processing non-Euclidean-space data represented by graph data, mainly due to the lack of effective modeling framework and useful training techniques. Here we present a general spike-based modeling framework that enables the direct training of SNNs for graph learning. Through spatial-temporal unfolding for spiking data flows of node features, we incorporate graph convolution filters into spiking dynamics and formalize a synergistic learning paradigm. Considering the unique features of spike representation and spiking dynamics, we propose a spatial-temporal feature normalization (STFN) technique suitable for SNN to accelerate convergence. We instantiate our methods into two spiking graph models, including graph convolution SNNs and graph attention SNNs, and validate their performance on three node-classification benchmarks, including Cora, Citeseer, and Pubmed. Our model can achieve comparable performance with the state-of-the-art graph neural network (GNN) models with much lower computation costs, demonstrating great benefits for the execution on neuromorphic hardware and prompting neuromorphic applications in graphical scenarios.


Author(s):  
Owe Axelsson ◽  
János Karátson

AbstractThe paper is devoted to Krylov type modifications of the Uzawa method on the operator level for the Stokes problem in order to accelerate convergence. First block preconditioners and their effect on convergence are studied. Then it is shown that a Krylov–Uzawa iteration produces superlinear convergence on smooth domains, and estimation is given on its speed.


2021 ◽  
Vol 10 (2) ◽  
Author(s):  
Joe Zimmerman ◽  
Brian Wheaton

Export-led growth is an economic hypothesis that links the level of a nation’s exports to economic growth in that country. Seen primarily as a model for low-income, developing nations to accelerate convergence as China began to do in the 1980s, the hypothesis theoretically still stands for developed nations. However, there exists significant discussion and doubt as to the strength and causality of the relationship between exports and growth, especially after a nation has industrialized and established itself as a major exporter. This paper examines and compares the effect of exports, imports, and net exports on economic growth for a set of low-income nations (Sub-Saharan Africa) and a country that has already undergone a significant economic transformation (China, at the provincial level). I regress the share of exports, imports, and net exports against GDP growth for Sub-Saharan African nations and Chinese provinces, and use instrumental variables to check for robustness. I find that while in Sub-Saharan Africa the share of exports and net exports exhibit a positive relationship with economic growth, higher shares of exports and net exports in China are associated with lower economic growth. This suggests that export-led growth is valid in Sub-Saharan Africa, but no longer is in China. I pose two potential explanations for this outcome in China: inefficient trade with low-income nations or decreasing trade with high-income nations. Regressions of China’s exports to these two types of economies over time indicate that the latter is the primary cause of the distinction in the effect of exports.


2021 ◽  
Vol 15 ◽  
Author(s):  
Xue Yang ◽  
Yin Lyu ◽  
Yang Sun ◽  
Chen Zhang

At present, part of people's body is in the state of sub-health, and more people pay attention to physical exercise. Dance is a relatively simple and popular activity, it has been widely concerned. The traditional action recognition method is easily affected by the action speed, illumination, occlusion and complex background, which leads to the poor robustness of the recognition results. In order to solve the above problems, an improved residual dense neural network method is used to study the automatic recognition of dance action images. Firstly, based on the residual model, the features of dance action are extracted by using the convolution layer and pooling layer. Then, the exponential linear element (ELU) activation function, batch normalization (BN) and Dropout technology are used to improve and optimize the model to mitigate the gradient disappearance, prevent over-fitting, accelerate convergence and enhance the model generalization ability. Finally, the dense connection network (DenseNet) is introduced to make the extracted dance action features more rich and effective. Comparison experiments are carried out on two public databases and one self-built database. The results show that the recognition rate of the proposed method on three databases are 99.98, 97.95, and 0.97.96%, respectively. It can be seen that this new method can effectively improve the performance of dance action recognition.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Leiming Tang ◽  
Xunjie Cao ◽  
Weiyang Chen ◽  
Changbo Ye

In this paper, the low-complexity tensor completion (LTC) scheme is proposed to improve the efficiency of tensor completion. On one hand, the matrix factorization model is established for complexity reduction, which adopts the matrix factorization into the model of low-rank tensor completion. On the other hand, we introduce the smoothness by total variation regularization and framelet regularization to guarantee the completion performance. Accordingly, given the proposed smooth matrix factorization (SMF) model, an alternating direction method of multiple- (ADMM-) based solution is further proposed to realize the efficient and effective tensor completion. Additionally, we employ a novel tensor initialization approach to accelerate convergence speed. Finally, simulation results are presented to confirm the system gain of the proposed LTC scheme in both efficiency and effectiveness.


2020 ◽  
Vol 8 ◽  
Author(s):  
Chao Yang ◽  
Jiri Brabec ◽  
Libor Veis ◽  
David B. Williams-Young ◽  
Karol Kowalski

We describe using the Newton Krylov method to solve the coupled cluster equation. The method uses a Krylov iterative method to compute the Newton correction to the approximate coupled cluster amplitude. The multiplication of the Jacobian with a vector, which is required in each step of a Krylov iterative method such as the Generalized Minimum Residual (GMRES) method, is carried out through a finite difference approximation, and requires an additional residual evaluation. The overall cost of the method is determined by the sum of the inner Krylov and outer Newton iterations. We discuss the termination criterion used for the inner iteration and show how to apply pre-conditioners to accelerate convergence. We will also examine the use of regularization technique to improve the stability of convergence and compare the method with the widely used direct inversion of iterative subspace (DIIS) methods through numerical examples.


2020 ◽  
Vol 61 (10) ◽  
Author(s):  
M. Edwards ◽  
R. Theunissen ◽  
C. B. Allen ◽  
D. J. Poole

Abstract This paper presents a method which allows for a reduced portion of a particle image velocimetry (PIV) image to be analysed, without introducing numerical artefacts near the edges of the reduced region. Based on confidence intervals of statistics of interest, such a region can be determined automatically depending on user-imposed confidence requirements, allowing for already satisfactorily converged regions of the field of view to be neglected in further analysis, offering significant computational benefits. Temporal fluctuations of the flow are unavoidable even for very steady flows, and the magnitude of such fluctuations will naturally vary over the domain. Moreover, the non-linear modulation effects of the cross-correlation operator exacerbate the perceived temporal fluctuations in regions of strong spatial displacement gradients. It follows, therefore, that steady, uniform, flow regions will require fewer contributing images than their less steady, spatially fluctuating, counterparts within the same field of view, and hence the further analysis of image pairs may be solely driven by small, isolated, non-converged regions. In this paper, a methodology is presented which allows these non-converged regions to be identified and subsequently analysed in isolation from the rest of the image, while ensuring that such localised analysis is not adversely affected by the reduced analysis region, i.e. does not introduce boundary effects, thus accelerating the analysis procedure considerably. Via experimental analysis, it is shown that under typical conditions a 44% reduction in the required number of correlations for an ensemble solution is achieved, compared to conventional image processing routines while maintaining a specified level of confidence over the domain. Graphic abstract


Sign in / Sign up

Export Citation Format

Share Document