scholarly journals Theoretical analysis of skip connections and batch normalization from generalization and optimization perspectives

Author(s):  
Yasutaka Furusho ◽  
Kazushi Ikeda

Abstract Deep neural networks (DNNs) have the same structure as the neocognitron proposed in 1979 but have much better performance, which is because DNNs include many heuristic techniques such as pre-training, dropout, skip connections, batch normalization (BN), and stochastic depth. However, the reason why these techniques improve the performance is not fully understood. Recently, two tools for theoretical analyses have been proposed. One is to evaluate the generalization gap, defined as the difference between the expected loss and empirical loss, by calculating the algorithmic stability, and the other is to evaluate the convergence rate by calculating the eigenvalues of the Fisher information matrix of DNNs. This overview paper briefly introduces the tools and shows their usefulness by showing why the skip connections and BN improve the performance.

2014 ◽  
Vol 940 ◽  
pp. 333-335
Author(s):  
You Jie Ma ◽  
De Xiang Wang ◽  
Xue Song Zhou

Power electronics products are widely used in industrial control, requirements of power quality have become more sophisticated. So how to improve voltage quality and how to ensure that the system is stable is an important and urgent issue. This paper briefly discusses evolution of STATCOM development, including the difference from the other compensation devices, the characteristics of STATCOM, the research status, the key technologies of STATCOM, and the trend in the future.


Author(s):  
Ping Luo

Deep Neural Network (DNN) is difficult to train and easy to overfit in training. We address these two issues by introducing EigenNet, an architecture that not only accelerates training but also adjusts number of hidden neurons to reduce over-fitting. They are achieved by whitening the information flows of DNNs and removing those eigenvectors that may capture noises. The former improves conditioning of the Fisher information matrix, whilst the latter increases generalization capability. These appealing properties of EigenNet can benefit many recent DNN structures, such as network in network and inception, by wrapping their hidden layers into the layers of EigenNet. The modeling capacities of the original networks are preserved. Both the training wall-clock time and number of updates are reduced by using EigenNet, compared to stochastic gradient descent on various datasets, including MNIST, CIFAR-10, and CIFAR-100.


Author(s):  
Nihat Ay

AbstractWe study the natural gradient method for learning in deep Bayesian networks, including neural networks. There are two natural geometries associated with such learning systems consisting of visible and hidden units. One geometry is related to the full system, the other one to the visible sub-system. These two geometries imply different natural gradients. In a first step, we demonstrate a great simplification of the natural gradient with respect to the first geometry, due to locality properties of the Fisher information matrix. This simplification does not directly translate to a corresponding simplification with respect to the second geometry. We develop the theory for studying the relation between the two versions of the natural gradient and outline a method for the simplification of the natural gradient with respect to the second geometry based on the first one. This method suggests to incorporate a recognition model as an auxiliary model for the efficient application of the natural gradient method in deep networks.


1973 ◽  
Vol 29 (02) ◽  
pp. 490-498 ◽  
Author(s):  
Hiroh Yamazaki ◽  
Itsuro Kobayashi ◽  
Tadahiro Sano ◽  
Takio Shimamoto

SummaryThe authors previously reported a transient decrease in adhesive platelet count and an enhancement of blood coagulability after administration of a small amount of adrenaline (0.1-1 µg per Kg, i. v.) in man and rabbit. In such circumstances, the sensitivity of platelets to aggregation induced by ADP was studied by an optical density method. Five minutes after i. v. injection of 1 µg per Kg of adrenaline in 10 rabbits, intensity of platelet aggregation increased to 115.1 ± 4.9% (mean ± S. E.) by 10∼5 molar, 121.8 ± 7.8% by 3 × 10-6 molar and 129.4 ± 12.8% of the value before the injection by 10”6 molar ADP. The difference was statistically significant (P<0.01-0.05). The above change was not observed in each group of rabbits injected with saline, 1 µg per Kg of 1-noradrenaline or 0.1 and 10 µg per Kg of adrenaline. Also, it was prevented by oral administration of 10 mg per Kg of phenoxybenzamine or propranolol or aspirin or pyridinolcarbamate 3 hours before the challenge. On the other hand, the enhancement of ADP-induced platelet aggregation was not observed in vitro, when 10-5 or 3 × 10-6 molar and 129.4 ± 12.8% of the value before 10∼6 molar ADP was added to citrated platelet rich plasma (CPRP) of rabbit after incubation at 37°C for 30 second with 0.01, 0.1, 1, 10 or 100 µg per ml of adrenaline or noradrenaline. These results suggest an important interaction between endothelial surface and platelets in connection with the enhancement of ADP-induced platelet aggregation by adrenaline in vivo.


Author(s):  
Philip Isett

This chapter presents the equations and calculations for energy approximation. It establishes the estimates (261) and (262) of the Main Lemma (10.1) for continuous solutions; these estimates state that we are able to accurately prescribe the energy that the correction adds to the solution, as well as bound the difference between the time derivatives of these two quantities. The chapter also introduces the proposition for prescribing energy, followed by the relevant computations. Each integral contributing to the other term can be estimated. Another proposition for estimating control over the rate of energy variation is given. Finally, the coarse scale material derivative is considered.


Metahumaniora ◽  
2017 ◽  
Vol 7 (3) ◽  
pp. 378
Author(s):  
Vincentia Tri Handayani

AbstrakFolklor yang menghasilkan tradisi lisan merupakan perwujudan budaya yang lahirdari pengalaman kelompok masyarakat. Salah satu bentuk tradisi lisan adalah ungkapan yangmengandung unsur budaya lokal dalam konstruksinya yang tidak dimiliki budaya lainnya.Ungkapan idiomatis memberikan warna pada bahasa melalui penggambaran mental. Dalambahasa Perancis, ungkapan dapat berupa locution dan expression. Perbedaan motif acuansuatu ungkapan dapat terlihat dari pengaruh budaya masyarakat pengguna bahasa. Sebuahleksem tidak selalu didefinisikan melalui unsur minimal, tidak juga melalui kata-kata,baik kata dasar atau kata kompleks, namun dapat melalui kata-kata beku yang maknanyatetap. Hubungan analogis dari makna tambahan yang ada pada suatu leksem muncul dariidentifikasi semem yang sama. Semem tersebut mengarah pada term yang diasosiasikan danyang diperkaya melalui konteks (dalam ungkapan berhubungan dengan konteks budaya).Kata kunci: folklor, ungkapan, struktur, makna idiomatis, kebudayaanAbstractFolklore which produces the oral tradition is a cultural manifestation born out theexperience of community groups. One form of the oral tradition is a phrase that containsthe elements of local culture in its construction that is not owned the other culture. Theidiomatic phrase gives the color to the language through the mental representation. InFrench, the expression can consist of locution and expression. The difference motivesreference of an expression can be seen from the influence of the cultural community thelanguage users. A lexeme is not always defined through a minimal element, nor throughwords, either basic or complex words, but can be through the frost words whose meaningsare fixed. The analogical connection of the additional meanings is on a lexeme arises fromthe identification of the same meaning. The meaning ‘semem’ leads to the associated termsand which are enriched through the context (in idiom related to the cultural context).Keywords : folklore, idioms, structure, idiom meaning, cultureI PENDAHULUAN


Author(s):  
Michel Meyer

Rhetoric has always been torn between the rhetoric of figures and the rhetoric of conflicts or arguments, as if rhetoric were exclusively one or the other. This is a false dilemma. Both types of rhetoric hinge on the same structure. A common formula is provided in Chapter 3 which unifies rhetoric stricto sensu and rhetoric as argumentation as two distinct but related strategies adopted according to the level of problematicity of the questions at stake, thereby giving unity to the field called “Rhetoric.” Highly problematic questions require arguments to justify their answers; non-divisive ones can be treated rhetorically through their answers as if they were self-evident. Another classic problem is how to understand the difference between logic and rhetoric. The difference between the two is due to the presence of questions explicitly answered in the premises in logic and only suggested (or remaining indeterminate) in rhetoric.


Author(s):  
D. T. Gauld ◽  
J. E. G. Raymont

The respiratory rates of three species of planktonic copepods, Acartia clausi, Centropages hamatus and Temora longicornis, were measured at four different temperatures.The relationship between respiratory rate and temperature was found to be similar to that previously found for Calanus, although the slope of the curves differed in the different species.The observations on Centropages at 13 and 170 C. can be divided into two groups and it is suggested that the differences are due to the use of copepods from two different generations.The relationship between the respiratory rates and lengths of Acartia and Centropages agreed very well with that previously found for other species. That for Temora was rather different: the difference is probably due to the distinct difference in the shape of the body of Temora from those of the other species.The application of these measurements to estimates of the food requirements of the copepods is discussed.


Sign in / Sign up

Export Citation Format

Share Document