linear networks Latest Research Papers

Abstract Direct Feedback Alignment (DFA) is emerging as an eﬁcient and biologically plausible alternative to backpropagation for training deep neural networks. Despite relying on random feedback weights for the backward pass, DFA successfully trains state-of-the-art models such as Transformers. On the other hand, it notoriously fails to train convolutional networks. An understanding of the inner workings of DFA to explain these diverging results remains elusive. Here, we propose a theory of feedback alignment algorithms. We ﬀrst show that learning in shallow networks proceeds in two steps: an alignment phase, where the model adapts its weights to align the approximate gradient with the true gradient of the loss function, is followed by a memorisation phase, where the model focuses on ﬀtting the data. This two-step process has a degeneracy breaking eﬂect: out of all the low-loss solutions in the landscape, a network trained with DFA naturally converges to the solution which maximises gradient alignment. We also identify a key quantity underlying alignment in deep linear networks: the conditioning of the alignment matrices. The latter enables a detailed understanding of the impact of data structure on alignment, and suggests a simple explanation for the well-known failure of DFA to train convolutional neural networks. Numerical experiments on MNIST and CIFAR10 clearly demonstrate degeneracy breaking in deep non-linear networks and show that the align-then-memorize process occurs sequentially from the bottom layers of the network to the top.

Download Full-text

Simulating macroscopic quantum correlations in linear networks

Physics Letters A ◽

10.1016/j.physleta.2021.127911 ◽

2021 ◽

pp. 127911

Author(s):

A. Dellios ◽

Peter D. Drummond ◽

Bogdan Opanchuk ◽

Run Yan Teh ◽

Margaret D. Reid

Keyword(s):

Quantum Correlations ◽

Macroscopic Quantum ◽

Linear Networks

Download Full-text

Noise and Linear Networks

10.1002/9781119859390.ch4 ◽

2021 ◽

pp. 45-58

Keyword(s):

Linear Networks

Download Full-text

IoT Multi-Hop Facilities via LoRa Modulation and LoRa WanProtocol within Thin Linear Networks

10.1109/sas51076.2021.9530117 ◽

2021 ◽

Author(s):

Federico Basili ◽

Stefano Parrino ◽

Giacomo Peruzzi ◽

Alessandro Pozzebon

Keyword(s):

Linear Networks

Download Full-text

On the Neural Tangent Kernel of Deep Networks with Orthogonal Initialization

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/355 ◽

2021 ◽

Author(s):

Wei Huang ◽

Weitao Du ◽

Richard Yi Da Xu

Keyword(s):

Neural Networks ◽

Empirical Investigation ◽

Linear Regime ◽

Nonlinear Networks ◽

Linear Networks ◽

Learning Speed ◽

Deep Networks ◽

Speed Up ◽

Fully Connected ◽

Fully Connected Networks

The prevailing thinking is that orthogonal weights are crucial to enforcing dynamical isometry and speeding up training. The increase in learning speed that results from orthogonal initialization in linear networks has been well-proven. However, while the same is believed to also hold for nonlinear networks when the dynamical isometry condition is satisfied, the training dynamics behind this contention have not been thoroughly explored. In this work, we study the dynamics of ultra-wide networks across a range of architectures, including Fully Connected Networks (FCNs) and Convolutional Neural Networks (CNNs) with orthogonal initialization via neural tangent kernel (NTK). Through a series of propositions and lemmas, we prove that two NTKs, one corresponding to Gaussian weights and one to orthogonal weights, are equal when the network width is infinite. Further, during training, the NTK of an orthogonally-initialized infinite-width network should theoretically remain constant. This suggests that the orthogonal initialization cannot speed up training in the NTK (lazy training) regime, contrary to the prevailing thoughts. In order to explore under what circumstances can orthogonality accelerate training, we conduct a thorough empirical investigation outside the NTK regime. We find that when the hyper-parameters are set to achieve a linear regime in nonlinear activation, orthogonal initialization can improve the learning speed with a large learning rate or large depth.

Download Full-text

MIXED CHARACTERISTIC OF S-PARAMETERS OF DIFFERENTIAL STRUCTURES

ВЕСТНИК ВОРОНЕЖСКОГО ГОСУДАРСТВЕННОГО ТЕХНИЧЕСКОГО УНИВЕРСИТЕТА ◽

10.36622/vstu.2021.17.1.011 ◽

2021 ◽

pp. 74-78

Author(s):

Т.С. Глотова ◽

Д.В. Журавлёв ◽

В.В. Глотов

Keyword(s):

Transmission Lines ◽

Mixed Mode ◽

Mode Conversion ◽

Vector Network Analyzer ◽

Network Analyzer ◽

Circuit Performance ◽

Reflected Waves ◽

Linear Networks ◽

S Parameter ◽

S Parameters

Различные типы СВЧ-устройств можно описать с помощью падающих и отражённых волн, которые распространяются в подключенных к ним линиях передач. Связь между этими волнами описывается волновой матрицей рассеяния или матрицей s-параметров. Оценка дифференциальных структур необходима для обеспечения оптимальных характеристик схемы. Комбинированные дифференциальные и синфазные (смешанные) параметры рассеяния (s-параметры) хорошо адаптированы для точных измерений линейных сетей на радиочастотах. Представлено преобразование между стандартными s-параметрами и s-параметрами смешанного режима, также описано графическое сравнение графиков стандартных и смешанных потерь s-параметра. S-параметры смешанного режима, полученные с помощью описанного метода, имеют хорошее согласие для возбудителя и реакции с одним и тем же режимом (общий или дифференциальный) и небольшую вариацию с разными режимами. Была изготовлена дифференциальная структура, которая измеряется с помощью двухпортового векторного анализатора цепей и четырехпортового анализатора цепей смешанного режима. Для прогнозирования поведения параметров смешанного режима с использованием традиционного двухпортового векторного анализатора цепей можно применить метод преобразования режимов, однако четырехпортовый анализатор цепей смешанного режима по-прежнему необходим для точного измерения влияния режима преобразования в реальные интегрированные дифференциальные тестовые структуры Various types of microwave devices can be described using incident and reflected waves that propagate in the transmission lines connected to them. The relationship between these waves is described by the scattering wave matrix or the S-parameter matrix. Evaluation of differential structures is necessary to ensure optimal circuit performance. The combined differential and common-mode (mixed) scatter parameters (s-parameters) are well suited for accurate measurements of linear networks at radio frequencies. We present the transformation between standard s-parameters and mixed-mode s-parameters, and a graphical comparison of graphs of standard and mixed s-parameter losses is also described. S-parameters of the mixed mode, obtained using the described method, have good agreement for the pathogen and the reaction with the same mode (general or differential) and little variation with different modes. We fabricated and measured a differential structure with a two-port vector network analyzer and a four-port mixed-mode network analyzer. Mode conversion can be used to predict the behavior of mixed-mode parameters using a traditional 2-port vector network analyzer, but a four-port mixed-mode network analyzer is still required to accurately measure the effect of conversion mode on real integrated differential test structures

Download Full-text