A Simple and Efficient Tensor Calculus

Sören Laue; Matthias Mitterreiter; Joachim Giesen

doi:10.1609/aaai.v34i04.5881

A Simple and Efficient Tensor Calculus

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5881 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4527-4534

Author(s):

Sören Laue ◽

Matthias Mitterreiter ◽

Joachim Giesen

Keyword(s):

Machine Learning ◽

Efficient Method ◽

Automatic Differentiation ◽

State Of The Art ◽

Higher Order ◽

Tensor Representation ◽

Tensor Calculus ◽

Correct Algorithm ◽

Previous State ◽

Derivatives Of

Computing derivatives of tensor expressions, also known as tensor calculus, is a fundamental task in machine learning. A key concern is the efficiency of evaluating the expressions and their derivatives that hinges on the representation of these expressions. Recently, an algorithm for computing higher order derivatives of tensor expressions like Jacobians or Hessians has been introduced that is a few orders of magnitude faster than previous state-of-the-art approaches. Unfortunately, the approach is based on Ricci notation and hence cannot be incorporated into automatic differentiation frameworks like TensorFlow, PyTorch, autograd, or JAX that use the simpler Einstein notation. This leaves two options, to either change the underlying tensor representation in these frameworks or to develop a new, provably correct algorithm based on Einstein notation. Obviously, the first option is impractical. Hence, we pursue the second option. Here, we show that using Ricci notation is not necessary for an efficient tensor calculus and develop an equally efficient method for the simpler Einstein notation. It turns out that turning to Einstein notation enables further improvements that lead to even better efficiency.

Download Full-text

A Simple and Efficient Tensor Calculus for Machine Learning

Fundamenta Informaticae ◽

10.3233/fi-2020-1984 ◽

2020 ◽

Vol 177 (2) ◽

pp. 157-179

Author(s):

Sören Laue ◽

Matthias Mitterreiter ◽

Joachim Giesen

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Efficient Method ◽

Automatic Differentiation ◽

State Of The Art ◽

Tensor Representation ◽

Tensor Calculus ◽

Online Tool ◽

Previous State ◽

Derivatives Of

Computing derivatives of tensor expressions, also known as tensor calculus, is a fundamental task in machine learning. A key concern is the efficiency of evaluating the expressions and their derivatives that hinges on the representation of these expressions. Recently, an algorithm for computing higher order derivatives of tensor expressions like Jacobians or Hessians has been introduced that is a few orders of magnitude faster than previous state-of-the-art approaches. Unfortunately, the approach is based on Ricci notation and hence cannot be incorporated into automatic differentiation frameworks from deep learning like TensorFlow, PyTorch, autograd, or JAX that use the simpler Einstein notation. This leaves two options, to either change the underlying tensor representation in these frameworks or to develop a new, provably correct algorithm based on Einstein notation. Obviously, the first option is impractical. Hence, we pursue the second option. Here, we show that using Ricci notation is not necessary for an efficient tensor calculus and develop an equally efficient method for the simpler Einstein notation. It turns out that turning to Einstein notation enables further improvements that lead to even better efficiency. The methods that are described in this paper for computing derivatives of matrix and tensor expressions have been implemented in the online tool www.MatrixCalculus.org.

Download Full-text

From Characters to Time Intervals: New Paradigms for Evaluation and Neural Parsing of Time Normalizations

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00025 ◽

2018 ◽

Vol 6 ◽

pp. 343-356 ◽

Cited By ~ 2

Author(s):

Egoitz Laparra ◽

Dongfang Xu ◽

Steven Bethard

Keyword(s):

Neural Network ◽

Machine Learning ◽

Comparative Analysis ◽

State Of The Art ◽

Learning Approaches ◽

Semantic Parsing ◽

Time Intervals ◽

Semantic Composition ◽

Previous State ◽

New Scoring

This paper presents the first model for time normalization trained on the SCATE corpus. In the SCATE schema, time expressions are annotated as a semantic composition of time entities. This novel schema favors machine learning approaches, as it can be viewed as a semantic parsing task. In this work, we propose a character level multi-output neural network that outperforms previous state-of-the-art built on the TimeML schema. To compare predictions of systems that follow both SCATE and TimeML, we present a new scoring metric for time intervals. We also apply this new metric to carry out a comparative analysis of the annotations of both schemes in the same corpus.

Download Full-text

Higher Order Automatic Differentiation with Dual Numbers

Periodica Polytechnica Electrical Engineering and Computer Science ◽

10.3311/ppee.16341 ◽

2020 ◽

Author(s):

László Szirmay-Kalos

Keyword(s):

Gaussian Curvature ◽

Automatic Differentiation ◽

Arbitrary Order ◽

Higher Order ◽

Dual Numbers ◽

Computer Implementation ◽

Particular Solution ◽

Derivatives Of ◽

Curve Fairing ◽

Curvature Computation

In engineering applications, we often need the derivatives of functions defined by a program. The approach chosen for derivative computation must be algebraic to allow computer implementation. A particular solution to obtain first derivatives is the application of dual numbers. This paper proposes simple and compact generalizations of this idea to obtain derivatives of arbitrary order for single or multi-variate functions and the automatic handling of 0/0 ambiguities in the calculations. We also provide the C++ code that takes advantage of operator overloading and recursion. The method is demonstrated by path animation, Gaussian curvature computation, and curve fairing.

Download Full-text

Perturbation confusion in forward automatic differentiation of higher-order functions

Journal of Functional Programming ◽

10.1017/s095679681900008x ◽

2019 ◽

Vol 29 ◽

Cited By ~ 1

Author(s):

OLEKSANDR MANZYUK ◽

BARAK A. PEARLMUTTER ◽

ALEXEY ANDREYEVICH RADUL ◽

DAVID R. RUSH ◽

JEFFREY MARK SISKIND

Keyword(s):

Automatic Differentiation ◽

Computer Programs ◽

Higher Order ◽

The Other ◽

Order Function ◽

Accumulation Mode ◽

The Creation ◽

One To One ◽

Derivatives Of ◽

Higher Order Functions

Abstract Automatic differentiation (AD) is a technique for augmenting computer programs to compute derivatives. The essence of AD in its forward accumulation mode is to attach perturbations to each number, and propagate these through the computation by overloading the arithmetic operators. When derivatives are nested, the distinct derivative calculations, and their associated perturbations, must be distinguished. This is typically accomplished by creating a unique tag for each derivative calculation and tagging the perturbations. We exhibit a subtle bug, present in fielded implementations which support derivatives of higher-order functions, in which perturbations are confused despite the tagging machinery, leading to incorrect results. The essence of the bug is as follows: a unique tag is needed for each derivative calculation, but in existing implementations unique tags are created when taking the derivative of a function at a point. When taking derivatives of higher-order functions, these need not correspond! We exhibit a simple example: a higher-order function f whose derivative at a point x, namely f′(x), is itself a function which calculates a derivative. This situation arises naturally when taking derivatives of curried functions. Two potential solutions are presented, and their deficiencies discussed. One uses eta expansion to delay the creation of fresh tags in order to put them into one-to-one correspondence with derivative calculations. The other wraps outputs of derivative operators with tag substitution machinery. Both solutions seem very difficult to implement without violating the desirable complexity guarantees of forward AD.

Download Full-text

Joint sparsity-biased variational graph autoencoders

The Journal of Defense Modeling and Simulation Applications Methodology Technology ◽

10.1177/1548512921996828 ◽

2021 ◽

pp. 154851292199682

Author(s):

Lane Lawley ◽

Will Frey ◽

Patrick Mullen ◽

Alexander D Wissner-Gross

Keyword(s):

Machine Learning ◽

Link Prediction ◽

State Of The Art ◽

Reconstruction Accuracy ◽

Sparse Graphs ◽

Joint Sparsity ◽

Network Topologies ◽

Previous State ◽

Classification Tasks ◽

Node Classification

To bring the full benefits of machine learning to defense modeling and simulation, it is essential to first learn useful representations for sparse graphs consisting of both key entities (vertices) and their relationships (edges). Here, we present a new model, the Joint Sparsity-Biased Variational Graph AutoEncoder (JSBVGAE), capable of learning embedded representations of nodes from which both sparse network topologies and node features can be jointly and accurately reconstructed. We show that our model outperforms the previous state of the art on standard link-prediction and node-classification tasks, and achieves significantly higher whole-network reconstruction accuracy, while reducing the number of trained parameters.

Download Full-text

Multi-Resolution Autoregressive Graph-to-Graph Translation for Molecules

10.26434/chemrxiv.8266745.v1 ◽

2019 ◽

Author(s):

Wengong Jin ◽

Regina Barzilay ◽

Tommi S Jaakkola

Keyword(s):

Drug Discovery ◽

State Of The Art ◽

Molecular Graph ◽

Biochemical Properties ◽

Large Margin ◽

Previous State ◽

Translation Methods ◽

Atom Level ◽

Precursor Molecules ◽

Prior State

The problem of accelerating drug discovery relies heavily on automatic tools to optimize precursor molecules to afford them with better biochemical properties. Our work in this paper substantially extends prior state-of-the-art on graph-to-graph translation methods for molecular optimization. In particular, we realize coherent multi-resolution representations by interweaving trees over substructures with the atom-level encoding of the original molecular graph. Moreover, our graph decoder is fully autoregressive, and interleaves each step of adding a new substructure with the process of resolving its connectivity to the emerging molecule. We evaluate our model on multiple molecular optimization tasks and show that our model outperforms previous state-of-the-art baselines by a large margin.

Download Full-text

Data science in economics: comprehensive review of advanced machine learning and deep learning methods

10.31232/osf.io/4pxq2 ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Puhong Duan ◽

Pedram Ghamisi ◽

Ferdinand Filip ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Science ◽

State Of The Art ◽

Science Methods ◽

Learning Models ◽

Diverse Range ◽

Hybrid Machine ◽

Economics Research

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.

Download Full-text

Deep Learning for text in limted data settings

10.36227/techrxiv.12100692 ◽

2020 ◽

Author(s):

Pathikkumar Patel ◽

Bhargav Lad ◽

Jinan Fiaidhi

Keyword(s):

Machine Learning ◽

Time Series ◽

Deep Learning ◽

Sentiment Analysis ◽

Transfer Learning ◽

Text Classification ◽

State Of The Art ◽

Time Series Forecasting ◽

Text Data ◽

Performance Levels

During the last few years, RNN models have been extensively used and they have proven to be better for sequence and text data. RNNs have achieved state-of-the-art performance levels in several applications such as text classification, sequence to sequence modelling and time series forecasting. In this article we will review different Machine Learning and Deep Learning based approaches for text data and look at the results obtained from these methods. This work also explores the use of transfer learning in NLP and how it affects the performance of models on a specific application of sentiment analysis.

Download Full-text

Multi-hop assortativities for network classification

Journal of Complex Networks ◽

10.1093/comnet/cny034 ◽

2018 ◽

Vol 7 (4) ◽

pp. 603-622 ◽

Cited By ~ 1

Author(s):

Leonardo Gutiérrez-Gómez ◽

Jean-Charles Delvenne

Keyword(s):

Machine Learning ◽

Scientific Collaboration ◽

State Of The Art ◽

Medical Engineering ◽

Research Field ◽

Classification Task ◽

Collaboration Network ◽

Structural Patterns ◽

Art Methods

Abstract Several social, medical, engineering and biological challenges rely on discovering the functionality of networks from their structure and node metadata, when it is available. For example, in chemoinformatics one might want to detect whether a molecule is toxic based on structure and atomic types, or discover the research field of a scientific collaboration network. Existing techniques rely on counting or measuring structural patterns that are known to show large variations from network to network, such as the number of triangles, or the assortativity of node metadata. We introduce the concept of multi-hop assortativity, that captures the similarity of the nodes situated at the extremities of a randomly selected path of a given length. We show that multi-hop assortativity unifies various existing concepts and offers a versatile family of ‘fingerprints’ to characterize networks. These fingerprints allow in turn to recover the functionalities of a network, with the help of the machine learning toolbox. Our method is evaluated empirically on established social and chemoinformatic network benchmarks. Results reveal that our assortativity based features are competitive providing highly accurate results often outperforming state of the art methods for the network classification task.

Download Full-text

Advances in the Application of Machine Learning Techniques for Power System Analytics: A Survey

Energies ◽

10.3390/en14164776 ◽

2021 ◽

Vol 14 (16) ◽

pp. 4776

Author(s):

Seyed Mahdi Miraftabzadeh ◽

Michela Longo ◽

Federica Foiadelli ◽

Marco Pasetti ◽

Raul Igual

Keyword(s):

Machine Learning ◽

Power Systems ◽

Smart Grids ◽

State Of The Art ◽

Smart Cities ◽

Power Grids ◽

Machine Learning Techniques ◽

Learning Techniques ◽

New Research ◽

Traditional Approaches

The recent advances in computing technologies and the increasing availability of large amounts of data in smart grids and smart cities are generating new research opportunities in the application of Machine Learning (ML) for improving the observability and efficiency of modern power grids. However, as the number and diversity of ML techniques increase, questions arise about their performance and applicability, and on the most suitable ML method depending on the specific application. Trying to answer these questions, this manuscript presents a systematic review of the state-of-the-art studies implementing ML techniques in the context of power systems, with a specific focus on the analysis of power flows, power quality, photovoltaic systems, intelligent transportation, and load forecasting. The survey investigates, for each of the selected topics, the most recent and promising ML techniques proposed by the literature, by highlighting their main characteristics and relevant results. The review revealed that, when compared to traditional approaches, ML algorithms can handle massive quantities of data with high dimensionality, by allowing the identification of hidden characteristics of (even) complex systems. In particular, even though very different techniques can be used for each application, hybrid models generally show better performances when compared to single ML-based models.

Download Full-text