Augmenting Transformers with KNN-Based Composite Memory for Dialog

Various machine learning tasks can benefit from access to external information of different modalities, such as text and images. Recent work has focused on learning architectures with large memories capable of storing this knowledge. We propose augmenting generative Transformer neural networks with KNN-based Information Fetching (KIF) modules. Each KIF module learns a read operation to access fixed external knowledge. We apply these modules to generative dialog modeling, a challenging task where information must be flexibly retrieved and incorporated to maintain the topic and flow of conversation. We demonstrate the effectiveness of our approach by identifying relevant knowledge required for knowledgeable but engaging dialog from Wikipedia, images, and human-written dialog utterances, and show that leveraging this retrieved information improves model performance, measured by automatic and human evaluation.

Download Full-text

Representing Deep Neural Networks Latent Space Geometries with Graphs

Algorithms ◽

10.3390/a14020039 ◽

2021 ◽

Vol 14 (2) ◽

pp. 39

Author(s):

Carlos Lassance ◽

Vincent Gripon ◽

Antonio Ortega

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Objective Function ◽

Learning Process ◽

Deep Neural Networks ◽

State Of The Art ◽

The Core ◽

Learning Tasks ◽

Latent Space

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.

Download Full-text

Using machine-learning risk prediction models to triage the acuity of undifferentiated patients entering the emergency care system: a systematic review

Diagnostic and Prognostic Research ◽

10.1186/s41512-020-00084-1 ◽

2020 ◽

Vol 4 (1) ◽

Author(s):

Jamie Miles ◽

Janette Turner ◽

Richard Jacques ◽

Julia Williams ◽

Suzanne Mason

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Neural Networks ◽

Logistic Regression ◽

Emergency Care ◽

Model Performance ◽

Machine Learning Methods ◽

C Statistic ◽

Emergency Care System ◽

Care System

Abstract Background The primary objective of this review is to assess the accuracy of machine learning methods in their application of triaging the acuity of patients presenting in the Emergency Care System (ECS). The population are patients that have contacted the ambulance service or turned up at the Emergency Department. The index test is a machine-learning algorithm that aims to stratify the acuity of incoming patients at initial triage. This is in comparison to either an existing decision support tool, clinical opinion or in the absence of these, no comparator. The outcome of this review is the calibration, discrimination and classification statistics. Methods Only derivation studies (with or without internal validation) were included. MEDLINE, CINAHL, PubMed and the grey literature were searched on the 14th December 2019. Risk of bias was assessed using the PROBAST tool and data was extracted using the CHARMS checklist. Discrimination (C-statistic) was a commonly reported model performance measure and therefore these statistics were represented as a range within each machine learning method. The majority of studies had poorly reported outcomes and thus a narrative synthesis of results was performed. Results There was a total of 92 models (from 25 studies) included in the review. There were two main triage outcomes: hospitalisation (56 models), and critical care need (25 models). For hospitalisation, neural networks and tree-based methods both had a median C-statistic of 0.81 (IQR 0.80-0.84, 0.79-0.82). Logistic regression had a median C-statistic of 0.80 (0.74-0.83). For critical care need, neural networks had a median C-statistic of 0.89 (0.86-0.91), tree based 0.85 (0.84-0.88), and logistic regression 0.83 (0.79-0.84). Conclusions Machine-learning methods appear accurate in triaging undifferentiated patients entering the Emergency Care System. There was no clear benefit of using one technique over another; however, models derived by logistic regression were more transparent in reporting model performance. Future studies should adhere to reporting guidelines and use these at the protocol design stage. Registration and funding This systematic review is registered on the International prospective register of systematic reviews (PROSPERO) and can be accessed online at the following URL: https://www.crd.york.ac.uk/PROSPERO/display_record.php?ID=CRD42020168696 This study was funded by the NIHR as part of a Clinical Doctoral Research Fellowship.

Download Full-text

Neural Networks in ECG Classification

Neural Networks in Healthcare ◽

10.4018/978-1-59140-848-2.ch004 ◽

2011 ◽

pp. 81-104 ◽

Cited By ~ 2

Author(s):

G. Camps-Valls ◽

J. F. Guerrero-Martinez

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Feature Selection ◽

Adaptive Systems ◽

Clinical Environment ◽

Learning Models ◽

Cardiac Pathology ◽

Learning Tasks ◽

Beat Detection ◽

Machine Learning Models

In this chapter, we review the vast field of application of artificial neural networks in cardiac pathology discrimination based on electrocardiographic signals. We discuss advantages and drawbacks of neural and adaptive systems in cardiovascular medicine and catch a glimpse of forthcoming developments in machine learning models for the real clinical environment. Some problems are identified in the learning tasks of beat detection, feature selection/extraction, and classification, and some proposals and suggestions are given to alleviate the problems of interpretability, overfitting, and adaptation. These have become important problems in recent years and will surely constitute the basis of some investigations in the immediate future.

Download Full-text

Ensemble Learning for Regression

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch120 ◽

2011 ◽

pp. 777-782

Author(s):

Niall Rooney

Keyword(s):

Machine Learning ◽

Data Mining ◽

Neural Networks ◽

Ensemble Learning ◽

Learning Task ◽

General Definition ◽

Learning Tasks ◽

Continuous Output ◽

Input Variables ◽

The Given

The concept of ensemble learning has its origins in research from the late 1980s/early 1990s into combining a number of artificial neural networks (ANNs) models for regression tasks. Ensemble learning is now a widely deployed and researched topic within the area of machine learning and data mining. Ensemble learning, as a general definition, refers to the concept of being able to apply more than one learning model to a particular machine learning problem using some method of integration. The desired goal of course is that the ensemble as a unit will outperform any of its individual members for the given learning task. Ensemble learning has been extended to cover other learning tasks such as classification (refer to Kuncheva, 2004 for a detailed overview of this area), online learning (Fern & Givan, 2003) and clustering (Strehl & Ghosh, 2003). The focus of this article is to review ensemble learning with respect to regression, where by regression, we refer to the supervised learning task of creating a model that relates a continuous output variable to a vector of input variables.

Download Full-text

Might a Single Neuron Solve Interesting Machine Learning Problems through Successive Computations on Its Dendritic Tree?

Neural Computation ◽

10.1162/neco_a_01390 ◽

2021 ◽

pp. 1-18

Author(s):

Ilenna Simone Jones ◽

Konrad Paul Kording

Keyword(s):

Machine Learning ◽

Binary Tree ◽

Single Neuron ◽

Model Performance ◽

Dendritic Tree ◽

Neural Computation ◽

Learning Problems ◽

Synaptic Inputs ◽

Learning Tasks ◽

Dendritic Branches

Abstract Physiological experiments have highlighted how the dendrites of biological neurons can nonlinearly process distributed synaptic inputs. However, it is unclear how aspects of a dendritic tree, such as its branched morphology or its repetition of presynaptic inputs, determine neural computation beyond this apparent nonlinearity. Here we use a simple model where the dendrite is implemented as a sequence of thresholded linear units. We manipulate the architecture of this model to investigate the impacts of binary branching constraints and repetition of synaptic inputs on neural computation. We find that models with such manipulations can perform well on machine learning tasks, such as Fashion MNIST or Extended MNIST. We find that model performance on these tasks is limited by binary tree branching and dendritic asymmetry and is improved by the repetition of synaptic inputs to different dendritic branches. These computational experiments further neuroscience theory on how different dendritic properties might determine neural computation of clearly defined tasks.

Download Full-text

Electrocardiogram Classification Based on Deep Convolutional Neural Networks: A Review

10.54216/fpa.030103 ◽

2021 ◽

pp. 43-53

Author(s):

admin admin ◽

◽

Adnan Mohsin Abdulazeez

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Classification Accuracy ◽

Machine Learning Techniques ◽

Deep Convolutional Neural Networks ◽

Learning Techniques ◽

Good Classification ◽

Learning Architectures ◽

Ecg Data

Due to many new medical uses, the value of ECG classification is very demanding. There are some Machine Learning (ML) algorithms currently available that can be used for ECG data processing and classification. The key limitations of these ML studies, however, are the use of heuristic hand-crafted or engineered characteristics of shallow learning architectures. The difficulty lies in the probability of not having the most suitable functionality that will provide this ECG problem with good classification accuracy. One choice suggested is to use deep learning algorithms in which the first layer of CNN acts as a feature. This paper summarizes some of the key approaches of ECG classification in machine learning, assessing them in terms of the characteristics they use, the precision of classification important physiological keys ECG biomarkers derived from machine learning techniques, and statistical modeling and supported simulation.

Download Full-text

DeePaC: Predicting pathogenic potential of novel DNA with a universal framework for reverse-complement neural networks

10.1101/535286 ◽

2019 ◽

Cited By ~ 1

Author(s):

Jakub M. Bartoszewicz ◽

Anja Seidel ◽

Robert Rentzsch ◽

Bernhard Y. Renard

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Pathogen Detection ◽

State Of The Art ◽

Model Performance ◽

Pathogenic Potential ◽

Reverse Complement ◽

Recent Developments ◽

Undesirable Property ◽

Simple Character

AbstractMotivation:We expect novel pathogens to arise due to their fast-paced evolution, and new species to be discovered thanks to advances in DNA sequencing and metagenomics. What is more, recent developments in synthetic biology raise concerns that some strains of bacteria could be modified for malicious purposes. Traditional approaches to open-view pathogen detection depend on databases of known organisms, limiting their performance on unknown, unrecognized, and unmapped sequences. In contrast, machine learning methods can infer pathogenic phenotypes from single NGS reads even though the biological context is unavailable. However, modern neural architectures treat DNA as a simple character string and may predict conflicting labels for a given sequence and its reverse-complement. This undesirable property may impact model performance.Results:We present DeePaC, a Deep Learning Approach to Pathogenicity Classification. It includes a universal, extensible framework for neural architectures ensuring identical predictions for any given DNA sequence and its reverse-complement. We implement reverse-complement convolutional neural networks and LSTMs, which outperform the state-of-the-art methods based on both sequence homology and machine learning. Combining a reverse-complement architecture with integrating the predictions for both mates in a read pair results in cutting the error rate almost in half in comparison to the previous state-of-the-art.Availability:The code and the models are available at: https://gitlab.com/rki_bioinformatics/DeePaC

Download Full-text

Measuring directed triadic closure with closure coefficients

Network Science ◽

10.1017/nws.2020.20 ◽

2020 ◽

Vol 8 (4) ◽

pp. 551-573 ◽

Cited By ~ 1

Author(s):

Hao Yin ◽

Austin R. Benson ◽

Johan Ugander

Keyword(s):

Machine Learning ◽

Recent Work ◽

Real World ◽

Directed Graphs ◽

Configuration Model ◽

Undirected Graphs ◽

Learning Tasks ◽

Triadic Closure ◽

Interpretable Models

AbstractRecent work studying triadic closure in undirected graphs has drawn attention to the distinction between measures that focus on the “center” node of a wedge (i.e., length-2 path) versus measures that focus on the “initiator,” a distinction with considerable consequences. Existing measures in directed graphs, meanwhile, have all been center-focused. In this work, we propose a family of eight directed closure coefficients that measure the frequency of triadic closure in directed graphs from the perspective of the node initiating closure. The eight coefficients correspond to different labeled wedges, where the initiator and center nodes are labeled, and we observe dramatic empirical variation in these coefficients on real-world networks, even in cases when the induced directed triangles are isomorphic. To understand this phenomenon, we examine the theoretical behavior of our closure coefficients under a directed configuration model. Our analysis illustrates an underlying connection between the closure coefficients and moments of the joint in- and out-degree distributions of the network, offering an explanation of the observed asymmetries. We also use our directed closure coefficients as predictors in two machine learning tasks. We find interpretable models with AUC scores above 0.92 in class-balanced binary prediction, substantially outperforming models that use traditional center-focused measures.

Download Full-text

A Review of Machine Learning Applications in Land Surface Modeling

Earth ◽

10.3390/earth2010011 ◽

2021 ◽

Vol 2 (1) ◽

pp. 174-190

Author(s):

Sujan Pal ◽

Prateek Sharma

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Land Surface ◽

Climate Models ◽

Short Term Memory ◽

Lower Boundary ◽

Model Performance ◽

Heat Fluxes ◽

Momentum Exchange ◽

High Resolution Data

Machine learning (ML), as an artificial intelligence tool, has acquired significant progress in data-driven research in Earth sciences. Land Surface Models (LSMs) are important components of the climate models, which help to capture the water, energy, and momentum exchange between the land surface and the atmosphere, providing lower boundary conditions to the atmospheric models. The objectives of this review paper are to highlight the areas of improvement in land modeling using ML and discuss the crucial ML techniques in detail. Literature searches were conducted using the relevant key words to obtain an extensive list of articles. The bibliographic lists of these articles were also considered. To date, ML-based techniques have been able to upgrade the performance of LSMs and reduce uncertainties by improving evapotranspiration and heat fluxes estimation, parameter optimization, better crop yield prediction, and model benchmarking. Widely used ML techniques used for these purposes include Artificial Neural Networks and Random Forests. We conclude that further improvements in land modeling are possible in terms of high-resolution data preparation, parameter calibration, uncertainty reduction, efficient model performance, and data assimilation using ML. In addition to the traditional techniques, convolutional neural networks, long short-term memory, and other deep learning methods can be implemented.

Download Full-text

Exploring convolutional neural networks and spatial video for on-the-ground mapping in informal settlements

International Journal of Health Geographics ◽

10.1186/s12942-021-00259-z ◽

2021 ◽

Vol 20 (1) ◽

Author(s):

Jayakrishnan Ajayakumar ◽

Andrew J. Curtis ◽

Vanessa Rouzier ◽

Jean William Pape ◽

Sandra Bempah ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Health Risk ◽

Spatial Data ◽

Model Performance ◽

Informal Settlements ◽

Environmental Risks ◽

Image Resolution ◽

Risk Mapping ◽

Spatial Video

Abstract Background The health burden in developing world informal settlements often coincides with a lack of spatial data that could be used to guide intervention strategies. Spatial video (SV) has proven to be a useful tool to collect environmental and social data at a granular scale, though the effort required to turn these spatially encoded video frames into maps limits sustainability and scalability. In this paper we explore the use of convolution neural networks (CNN) to solve this problem by automatically identifying disease related environmental risks in a series of SV collected from Haiti. Our objective is to determine the potential of machine learning in health risk mapping for these environments by assessing the challenges faced in adequately training the required classification models. Results We show that SV can be a suitable source for automatically identifying and extracting health risk features using machine learning. While well-defined objects such as drains, buckets, tires and animals can be efficiently classified, more amorphous masses such as trash or standing water are difficult to classify. Our results further show that variations in the number of image frames selected, the image resolution, and combinations of these can be used to improve the overall model performance. Conclusion Machine learning in combination with spatial video can be used to automatically identify environmental risks associated with common health problems in informal settlements, though there are likely to be variations in the type of data needed for training based on location. Success based on the risk type being identified are also likely to vary geographically. However, we are confident in identifying a series of best practices for data collection, model training and performance in these settings. We also discuss the next step of testing these findings in other environments, and how adding in the simultaneously collected geographic data could be used to create an automatic health risk mapping tool.

Download Full-text