TAPAS: Train-Less Accuracy Predictor for Architecture Search

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013927 ◽

2019 ◽

Vol 33 ◽

pp. 3927-3934 ◽

Cited By ~ 4

Author(s):

R. Istrate ◽

F. Scheidegger ◽

G. Mariani ◽

D. Nikolopoulos ◽

C. Bekas ◽

...

Keyword(s):

Neural Network ◽

Network Architecture ◽

Large Scale ◽

State Of The Art ◽

Classification Performance ◽

Neural Network Architecture ◽

Network Information ◽

Topological Network ◽

Computational Resources

In recent years an increasing number of researchers and practitioners have been suggesting algorithms for large-scale neural network architecture search: genetic algorithms, reinforcement learning, learning curve extrapolation, and accuracy predictors. None of them, however, demonstrated highperformance without training new experiments in the presence of unseen datasets. We propose a new deep neural network accuracy predictor, that estimates in fractions of a second classification performance for unseen input datasets, without training. In contrast to previously proposed approaches, our prediction is not only calibrated on the topological network information, but also on the characterization of the dataset-difficulty which allows us to re-tune the prediction without any training. Our predictor achieves a performance which exceeds 100 networks per second on a single GPU, thus creating the opportunity to perform large-scale architecture search within a few minutes. We present results of two searches performed in 400 seconds on a single GPU. Our best discovered networks reach 93.67% accuracy for CIFAR-10 and 81.01% for CIFAR-100, verified by training. These networks are performance competitive with other automatically discovered state-of-the-art networks however we only needed a small fraction of the time to solution and computational resources.

Download Full-text

SketchGNN: Semantic Sketch Segmentation with Graph Neural Networks

ACM Transactions on Graphics ◽

10.1145/3450284 ◽

2021 ◽

Vol 40 (3) ◽

pp. 1-13

Author(s):

Lumin Yang ◽

Jiajie Zhuang ◽

Hongbo Fu ◽

Xiangzhi Wei ◽

Kun Zhou ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Large Scale ◽

State Of The Art ◽

Semantic Segmentation ◽

Structure Information ◽

Graph Neural Networks ◽

Node Labels ◽

Point Level

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.

Download Full-text

IPOMDP-Net: A Deep Neural Network for Partially Observable Multi-Agent Planning Using Interactive POMDPs

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016062 ◽

2019 ◽

Vol 33 ◽

pp. 6062-6069 ◽

Cited By ~ 1

Author(s):

Yanlin Han ◽

Piotr Gmytrasiewicz

Keyword(s):

Neural Network ◽

Network Architecture ◽

State Of The Art ◽

Neural Computing ◽

Neural Network Architecture ◽

Markov Decision ◽

Planning Algorithm ◽

Multi Agent ◽

Partially Observable ◽

Multi Agent Planning

This paper introduces the IPOMDP-net, a neural network architecture for multi-agent planning under partial observability. It embeds an interactive partially observable Markov decision process (I-POMDP) model and a QMDP planning algorithm that solves the model in a neural network architecture. The IPOMDP-net is fully differentiable and allows for end-to-end training. In the learning phase, we train an IPOMDP-net on various fixed and randomly generated environments in a reinforcement learning setting, assuming observable reinforcements and unknown (randomly initialized) model functions. In the planning phase, we test the trained network on new, unseen variants of the environments under the planning setting, using the trained model to plan without reinforcements. Empirical results show that our model-based IPOMDP-net outperforms the other state-of-the-art modelfree network and generalizes better to larger, unseen environments. Our approach provides a general neural computing architecture for multi-agent planning using I-POMDPs. It suggests that, in a multi-agent setting, having a model of other agents benefits our decision-making, resulting in a policy of higher quality and better generalizability.

Download Full-text

Robustness of Neural Network Emulations of Radiative Transfer Parameterizations in a State-of-the-Art General Circulation Model

10.5194/gmd-2021-114 ◽

2021 ◽

Author(s):

Alexei Belochitski ◽

Vladimir Krasnopolsky

Keyword(s):

Neural Network ◽

Radiative Transfer ◽

Network Architecture ◽

General Circulation ◽

State Of The Art ◽

Circulation Model ◽

Set Design ◽

Neural Network Architecture ◽

Parametric Change ◽

Model Components

Abstract. The ability of Machine-Learning (ML) based model components to generalize to the previously unseen inputs, and the resulting stability of the models that use these components, has been receiving a lot of recent attention, especially when it comes to ML-based parameterizations. At the same time, ML-based emulators of existing parameterizations can be stable, accurate, and fast when used in the model they were specifically designed for. In this work we show that shallow-neural-network-based emulators of radiative transfer parameterizations developed almost a decade ago for a state-of-the-art GCM are robust with respect to the substantial structural and parametric change in the host model: when used in two seven month-long experiments with the new model, they not only remain stable, but generate realistic output. Aspects of neural network architecture and training set design potentially contributing to stability of ML-based model components are discussed.

Download Full-text

Deep learning regression model for antimicrobial peptide design

10.1101/692681 ◽

2019 ◽

Cited By ~ 3

Author(s):

Jacob Witten ◽

Zack Witten

Keyword(s):

Neural Network ◽

Antimicrobial Peptides ◽

Network Architecture ◽

State Of The Art ◽

Machine Learning Techniques ◽

Neural Network Architecture ◽

Antibiotic Resistant ◽

E Coli ◽

Naturally Occurring ◽

Learning Techniques

AbstractAntimicrobial peptides (AMPs) are naturally occurring or synthetic peptides that show promise for treating antibiotic-resistant pathogens. Machine learning techniques are increasingly used to identify naturally occurring AMPs, but there is a dearth of purely computational methods to design novel effective AMPs, which would speed AMP development. We collected a large database, Giant Repository of AMP Activities (GRAMPA), containing AMP sequences and associated MICs. We designed a convolutional neural network to perform combined classification and regression on peptide sequences to quantitatively predict AMP activity against Escherichia coli. Our predictions outperformed the state of the art at AMP classification and were also effective at regression, for which there were no publicly available comparisons. We then used our model to design novel AMPs and experimentally demonstrated activity of these AMPs against the pathogens E. coli, Pseudomonas aeruginosa, and Staphylococcus aureus. Data, code, and neural network architecture and parameters are available at https://github.com/zswitten/Antimicrobial-Peptides.

Download Full-text

Image Classification for Vehicle Type Dataset Using State-of-the-art Convolutional Neural Network Architecture

Proceedings of the 2018 Artificial Intelligence and Cloud Computing Conference on ZZZ - AICCC '18 ◽

10.1145/3299819.3299822 ◽

2018 ◽

Author(s):

Yian Seo ◽

Kyung-shik Shin

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Image Classification ◽

Network Architecture ◽

State Of The Art ◽

Neural Network Architecture ◽

Vehicle Type

Download Full-text

Robust Negative Sampling for Network Embedding

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013191 ◽

2019 ◽

Vol 33 ◽

pp. 3191-3198 ◽

Cited By ~ 3

Author(s):

Mohammadreza Armandpour ◽

Patrick Ding ◽

Jianhua Huang ◽

Xia Hu

Keyword(s):

Neural Network ◽

Network Architecture ◽

Large Scale ◽

Embedding Problem ◽

Network Embedding ◽

Neural Network Architecture ◽

Suitable Candidate ◽

Computationally Expensive ◽

Large Scale Networks ◽

Method R

Many recent network embedding algorithms use negative sampling (NS) to approximate a variant of the computationally expensive Skip-Gram neural network architecture (SGA) objective. In this paper, we provide theoretical arguments that reveal how NS can fail to properly estimate the SGA objective, and why it is not a suitable candidate for the network embedding problem as a distinct objective. We show NS can learn undesirable embeddings, as the result of the “Popular Neighbor Problem.” We use the theory to develop a new method “R-NS” that alleviates the problems of NS by using a more intelligent negative sampling scheme and careful penalization of the embeddings. R-NS is scalable to large-scale networks, and we empirically demonstrate the superiority of R-NS over NS for multi-label classification on a variety of real-world networks including social networks and language networks.

Download Full-text

ThriftyNets: Convolutional Neural Networks with Tiny Parameter Budget

IoT ◽

10.3390/iot2020012 ◽

2021 ◽

Vol 2 (2) ◽

pp. 222-235

Author(s):

Guillaume Coiffier ◽

Ghouthi Boukli Hacene ◽

Vincent Gripon

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Convolutional Neural Network ◽

Spatial Resolution ◽

Network Architecture ◽

Deep Neural Networks ◽

State Of The Art ◽

Feature Maps ◽

Neural Network Architecture

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.

Download Full-text

Conservation laws in a neural network architecture: Enforcing the atom balance of a Julia-based photochemical model (v0.2.0)

10.5194/gmd-2021-402 ◽

2021 ◽

Author(s):

Patrick Obin Sturm ◽

Anthony S. Wexler

Keyword(s):

Neural Network ◽

Conservation Laws ◽

Network Architecture ◽

Neural Network Architecture ◽

Machine Precision ◽

Emissions Scenarios ◽

Hidden Layer ◽

Computational Resources ◽

Atmospheric Phenomena ◽

Insight Into

Abstract. Models of atmospheric phenomena provide insight into climate, air quality, and meteorology, and provide a mechanism for understanding the effect of future emissions scenarios. To accurately represent atmospheric phenomena, these models consume vast quantities of computational resources. Machine learning (ML) techniques such as neural networks have the potential to emulate compute-intensive components of these models to reduce their computational burden. However, such ML surrogate models may lead to nonphysical predictions that are difficult to uncover. Here we present a neural network architecture that enforces conservation laws. Instead of simply predicting properties of interest, a physically interpretable hidden layer within the network predicts fluxes between properties which are subsequently related to the properties of interest. As an example, we design a physics-constrained neural network surrogate model of photochemistry using this approach and find that it conserves atoms as they flow between molecules to machine precision, while outperforming a naïve neural network in terms of accuracy and non-negativity of concentrations.

Download Full-text

Fast Pre-Diagnosis of Neoplastic Changes in Cytology Images Using Machine Learning

Applied Sciences ◽

10.3390/app11167181 ◽

2021 ◽

Vol 11 (16) ◽

pp. 7181

Author(s):

Jakub Caputa ◽

Daria Łukasik ◽

Maciej Wielgosz ◽

Michał Karwatowski ◽

Rafał Frączek ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Network Architecture ◽

State Of The Art ◽

Prototype Model ◽

Model Parameters ◽

Round Cell ◽

Neural Network Architecture ◽

Original Dataset ◽

Automated Tools

We present the experiment results to use the YOLOv3 neural network architecture to automatically detect tumor cells in cytological samples taken from the skin in canines. A rich dataset of 1219 smeared sample images with 28,149 objects was gathered and annotated by the vet doctor to perform the experiments. It covers three types of common round cell neoplasms: mastocytoma, histiocytoma, and lymphoma. The dataset has been thoroughly described in the paper and is publicly available. The YOLOv3 neural network architecture was trained using various schemes involving original dataset modification and the different model parameters. The experiments showed that the prototype model achieved 0.7416 mAP, which outperforms the state-of-the-art machine learning and human estimated results. We also provided a series of analyses that may facilitate ML-based solutions by casting more light on some aspects of its performance. We also presented the main discrepancies between ML-based and human-based diagnoses. This outline may help depict the scenarios and how the automated tools may support the diagnosis process.

Download Full-text

Deep Neural Network Architecture Search for Wearable Heart Rate Estimations

Studies in Health Technology and Informatics - Public Health and Informatics ◽

10.3233/shti210366 ◽

2021 ◽

Author(s):

Daniel Ray ◽

Tim Collins ◽

Prasad Ponnapalli

Keyword(s):

Neural Network ◽

Heart Rate ◽

Deep Learning ◽

Network Architecture ◽

Deep Neural Network ◽

State Of The Art ◽

Data Driven ◽

Network Architectures ◽

Learning Approaches ◽

Neural Network Architecture

Extracting accurate heart rate estimations from wrist-worn photoplethysmography (PPG) devices is challenging due to the signal containing artifacts from several sources. Deep Learning approaches have shown very promising results outperforming classical methods with improvements of 21% and 31% on two state-of-the-art datasets. This paper provides an analysis of several data-driven methods for creating deep neural network architectures with hopes of further improvements.

Download Full-text