scholarly journals TAPAS: Train-Less Accuracy Predictor for Architecture Search

Author(s):  
R. Istrate ◽  
F. Scheidegger ◽  
G. Mariani ◽  
D. Nikolopoulos ◽  
C. Bekas ◽  
...  

In recent years an increasing number of researchers and practitioners have been suggesting algorithms for large-scale neural network architecture search: genetic algorithms, reinforcement learning, learning curve extrapolation, and accuracy predictors. None of them, however, demonstrated highperformance without training new experiments in the presence of unseen datasets. We propose a new deep neural network accuracy predictor, that estimates in fractions of a second classification performance for unseen input datasets, without training. In contrast to previously proposed approaches, our prediction is not only calibrated on the topological network information, but also on the characterization of the dataset-difficulty which allows us to re-tune the prediction without any training. Our predictor achieves a performance which exceeds 100 networks per second on a single GPU, thus creating the opportunity to perform large-scale architecture search within a few minutes. We present results of two searches performed in 400 seconds on a single GPU. Our best discovered networks reach 93.67% accuracy for CIFAR-10 and 81.01% for CIFAR-100, verified by training. These networks are performance competitive with other automatically discovered state-of-the-art networks however we only needed a small fraction of the time to solution and computational resources.

2021 ◽  
Vol 40 (3) ◽  
pp. 1-13
Author(s):  
Lumin Yang ◽  
Jiajie Zhuang ◽  
Hongbo Fu ◽  
Xiangzhi Wei ◽  
Kun Zhou ◽  
...  

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.


Author(s):  
Yanlin Han ◽  
Piotr Gmytrasiewicz

This paper introduces the IPOMDP-net, a neural network architecture for multi-agent planning under partial observability. It embeds an interactive partially observable Markov decision process (I-POMDP) model and a QMDP planning algorithm that solves the model in a neural network architecture. The IPOMDP-net is fully differentiable and allows for end-to-end training. In the learning phase, we train an IPOMDP-net on various fixed and randomly generated environments in a reinforcement learning setting, assuming observable reinforcements and unknown (randomly initialized) model functions. In the planning phase, we test the trained network on new, unseen variants of the environments under the planning setting, using the trained model to plan without reinforcements. Empirical results show that our model-based IPOMDP-net outperforms the other state-of-the-art modelfree network and generalizes better to larger, unseen environments. Our approach provides a general neural computing architecture for multi-agent planning using I-POMDPs. It suggests that, in a multi-agent setting, having a model of other agents benefits our decision-making, resulting in a policy of higher quality and better generalizability.


2021 ◽  
Author(s):  
Alexei Belochitski ◽  
Vladimir Krasnopolsky

Abstract. The ability of Machine-Learning (ML) based model components to generalize to the previously unseen inputs, and the resulting stability of the models that use these components, has been receiving a lot of recent attention, especially when it comes to ML-based parameterizations. At the same time, ML-based emulators of existing parameterizations can be stable, accurate, and fast when used in the model they were specifically designed for. In this work we show that shallow-neural-network-based emulators of radiative transfer parameterizations developed almost a decade ago for a state-of-the-art GCM are robust with respect to the substantial structural and parametric change in the host model: when used in two seven month-long experiments with the new model, they not only remain stable, but generate realistic output. Aspects of neural network architecture and training set design potentially contributing to stability of ML-based model components are discussed.


2019 ◽  
Author(s):  
Jacob Witten ◽  
Zack Witten

AbstractAntimicrobial peptides (AMPs) are naturally occurring or synthetic peptides that show promise for treating antibiotic-resistant pathogens. Machine learning techniques are increasingly used to identify naturally occurring AMPs, but there is a dearth of purely computational methods to design novel effective AMPs, which would speed AMP development. We collected a large database, Giant Repository of AMP Activities (GRAMPA), containing AMP sequences and associated MICs. We designed a convolutional neural network to perform combined classification and regression on peptide sequences to quantitatively predict AMP activity against Escherichia coli. Our predictions outperformed the state of the art at AMP classification and were also effective at regression, for which there were no publicly available comparisons. We then used our model to design novel AMPs and experimentally demonstrated activity of these AMPs against the pathogens E. coli, Pseudomonas aeruginosa, and Staphylococcus aureus. Data, code, and neural network architecture and parameters are available at https://github.com/zswitten/Antimicrobial-Peptides.


Author(s):  
Mohammadreza Armandpour ◽  
Patrick Ding ◽  
Jianhua Huang ◽  
Xia Hu

Many recent network embedding algorithms use negative sampling (NS) to approximate a variant of the computationally expensive Skip-Gram neural network architecture (SGA) objective. In this paper, we provide theoretical arguments that reveal how NS can fail to properly estimate the SGA objective, and why it is not a suitable candidate for the network embedding problem as a distinct objective. We show NS can learn undesirable embeddings, as the result of the “Popular Neighbor Problem.” We use the theory to develop a new method “R-NS” that alleviates the problems of NS by using a more intelligent negative sampling scheme and careful penalization of the embeddings. R-NS is scalable to large-scale networks, and we empirically demonstrate the superiority of R-NS over NS for multi-label classification on a variety of real-world networks including social networks and language networks.


IoT ◽  
2021 ◽  
Vol 2 (2) ◽  
pp. 222-235
Author(s):  
Guillaume Coiffier ◽  
Ghouthi Boukli Hacene ◽  
Vincent Gripon

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.


2021 ◽  
Author(s):  
Patrick Obin Sturm ◽  
Anthony S. Wexler

Abstract. Models of atmospheric phenomena provide insight into climate, air quality, and meteorology, and provide a mechanism for understanding the effect of future emissions scenarios. To accurately represent atmospheric phenomena, these models consume vast quantities of computational resources. Machine learning (ML) techniques such as neural networks have the potential to emulate compute-intensive components of these models to reduce their computational burden. However, such ML surrogate models may lead to nonphysical predictions that are difficult to uncover. Here we present a neural network architecture that enforces conservation laws. Instead of simply predicting properties of interest, a physically interpretable hidden layer within the network predicts fluxes between properties which are subsequently related to the properties of interest. As an example, we design a physics-constrained neural network surrogate model of photochemistry using this approach and find that it conserves atoms as they flow between molecules to machine precision, while outperforming a naïve neural network in terms of accuracy and non-negativity of concentrations.


2021 ◽  
Vol 11 (16) ◽  
pp. 7181
Author(s):  
Jakub Caputa ◽  
Daria Łukasik ◽  
Maciej Wielgosz ◽  
Michał Karwatowski ◽  
Rafał Frączek ◽  
...  

We present the experiment results to use the YOLOv3 neural network architecture to automatically detect tumor cells in cytological samples taken from the skin in canines. A rich dataset of 1219 smeared sample images with 28,149 objects was gathered and annotated by the vet doctor to perform the experiments. It covers three types of common round cell neoplasms: mastocytoma, histiocytoma, and lymphoma. The dataset has been thoroughly described in the paper and is publicly available. The YOLOv3 neural network architecture was trained using various schemes involving original dataset modification and the different model parameters. The experiments showed that the prototype model achieved 0.7416 mAP, which outperforms the state-of-the-art machine learning and human estimated results. We also provided a series of analyses that may facilitate ML-based solutions by casting more light on some aspects of its performance. We also presented the main discrepancies between ML-based and human-based diagnoses. This outline may help depict the scenarios and how the automated tools may support the diagnosis process.


Author(s):  
Daniel Ray ◽  
Tim Collins ◽  
Prasad Ponnapalli

Extracting accurate heart rate estimations from wrist-worn photoplethysmography (PPG) devices is challenging due to the signal containing artifacts from several sources. Deep Learning approaches have shown very promising results outperforming classical methods with improvements of 21% and 31% on two state-of-the-art datasets. This paper provides an analysis of several data-driven methods for creating deep neural network architectures with hopes of further improvements.


Sign in / Sign up

Export Citation Format

Share Document