IPOMDP-Net: A Deep Neural Network for Partially Observable Multi-Agent Planning Using Interactive POMDPs

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016062 ◽

2019 ◽

Vol 33 ◽

pp. 6062-6069 ◽

Cited By ~ 1

Author(s):

Yanlin Han ◽

Piotr Gmytrasiewicz

Keyword(s):

Neural Network ◽

Network Architecture ◽

State Of The Art ◽

Neural Computing ◽

Neural Network Architecture ◽

Markov Decision ◽

Planning Algorithm ◽

Multi Agent ◽

Partially Observable ◽

Multi Agent Planning

This paper introduces the IPOMDP-net, a neural network architecture for multi-agent planning under partial observability. It embeds an interactive partially observable Markov decision process (I-POMDP) model and a QMDP planning algorithm that solves the model in a neural network architecture. The IPOMDP-net is fully differentiable and allows for end-to-end training. In the learning phase, we train an IPOMDP-net on various fixed and randomly generated environments in a reinforcement learning setting, assuming observable reinforcements and unknown (randomly initialized) model functions. In the planning phase, we test the trained network on new, unseen variants of the environments under the planning setting, using the trained model to plan without reinforcements. Empirical results show that our model-based IPOMDP-net outperforms the other state-of-the-art modelfree network and generalizes better to larger, unseen environments. Our approach provides a general neural computing architecture for multi-agent planning using I-POMDPs. It suggests that, in a multi-agent setting, having a model of other agents benefits our decision-making, resulting in a policy of higher quality and better generalizability.

Download Full-text

Towards Heterogeneous Multi-Agent Reinforcement Learning with Graph Neural Networks

10.5753/eniac.2020.12161 ◽

2020 ◽

Author(s):

Douglas Meneghetti ◽

Reinaldo Bianchi

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Communication Channels ◽

Neural Network Architecture ◽

Graph Representations ◽

Labeled Graph ◽

Multiple Agent ◽

Multi Agent ◽

Graph Neural Networks

This work proposes a neural network architecture that learns policies for multiple agent classes in a heterogeneous multi-agent reinforcement setting. The proposed network uses directed labeled graph representations for states, encodes feature vectors of different sizes for different entity classes, uses relational graph convolution layers to model different communication channels between entity types and learns distinct policies for different agent classes, sharing parameters wherever possible. Results have shown that specializing the communication channels between entity classes is a promising step to achieve higher performance in environments composed of heterogeneous entities.

Download Full-text

Robustness of Neural Network Emulations of Radiative Transfer Parameterizations in a State-of-the-Art General Circulation Model

10.5194/gmd-2021-114 ◽

2021 ◽

Author(s):

Alexei Belochitski ◽

Vladimir Krasnopolsky

Keyword(s):

Neural Network ◽

Radiative Transfer ◽

Network Architecture ◽

General Circulation ◽

State Of The Art ◽

Circulation Model ◽

Set Design ◽

Neural Network Architecture ◽

Parametric Change ◽

Model Components

Abstract. The ability of Machine-Learning (ML) based model components to generalize to the previously unseen inputs, and the resulting stability of the models that use these components, has been receiving a lot of recent attention, especially when it comes to ML-based parameterizations. At the same time, ML-based emulators of existing parameterizations can be stable, accurate, and fast when used in the model they were specifically designed for. In this work we show that shallow-neural-network-based emulators of radiative transfer parameterizations developed almost a decade ago for a state-of-the-art GCM are robust with respect to the substantial structural and parametric change in the host model: when used in two seven month-long experiments with the new model, they not only remain stable, but generate realistic output. Aspects of neural network architecture and training set design potentially contributing to stability of ML-based model components are discussed.

Download Full-text

Deep learning regression model for antimicrobial peptide design

10.1101/692681 ◽

2019 ◽

Cited By ~ 3

Author(s):

Jacob Witten ◽

Zack Witten

Keyword(s):

Neural Network ◽

Antimicrobial Peptides ◽

Network Architecture ◽

State Of The Art ◽

Machine Learning Techniques ◽

Neural Network Architecture ◽

Antibiotic Resistant ◽

E Coli ◽

Naturally Occurring ◽

Learning Techniques

AbstractAntimicrobial peptides (AMPs) are naturally occurring or synthetic peptides that show promise for treating antibiotic-resistant pathogens. Machine learning techniques are increasingly used to identify naturally occurring AMPs, but there is a dearth of purely computational methods to design novel effective AMPs, which would speed AMP development. We collected a large database, Giant Repository of AMP Activities (GRAMPA), containing AMP sequences and associated MICs. We designed a convolutional neural network to perform combined classification and regression on peptide sequences to quantitatively predict AMP activity against Escherichia coli. Our predictions outperformed the state of the art at AMP classification and were also effective at regression, for which there were no publicly available comparisons. We then used our model to design novel AMPs and experimentally demonstrated activity of these AMPs against the pathogens E. coli, Pseudomonas aeruginosa, and Staphylococcus aureus. Data, code, and neural network architecture and parameters are available at https://github.com/zswitten/Antimicrobial-Peptides.

Download Full-text

Image Classification for Vehicle Type Dataset Using State-of-the-art Convolutional Neural Network Architecture

Proceedings of the 2018 Artificial Intelligence and Cloud Computing Conference on ZZZ - AICCC '18 ◽

10.1145/3299819.3299822 ◽

2018 ◽

Author(s):

Yian Seo ◽

Kyung-shik Shin

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Image Classification ◽

Network Architecture ◽

State Of The Art ◽

Neural Network Architecture ◽

Vehicle Type

Download Full-text

ThriftyNets: Convolutional Neural Networks with Tiny Parameter Budget

IoT ◽

10.3390/iot2020012 ◽

2021 ◽

Vol 2 (2) ◽

pp. 222-235

Author(s):

Guillaume Coiffier ◽

Ghouthi Boukli Hacene ◽

Vincent Gripon

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Convolutional Neural Network ◽

Spatial Resolution ◽

Network Architecture ◽

Deep Neural Networks ◽

State Of The Art ◽

Feature Maps ◽

Neural Network Architecture

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.

Download Full-text

ModGNN: Expert Policy Approximation in Multi-Agent Systems with a Modular Graph Neural Network Architecture

10.1109/icra48506.2021.9561386 ◽

2021 ◽

Author(s):

Ryan Kortvelesy ◽

Amanda Prorok

Keyword(s):

Neural Network ◽

Network Architecture ◽

Multi Agent Systems ◽

Neural Network Architecture ◽

Agent Systems ◽

Multi Agent

Download Full-text

Fast Pre-Diagnosis of Neoplastic Changes in Cytology Images Using Machine Learning

Applied Sciences ◽

10.3390/app11167181 ◽

2021 ◽

Vol 11 (16) ◽

pp. 7181

Author(s):

Jakub Caputa ◽

Daria Łukasik ◽

Maciej Wielgosz ◽

Michał Karwatowski ◽

Rafał Frączek ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Network Architecture ◽

State Of The Art ◽

Prototype Model ◽

Model Parameters ◽

Round Cell ◽

Neural Network Architecture ◽

Original Dataset ◽

Automated Tools

We present the experiment results to use the YOLOv3 neural network architecture to automatically detect tumor cells in cytological samples taken from the skin in canines. A rich dataset of 1219 smeared sample images with 28,149 objects was gathered and annotated by the vet doctor to perform the experiments. It covers three types of common round cell neoplasms: mastocytoma, histiocytoma, and lymphoma. The dataset has been thoroughly described in the paper and is publicly available. The YOLOv3 neural network architecture was trained using various schemes involving original dataset modification and the different model parameters. The experiments showed that the prototype model achieved 0.7416 mAP, which outperforms the state-of-the-art machine learning and human estimated results. We also provided a series of analyses that may facilitate ML-based solutions by casting more light on some aspects of its performance. We also presented the main discrepancies between ML-based and human-based diagnoses. This outline may help depict the scenarios and how the automated tools may support the diagnosis process.

Download Full-text

Deep Neural Network Architecture Search for Wearable Heart Rate Estimations

Studies in Health Technology and Informatics - Public Health and Informatics ◽

10.3233/shti210366 ◽

2021 ◽

Author(s):

Daniel Ray ◽

Tim Collins ◽

Prasad Ponnapalli

Keyword(s):

Neural Network ◽

Heart Rate ◽

Deep Learning ◽

Network Architecture ◽

Deep Neural Network ◽

State Of The Art ◽

Data Driven ◽

Network Architectures ◽

Learning Approaches ◽

Neural Network Architecture

Extracting accurate heart rate estimations from wrist-worn photoplethysmography (PPG) devices is challenging due to the signal containing artifacts from several sources. Deep Learning approaches have shown very promising results outperforming classical methods with improvements of 21% and 31% on two state-of-the-art datasets. This paper provides an analysis of several data-driven methods for creating deep neural network architectures with hopes of further improvements.

Download Full-text

Predicting Retrosynthetic Reaction using Self-Corrected Transformer Neural Networks

10.26434/chemrxiv.8427776.v1 ◽

2019 ◽

Author(s):

Shuangjia Zheng ◽

Jiahua Rao ◽

Zhongyue Zhang ◽

Jun Xu ◽

Yuedong Yang

Keyword(s):

Neural Network ◽

Network Architecture ◽

State Of The Art ◽

Neural Network Architecture ◽

Training Set ◽

Computer Aided ◽

Synthesis Planning ◽

Target Molecules ◽

Template Free ◽

Synthetic Routes

<p><a>Synthesis planning is the process of recursively decomposing target molecules into available precursors. Computer-aided retrosynthesis can potentially assist chemists in designing synthetic routes, but at present it is cumbersome and provides results of dissatisfactory quality. In this study, we develop a template-free self-corrected retrosynthesis predictor (SCROP) to perform a retrosynthesis prediction task trained by using the Transformer neural network architecture. In the method, the retrosynthesis planning is converted as a machine translation problem between molecular linear notations of reactants and the products. Coupled with a neural network-based syntax corrector, our method achieves an accuracy of 59.0% on a standard benchmark dataset, which increases >21% over other deep learning methods, and >6% over template-based methods. More importantly, our method shows an accuracy 1.7 times higher than other state-of-the-art methods for compounds not appearing in the training set.</a></p>

Download Full-text

TAPAS: Train-Less Accuracy Predictor for Architecture Search

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013927 ◽

2019 ◽

Vol 33 ◽

pp. 3927-3934 ◽

Cited By ~ 4

Author(s):

R. Istrate ◽

F. Scheidegger ◽

G. Mariani ◽

D. Nikolopoulos ◽

C. Bekas ◽

...

Keyword(s):

Neural Network ◽

Network Architecture ◽

Large Scale ◽

State Of The Art ◽

Classification Performance ◽

Neural Network Architecture ◽

Network Information ◽

Topological Network ◽

Computational Resources

In recent years an increasing number of researchers and practitioners have been suggesting algorithms for large-scale neural network architecture search: genetic algorithms, reinforcement learning, learning curve extrapolation, and accuracy predictors. None of them, however, demonstrated highperformance without training new experiments in the presence of unseen datasets. We propose a new deep neural network accuracy predictor, that estimates in fractions of a second classification performance for unseen input datasets, without training. In contrast to previously proposed approaches, our prediction is not only calibrated on the topological network information, but also on the characterization of the dataset-difficulty which allows us to re-tune the prediction without any training. Our predictor achieves a performance which exceeds 100 networks per second on a single GPU, thus creating the opportunity to perform large-scale architecture search within a few minutes. We present results of two searches performed in 400 seconds on a single GPU. Our best discovered networks reach 93.67% accuracy for CIFAR-10 and 81.01% for CIFAR-100, verified by training. These networks are performance competitive with other automatically discovered state-of-the-art networks however we only needed a small fraction of the time to solution and computational resources.

Download Full-text