scholarly journals Multi-Stage Meta-Learning for Few-Shot with Lie Group Network Constraint

Author(s):  
Fang Dong ◽  
Fanzhang Li

Deep learning has achieved lots of successes in many fields, but when trainable sample are extremely limited, deep learning often under or overfitting to few samples. Meta-learning was proposed to solve difficulties in few-shot learning and fast adaptive areas. Meta-learner learns to remember some common knowledge by training on large scale tasks sampled from a certain data distribution to equip generalization when facing unseen new tasks. Due to the limitation of samples, most approaches only use shallow neural network to avoid overfitting and reduce the difficulty of training process, that causes the waste of many extra information when adapting to unseen tasks. Euclidean space-based gradient descent also make meta-learner's update inaccurate. These issues cause many meta-learning model hard to extract feature from samples and update network parameters. In this paper, we propose a novel method by using multi-stage joint training approach to post the bottleneck during adapting process. To accelerate adapt procedure, we also constraint network to Stiefel manifold, thus meta-learner could perform more stable gradient descent in limited steps. Experiment on mini-ImageNet shows that our method reaches better accuracy under 5-way 1-shot and 5-way 5-shot conditions.

Entropy ◽  
2020 ◽  
Vol 22 (6) ◽  
pp. 625
Author(s):  
Fang Dong ◽  
Li Liu ◽  
Fanzhang Li

Deep learning has achieved many successes in different fields but can sometimes encounter an overfitting problem when there are insufficient amounts of labeled samples. In solving the problem of learning with limited training data, meta-learning is proposed to remember some common knowledge by leveraging a large number of similar few-shot tasks and learning how to adapt a base-learner to a new task for which only a few labeled samples are available. Current meta-learning approaches typically uses Shallow Neural Networks (SNNs) to avoid overfitting, thus wasting much information in adapting to a new task. Moreover, the Euclidean space-based gradient descent in existing meta-learning approaches always lead to an inaccurate update of meta-learners, which poses a challenge to meta-learning models in extracting features from samples and updating network parameters. In this paper, we propose a novel meta-learning model called Multi-Stage Meta-Learning (MSML) to post the bottleneck during the adapting process. The proposed method constrains a network to Stiefel manifold so that a meta-learner could perform a more stable gradient descent in limited steps so that the adapting process can be accelerated. An experiment on the mini-ImageNet demonstrates that the proposed method reached a better accuracy under 5-way 1-shot and 5-way 5-shot conditions.


2020 ◽  
Vol 498 (4) ◽  
pp. 5620-5628
Author(s):  
Y Su ◽  
Y Zhang ◽  
G Liang ◽  
J A ZuHone ◽  
D J Barnes ◽  
...  

ABSTRACT The origin of the diverse population of galaxy clusters remains an unexplained aspect of large-scale structure formation and cluster evolution. We present a novel method of using X-ray images to identify cool core (CC), weak cool core (WCC), and non-cool core (NCC) clusters of galaxies that are defined by their central cooling times. We employ a convolutional neural network, ResNet-18, which is commonly used for image analysis, to classify clusters. We produce mock Chandra X-ray observations for a sample of 318 massive clusters drawn from the IllustrisTNG simulations. The network is trained and tested with low-resolution mock Chandra images covering a central 1 Mpc square for the clusters in our sample. Without any spectral information, the deep learning algorithm is able to identify CC, WCC, and NCC clusters, achieving balanced accuracies (BAcc) of 92 per cent, 81 per cent, and 83 per cent, respectively. The performance is superior to classification by conventional methods using central gas densities, with an average ${\rm BAcc}=81{{\ \rm per\ cent}}$, or surface brightness concentrations, giving ${\rm BAcc}=73{{\ \rm per\ cent}}$. We use class activation mapping to localize discriminative regions for the classification decision. From this analysis, we observe that the network has utilized regions from cluster centres out to r ≈ 300 kpc and r ≈ 500 kpc to identify CC and NCC clusters, respectively. It may have recognized features in the intracluster medium that are associated with AGN feedback and disruptive major mergers.


2021 ◽  
Author(s):  
Qihang Wang ◽  
Feng Liu ◽  
Guihong Wan ◽  
Ying Chen

AbstractMonitoring the depth of unconsciousness during anesthesia is useful in both clinical settings and neuroscience investigations to understand brain mechanisms. Electroencephalogram (EEG) has been used as an objective means of characterizing brain altered arousal and/or cognition states induced by anesthetics in real-time. Different general anesthetics affect cerebral electrical activities in different ways. However, the performance of conventional machine learning models on EEG data is unsatisfactory due to the low Signal to Noise Ratio (SNR) in the EEG signals, especially in the office-based anesthesia EEG setting. Deep learning models have been used widely in the field of Brain Computer Interface (BCI) to perform classification and pattern recognition tasks due to their capability of good generalization and handling noises. Compared to other BCI applications, where deep learning has demonstrated encouraging results, the deep learning approach for classifying different brain consciousness states under anesthesia has been much less investigated. In this paper, we propose a new framework based on meta-learning using deep neural networks, named Anes-MetaNet, to classify brain states under anesthetics. The Anes-MetaNet is composed of Convolutional Neural Networks (CNN) to extract power spectrum features, and a time consequence model based on Long Short-Term Memory (LSTM) Networks to capture the temporal dependencies, and a meta-learning framework to handle large cross-subject variability. We used a multi-stage training paradigm to improve the performance, which is justified by visualizing the high-level feature mapping. Experiments on the office-based anesthesia EEG dataset demonstrate the effectiveness of our proposed Anes-MetaNet by comparison of existing methods.


2021 ◽  
Author(s):  
Christian Bergler ◽  
Manuel Schmitt ◽  
Andreas Maier ◽  
Helena Symonds ◽  
Paul Spong ◽  
...  

GigaScience ◽  
2021 ◽  
Vol 10 (6) ◽  
Author(s):  
Sen Li ◽  
Zeyu Du ◽  
Xiangjie Meng ◽  
Yang Zhang

Abstract Motivation Malaria, a mosquito-borne infectious disease affecting humans and other animals, is widespread in tropical and subtropical regions. Microscopy is the most common method for diagnosing the malaria parasite from stained blood smear samples. However, this technique is time consuming and must be performed by a well-trained professional, yet it remains prone to errors. Distinguishing the multiple growth stages of parasites remains an especially challenging task. Results In this article, we develop a novel deep learning approach for the recognition of malaria parasites of various stages in blood smear images using a deep transfer graph convolutional network (DTGCN). To our knowledge, this is the first application of graph convolutional network (GCN) on multi-stage malaria parasite recognition in such images. The proposed DTGCN model is based on unsupervised learning by transferring knowledge learnt from source images that contain the discriminative morphology characteristics of multi-stage malaria parasites. This transferred information guarantees the effectiveness of the target parasite recognition. This approach first learns the identical representations from the source to establish topological correlations between source class groups and the unlabelled target samples. At this stage, the GCN is implemented to extract graph feature representations for multi-stage malaria parasite recognition. The proposed method showed higher accuracy and effectiveness in publicly available microscopic images of multi-stage malaria parasites compared to a wide range of state-of-the-art approaches. Furthermore, this method is also evaluated on a large-scale dataset of unseen malaria parasites and the Babesia dataset. Availability Code and dataset are available at https://github.com/senli2018/DTGCN_2021 under a MIT license.


PLoS ONE ◽  
2021 ◽  
Vol 16 (4) ◽  
pp. e0249318
Author(s):  
Sung-Bae Cho ◽  
Jin-Young Kim

Urban mobility is a vital aspect of any city and often influences its physical shape as well as its level of economic and social development. A thorough analysis of mobility patterns in urban areas can provide various benefits, such as the prediction of traffic flow and public transportation usage. In particular, based on its exceptional ability to extract patterns from complex large-scale data, embedding based on deep learning is a promising method for analyzing the mobility patterns of urban residents. However, as urban mobility becomes increasingly complex, it becomes difficult to embed patterns into a single vector because of its limited capacity. In this paper, we propose a novel method for analyzing urban mobility based on deep learning. The proposed method involves clustering mobility patterns and embedding them to capture their implicit meaning. Clustering groups mobility patterns based on their spatiotemporal characteristics, and embedding provides meaningful information regarding both individual residents (i.e., personalized mobility) and all residents as a whole, enabling a more effective analysis of mobility patterns. Experiments were performed to predict the successive points of interest (POIs) based on transportation data collected from 1.5 million citizens in a large metropolitan city; the results demonstrate that the proposed method achieves top-1, 3, and 5 accuracies of 73.64%, 88.65%, and 91.54%, respectively, which are much higher than those of the conventional method (59.48%, 75.85%, and 80.1%, respectively). We also demonstrate that the proposed method facilitates the analysis of urban mobility through arithmetic operations between POI vectors.


2008 ◽  
Vol 59 (11) ◽  
Author(s):  
Iulia Lupan ◽  
Sergiu Chira ◽  
Maria Chiriac ◽  
Nicolae Palibroda ◽  
Octavian Popescu

Amino acids are obtained by bacterial fermentation, extraction from natural protein or enzymatic synthesis from specific substrates. With the introduction of recombinant DNA technology, it has become possible to apply more rational approaches to enzymatic synthesis of amino acids. Aspartase (L-aspartate ammonia-lyase) catalyzes the reversible deamination of L-aspartic acid to yield fumaric acid and ammonia. It is one of the most important industrial enzymes used to produce L-aspartic acid on a large scale. Here we described a novel method for [15N] L-aspartic synthesis from fumarate and ammonia (15NH4Cl) using a recombinant aspartase.


2020 ◽  
Author(s):  
Anusha Ampavathi ◽  
Vijaya Saradhi T

UNSTRUCTURED Big data and its approaches are generally helpful for healthcare and biomedical sectors for predicting the disease. For trivial symptoms, the difficulty is to meet the doctors at any time in the hospital. Thus, big data provides essential data regarding the diseases on the basis of the patient’s symptoms. For several medical organizations, disease prediction is important for making the best feasible health care decisions. Conversely, the conventional medical care model offers input as structured that requires more accurate and consistent prediction. This paper is planned to develop the multi-disease prediction using the improvised deep learning concept. Here, the different datasets pertain to “Diabetes, Hepatitis, lung cancer, liver tumor, heart disease, Parkinson’s disease, and Alzheimer’s disease”, from the benchmark UCI repository is gathered for conducting the experiment. The proposed model involves three phases (a) Data normalization (b) Weighted normalized feature extraction, and (c) prediction. Initially, the dataset is normalized in order to make the attribute's range at a certain level. Further, weighted feature extraction is performed, in which a weight function is multiplied with each attribute value for making large scale deviation. Here, the weight function is optimized using the combination of two meta-heuristic algorithms termed as Jaya Algorithm-based Multi-Verse Optimization algorithm (JA-MVO). The optimally extracted features are subjected to the hybrid deep learning algorithms like “Deep Belief Network (DBN) and Recurrent Neural Network (RNN)”. As a modification to hybrid deep learning architecture, the weight of both DBN and RNN is optimized using the same hybrid optimization algorithm. Further, the comparative evaluation of the proposed prediction over the existing models certifies its effectiveness through various performance measures.


2017 ◽  
Vol 14 (9) ◽  
pp. 1513-1517 ◽  
Author(s):  
Rodrigo F. Berriel ◽  
Andre Teixeira Lopes ◽  
Alberto F. de Souza ◽  
Thiago Oliveira-Santos
Keyword(s):  

Author(s):  
Mathieu Turgeon-Pelchat ◽  
Samuel Foucher ◽  
Yacine Bouroubi

Sign in / Sign up

Export Citation Format

Share Document