scholarly journals Information-Theoretic Generalization Bounds for Meta-Learning and Applications

Entropy ◽  
2021 ◽  
Vol 23 (1) ◽  
pp. 126
Author(s):  
Sharu Theresa Jose ◽  
Osvaldo Simeone

Meta-learning, or “learning to learn”, refers to techniques that infer an inductive bias from data corresponding to multiple related tasks with the goal of improving the sample efficiency for new, previously unobserved, tasks. A key performance measure for meta-learning is the meta-generalization gap, that is, the difference between the average loss measured on the meta-training data and on a new, randomly selected task. This paper presents novel information-theoretic upper bounds on the meta-generalization gap. Two broad classes of meta-learning algorithms are considered that use either separate within-task training and test sets, like model agnostic meta-learning (MAML), or joint within-task training and test sets, like reptile. Extending the existing work for conventional learning, an upper bound on the meta-generalization gap is derived for the former class that depends on the mutual information (MI) between the output of the meta-learning algorithm and its input meta-training data. For the latter, the derived bound includes an additional MI between the output of the per-task learning procedure and corresponding data set to capture within-task uncertainty. Tighter bounds are then developed for the two classes via novel individual task MI (ITMI) bounds. Applications of the derived bounds are finally discussed, including a broad class of noisy iterative algorithms for meta-learning.

Author(s):  
Usman Ahmed ◽  
Jerry Chun-Wei Lin ◽  
Gautam Srivastava

Deep learning methods have led to a state of the art medical applications, such as image classification and segmentation. The data-driven deep learning application can help stakeholders to collaborate. However, limited labelled data set limits the deep learning algorithm to generalize for one domain into another. To handle the problem, meta-learning helps to learn from a small set of data. We proposed a meta learning-based image segmentation model that combines the learning of the state-of-the-art model and then used it to achieve domain adoption and high accuracy. Also, we proposed a prepossessing algorithm to increase the usability of the segments part and remove noise from the new test image. The proposed model can achieve 0.94 precision and 0.92 recall. The ability to increase 3.3% among the state-of-the-art algorithms.


2020 ◽  
pp. 609-623
Author(s):  
Arun Kumar Beerala ◽  
Gobinath R. ◽  
Shyamala G. ◽  
Siribommala Manvitha

Water is the most valuable natural resource for all living things and the ecosystem. The quality of groundwater is changed due to change in ecosystem, industrialisation, and urbanisation, etc. In the study, 60 samples were taken and analysed for various physio-chemical parameters. The sampling locations were located using global positioning system (GPS) and were taken for two consecutive years for two different seasons, monsoon (Nov-Dec) and post-monsoon (Jan-Mar). In 2016-2017 and 2017-2018 pH, EC, and TDS were obtained in the field. Hardness and Chloride are determined using titration method. Nitrate and Sulphate were determined using Spectrophotometer. Machine learning techniques were used to train the data set and to predict the unknown values. The dominant elements of groundwater are as follows: Ca2, Mg2 for cation and Cl-, SO42, NO3− for anions. The regression value for the training data set was found to be 0.90596, and for the entire network, it was found to be 0.81729. The best performance was observed as 0.0022605 at epoch 223.


2018 ◽  
Vol 7 (04) ◽  
pp. 871-888 ◽  
Author(s):  
Sophie J. Lee ◽  
Howard Liu ◽  
Michael D. Ward

Improving geolocation accuracy in text data has long been a goal of automated text processing. We depart from the conventional method and introduce a two-stage supervised machine-learning algorithm that evaluates each location mention to be either correct or incorrect. We extract contextual information from texts, i.e., N-gram patterns for location words, mention frequency, and the context of sentences containing location words. We then estimate model parameters using a training data set and use this model to predict whether a location word in the test data set accurately represents the location of an event. We demonstrate these steps by constructing customized geolocation event data at the subnational level using news articles collected from around the world. The results show that the proposed algorithm outperforms existing geocoders even in a case added post hoc to test the generality of the developed algorithm.


2011 ◽  
Vol 21 (03) ◽  
pp. 247-263 ◽  
Author(s):  
J. P. FLORIDO ◽  
H. POMARES ◽  
I. ROJAS

In function approximation problems, one of the most common ways to evaluate a learning algorithm consists in partitioning the original data set (input/output data) into two sets: learning, used for building models, and test, applied for genuine out-of-sample evaluation. When the partition into learning and test sets does not take into account the variability and geometry of the original data, it might lead to non-balanced and unrepresentative learning and test sets and, thus, to wrong conclusions in the accuracy of the learning algorithm. How the partitioning is made is therefore a key issue and becomes more important when the data set is small due to the need of reducing the pessimistic effects caused by the removal of instances from the original data set. Thus, in this work, we propose a deterministic data mining approach for a distribution of a data set (input/output data) into two representative and balanced sets of roughly equal size taking the variability of the data set into consideration with the purpose of allowing both a fair evaluation of learning's accuracy and to make reproducible machine learning experiments usually based on random distributions. The sets are generated using a combination of a clustering procedure, especially suited for function approximation problems, and a distribution algorithm which distributes the data set into two sets within each cluster based on a nearest-neighbor approach. In the experiments section, the performance of the proposed methodology is reported in a variety of situations through an ANOVA-based statistical study of the results.


2012 ◽  
Vol 461 ◽  
pp. 818-821
Author(s):  
Shi Hu Zhang

The problem of real estate prices are the current focus of the community's concern. Support Vector Machine is a new machine learning algorithm, as its excellent performance of the study, and in small samples to identify many ways, and so has its unique advantages, is now used in many areas. Determination of real estate price is a complicated problem due to its non-linearity and the small quantity of training data. In this study, support vector machine (SVM) is proposed to forecast the price of real estate price in China. The experimental results indicate that the SVM method can achieve greater accuracy than grey model, artificial neural network under the circumstance of small training data. It was also found that the predictive ability of the SVM outperformed those of some traditional pattern recognition methods for the data set used here.


2012 ◽  
Vol 7 (1) ◽  
Author(s):  
Youness El Hamzaoui ◽  
J.A Hernandez ◽  
Abraham Gonzalez Roman ◽  
José Alfredo Rodríguez Ramírez

The aim of this study is to demonstrate the comparison of an artificial neural network (ANN) and an adaptive neuro fuzzy inference system (ANFIS) for the prediction of the coefficient of performance (COP) for a water purification process integrated in an absorption heat transformer system with energy recycling. ANN and ANFIS models take into account the input and output temperatures for each one of the four components (absorber, generator, evaporator, and condenser), as well as two presures and LiBr+H2O concentrations. Experimental results are performed to verify the results from the ANN and ANFIS approaches. For the network, a feedforward with one hidden layer, a Levenberg-Marquardt learning algorithm, a hyperbolic tangent sigmoid transfer function and a linear transfer function were used. The best fitting training data set was obtained with three neurons in the hidden layer. On the validaton data set, simulations and experimental data test were in good agreement (R2>0.9980). However, the ANFIS model was developed using the same input variables. The statistical values are given in as tables. However, comparaison between two models shows that ANN provides better results than the ANFIS results. Finally this paper shows the appropriateness of ANN and ANFIS for the quantitative modeling with reasonable accuracy.


2012 ◽  
Vol 66 (2) ◽  
pp. 239-246
Author(s):  
Xu Hua ◽  
Xue Hengxin ◽  
Chen Zhiguo

In order to overcome the shortcoming of the solution may be trapped into the local minimization in the traditional TSK (Takagi-Sugeno-Kang) fuzzy inference training, this paper attempts to consider the TSK fuzzy system modeling approach based on the visual system principle and the Weber law. This approach not only utilizes the strong capability of identifying objects of human eyes, but also considers the distribution structure of the training data set in parameter regulation. In order to overcome the shortcoming of it adopting the gradient learning algorithm with slow convergence rate, a novel visual TSK fuzzy system model based on evolutional learning is proposed by introducing the particle swarm optimization algorithm. The main advantage of this method lies in its very good optimization, very strong noise immunity and very good interpretability. The new method is applied to long-term hydrological forecasting examples. The simulation results show that the method is feasibile and effective, the new method not only inherits the advantages of traditional visual TSK fuzzy models but also has the better global convergence and accuracy than the traditional model.


Author(s):  
S. Spiegel ◽  
J. Chen

Abstract. Deep neural networks (DNNs) and convolutional neural networks (CNNs) have demonstrated greater robustness and accuracy in classifying two-dimensional images and three-dimensional point clouds compared to more traditional machine learning approaches. However, their main drawback is the need for large quantities of semantically labeled training data sets, which are often out of reach for those with resource constraints. In this study, we evaluated the use of simulated 3D point clouds for training a CNN learning algorithm to segment and classify 3D point clouds of real-world urban environments. The simulation involved collecting light detection and ranging (LiDAR) data using a simulated 16 channel laser scanner within the the CARLA (Car Learning to Act) autonomous vehicle gaming environment. We used this labeled data to train the Kernel Point Convolution (KPConv) and KPConv Segmentation Network for Point Clouds (KP-FCNN), which we tested on real-world LiDAR data from the NPM3D benchmark data set. Our results showed that high accuracy can be achieved using data collected in a simulator.


2020 ◽  
Vol 8 (6) ◽  
pp. 4684-4688

Per the statistics received from BBC, data varies for every earthquake occurred till date. Approximately, up to thousands are dead, about 50,000 are injured, around 1-3 Million are dislocated, while a significant amount go missing and homeless. Almost 100% structural damage is experienced. It also affects the economic loss, varying from 10 to 16 million dollars. A magnitude corresponding to 5 and above is classified as deadliest. The most life-threatening earthquake occurred till date took place in Indonesia where about 3 million were dead, 1-2 million were injured and the structural damage accounted to 100%. Hence, the consequences of earthquake are devastating and are not limited to loss and damage of living as well as nonliving, but it also causes significant amount of change-from surrounding and lifestyle to economic. Every such parameter desiderates into forecasting earthquake. A couple of minutes’ notice and individuals can act to shield themselves from damage and demise; can decrease harm and monetary misfortunes, and property, characteristic assets can be secured. In current scenario, an accurate forecaster is designed and developed, a system that will forecast the catastrophe. It focuses on detecting early signs of earthquake by using machine learning algorithms. System is entitled to basic steps of developing learning systems along with life cycle of data science. Data-sets for Indian sub-continental along with rest of the World are collected from government sources. Pre-processing of data is followed by construction of stacking model that combines Random Forest and Support Vector Machine Algorithms. Algorithms develop this mathematical model reliant on “training data-set”. Model looks for pattern that leads to catastrophe and adapt to it in its building, so as to settle on choices and forecasts without being expressly customized to play out the task. After forecast, we broadcast the message to government officials and across various platforms. The focus of information to obtain is keenly represented by the 3 factors – Time, Locality and Magnitude.


2018 ◽  
Author(s):  
Hamid Mohamadlou ◽  
Saarang Panchavati ◽  
Jacob Calvert ◽  
Anna Lynn-Palevsky ◽  
Christopher Barton ◽  
...  

AbstractPurposeThis study evaluates a machine-learning-based mortality prediction tool.Materials and MethodsWe conducted a retrospective study with data drawn from three academic health centers. Inpatients of at least 18 years of age and with at least one observation of each vital sign were included. Predictions were made at 12, 24, and 48 hours before death. Models fit to training data from each institution were evaluated on hold-out test data from the same institution and data from the remaining institutions. Predictions were compared to those of qSOFA and MEWS using area under the receiver operating characteristic curve (AUROC).ResultsFor training and testing on data from a single institution, machine learning predictions averaged AUROCs of 0.97, 0.96, and 0.95 across institutional test sets for 12-, 24-, and 48-hour predictions, respectively. When trained and tested on data from different hospitals, the algorithm achieved AUROC up to 0.95, 0.93, and 0.91, for 12-, 24-, and 48-hour predictions, respectively. MEWS and qSOFA had average 48-hour AUROCs of 0.86 and 0.82, respectively.ConclusionThis algorithm may help identify patients in need of increased levels of clinical care.


Sign in / Sign up

Export Citation Format

Share Document