Dataset2Vec: learning dataset meta-features

Data Mining and Knowledge Discovery ◽

10.1007/s10618-021-00737-9 ◽

2021 ◽

Author(s):

Hadi S. Jomaa ◽

Lars Schmidt-Thieme ◽

Josif Grabocka

Keyword(s):

Domain Knowledge ◽

Large Scale ◽

Learning Task ◽

Similarity Learning ◽

Learning To Learn ◽

Prior Learning ◽

Optimization Task ◽

Learning Tasks ◽

Meta Learning ◽

Data Driven Approach

AbstractMeta-learning, or learning to learn, is a machine learning approach that utilizes prior learning experiences to expedite the learning process on unseen tasks. As a data-driven approach, meta-learning requires meta-features that represent the primary learning tasks or datasets, and are estimated traditonally as engineered dataset statistics that require expert domain knowledge tailored for every meta-task. In this paper, first, we propose a meta-feature extractor called Dataset2Vec that combines the versatility of engineered dataset meta-features with the expressivity of meta-features learned by deep neural networks. Primary learning tasks or datasets are represented as hierarchical sets, i.e., as a set of sets, esp. as a set of predictor/target pairs, and then a DeepSet architecture is employed to regress meta-features on them. Second, we propose a novel auxiliary meta-learning task with abundant data called dataset similarity learning that aims to predict if two batches stem from the same dataset or different ones. In an experiment on a large-scale hyperparameter optimization task for 120 UCI datasets with varying schemas as a meta-learning task, we show that the meta-features of Dataset2Vec outperform the expert engineered meta-features and thus demonstrate the usefulness of learned meta-features for datasets with varying schemas for the first time.

Download Full-text

Learning to learn: 8-month-old infants meta-learn from sparse evidence

10.31234/osf.io/dc9s6 ◽

2021 ◽

Author(s):

Francesco Poli ◽

Tommaso Ghilardi ◽

Rogier B. Mars ◽

Max Hinne ◽

Sabine Hunnius

Keyword(s):

Hierarchical Models ◽

Learning Task ◽

Social World ◽

Task Structure ◽

Learning To Learn ◽

Meta Learning ◽

Gaze Behaviour ◽

Efficient Learning ◽

Prior Experiences ◽

Future Information

Infants learn to navigate the complexity of the physical and social world at an outstanding pace, but how they accomplish this learning is still unknown. Recent advances in human and artificial intelligence research propose that a key feature to achieve quick and efficient learning is meta-learning, the ability to make use of prior experiences to optimize how future information is acquired. Here we show that 8-month-old infants successfully engage in meta-learning within very short timespans. We developed a Bayesian model that captures how infants attribute informativity to incoming events, and how this process is optimized by the meta-parameters of their hierarchical models over the task structure. We fitted the model using infants’ gaze behaviour during a learning task. Our results reveal that infants do not simply accumulate experiences, but actively use them to generate new inductive biases that allow learning to proceed faster in the future.

Download Full-text

Reinforcement meta-learning optimizes visuomotor learning

10.1101/2020.01.19.912048 ◽

2020 ◽

Cited By ~ 3

Author(s):

Taisei Sugiyama ◽

Nicolas Schweighofer ◽

Jun Izawa

Keyword(s):

Learning Outcome ◽

Learning To Learn ◽

Visuomotor Learning ◽

State Action ◽

Learning Tasks ◽

Meta Learning ◽

Mechanism Of Interaction ◽

Optimal Action ◽

The Brain ◽

Motivational Bias

AbstractReinforcement learning enables the brain to learn optimal action selection, such as go or not go, by forming state-action and action-outcome associations. Does this mechanism also optimize the brain’s willingness to learn, such as learn or not learn? Learning to learn by rewards, i.e., reinforcement meta-learning, is a crucial mechanism for machines to develop flexibility in learning, which is also considered in the brain without empirical examinations. Here, we show that humans learn to learn or not learn to maximize rewards in visuomotor learning tasks. We also show that this regulation of learning is not a motivational bias but is a result of an instrumental, active process, which takes into account the learning-outcome structure. Our results thus demonstrate the existence of reinforcement meta-learning in the human brain. Because motor learning is a process of minimizing sensory errors, our findings uncover an essential mechanism of interaction between reward and error.

Download Full-text

Time Series Classification with Meta Learning

10.5121/csit.2020.101415 ◽

2020 ◽

Author(s):

Aman Gupta ◽

Yadul Raghav

Keyword(s):

Time Series ◽

Multivariate Time Series ◽

Learning Algorithms ◽

Classification Task ◽

Time Series Classification ◽

Learning To Learn ◽

Learning Tasks ◽

Task Classification ◽

Meta Learning ◽

Speed Up

Meta-Learning, the ability of learning to learn, helps to train a model to learn very quickly on a variety of learning tasks; adapting to any new environment with a minimal number of examples allows us to speed up the performance and training of the model. It solves the traditional machine learning paradigm problem, where it needed a vast dataset to learn any task to train the model from scratch. Much work has already been done on meta-learning in various learning environments, including reinforcement learning, regression task, classification task with image, and other datasets, but it is yet to be explored with the time-series domain. In this work, we aimed to understand the effectiveness of meta-learning algorithms in time series classification task with multivariate time-series datasets. We present the algorithm’s performance on the time series archive, where the result shows that using meta-learning algorithms leads to faster convergence with fewer iteration over the non-meta-learning equivalent.

Download Full-text

Leveraging Edge Intelligence for Video Analytics in Smart City Applications

Information ◽

10.3390/info12010014 ◽

2020 ◽

Vol 12 (1) ◽

pp. 14

Author(s):

Aluizio Rocha Neto ◽

Thiago P. Silva ◽

Thais Batista ◽

Flávia C. Delicato ◽

Paulo F. Pires ◽

...

Keyword(s):

Distributed System ◽

Smart City ◽

Large Scale ◽

Facial Recognition ◽

Public Spaces ◽

Hardware Accelerators ◽

Video Streams ◽

Video Analytics ◽

Learning Tasks ◽

Naive Approach

In smart city scenarios, the huge proliferation of monitoring cameras scattered in public spaces has posed many challenges to network and processing infrastructure. A few dozen cameras are enough to saturate the city’s backbone. In addition, most smart city applications require a real-time response from the system in charge of processing such large-scale video streams. Finding a missing person using facial recognition technology is one of these applications that require immediate action on the place where that person is. In this paper, we tackle these challenges presenting a distributed system for video analytics designed to leverage edge computing capabilities. Our approach encompasses architecture, methods, and algorithms for: (i) dividing the burdensome processing of large-scale video streams into various machine learning tasks; and (ii) deploying these tasks as a workflow of data processing in edge devices equipped with hardware accelerators for neural networks. We also propose the reuse of nodes running tasks shared by multiple applications, e.g., facial recognition, thus improving the system’s processing throughput. Simulations showed that, with our algorithm to distribute the workload, the time to process a workflow is about 33% faster than a naive approach.

Download Full-text

Information-Theoretic Generalization Bounds for Meta-Learning and Applications

Entropy ◽

10.3390/e23010126 ◽

2021 ◽

Vol 23 (1) ◽

pp. 126

Author(s):

Sharu Theresa Jose ◽

Osvaldo Simeone

Keyword(s):

Learning Algorithm ◽

Broad Class ◽

Performance Measure ◽

Training Data ◽

Learning To Learn ◽

Data Set ◽

Information Theoretic ◽

Meta Learning ◽

Task Training ◽

Test Sets

Meta-learning, or “learning to learn”, refers to techniques that infer an inductive bias from data corresponding to multiple related tasks with the goal of improving the sample efficiency for new, previously unobserved, tasks. A key performance measure for meta-learning is the meta-generalization gap, that is, the difference between the average loss measured on the meta-training data and on a new, randomly selected task. This paper presents novel information-theoretic upper bounds on the meta-generalization gap. Two broad classes of meta-learning algorithms are considered that use either separate within-task training and test sets, like model agnostic meta-learning (MAML), or joint within-task training and test sets, like reptile. Extending the existing work for conventional learning, an upper bound on the meta-generalization gap is derived for the former class that depends on the mutual information (MI) between the output of the meta-learning algorithm and its input meta-training data. For the latter, the derived bound includes an additional MI between the output of the per-task learning procedure and corresponding data set to capture within-task uncertainty. Tighter bounds are then developed for the two classes via novel individual task MI (ITMI) bounds. Applications of the derived bounds are finally discussed, including a broad class of noisy iterative algorithms for meta-learning.

Download Full-text

Accelerating In-Transit Co-Processing for Scientific Simulations Using Region-Based Data-Driven Analysis

Algorithms ◽

10.3390/a14050154 ◽

2021 ◽

Vol 14 (5) ◽

pp. 154

Author(s):

Marcus Walldén ◽

Masao Okita ◽

Fumihiko Ino ◽

Dimitris Drikakis ◽

Ioannis Kokkinakis

Keyword(s):

Large Scale ◽

Data Driven ◽

Data Sets ◽

Output Constraints ◽

Data Driven Approach ◽

Scientific Simulations ◽

Multiple Metrics ◽

In Transit ◽

Multiple Compression ◽

Large Scale Simulations

Increasing processing capabilities and input/output constraints of supercomputers have increased the use of co-processing approaches, i.e., visualizing and analyzing data sets of simulations on the fly. We present a method that evaluates the importance of different regions of simulation data and a data-driven approach that uses the proposed method to accelerate in-transit co-processing of large-scale simulations. We use the importance metrics to simultaneously employ multiple compression methods on different data regions to accelerate the in-transit co-processing. Our approach strives to adaptively compress data on the fly and uses load balancing to counteract memory imbalances. We demonstrate the method’s efficiency through a fluid mechanics application, a Richtmyer–Meshkov instability simulation, showing how to accelerate the in-transit co-processing of simulations. The results show that the proposed method expeditiously can identify regions of interest, even when using multiple metrics. Our approach achieved a speedup of 1.29× in a lossless scenario. The data decompression time was sped up by 2× compared to using a single compression method uniformly.

Download Full-text

Improving the management of type 2 diabetes through large-scale general practice: the role of a data-driven and technology-enabled education programme

BMJ Open Quality ◽

10.1136/bmjoq-2020-001087 ◽

2021 ◽

Vol 10 (1) ◽

pp. e001087

Author(s):

Tarek F Radwan ◽

Yvette Agyako ◽

Alireza Ettefaghian ◽

Tahira Kamran ◽

Omar Din ◽

...

Keyword(s):

Type 2 Diabetes ◽

Primary Care ◽

Large Scale ◽

Education Programme ◽

Educational Programme ◽

Data Driven ◽

Treatment Targets ◽

Care Processes ◽

Data Driven Approach

A quality improvement (QI) scheme was launched in 2017, covering a large group of 25 general practices working with a deprived registered population. The aim was to improve the measurable quality of care in a population where type 2 diabetes (T2D) care had previously proved challenging. A complex set of QI interventions were co-designed by a team of primary care clinicians and educationalists and managers. These interventions included organisation-wide goal setting, using a data-driven approach, ensuring staff engagement, implementing an educational programme for pharmacists, facilitating web-based QI learning at-scale and using methods which ensured sustainability. This programme was used to optimise the management of T2D through improving the eight care processes and three treatment targets which form part of the annual national diabetes audit for patients with T2D. With the implemented improvement interventions, there was significant improvement in all care processes and all treatment targets for patients with diabetes. Achievement of all the eight care processes improved by 46.0% (p<0.001) while achievement of all three treatment targets improved by 13.5% (p<0.001). The QI programme provides an example of a data-driven large-scale multicomponent intervention delivered in primary care in ethnically diverse and socially deprived areas.

Download Full-text

Adaptive quantum state tomography with neural networks

npj Quantum Information ◽

10.1038/s41534-021-00436-9 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Yihui Quek ◽

Stanislav Fort ◽

Hui Khoon Ng

Keyword(s):

Machine Learning ◽

Quantum State ◽

The State ◽

Learning To Learn ◽

Seamless Integration ◽

Quantum State Tomography ◽

Flexible Machine ◽

Meta Learning ◽

Long Time ◽

State Tomography

AbstractCurrent algorithms for quantum state tomography (QST) are costly both on the experimental front, requiring measurement of many copies of the state, and on the classical computational front, needing a long time to analyze the gathered data. Here, we introduce neural adaptive quantum state tomography (NAQT), a fast, flexible machine-learning-based algorithm for QST that adapts measurements and provides orders of magnitude faster processing while retaining state-of-the-art reconstruction accuracy. As in other adaptive QST schemes, measurement adaptation makes use of the information gathered from previous measured copies of the state to perform a targeted sensing of the next copy, maximizing the information gathered from that next copy. Our NAQT approach allows for a rapid and seamless integration of measurement adaptation and statistical inference, using a neural-network replacement of the standard Bayes’ update, to obtain the best estimate of the state. Our algorithm, which falls into the machine learning subfield of “meta-learning” (in effect “learning to learn” about quantum states), does not require any ansatz about the form of the state to be estimated. Despite this generality, it can be retrained within hours on a single laptop for a two-qubit situation, which suggests a feasible time-cost when extended to larger systems and potential speed-ups if provided with additional structure, such as a state ansatz.

Download Full-text

The information complexity of learning tasks, their structure and their distance

Information and Inference A Journal of the IMA ◽

10.1093/imaiai/iaaa033 ◽

2021 ◽

Author(s):

Alessandro Achille ◽

Giovanni Paolini ◽

Glen Mbeng ◽

Stefano Soatto

Keyword(s):

Kolmogorov Complexity ◽

Large Scale ◽

Parametric Model ◽

Training Dataset ◽

Optimization Scheme ◽

Learning Tasks ◽

Asymmetric Distance ◽

Special Cases ◽

Scale Models ◽

Real World Datasets

Abstract We introduce an asymmetric distance in the space of learning tasks and a framework to compute their complexity. These concepts are foundational for the practice of transfer learning, whereby a parametric model is pre-trained for a task, and then fine tuned for another. The framework we develop is non-asymptotic, captures the finite nature of the training dataset and allows distinguishing learning from memorization. It encompasses, as special cases, classical notions from Kolmogorov complexity and Shannon and Fisher information. However, unlike some of those frameworks, it can be applied to large-scale models and real-world datasets. Our framework is the first to measure complexity in a way that accounts for the effect of the optimization scheme, which is critical in deep learning.

Download Full-text

MetaTP

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3478083 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-28

Author(s):

Weida Zhong ◽

Qiuling Suo ◽

Abhishek Gupta ◽

Xiaowei Jia ◽

Chunming Qiao ◽

...

Keyword(s):

Time Series ◽

Large Scale ◽

Multivariate Time Series ◽

Modern Society ◽

Training Data ◽

Traffic Prediction ◽

Temporal Prediction ◽

Reference Space ◽

Meta Learning ◽

Real World Datasets

With the popularity of smartphones, large-scale road sensing data is being collected to perform traffic prediction, which is an important task in modern society. Due to the nature of the roving sensors on smartphones, the collected traffic data which is in the form of multivariate time series, is often temporally sparse and unevenly distributed across regions. Moreover, different regions can have different traffic patterns, which makes it challenging to adapt models learned from regions with sufficient training data to target regions. Given that many regions may have very sparse data, it is also impossible to build individual models for each region separately. In this paper, we propose a meta-learning based framework named MetaTP to overcome these challenges. MetaTP has two key parts, i.e., basic traffic prediction network (base model) and meta-knowledge transfer. In base model, a two-layer interpolation network is employed to map original time series onto uniformly-spaced reference time points, so that temporal prediction can be effectively performed in the reference space. The meta-learning framework is employed to transfer knowledge from source regions with a large amount of data to target regions with a few data examples via fast adaptation, in order to improve model generalizability on target regions. Moreover, we use two memory networks to capture the global patterns of spatial and temporal information across regions. We evaluate the proposed framework on two real-world datasets, and experimental results show the effectiveness of the proposed framework.

Download Full-text