Physics-aware Spatiotemporal Modules with Auxiliary Tasks for Meta-Learning

Modeling the dynamics of real-world physical systems is critical for spatiotemporal prediction tasks, but challenging when data is limited. The scarcity of real-world data and the difficulty in reproducing the data distribution hinder directly applying meta-learning techniques. Although the knowledge of governing partial differential equations (PDE) of the data can be helpful for the fast adaptation to few observations, it is mostly infeasible to exactly find the equation for observations in real-world physical systems. In this work, we propose a framework, physics-aware meta-learning with auxiliary tasks, whose spatial modules incorporate PDE-independent knowledge and temporal modules utilize the generalized features from the spatial modules to be adapted to the limited data, respectively. The framework is inspired by a local conservation law expressed mathematically as a continuity equation and does not require the exact form of governing equation to model the spatiotemporal observations. The proposed method mitigates the need for a large number of real-world tasks for meta-learning by leveraging spatial information in simulated data to meta-initialize the spatial modules. We apply the proposed framework to both synthetic and real-world spatiotemporal prediction tasks and demonstrate its superior performance with limited observations.

Download Full-text

Improving the performance of a radio-frequency localization system in adverse outdoor applications

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-021-02001-6 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Marcelo N. de Sousa ◽

Ricardo Sant’Ana ◽

Rigel P. Fernandes ◽

Julio Cesar Duarte ◽

José A. Apolinário ◽

...

Keyword(s):

Random Forest ◽

Ray Tracing ◽

Real World ◽

Practical Implication ◽

Real Life ◽

Simulated Data ◽

Real Data ◽

Gradient Boosting ◽

Real World Data ◽

Localization Accuracy

AbstractIn outdoor RF localization systems, particularly where line of sight can not be guaranteed or where multipath effects are severe, information about the terrain may improve the position estimate’s performance. Given the difficulties in obtaining real data, a ray-tracing fingerprint is a viable option. Nevertheless, although presenting good simulation results, the performance of systems trained with simulated features only suffer degradation when employed to process real-life data. This work intends to improve the localization accuracy when using ray-tracing fingerprints and a few field data obtained from an adverse environment where a large number of measurements is not an option. We employ a machine learning (ML) algorithm to explore the multipath information. We selected algorithms random forest and gradient boosting; both considered efficient tools in the literature. In a strict simulation scenario (simulated data for training, validating, and testing), we obtained the same good results found in the literature (error around 2 m). In a real-world system (simulated data for training, real data for validating and testing), both ML algorithms resulted in a mean positioning error around 100 ,m. We have also obtained experimental results for noisy (artificially added Gaussian noise) and mismatched (with a null subset of) features. From the simulations carried out in this work, our study revealed that enhancing the ML model with a few real-world data improves localization’s overall performance. From the machine ML algorithms employed herein, we also observed that, under noisy conditions, the random forest algorithm achieved a slightly better result than the gradient boosting algorithm. However, they achieved similar results in a mismatch experiment. This work’s practical implication is that multipath information, once rejected in old localization techniques, now represents a significant source of information whenever we have prior knowledge to train the ML algorithm.

Download Full-text

Theoretical and Empirical Analysis of a Spatial EA Parallel Boosting Algorithm

Evolutionary Computation ◽

10.1162/evco_a_00202 ◽

2018 ◽

Vol 26 (1) ◽

pp. 43-66 ◽

Cited By ~ 1

Author(s):

Uday Kamath ◽

Carlotta Domeniconi ◽

Kenneth De Jong

Keyword(s):

Real World ◽

Learning Algorithm ◽

Learning Algorithms ◽

Real World Data ◽

Meta Level ◽

Meta Learning ◽

Robustness To Noise ◽

Boosting Algorithm ◽

Efficient Learning ◽

Empirical Analyses

Many real-world problems involve massive amounts of data. Under these circumstances learning algorithms often become prohibitively expensive, making scalability a pressing issue to be addressed. A common approach is to perform sampling to reduce the size of the dataset and enable efficient learning. Alternatively, one customizes learning algorithms to achieve scalability. In either case, the key challenge is to obtain algorithmic efficiency without compromising the quality of the results. In this article we discuss a meta-learning algorithm (PSBML) that combines concepts from spatially structured evolutionary algorithms (SSEAs) with concepts from ensemble and boosting methodologies to achieve the desired scalability property. We present both theoretical and empirical analyses which show that PSBML preserves a critical property of boosting, specifically, convergence to a distribution centered around the margin. We then present additional empirical analyses showing that this meta-level algorithm provides a general and effective framework that can be used in combination with a variety of learning classifiers. We perform extensive experiments to investigate the trade-off achieved between scalability and accuracy, and robustness to noise, on both synthetic and real-world data. These empirical results corroborate our theoretical analysis, and demonstrate the potential of PSBML in achieving scalability without sacrificing accuracy.

Download Full-text

Multi-View Multi-Label Learning with View-Specific Information Extraction

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/539 ◽

2019 ◽

Cited By ~ 4

Author(s):

Xuan Wu ◽

Qing-Guo Chen ◽

Yao Hu ◽

Dengbao Wang ◽

Xiaodong Chang ◽

...

Keyword(s):

Information Extraction ◽

Real World ◽

State Of The Art ◽

Specific Information ◽

Learning Approach ◽

Data Sets ◽

Learning Approaches ◽

Real World Data ◽

Learning Techniques ◽

Shared Information

Multi-view multi-label learning serves an important framework to learn from objects with diverse representations and rich semantics. Existing multi-view multi-label learning techniques focus on exploiting shared subspace for fusing multi-view representations, where helpful view-specific information for discriminative modeling is usually ignored. In this paper, a novel multi-view multi-label learning approach named SIMM is proposed which leverages shared subspace exploitation and view-specific information extraction. For shared subspace exploitation, SIMM jointly minimizes confusion adversarial loss and multi-label loss to utilize shared information from all views. For view-specific information extraction, SIMM enforces an orthogonal constraint w.r.t. the shared subspace to utilize view-specific discriminative information. Extensive experiments on real-world data sets clearly show the favorable performance of SIMM against other state-of-the-art multi-view multi-label learning approaches.

Download Full-text

It Matters: Reference Indicator Selection in Measurement Invariance Tests

Educational and Psychological Measurement ◽

10.1177/0013164420926565 ◽

2020 ◽

pp. 001316442092656

Author(s):

Yutian T. Thompson ◽

Hairong Song ◽

Dexin Shi ◽

Zhengkui Liu

Keyword(s):

Measurement Invariance ◽

Real World ◽

Quantitative Methods ◽

Simulated Data ◽

Selection Methods ◽

Real World Data ◽

Large Sample ◽

World Data ◽

Follow Up Study

Conventional approaches for selecting a reference indicator (RI) could lead to misleading results in testing for measurement invariance (MI). Several newer quantitative methods have been available for more rigorous RI selection. However, it is still unknown how well these methods perform in terms of correctly identifying a truly invariant item to be an RI. Thus, Study 1 was designed to address this issue in various conditions using simulated data. As a follow-up, Study 2 further investigated the advantages/disadvantages of using RI-based approaches for MI testing in comparison with non-RI-based approaches. Altogether, the two studies provided a solid examination on how RI matters in MI tests. In addition, a large sample of real-world data was used to empirically compare the uses of the RI selection methods as well as the RI-based and non-RI-based approaches for MI testing. In the end, we offered a discussion on all these methods, followed by suggestions and recommendations for applied researchers.

Download Full-text

Sensitivity of Estimated Tacrolimus Population Pharmacokinetic Profile to Assumed Dose Timing and Absorption in Real World Data and Simulated Data

British Journal of Clinical Pharmacology ◽

10.1111/bcp.15218 ◽

2022 ◽

Author(s):

Michael L. Williams ◽

Hannah L. Weeks ◽

Cole Beck ◽

Kelly A. Birdwell ◽

Sara L. Van Driest ◽

...

Keyword(s):

Real World ◽

Simulated Data ◽

Pharmacokinetic Profile ◽

Population Pharmacokinetic ◽

Real World Data ◽

World Data

Download Full-text

Pan-Cancer Metastasis Prediction Based on Graph Deep Learning Method

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.675978 ◽

2021 ◽

Vol 9 ◽

Author(s):

Yining Xu ◽

Xinran Cui ◽

Yadong Wang

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Gene Expression Data ◽

Real World ◽

Tumor Metastasis ◽

Machine Learning Techniques ◽

Expression Data ◽

Real World Data ◽

Convolutional Network ◽

Learning Techniques

Tumor metastasis is the major cause of mortality from cancer. From this perspective, detecting cancer gene expression and transcriptome changes is important for exploring tumor metastasis molecular mechanisms and cellular events. Precisely estimating a patient’s cancer state and prognosis is the key challenge to develop a patient’s therapeutic schedule. In the recent years, a variety of machine learning techniques widely contributed to analyzing real-world gene expression data and predicting tumor outcomes. In this area, data mining and machine learning techniques have widely contributed to gene expression data analysis by supplying computational models to support decision-making on real-world data. Nevertheless, limitation of real-world data extremely restricted model predictive performance, and the complexity of data makes it difficult to extract vital features. Besides these, the efficacy of standard machine learning pipelines is far from being satisfactory despite the fact that diverse feature selection strategy had been applied. To address these problems, we developed directed relation-graph convolutional network to provide an advanced feature extraction strategy. We first constructed gene regulation network and extracted gene expression features based on relational graph convolutional network method. The high-dimensional features of each sample were regarded as an image pixel, and convolutional neural network was implemented to predict the risk of metastasis for each patient. Ten cross-validations on 1,779 cases from The Cancer Genome Atlas show that our model’s performance (area under the curve, AUC = 0.837; area under precision recall curve, AUPRC = 0.717) outstands that of an existing network-based method (AUC = 0.707, AUPRC = 0.555).

Download Full-text

Initial Results of Quantification of Model Validation Results Using Modal Analysis

10.36227/techrxiv.11786100.v1 ◽

2020 ◽

Author(s):

Urmila Agrawal ◽

Pavel Etingov ◽

Renke Huang

Keyword(s):

Power Systems ◽

Modal Analysis ◽

Model Validation ◽

Real World ◽

Dynamic Models ◽

Simulated Data ◽

Real World Data ◽

Prony Method ◽

System Mode ◽

Initial Results

<div>High quality generator dynamic models are critical to reliable and accurate power systems studies and planning. With the availability of PMU measurements, measurement-based approach for model validation has gained significant prominence. Currently, the model validation results are analyzed by visually comparing real–world PMU measurements with the model-based simulated data. This paper proposes metrics to quantify the generator dynamic model validation results based on the response of generators to each system mode, which includes both local and inter-area, using modal analysis approach. The metrics provide information on the inaccuracy associated with the model in terms of the characteristics of each mode. Initial results obtained using the real-world data validates the effectiveness of the proposed metrics. In this paper, modal analysis was carried out using Prony method.</div>

Download Full-text

Unique variable analysis: A novel approach for detecting redundant variables in multivariate data

10.31234/osf.io/4kra2 ◽

2020 ◽

Author(s):

Alexander P. Christensen ◽

Luis Eduardo Garrido ◽

Hudson Golino

Keyword(s):

Real World ◽

Multivariate Data ◽

Simulated Data ◽

Network Models ◽

Dimensional Structure ◽

Accurate Estimation ◽

Real World Data ◽

Novel Approach ◽

Single Attribute ◽

Variable Analysis

One common approach for constructing tests that measure a single attribute is the semantic similarity approach where items vary slightly in their wording and content. Despite being an effective strategy for ensuring high internal consistency, the information in tests may become redundant or worse confound the interpretation of the test scores. With the advent of network models, where tests represent a complex system and components (usually items) represent causally autonomous features, redundant variables may have inadvertent effects on the interpretation of their metrics. These issues motivated the development of a novel approach called Unique Variable Analysis (UVA), which detects redundant variables in multivariate data. The goal of UVA is to statistically identify potential redundancies in multivariate data so that researchers can make decisions about how best to handle them. Using a Monte Carlo simulation approach, we generated multivariate data with redundancies that were based on examples of known real-world redundancies. We then demonstrate the effects that redundancy can have on the accurate estimation of dimensions. Next, we evaluated UVA’s ability to detect redundant variables in the simulated data. Based on these results, we provide a tutorial for how to apply UVA to real-world data. Our example data demonstrate that redundant variables create inaccurate estimates of dimensional structure but after applying UVA, the expected structure can be recovered. In sum, our study suggests that redundancy can have substantial effects on validity if left unchecked and that redundancy assessment should be integrated into standard validation practices.

Download Full-text

Continuous self-adaptive optimization to learn multi-task multi-agent

Complex & Intelligent Systems ◽

10.1007/s40747-021-00591-8 ◽

2021 ◽

Author(s):

Wenqian Liang ◽

Ji Wang ◽

Weidong Bao ◽

Xiaomin Zhu ◽

Qingyong Wang ◽

...

Keyword(s):

Real World ◽

Learning Algorithm ◽

Superior Performance ◽

Adaptive Optimization ◽

Single Task ◽

Multiple Tasks ◽

Meta Learning ◽

Gradient Based ◽

Multi Agent ◽

Self Adaptive

AbstractMulti-agent reinforcement learning (MARL) methods have shown superior performance to solve a variety of real-world problems focusing on learning distinct policies for individual tasks. These approaches face problems when applied to the non-stationary real-world: agents trained in specialized tasks cannot achieve satisfied generalization performance across multiple tasks; agents have to learn and store specialized policies for individual task and reliable identities of tasks are hardly observable in practice. To address the challenge continuously adapting to multiple tasks in MARL, we formalize the problem into a two-stage curriculum. Single-task policies are learned with MARL approaches, after that we develop a gradient-based Self-Adaptive Meta-Learning algorithm, SAML, that cannot only distill single-task policies into a unified policy but also can facilitate the unified policy to continuously adapt to new incoming tasks. In addition, to validate the continuous adaptation performance on complex task, we extend the widely adopted StarCraft benchmark SMAC and develop a new multi-task multi-agent StarCraft environment, Meta-SMAC, for testing various aspects of continuous adaptation method. Our experiments with a population of agents show that our method enables significantly more efficient adaptation than reactive baselines across different scenarios.

Download Full-text

The META tool optimizes metagenomic analyses across sequencing platforms and classifiers.

10.1101/2021.07.29.454031 ◽

2021 ◽

Author(s):

Robert A Player ◽

Angeline M Aguinaldo ◽

Brian B Merritt ◽

Lisa N Maszkiewicz ◽

Oluwaferanmi E Adeyemo ◽

...

Keyword(s):

Real World ◽

Simulated Data ◽

Evaluation Tool ◽

Algorithm Selection ◽

Real World Data ◽

Sequencing Platform ◽

And Performance ◽

Downstream Analysis ◽

Sequencing Platforms ◽

Utilization Time

A major challenge in the field of metagenomics is the selection of the correct combination of sequencing platform and downstream metagenomic analysis algorithm, or classifier. Here, we present the Metagenomic Evaluation Tool Analyzer (META), which produces simulated data and facilitates platform and algorithm selection for any given metagenomic use case. META-generated in silico read data are modular, scalable, and reflect user-defined community profiles, while the downstream analysis is done using a variety of metagenomic classifiers. Reported results include information on resource utilization, time-to-answer, and performance. Real-world data can also be analyzed using selected classifiers and results benchmarked against simulations. To test the utility of the META software, simulated data was compared to real-world viral and bacterial metagenomic samples run on four different sequencers and analyzed using 12 metagenomic classifiers. Lastly, we introduce META Score: a unified, quantitative value which rates an analytic classifiers' ability to both identify and count taxa in a representative sample.

Download Full-text