scholarly journals Data-Driven Representations for Testing Independence: Modeling, Analysis and Connection with Mutual Information Estimation

Author(s):  
Mauricio E Gonzalez ◽  
Jorge F Silva ◽  
Miguel Videla ◽  
Marcos Orchard
2021 ◽  
Author(s):  
Gourab Das

LitRev is a novel robust data driven approach, devel-oped for quick literature review on a particular topic of interest. This method identifies common biological phrases that follow a power law distribution and important phrases which have the normalized point wise mutual information score greater than zero.


2020 ◽  
Vol 14 (03) ◽  
pp. 246-253 ◽  
Author(s):  
Rui Huang ◽  
Miao Liu ◽  
Yongmei Ding

Currently, the outbreak of COVID-19 is rapidly spreading especially in Wuhan city, and threatens 14 million people in central China. In the present study we applied the Moran index, a strong statistical tool, to the spatial panel to show that COVID-19 infection is spatially dependent and mainly spread from Hubei Province in Central China to neighbouring areas. Logistic model was employed according to the trend of available data, which shows the difference between Hubei Province and outside of it. We also calculated the reproduction number R0 for the range of [2.23, 2.51] via SEIR model. The measures to reduce or prevent the virus spread should be implemented, and we expect our data-driven modeling analysis providing some insights to identify and prepare for the future virus control.


2021 ◽  
Author(s):  
Søren Wichmann

The present work is aimed at (1) developing a search machine adapted to the large DReaM corpus of linguistic descriptive literature and (2) getting insights into how a data-driven ontology of linguistic terminology might be built. Starting from close to 20,000 text documents from the literature of language descriptions, from documents either born digitally or scanned and OCR’d, we extract keywords and pass them through a pruning pipeline where mainly keywords that can be considered as belonging to linguistic terminology survive. Subsequently we quantify relations among those terms using Normalized Pointwise Mutual Information (NPMI) and use the resulting measures, in conjunction with the Google Page Rank (GPR), to build networks of linguistic terms.


2020 ◽  
Author(s):  
Yongmei Ding ◽  
Liyuan Gao

Abstract The novel coronavirus (COVID-19) that has been spreading worldwide since December 2019 has sickened millions of people, shut down major cities and some countries, prompted unprecedented global travel restrictions. Real data-driven modeling is an effort to help evaluate and curb the spread of the novel virus. Lockdowns and the effectiveness of reduction in the contacts in Italy has been measured via our modified model, with the addition of auxiliary and state variables that represent contacts, contacts with infected, conversion rate, latent propagation. Results show the decrease in infected people due to stay-at-home orders and tracing quarantine intervention. The effect of quarantine and centralized medical treatment was also measured through numerical modeling analysis.


2018 ◽  
Vol 11 (3) ◽  
pp. 517-529 ◽  
Author(s):  
Ioannis M. Stephanakis ◽  
Theodoros Iliou ◽  
George Anastassopoulos

Author(s):  
Karthik Kappaganthu ◽  
C. Nataraj

This paper proposes a novel technique combining datadriven and model-based techniques to significantly improve the performance in bearing fault diagnostics. Features that provide best classification performance for the given data are selected from a combined set of data driven and model based features. Some of the common data driven techniques from time, frequency and time-frequency domain are considered. For model based feature extraction, recently developed cross-sample entropy is used. The ranking and performance of each of these feature sets are studied, when used independently and when used together. Mutual information based technique is used for ranking and selection of the optimal feature set. Using this method, the contribution to performance and redundancy of each of the data driven features and model based features can be studied. This method can be used to design an effective diagnostic system for bearing fault detection.


Author(s):  
Zhenyu Chen ◽  
Bart M Doekemeijer ◽  
Zhongwei Lin ◽  
Zhen Xie ◽  
Zongming Si ◽  
...  

2016 ◽  
Vol 37 (1) ◽  
Author(s):  
Gintautas Jakimauskas ◽  
Marijus Radavičius ◽  
Jurgis Sušinskas

A simple, data-driven and computationally efficient procedure for testing independence of high-dimensional random vectors is proposed. The procedure is based on interpretation of testing goodness-of-fit as the classification problem, a special sequential partition procedure, elements of sequential testing, resampling and randomization. Monte Carlo simulations are carried out to assess the performance of the procedure.


Sign in / Sign up

Export Citation Format

Share Document