EmbNum+: Effective, Efficient, and Robust Semantic Labeling for Numerical Values

Abstract In recent years, there has been an increasing interest in numerical semantic labeling, in which the meaning of an unknown numerical column is assigned by the label of the most relevant columns in predefined knowledge bases. Previous methods used the p value of a statistical hypothesis test to estimate the relevance and thus strongly depend on the distribution and data domain. In other words, they are unstable for general cases, when such knowledge is undefined. Our goal is solving semantic labeling without using such information while guaranteeing high accuracy. We propose EmbNum+, a neural numerical embedding for learning both discriminant representations and a similarity metric from numerical columns. EmbNum+ maps lists of numerical values of columns into feature vectors in an embedding space, and a similarity metric can be calculated directly on these feature vectors. Evaluations on many datasets of various domains confirmed that EmbNum+ consistently outperformed other state-of-the-art approaches in terms of accuracy. The compact embedding representations also made EmbNum+ significantly faster than others and enable large-scale semantic labeling. Furthermore, attribute augmentation can be used to enhance the robustness and unlock the portability of EmbNum+, making it possible to be trained on one domain but applicable to many different domains.

Download Full-text

Parameter Estimation for Industrial Servomotor by Using Statistical Hypothesis Test

Journal of Applied Reliability ◽

10.33162/jar.2019.06.19.2.134 ◽

2019 ◽

Vol 19 (2) ◽

pp. 134-140

Author(s):

Baek-Ju Sung ◽

Sung-kyu Lee ◽

Mu-Seong Chang ◽

Do-Sik Kim

Keyword(s):

Parameter Estimation ◽

Hypothesis Test ◽

Statistical Hypothesis ◽

Statistical Hypothesis Test

Download Full-text

Bayesian level set method based on statistical hypothesis test and estimation of prior probabilities for image segmentation

10.1117/12.853699 ◽

2010 ◽

Author(s):

Yao-Tien Chen

Keyword(s):

Image Segmentation ◽

Level Set Method ◽

Level Set ◽

Hypothesis Test ◽

Statistical Hypothesis ◽

Prior Probabilities ◽

Statistical Hypothesis Test

Download Full-text

Item-specific overlap between hallucinatory experiences and cognition in the general population: A three-step multivariate analysis of international multi-site data

10.31234/osf.io/fkbvu ◽

2021 ◽

Author(s):

Abhijit Mahesh Chinchani ◽

Mahesh Menon ◽

Meighen Roes ◽

Heungsun Hwang ◽

Paul Allen ◽

...

Keyword(s):

Signal Detection ◽

Source Monitoring ◽

Hypothesis Test ◽

The Other ◽

Statistical Hypothesis ◽

Auditory Signal ◽

Attentional Processes ◽

Hearing Voices ◽

Statistical Hypothesis Test ◽

Using Data

Cognitive mechanisms hypothesized to underlie hallucinatory experiences (HEs) include dysfunctional source monitoring, heightened signal detection, or impaired attentional processes. HEs can be very pronounced in psychosis, but similar experiences also occur in nonclinical populations. Using data from an international multisite study on nonclinical subjects (N = 419), we described the overlap between two sets of variables - one measuring cognition and the other HEs - at the level of individual items, allowing extraction of item-specific signal which might considered off-limits when summary scores are analyzed. This involved using a statistical hypothesis test at the multivariate level, and variance constraints, dimension reduction, and split-half reliability checks at the level of individual items. The results showed that (1) modality-general HEs involving sensory distortions (hearing voices/sounds, troubled by voices, everyday things look abnormal, sensations of presence/movement) were associated with more liberal auditory signal detection, and (2) HEs involving experiences of sensory overload and vivid images/imagery (viz., HEs for faces and intense daydreams) were associated with other-ear distraction and reduced laterality in dichotic listening. Based on these results, it is concluded that the overlap between HEs and cognition variables can be conceptualized as modality-general and bi-dimensional: one involving distortions, and the other involving overload or intensity.

Download Full-text

Efficiency of Statistical Hypothesis Test Procedures

Behaviormetrics: Quantitative Approaches to Human Behavior - Statistical Data Analysis and Entropy ◽

10.1007/978-981-15-2552-0_5 ◽

2020 ◽

pp. 141-165

Author(s):

Nobuoki Eshima

Keyword(s):

Hypothesis Test ◽

Statistical Hypothesis ◽

Test Procedures ◽

Statistical Hypothesis Test

Download Full-text

Fault-tolerant event region detection in wireless sensor networks using statistical hypothesis test

2008 5th IEEE International Conference on Mobile Ad Hoc and Sensor Systems ◽

10.1109/mahss.2008.4660039 ◽

2008 ◽

Author(s):

Donglei Cao ◽

Beihong Jin ◽

Jiannong Cao

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Fault Tolerant ◽

Hypothesis Test ◽

Wireless Sensor ◽

Statistical Hypothesis ◽

Region Detection ◽

Statistical Hypothesis Test

Download Full-text

Particulate Matter and COVID-19 Disease Diffusion in Emilia-Romagna (Italy). Already a Cold Case?

Computation ◽

10.3390/computation8020059 ◽

2020 ◽

Vol 8 (2) ◽

pp. 59 ◽

Cited By ~ 5

Author(s):

Giovanni Delnevo ◽

Silvia Mirri ◽

Marco Roccetti

Keyword(s):

Granger Causality ◽

Air Pollutants ◽

Climatic Factors ◽

Virus Disease ◽

Hypothesis Test ◽

Statistical Hypothesis ◽

Clear Correlation ◽

The Real ◽

Statistical Hypothesis Test ◽

Before And After

As we prepare to emerge from an extensive and unprecedented lockdown period, due to the COVID-19 virus infection that hit the Northern regions of Italy with the Europe’s highest death toll, it becomes clear that what has gone wrong rests upon a combination of demographic, healthcare, political, business, organizational, and climatic factors that are out of our scientific scope. Nonetheless, looking at this problem from a patient’s perspective, it is indisputable that risk factors, considered as associated with the development of the virus disease, include older age, history of smoking, hypertension and heart disease. While several studies have already shown that many of these diseases can also be favored by a protracted exposure to air pollution, there has been recently an insurgence of negative commentary against authors who have correlated the fatal consequences of COVID-19 (also) to the exposition of specific air pollutants. Well aware that understanding the real connection between the spread of this fatal virus and air pollutants would require many other investigations at a level appropriate to the scale of this phenomenon (e.g., biological, chemical, and physical), we propose the results of a study, where a series of the measures of the daily values of PM2.5, PM10, and NO2 were considered over time, while the Granger causality statistical hypothesis test was used for determining the presence of a possible correlation with the series of the new daily COVID19 infections, in the period February–April 2020, in Emilia-Romagna. Results taken both before and after the governmental lockdown decisions show a clear correlation, although strictly seen from a Granger causality perspective. Moving beyond the relevance of our results towards the real extent of such a correlation, our scientific efforts aim at reinvigorating the debate on a relevant case, that should not remain unsolved or no longer investigated.

Download Full-text

Local Variation as a Statistical Hypothesis Test

International Journal of Computer Vision ◽

10.1007/s11263-015-0855-4 ◽

2015 ◽

Vol 117 (2) ◽

pp. 131-141 ◽

Cited By ~ 1

Author(s):

Michael Baltaxe ◽

Peter Meer ◽

Michael Lindenbaum

Keyword(s):

Hypothesis Test ◽

Local Variation ◽

Statistical Hypothesis ◽

Statistical Hypothesis Test

Download Full-text

Analysis of Microarray Data by Empirical Wavelet Transform for Cancer Classification Using Block by Block Method

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2021.3318 ◽

2021 ◽

Vol 11 (3) ◽

pp. 697-702

Author(s):

S. Jayanthi ◽

C. R. Rene Robin

Keyword(s):

Wavelet Transform ◽

Microarray Data ◽

Hypothesis Test ◽

Cancer Classification ◽

Statistical Hypothesis ◽

Support Vector ◽

Adaptive Wavelet ◽

Statistical Hypothesis Test ◽

Empirical Wavelet Transform ◽

Powers Of 2

In this study, DNA microarray data is analyzed from a signal processing perspective for cancer classification. An adaptive wavelet transform named Empirical Wavelet Transform (EWT) is analyzed using block-by-block procedure to characterize microarray data. The EWT wavelet basis depends on the input data rather predetermined like in conventional wavelets. Thus, EWT gives more sparse representations than wavelets. The characterization of microarray data is made by block-by-block procedure with predefined block sizes in powers of 2 that starts from 128 to 2048. After characterization, a statistical hypothesis test is employed to select the informative EWT coefficients. Only the selected coefficients are used for Microarray Data Classification (MDC) by the Support Vector Machine (SVM). Computational experiments are employed on five microarray datasets; colon, breast, leukemia, CNS and ovarian to test the developed cancer classification system. The obtained results demonstrate that EWT coefficients with SVM emerged as an effective approach with no misclassification for MDC system.

Download Full-text

Statistical Analysis of Binder Behavior during Debinding Step in Powder Injection Molding (PIM)

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.970.172 ◽

2014 ◽

Vol 970 ◽

pp. 172-176 ◽

Cited By ~ 3

Author(s):

Parinya Chakartnarodom ◽

Nutthita Chuankrerkkul

Keyword(s):

Linear Regression ◽

Injection Molding ◽

Hypothesis Test ◽

Powder Injection Molding ◽

Statistical Hypothesis ◽

Powder Injection ◽

Dissolution Behavior ◽

Linear Regression Method ◽

Statistical Hypothesis Testing ◽

Statistical Hypothesis Test

The aim of this paper is to propose the approach for applying statistical methods (linear regression and statistical hypothesis testing) to study the behavior of binder during binder removing (debinding) step in powder injection molding (PIM) and also the parameters that affect the binder removing rate. In this work, the binder system under the investigation is the composite binder of 85wt% polyethylene glycol (PEG) and 15 wt% poly (methyl methacrylate) (PMMA) where PEG can be removed from the green product by using warm water while PMMA is removed later during sintering. At 0.05 level of significance, the linear regression method and the statistical hypothesis test prove that the dissolution behavior of PEG can be described using Avarami equation. Furthermore, the dissolution rates of PEG were independent of all parameters used in this study including binder contents in the green products, temperatures, and powder sizes.

Download Full-text

Bootstrap test for process capability indices

International Journal of Quality & Reliability Management ◽

10.1108/ijqrm-11-2014-0166 ◽

2017 ◽

Vol 34 (7) ◽

pp. 925-939

Author(s):

Bahram Sadeghpour Gildeh ◽

Sedigheh Rahimpour ◽

Fatemeh Ghanbarpour Gravi

Keyword(s):

Hypothesis Test ◽

Bootstrap Method ◽

Process Capability ◽

Statistical Hypothesis ◽

Process Capability Indices ◽

Content Type ◽

Statistical Hypothesis Test ◽

Test Of Hypotheses ◽

Capability Indices ◽

The Difference

Purpose The purpose of this paper is to construct a statistical hypotheses test for process capability indices and compare the pairs of them with a fixed sample size. Design/methodology/approach Since the sampling distribution of the estimators of pairs of two process capability indices (PCIs) is very complex, an exact statistical hypothesis test for them cannot be constructed. Therefore, the authors have proposed a bootstrap method to construct the hypothesis test for them on the basis of p-value. Findings The authors have shown that by increasing n, the bootstrap method has better output relative to other methods and it can be easily implemented. The authors have also demonstrated that sometimes an exact hypotheses test cannot be constructed and need some assumptions. Originality/value In the present paper, several methods to test of hypotheses about the difference between two process capability indices have been compared.

Download Full-text