scholarly journals An Empirical Comparative Study of Instance-based Schema Matching

Author(s):  
Mogahed Alzeber ◽  
Ali A. Alwan ◽  
Azlin Nordin ◽  
Abedallah Zaid Abualkishik

<span lang="EN-US">The main issue concern of schema matching is how to support the merging decision by providing matching between attributes of different schemas. There have been many works in the literature toward utilizing database instances to detect the correspondence between attributes. Most of these previous works aim at improving the match accuracy. We observed that no technique managed to provide an accurate matching for different types of data. In other words, some of the techniques treat numeric values as strings. Similarly, other techniques process textual instance, as numeric, and this negatively influences the process of discovering the match and compromising the matching result. Thus, a practical comparative study between syntactic and semantic techniques is needed. The study emphasizes on analyzing these techniques to determine the strengths and weaknesses of each technique. This paper aims at comparing two different instance-based matching techniques, namely: (i) regular expression and (ii) Google similarity to identify the match between attributes. Several analyses have been conducted on real and synthetic data sets to evaluate the performance of these techniques with respect to Precision (P), Recall (R) and F-Measure.</span>

2020 ◽  
Author(s):  
Cesaré Ovando-Vázquez ◽  
Daniel Cázarez-García ◽  
Robert Winkler

AbstractMachine learning algorithms excavate important variables from biological big data. However, deciding on the biological relevance of identified variables is challenging. The addition of artificial noise, ‘decoy’ variables, to raw data, ‘target’ variables, enables calculating a false-positive rate (FPR) and a biological relevance probability (BRp) for each variable rank. These scores allow the setting of a cut-off for informative variables can be defined, depending on the required sensitivity/ specificity of a scientific question. We demonstrate the function of the Target-Decoy MineR (TDM) with synthetic data and with experimental metabolomics results. The Target-Decoy MineR is suitable for different types of quantitative data in tabular format. An implementation of the algorithm in R is freely available from https://bitbucket.org/cesaremov/targetdecoy_mining/.


1977 ◽  
Vol 79 (1) ◽  
pp. 133-140 ◽  
Author(s):  
H. M. Meddick

SUMMARYThe ability of six different types of contamination control mats currently in use at the entrances to theatre suites and other clean areas to remove bacteria-carrying particles from theatre trolley wheels was compared. Marked differences in the effectiveness of this property were obtained; and all mats showed some disadvantages. Modification of one of the mats has resulted in improved efficiency under working conditions.


2014 ◽  
Vol 7 (3) ◽  
pp. 781-797 ◽  
Author(s):  
P. Paatero ◽  
S. Eberly ◽  
S. G. Brown ◽  
G. A. Norris

Abstract. The EPA PMF (Environmental Protection Agency positive matrix factorization) version 5.0 and the underlying multilinear engine-executable ME-2 contain three methods for estimating uncertainty in factor analytic models: classical bootstrap (BS), displacement of factor elements (DISP), and bootstrap enhanced by displacement of factor elements (BS-DISP). The goal of these methods is to capture the uncertainty of PMF analyses due to random errors and rotational ambiguity. It is shown that the three methods complement each other: depending on characteristics of the data set, one method may provide better results than the other two. Results are presented using synthetic data sets, including interpretation of diagnostics, and recommendations are given for parameters to report when documenting uncertainty estimates from EPA PMF or ME-2 applications.


Geophysics ◽  
1983 ◽  
Vol 48 (11) ◽  
pp. 1514-1524 ◽  
Author(s):  
Edip Baysal ◽  
Dan D. Kosloff ◽  
John W. C. Sherwood

Migration of stacked or zero‐offset sections is based on deriving the wave amplitude in space from wave field observations at the surface. Conventionally this calculation has been carried out through a depth extrapolation. We examine the alternative of carrying out the migration through a reverse time extrapolation. This approach may offer improvements over existing migration methods, especially in cases of steeply dipping structures with strong velocity contrasts. This migration method is tested using appropriate synthetic data sets.


2008 ◽  
Vol 2 (2) ◽  
pp. 77-83 ◽  
Author(s):  
U. Ben-Hanan ◽  
H. Judes ◽  
M. Regev

Geophysics ◽  
2011 ◽  
Vol 76 (4) ◽  
pp. F239-F250 ◽  
Author(s):  
Fernando A. Monteiro Santos ◽  
Hesham M. El-Kaliouby

Joint or sequential inversion of direct current resistivity (DCR) and time-domain electromagnetic (TDEM) data commonly are performed for individual soundings assuming layered earth models. DCR and TDEM have different and complementary sensitivity to resistive and conductive structures, making them suitable methods for the application of joint inversion techniques. This potential joint inversion of DCR and TDEM methods has been used by several authors to reduce the ambiguities of the models calculated from each method separately. A new approach for joint inversion of these data sets, based on a laterally constrained algorithm, was found. The method was developed for the interpretation of soundings collected along a line over a 1D or 2D geology. The inversion algorithm was tested on two synthetic data sets, as well as on field data from Saudi Arabia. The results show that the algorithm is efficient and stable in producing quasi-2D models from DCR and TDEM data acquired in relatively complex environments.


Sign in / Sign up

Export Citation Format

Share Document