scholarly journals TabReformer: Unsupervised Representation Learning for Erroneous Data Detection

2021 ◽  
Vol 2 (3) ◽  
pp. 1-29
Author(s):  
Mona Nashaat ◽  
Aindrila Ghosh ◽  
James Miller ◽  
Shaikh Quader

Error detection is a crucial preliminary phase in any data analytics pipeline. Existing error detection techniques typically target specific types of errors. Moreover, most of these detection models either require user-defined rules or ample hand-labeled training examples. Therefore, in this article, we present TabReformer, a model that learns bidirectional encoder representations for tabular data. The proposed model consists of two main phases. In the first phase, TabReformer follows encoder architecture with multiple self-attention layers to model the dependencies between cells and capture tuple-level representations. Also, the model utilizes a Gaussian Error Linear Unit activation function with the Masked Data Model objective to achieve deeper probabilistic understanding. In the second phase, the model parameters are fine-tuned for the task of erroneous data detection. The model applies a data augmentation module to generate more erroneous examples to represent the minority class. The experimental evaluation considers a wide range of databases with different types of errors and distributions. The empirical results show that our solution can enhance the recall values by 32.95% on average compared with state-of-the-art techniques while reducing the manual effort by up to 48.86%.

2019 ◽  
Vol 488 (4) ◽  
pp. 4623-4637 ◽  
Author(s):  
Mohsen Shadmehri ◽  
Sayyedeh Masoumeh Ghoreyshi

ABSTRACT We study the evolution of the protoplanetary discs (PPDs) in the presence of magnetically driven winds with the stress relations motivated by the non-ideal MHD disc simulations. Contribution of the magnetic winds in the angular momentum removal and mass-loss is described using these relations which are quantified in terms of the plasma parameter. Evolution of the essential disc quantities including the surface density, accretion rate, and wind mass-loss rate are studied for a wide range of the model parameters. Two distinct phases of the disc evolution are found irrespective of the adopted input parameters. While at the early phase of the disc evolution, global disc quantities such as its total mass and magnetic flux undergo non-significant reductions, their rapid declines are found in the second phase of evolution. Duration of each phase, however, depends upon the model parameters including magnetic wind strength. Our model predicts that contributions of the magnetic winds in the disc evolution are significant during the second phase. We then calculated locus of points in the plane of the accretion rate and total disc mass corresponding to an ensemble of evolving PPDs. Our theoretical isochrone tracks exhibit reasonable fits to the observed PPDs in star-forming regions Lupus and σ-Orion.


2013 ◽  
Vol 33 (5) ◽  
pp. 1459-1462
Author(s):  
Xiaoming JU ◽  
Jiehao ZHANG ◽  
Yizhong ZHANG

2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Malte Seemann ◽  
Lennart Bargsten ◽  
Alexander Schlaefer

AbstractDeep learning methods produce promising results when applied to a wide range of medical imaging tasks, including segmentation of artery lumen in computed tomography angiography (CTA) data. However, to perform sufficiently, neural networks have to be trained on large amounts of high quality annotated data. In the realm of medical imaging, annotations are not only quite scarce but also often not entirely reliable. To tackle both challenges, we developed a two-step approach for generating realistic synthetic CTA data for the purpose of data augmentation. In the first step moderately realistic images are generated in a purely numerical fashion. In the second step these images are improved by applying neural domain adaptation. We evaluated the impact of synthetic data on lumen segmentation via convolutional neural networks (CNNs) by comparing resulting performances. Improvements of up to 5% in terms of Dice coefficient and 20% for Hausdorff distance represent a proof of concept that the proposed augmentation procedure can be used to enhance deep learning-based segmentation for artery lumen in CTA images.


Genetics ◽  
2000 ◽  
Vol 156 (1) ◽  
pp. 457-467 ◽  
Author(s):  
Z W Luo ◽  
S H Tao ◽  
Z-B Zeng

Abstract Three approaches are proposed in this study for detecting or estimating linkage disequilibrium between a polymorphic marker locus and a locus affecting quantitative genetic variation using the sample from random mating populations. It is shown that the disequilibrium over a wide range of circumstances may be detected with a power of 80% by using phenotypic records and marker genotypes of a few hundred individuals. Comparison of ANOVA and regression methods in this article to the transmission disequilibrium test (TDT) shows that, given the genetic variance explained by the trait locus, the power of TDT depends on the trait allele frequency, whereas the power of ANOVA and regression analyses is relatively independent from the allelic frequency. The TDT method is more powerful when the trait allele frequency is low, but much less powerful when it is high. The likelihood analysis provides reliable estimation of the model parameters when the QTL variance is at least 10% of the phenotypic variance and the sample size of a few hundred is used. Potential use of these estimates in mapping the trait locus is also discussed.


2021 ◽  
Vol 9 (4) ◽  
pp. 839
Author(s):  
Muhammad Rafiullah Khan ◽  
Vanee Chonhenchob ◽  
Chongxing Huang ◽  
Panitee Suwanamornlert

Microorganisms causing anthracnose diseases have a medium to a high level of resistance to the existing fungicides. This study aimed to investigate neem plant extract (propyl disulfide, PD) as an alternative to the current fungicides against mango’s anthracnose. Microorganisms were isolated from decayed mango and identified as Colletotrichum gloeosporioides and Colletotrichum acutatum. Next, a pathogenicity test was conducted and after fulfilling Koch’s postulates, fungi were reisolated from these symptomatic fruits and we thus obtained pure cultures. Then, different concentrations of PD were used against these fungi in vapor and agar diffusion assays. Ethanol and distilled water were served as control treatments. PD significantly (p ≤ 0.05) inhibited more of the mycelial growth of these fungi than both controls. The antifungal activity of PD increased with increasing concentrations. The vapor diffusion assay was more effective in inhibiting the mycelial growth of these fungi than the agar diffusion assay. A good fit (R2, 0.950) of the experimental data in the Gompertz growth model and a significant difference in the model parameters, i.e., lag phase (λ), stationary phase (A) and mycelial growth rate, further showed the antifungal efficacy of PD. Therefore, PD could be the best antimicrobial compound against a wide range of microorganisms.


2011 ◽  
Vol 2011 ◽  
pp. 1-12 ◽  
Author(s):  
Karim El-Laithy ◽  
Martin Bogdan

An integration of both the Hebbian-based and reinforcement learning (RL) rules is presented for dynamic synapses. The proposed framework permits the Hebbian rule to update the hidden synaptic model parameters regulating the synaptic response rather than the synaptic weights. This is performed using both the value and the sign of the temporal difference in the reward signal after each trial. Applying this framework, a spiking network with spike-timing-dependent synapses is tested to learn the exclusive-OR computation on a temporally coded basis. Reward values are calculated with the distance between the output spike train of the network and a reference target one. Results show that the network is able to capture the required dynamics and that the proposed framework can reveal indeed an integrated version of Hebbian and RL. The proposed framework is tractable and less computationally expensive. The framework is applicable to a wide class of synaptic models and is not restricted to the used neural representation. This generality, along with the reported results, supports adopting the introduced approach to benefit from the biologically plausible synaptic models in a wide range of intuitive signal processing.


Author(s):  
Afshin Anssari-Benam ◽  
Andrea Bucchi ◽  
Giuseppe Saccomandi

AbstractThe application of a newly proposed generalised neo-Hookean strain energy function to the inflation of incompressible rubber-like spherical and cylindrical shells is demonstrated in this paper. The pressure ($P$ P ) – inflation ($\lambda $ λ or $v$ v ) relationships are derived and presented for four shells: thin- and thick-walled spherical balloons, and thin- and thick-walled cylindrical tubes. Characteristics of the inflation curves predicted by the model for the four considered shells are analysed and the critical values of the model parameters for exhibiting the limit-point instability are established. The application of the model to extant experimental datasets procured from studies across 19th to 21st century will be demonstrated, showing favourable agreement between the model and the experimental data. The capability of the model to capture the two characteristic instability phenomena in the inflation of rubber-like materials, namely the limit-point and inflation-jump instabilities, will be made evident from both the theoretical analysis and curve-fitting approaches presented in this study. A comparison with the predictions of the Gent model for the considered data is also demonstrated and is shown that our presented model provides improved fits. Given the simplicity of the model, its ability to fit a wide range of experimental data and capture both limit-point and inflation-jump instabilities, we propose the application of our model to the inflation of rubber-like materials.


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Dieter M. Tourlousse ◽  
Koji Narita ◽  
Takamasa Miura ◽  
Mitsuo Sakamoto ◽  
Akiko Ohashi ◽  
...  

Abstract Background Validation and standardization of methodologies for microbial community measurements by high-throughput sequencing are needed to support human microbiome research and its industrialization. This study set out to establish standards-based solutions to improve the accuracy and reproducibility of metagenomics-based microbiome profiling of human fecal samples. Results In the first phase, we performed a head-to-head comparison of a wide range of protocols for DNA extraction and sequencing library construction using defined mock communities, to identify performant protocols and pinpoint sources of inaccuracy in quantification. In the second phase, we validated performant protocols with respect to their variability of measurement results within a single laboratory (that is, intermediate precision) as well as interlaboratory transferability and reproducibility through an industry-based collaborative study. We further ascertained the performance of our recommended protocols in the context of a community-wide interlaboratory study (that is, the MOSAIC Standards Challenge). Finally, we defined performance metrics to provide best practice guidance for improving measurement consistency across methods and laboratories. Conclusions The validated protocols and methodological guidance for DNA extraction and library construction provided in this study expand current best practices for metagenomic analyses of human fecal microbiota. Uptake of our protocols and guidelines will improve the accuracy and comparability of metagenomics-based studies of the human microbiome, thereby facilitating development and commercialization of human microbiome-based products.


Vehicles ◽  
2021 ◽  
Vol 3 (2) ◽  
pp. 212-232
Author(s):  
Ludwig Herzog ◽  
Klaus Augsburg

The important change in the transition from partial to high automation is that a vehicle can drive autonomously, without active human involvement. This fact increases the current requirements regarding ride comfort and dictates new challenges for automotive shock absorbers. There exist two common types of automotive shock absorber with two friction types: The intended viscous friction dissipates the chassis vibrations, while the unwanted solid body friction is generated by the rubbing of the damper’s seals and guides during actuation. The latter so-called static friction impairs ride comfort and demands appropriate friction modeling for the control of adaptive or active suspension systems. In this article, a simulation approach is introduced to model damper friction based on the most friction-relevant parameters. Since damper friction is highly dependent on geometry, which can vary widely, three-dimensional (3D) structural FEM is used to determine the deformations of the damper parts resulting from mounting and varying operation conditions. In the respective contact zones, a dynamic friction model is applied and parameterized based on the single friction point measurements. Subsequent to the parameterization of the overall friction model with geometry data, operation conditions, material properties and friction model parameters, single friction point simulations are performed, analyzed and validated against single friction point measurements. It is shown that this simulation method allows for friction prediction with high accuracy. Consequently, its application enables a wide range of parameters relevant to damper friction to be investigated with significantly increased development efficiency.


Sign in / Sign up

Export Citation Format

Share Document