limiting error
Recently Published Documents


TOTAL DOCUMENTS

17
(FIVE YEARS 3)

H-INDEX

3
(FIVE YEARS 1)

Author(s):  
Naveed Ahmad Khan Jhamat ◽  
Ghulam Mustafa ◽  
Zhendong Niu

Class imbalance problem is being manifoldly confronted by researchers due to the increasing amount of complicated data. Common classification algorithms are impoverished to perform effectively on imbalanced datasets. Larger class cases typically outbalance smaller class cases in class imbalance learning. Common classification algorithms raise larger class performance owing to class imbalance in data and overall improvement in accuracy as their goal while lowering performance on smaller class. Furthermore, these algorithms deal false positive and false negative in an even way and regard equal cost of misclassifying cases. Meanwhile, different ensemble solutions have been proposed over the years for class imbalance learning but these approaches hamper the performance of larger class as emphasizing on the small class cases. The intuition of this overall degraded outcome would be the low diversity in ensemble solutions and overfitting or underfitting in data resampling techniques. To overcome these problems, we suggest a hybrid ensemble method by leveraging MultiBoost ensemble and Synthetic Minority Over-sampling TEchnique (SMOTE). Our suggested solution leverage the effectiveness of its elements. Therefore, it improves the outcome of the smaller class by reinforcing its space and limiting error in prediction. The proposed method shows improved performance as compare to numerous other algorithms and techniques in experiments.  


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Louis Ranjard ◽  
Thomas K. F. Wong ◽  
Allen G. Rodrigo

Abstract Background In short-read DNA sequencing experiments, the read coverage is a key parameter to successfully assemble the reads and reconstruct the sequence of the input DNA. When coverage is very low, the original sequence reconstruction from the reads can be difficult because of the occurrence of uncovered gaps. Reference guided assembly can then improve these assemblies. However, when the available reference is phylogenetically distant from the sequencing reads, the mapping rate of the reads can be extremely low. Some recent improvements in read mapping approaches aim at modifying the reference according to the reads dynamically. Such approaches can significantly improve the alignment rate of the reads onto distant references but the processing of insertions and deletions remains challenging. Results Here, we introduce a new algorithm to update the reference sequence according to previously aligned reads. Substitutions, insertions and deletions are performed in the reference sequence dynamically. We evaluate this approach to assemble a western-grey kangaroo mitochondrial amplicon. Our results show that more reads can be aligned and that this method produces assemblies of length comparable to the truth while limiting error rate when classic approaches fail to recover the correct length. Finally, we discuss how the core algorithm of this method could be improved and combined with other approaches to analyse larger genomic sequences. Conclusions We introduced an algorithm to perform dynamic alignment of reads on a distant reference. We showed that such approach can improve the reconstruction of an amplicon compared to classically used bioinformatic pipelines. Although not portable to genomic scale in the current form, we suggested several improvements to be investigated to make this method more flexible and allow dynamic alignment to be used for large genome assemblies.


2019 ◽  
Vol 40 (4) ◽  
pp. 455-462
Author(s):  
Ankit Kumar ◽  
Manisha Bharti ◽  
Tanya Kumar

Abstract In this paper, comparative analysis of code performance of dissimilar optical 2-D codes from Optical Orthogonal code family has been studied. Optical 2-D codes considered from OOC family are (n,w,1,2) OOC, SPS/OOC, OCFHC/OOC, EPC/OCS and VWOOC. By utilizing hard limiting error probability (HEP) equations and combinatorial method, code performance of each considered code is evaluated in detail. On the basis of detailed comparative performance analysis, EPC/OCS is concluded as best performing codes among all other optical codes under consideration. EPC/OCS possesses much better correlation properties, along with lower hit probability values which are responsible for its supremacy in performance characteristics to the other OOCs considered.


2018 ◽  
Author(s):  
Louis Ranjard ◽  
Thomas K. F. Wong ◽  
Allen G. Rodrigo

ABSTRACTIn short-read DNA sequencing experiments, the read coverage is a key parameter to successfully assemble the reads and reconstruct the sequence of the input DNA. When coverage is very low, the original sequence reconstruction from the reads can be difficult because of the occurrence of uncovered gaps. Reference guided assembly can then improve these assemblies. However, when the available reference is phylogenetically distant from the sequencing reads, the mapping rate of the reads can be extremely low. Some recent improvements in read mapping approaches aim at modifying the reference according to the reads dynamically. Such approaches can significantly improve the alignment rate of the reads onto distant references but the processing of insertions and deletions remains challenging. Here, we introduce a dynamic programming algorithm to update the reference sequence according to previously aligned reads. Substitutions, insertions and deletions are performed in the reference sequence dynamically. We evaluate this approach to assemble a western-grey kangaroo mitochondrial amplicon. Our results show that more reads can be aligned and that this method produces assemblies of length comparable to the truth while limiting error rate when classic approaches fail to recover the correct length. Our method allows us to assemble the first full mitochondrial genome for the western-grey kangaroo. Finally, we discuss how the core algorithm of this method could be improved and combined with other approaches to analyse larger genomic sequences.


2015 ◽  
Author(s):  
James Crall ◽  
Nick Gravish ◽  
Andrew M Mountcastle ◽  
Stacey A Combes

A fundamental challenge common to studies of animal movement, behavior, and ecology is the collection of high-quality datasets on spatial positions of animals as they change through space and time. Recent innovations in tracking technology have allowed researchers to collect large and highly accurate datasets on animal spatiotemporal position while vastly decreasing the time and cost of collecting such data. One technique that is of particular relevance to the study of behavioral ecology involves tracking visual tags that can be uniquely identified in separate images or movie frames. These tags can be located within images that are visually complex, making them particularly well suited for longitudinal studies of animal behavior and movement in naturalistic environments. While several software packages have been developed that use computer vision to identify visual tags, these software packages are either (a) not optimized for identification of single tags, which is generally of the most interest for biologist, or (b) suffer from licensing issues, and therefore their use in the study of animal behavior has been limited. Here, we present BEEtag, an open-source, image-based tracking system in Matlab that allows for unique identification of individual animals or anatomical markers. The primary advantages of this system are that it (a) independently identifies animals or marked points in each frame of a video, limiting error propagation, (b) performs well in images with complex background, and (c) is low-cost. To validate the use of this tracking system in animal behavior, we mark and track individual bumblebees (Bombus impatiens) and recover individual patterns of space use and activity within the hive. Finally, we discuss the advantages and limitations of this software package and its application to the study of animal movement, behavior, and ecology.


2012 ◽  
Vol 2 (3) ◽  
pp. 216-223 ◽  
Author(s):  
S. A. Younes ◽  
A. G. Elmezayen

AbstractThe principal limiting error source in the Global Positioning System (GPS) is the mismodeling of the delay experienced by radio waves in propagating through the atmosphere. The atmosphere causing the delay in GPS signals consists of two main layers: the ionosphere and the troposphere. The ionospheric delay can be mitigated using dual frequency receivers, but the tropospheric delay is often corrected using a standard tropospheric model. The tropospheric delay can be described as a product of the delay at the zenith and a mapping function, which models the elevation dependence of the propagation delay. A large number of mapping functions have been developed for use in the analysis of space geodetic data. An assessment of most of these mapping functions including those developed by Niell (NMF), Herring (MTT), Davis (CfA-2.2), Ifadis, Chao, Black & Eisner (B & E), Yang & Ping, Moffett, Vienna (VMF), and Isobaric (IMF) have been performed. The behavior of these mapping functions was assessed by comparing their results with highly accurate Numerical Integration based Models (NIM) for three different stations in Egypt (Aswan, Helwan, and Mersa Matrouh) at different times throughout the year. The meteorological data used in this study was taken from the Egyptian Meteorological Authority (EMA) as average values between 1990 and 2005. It can be concluded that the Black & Eisner mapping function is recommended for dry tropospheric delay prediction for low zenith angles, whereas VMF will be the choice for elevation angles up to 10°.


2009 ◽  
Vol 27 (4) ◽  
pp. 241-250
Author(s):  
Mark P. Widrlechner ◽  
Janette R. Thompson ◽  
Emily J. Kapler ◽  
Kristen Kordecki ◽  
Philip M. Dixon ◽  
...  

Abstract Accurate methods to predict the naturalization of non-native woody plants are key components of risk-management programs being considered by nursery and landscape professionals. The objective of this study was to evaluate four decision-tree models to predict naturalization (first tested in Iowa) on two new sets of data for non-native woody plants cultivated in the Chicago region. We identified life-history traits and native ranges for 193 species (52 known to naturalize and 141 not known to naturalize) in two study areas within the Chicago region. We used these datasets to test four models (one continental-scale and three regional-scale) as a form of external validation. Application of the continental-scale model resulted in classification rates of 72–76%, horticulturally limiting (false positive) error rates of 20–24%, and biologically significant (false negative) error rates of 5–6%. Two regional modifications to the continental model gave increased classification rates (85–93%) and generally lower horticulturally limiting error rates (16–22%), but similar biologically significant error rates (5–8%). A simpler method, the CART model developed from the Iowa data, resulted in lower classification rates (70–72%) and higher biologically significant error rates (8–10%), but, to its credit, it also had much lower horticulturally limiting error rates (5–10%). A combination of models to capture both high classification rates and low error rates will likely be the most effective until improved protocols based on multiple regional datasets can be developed and validated.


Geophysics ◽  
2006 ◽  
Vol 71 (6) ◽  
pp. J71-J80 ◽  
Author(s):  
Maria A. Annecchione ◽  
Pierre Keating ◽  
Michel Chouteau

Airborne gravimeters based on inertial navigation system (INS) technology are capable, in theory, of providing direct observations of the horizontal components of anomalous gravity. However, their accuracy and usefulness in geophysical or geological applications is unknown. Determining the accuracy of airborne horizontal component data is complicated by the lack of ground-surveyed control data. We determine the accuracy of airborne vector gravity data internally using repeatedly flown line data. Multilevel wavelet analyses of raw vector gravity data elucidate the limiting error source for the horizontal components. We demonstrate the usefulness of the airborne horizontal component data by performing Euler deconvolutions on real vector gravity data. The accuracy of the horizontal components is lower than the accuracy of the vertical component. Wavelet analyses of data from a test flight over Alexandria, Ontario, Canada, show that the main source of error limiting the accuracy of the horizontal components is time-dependent platform alignment errors. Euler deconvolutions performed on the Timmins data set show that the horizontal components help in constraining the 3D locations of regional geological features. It is thus concluded that the quality of the airborne horizontal component data is sufficient to motivate their use in resource exploration and geological applications.


Sign in / Sign up

Export Citation Format

Share Document