scholarly journals Robust detection and identification of sparse segments in ultrahigh dimensional data analysis

Author(s):  
T. Tony Cai ◽  
X. Jessie Jeng ◽  
Hongzhe Li
2021 ◽  
Author(s):  
David Bohnenkamp ◽  
Jan Behmann ◽  
Stefan Paulus ◽  
Ulrike Steiner ◽  
Anne-Katrin Mahlein

This work established a hyperspectral library of important foliar diseases of wheat in time series to detect spectral changes from infection to symptom appearance induced by different pathogens. The data was generated under controlled conditions at the leaf-scale. The transition from healthy to diseased leaf tissue was assessed, spectral shifts were identified and used in combination with histological investigations to define developmental stages in pathogenesis for each disease. The spectral signatures of each plant disease that are indicative of a certain developmental stage during pathogenesis - defined as turning points - were combined into a spectral library. Different machine learning analysis methods were applied and compared to test the potential of this library for the detection and quantification of foliar diseases in hyperspectral images. All evaluated classifiers provided a high accuracy for the detection and identification for both the biotrophic fungi and the necrotrophic fungi of up to 99%. The potential of applying spectral analysis methods, in combination with a spectral library for the detection and identification of plant diseases is demonstrated. Further evaluation and development of these algorithms should contribute to a robust detection and identification system for plant diseases at different developmental stages and the promotion and development of site-specific management techniques of plant diseases under field conditions.


1994 ◽  
Vol 37 (3) ◽  
Author(s):  
O. K. Kedrov ◽  
V. E. Permyakova

The new concept and methodology of regional seismic arrays (RSA) equipped by three component (3 C) sensors (Z, NS, EH9, are proposed. Such system could be more perfect tool of Earth interior investigations. This aim can be achieved by introduction of polarization filtering of 3 C seismic vibrations as an effective means of noise suppression and robust detection and identification of secondary body phases of the signals. The proposed algorithm is based on: 1) linear phase band pass frequency filtering of N 3 C records in M bands; 2) polarization filtering of all 3 C records in all L directions where array beams are routinely oriented; 3) calculation of L beams in M bands using polarized P, SV and SH traces of individual sensors; 4) detection of signals on the L*M P, SV and SH traces; 5) location of the event. The main new procedures are 2) and 3). Due to these new approaches the procedures 4) and 5) will be improved in comparison with,those routinely used today at RSA's. This work includes the theoretical consideration of proposed method efficiency and preliminary experimental results.


GigaScience ◽  
2020 ◽  
Vol 9 (10) ◽  
Author(s):  
Katrina L Kalantar ◽  
Tiago Carvalho ◽  
Charles F A de Bourcy ◽  
Boris Dimitrov ◽  
Greg Dingle ◽  
...  

Abstract Background Metagenomic next-generation sequencing (mNGS) has enabled the rapid, unbiased detection and identification of microbes without pathogen-specific reagents, culturing, or a priori knowledge of the microbial landscape. mNGS data analysis requires a series of computationally intensive processing steps to accurately determine the microbial composition of a sample. Existing mNGS data analysis tools typically require bioinformatics expertise and access to local server-class hardware resources. For many research laboratories, this presents an obstacle, especially in resource-limited environments. Findings We present IDseq, an open source cloud-based metagenomics pipeline and service for global pathogen detection and monitoring (https://idseq.net). The IDseq Portal accepts raw mNGS data, performs host and quality filtration steps, then executes an assembly-based alignment pipeline, which results in the assignment of reads and contigs to taxonomic categories. The taxonomic relative abundances are reported and visualized in an easy-to-use web application to facilitate data interpretation and hypothesis generation. Furthermore, IDseq supports environmental background model generation and automatic internal spike-in control recognition, providing statistics that are critical for data interpretation. IDseq was designed with the specific intent of detecting novel pathogens. Here, we benchmark novel virus detection capability using both synthetically evolved viral sequences and real-world samples, including IDseq analysis of a nasopharyngeal swab sample acquired and processed locally in Cambodia from a tourist from Wuhan, China, infected with the recently emergent SARS-CoV-2. Conclusion The IDseq Portal reduces the barrier to entry for mNGS data analysis and enables bench scientists, clinicians, and bioinformaticians to gain insight from mNGS datasets for both known and novel pathogens.


Author(s):  
Katrina L. Kalantar ◽  
Tiago Carvalho ◽  
Charles F.A. de Bourcy ◽  
Boris Dimitrov ◽  
Greg Dingle ◽  
...  

ABSTRACTBackgroundMetagenomic next generation sequencing (mNGS) has enabled the rapid, unbiased detection and identification of microbes without pathogen-specific reagents, culturing, or a priori knowledge of the microbial landscape. mNGS data analysis requires a series of computationally intensive processing steps to accurately determine the microbial composition of a sample. Existing mNGS data analysis tools typically require bioinformatics expertise and access to local server-class hardware resources. For many research laboratories, this presents an obstacle, especially in resource limited environments.FindingsWe present IDseq, an open source cloud-based metagenomics pipeline and service for global pathogen detection and monitoring (https://idseq.net). The IDseq Portal accepts raw mNGS data, performs host and quality filtration steps, then executes an assembly-based alignment pipeline which results in the assignment of reads and contigs to taxonomic categories. The taxonomic relative abundances are reported and visualized in an easy-to-use web application to facilitate data interpretation and hypothesis generation. Furthermore, IDseq supports environmental background model generation and automatic internal spike-in control recognition, providing statistics which are critical for data interpretation. IDseq was designed with the specific intent of detecting novel pathogens. Here, we benchmark novel virus detection capability using both synthetically evolved viral sequences, and real-world samples, including IDseq analysis of a nasopharyngeal swab sample acquired and processed locally in Cambodia from a tourist from Wuhan, China, infected with the recently emergent SARS-CoV-2.ConclusionThe IDseq Portal reduces the barrier to entry for mNGS data analysis and enables bench scientists, clinicians, and bioinformaticians to gain insight from mNGS datasets for both known and novel pathogens.


2016 ◽  
Vol 21 (9) ◽  
pp. 942-955 ◽  
Author(s):  
Rajarshi Guha ◽  
Lesley A. Mathews Griner ◽  
Jonathan M. Keller ◽  
Xiaohu Zhang ◽  
David Fitzgerald ◽  
...  

Synthetic lethal screens are used to discover new combination treatments for cancer. In traditional high-throughput synthetic lethal screens, compounds are tested at a single dose, and hit selection is based on threshold activity values from the variance of the efficacy of the compounds tested. The limitation of the single-dose screening for synthetic lethal screens is that it does not allow for the robust detection of differential activities from compound collections with a broad range of potencies and efficacies. There is therefore a need to develop screening approaches that enable the identification of compounds with synthetic lethal effects based on changes in both potency and efficacy. Here we describe the implementation of a dose response–based synthetic lethal screen to find drugs that enhance or mitigate the cytotoxic effect of an immunotoxin protein (HA22). We developed a data analysis framework for the selection of compounds with enhancing or mitigating cytotoxic activities based on the use of dose-response parameters. The data analysis framework includes an ensemble ranking approach that allows the use of multiple dose-response parameters in a nonparametric fashion. Quantitative high-throughput screening (HTS) enables the identification of compounds with synthetic lethal activity not identified by single-dose HTS.


2009 ◽  
Vol 75 (12) ◽  
pp. 4185-4193 ◽  
Author(s):  
R. van Doorn ◽  
M. Sławiak ◽  
M. Szemes ◽  
A. M. Dullemans ◽  
P. Bonants ◽  
...  

ABSTRACT Simultaneous detection and identification of multiple pathogenic microorganisms in complex environmental samples are required in numerous diagnostic fields. Here, we describe the development of a novel, background-free ligation detection (LD) system using a single compound detector probe per target. The detector probes used, referred to as padlock probes (PLPs), are long oligonucleotides containing asymmetric target complementary regions at both their 5′ and 3′ ends which confer extremely specific target detection. Probes also incorporate a desthiobiotin moiety and an internal endonuclease IV cleavage site. DNA samples are PCR amplified, and the resulting products serve as potential targets for PLP ligation. Upon perfect target hybridization, the PLPs are circularized via enzymatic ligation, captured, and cleaved, allowing only the originally ligated PLPs to be visualized on a universal microarray. Unlike previous procedures, the probes themselves are not amplified, thereby allowing a simple PLP cleavage to yield a background-free assay. We designed and tested nine PLPs targeting several oomycetes and fungi. All of the probes specifically detected their corresponding targets and provided perfect discrimination against closely related nontarget organisms, yielding an assay sensitivity of 1 pg genomic DNA and a dynamic detection range of 104. A practical demonstration with samples collected from horticultural water circulation systems was performed to test the robustness of the newly developed multiplex assay. This novel LD system enables highly specific detection and identification of multiple pathogens over a wide range of target concentrations and should be easily adaptable to a variety of applications in environmental microbiology.


2018 ◽  
Author(s):  
Max Robinson ◽  
Anat Zimmer ◽  
Terry Farrah ◽  
Denise E. Mauldin ◽  
Nathan D. Price ◽  
...  

AbstractScale invariance is a common property of physical laws and a key concept in perspective drawing, which aims to provide a meaningful two-dimensional representation of a more complex, three-dimensional scene. Here we describe Scale Invariant Geometric Data Analysis (SIGDA), a new, general exploratory data analysis (EDA) method based on normalization of data to scale invariance. We discuss similarities and differences between SIGDA and two widely-used EDA methods, Correspondence Analysis (CA) and Principal Components Analysis (PCA). We then illustrate SIGDA’s ability to analyze and visualize population structure relationships within the data that inspired its development: genetic marker data, in which context PCA is considered a standard method. We show that SIGDA provides significant advantages over PCA of the same data, including: (a) robust detection and separation of a larger number of population axes, leading to (b) better separation of annotated populations; (c) separation of an independent allele frequency axis interpretable as a proxy for allele age, (d) visualization of marker flow between populations (population history), and (d) robust detection and visualization of relationships between closely-related individuals and among family groups. Although this illustration focuses on a specific task, SIGDA is a general-purpose EDA method and derives its advantages from its novel approach to fundamental issues in data analysis, rather than clever sampling or other task-specific methodology.One Sentence SummaryWe illustrate the advantages of Scale Invariant Geometric Data Analysis (SIGDA), a new exploratory data analysis method similar to PCA, by applying SIGDA to derive detailed, robust visualizations of the complex history of human population structure from a large sample of single nucleotide variants.


Author(s):  
C.D. Humphrey ◽  
T.L. Cromeans ◽  
E.H. Cook ◽  
D.W. Bradley

There is a variety of methods available for the rapid detection and identification of viruses by electron microscopy as described in several reviews. The predominant techniques are classified as direct electron microscopy (DEM), immune electron microscopy (IEM), liquid phase immune electron microscopy (LPIEM) and solid phase immune electron microscopy (SPIEM). Each technique has inherent strengths and weaknesses. However, in recent years, the most progress for identifying viruses has been realized by the utilization of SPIEM.


Sign in / Sign up

Export Citation Format

Share Document