A Universal Sequencing System for Unknown Oligomers

No synthetic chemical system can produce complex oligomers with fidelities comparable to biological systems. To bridge this gap, chemists must be able to characterise synthetic oligomers. Currently there are no tools for identifying synthetic oligomers with sequence resolution. Herein, we present a system that allows us to do omics-level sequencing for synthetic oligomers and use this to explore unconstrained complex mixtures. The system, Oligomer-Soup-Sequencing (OLIGOSS), can sequence individual oligomers in heterogeneous and polydisperse mixtures from tandem mass spectrometry (MS/MS) data. Unlike existing software, OLIGOSS can sequence oligomers with different backbone chemistries. Using an input file format, OLIG, that formalizes the set of abstract properties, any MS/MS fragmentation pathway can be defined. This has been demonstrated on four model systems of linear oligomers. OLIGOSS can screen large sequence spaces, enabling reliable sequencing of synthetic oligomeric mixtures, with false discovery rates (FDRs) of 0-1.1%, providing sequence resolution comparable to bioinformatic tools.

Download Full-text

A Universal Sequencing System for Unknown Oligomers

10.26434/chemrxiv.13202969.v1 ◽

2020 ◽

Author(s):

David Doran ◽

Emma Clarke ◽

Graham Keenan ◽

Emma Carrick ◽

Cole Mathis ◽

...

Keyword(s):

Fragmentation Pathway ◽

Model Systems ◽

Chemical System ◽

File Format ◽

Input File ◽

False Discovery Rates ◽

Bioinformatic Tools ◽

False Discovery ◽

Input File Format ◽

Discovery Rates

Download Full-text

Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/sagmb-2013-0003 ◽

2013 ◽

Vol 12 (4) ◽

Cited By ~ 7

Author(s):

David R. Bickel

Keyword(s):

False Discovery Rates ◽

P Values ◽

False Discovery ◽

Discovery Rates

Download Full-text

False discovery rates and copy number variation

Biometrika ◽

10.1093/biomet/asr018 ◽

2011 ◽

Vol 98 (2) ◽

pp. 251-271 ◽

Cited By ~ 11

Author(s):

Bradley Efron ◽

Nancy R. Zhang

Keyword(s):

Copy Number Variation ◽

Copy Number ◽

False Discovery Rates ◽

False Discovery ◽

Number Variation ◽

Discovery Rates

Download Full-text

Signal identification for rare and weak features: higher criticism or false discovery rates?

Biostatistics ◽

10.1093/biostatistics/kxs030 ◽

2012 ◽

Vol 14 (1) ◽

pp. 129-143 ◽

Cited By ~ 14

Author(s):

Bernd Klaus ◽

Korbinian Strimmer

Keyword(s):

Signal Identification ◽

False Discovery Rates ◽

Higher Criticism ◽

False Discovery ◽

Discovery Rates

Download Full-text

A Comparison of Two Classes of Methods for Estimating False Discovery Rates in Microarray Studies

Scientifica ◽

10.6064/2012/519394 ◽

2012 ◽

Vol 2012 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Emily Hansen ◽

Kathleen F. Kerr

Keyword(s):

Differentially Expressed Genes ◽

Null Distribution ◽

Differentially Expressed ◽

Test Statistics ◽

False Discovery Rates ◽

Model Method ◽

False Discovery ◽

Microarray Studies ◽

Discovery Rates

The goal of many microarray studies is to identify genes that are differentially expressed between two classes or populations. Many data analysts choose to estimate the false discovery rate (FDR) associated with the list of genes declared differentially expressed. Estimating an FDR largely reduces to estimatingπ1, the proportion of differentially expressed genes among all analyzed genes. Estimatingπ1is usually done throughP-values, but computingP-values can be viewed as a nuisance and potentially problematic step. We evaluated methods for estimatingπ1directly from test statistics, circumventing the need to computeP-values. We adapted existing methodology for estimatingπ1fromt- andz-statistics so thatπ1could be estimated from other statistics. We compared the quality of these estimates to estimates generated by two established methods for estimatingπ1fromP-values. Overall, methods varied widely in bias and variability. The least biased and least variable estimates ofπ1, the proportion of differentially expressed genes, were produced by applying the “convest” mixture model method toP-values computed from a pooled permutation null distribution. Estimates computed directly from test statistics rather thanP-values did not reliably perform well.

Download Full-text

False discovery rates: a new deal

Biostatistics ◽

10.1093/biostatistics/kxw041 ◽

2016 ◽

pp. kxw041 ◽

Cited By ~ 66

Author(s):

Matthew Stephens

Keyword(s):

New Deal ◽

False Discovery Rates ◽

False Discovery ◽

Discovery Rates

Download Full-text

Large and ancient linguistic areas

Language Dispersal, Diversification, and Contact ◽

10.1093/oso/9780198723813.003.0005 ◽

2020 ◽

pp. 78-100

Author(s):

Balthasar Bickel

Keyword(s):

Regression Models ◽

Large Scale ◽

Population History ◽

False Discovery Rates ◽

Ancient Population ◽

False Discovery ◽

Language Universals ◽

Discovery Rates ◽

Pacific Area

Large-scale areal patterns point to ancient population history and form a well-known confound for language universals. Despite their importance, demonstrating such patterns remains a challenge. This chapter argues that large-scale area hypotheses are better tested by modeling diachronic family biases than by controlling for genealogical relations in regression models. A case study of the Trans-Pacific area reveals that diachronic bias estimates do not depend much on the amount of phylogenetic information that is used when inferring them. After controlling for false discovery rates, about 39 variables in WALS and AUTOTYP show diachronic biases that differ significantly inside vs. outside the Trans-Pacific area. Nearly three times as many biases hold outside than inside the Trans-Pacific area, indicating that the Trans-Pacific area is not so much characterized by the spread of biases but rather by the retention of earlier diversity, in line with earlier suggestions in the literature.

Download Full-text