Blind normalization of public high-throughput databases

PeerJ Computer Science ◽

10.7717/peerj-cs.231 ◽

2019 ◽

Vol 5 ◽

pp. e231

Author(s):

Sebastian Ohse ◽

Melanie Boerries ◽

Hauke Busch

Keyword(s):

High Throughput ◽

Cell Biology ◽

Large Scale ◽

Missing Values ◽

Ad Hoc ◽

Confounding Factors ◽

Noise Sources ◽

Biological Signal ◽

Public Data ◽

Meta Analyses

The rise of high-throughput technologies in the domain of molecular and cell biology, as well as medicine, has generated an unprecedented amount of quantitative high-dimensional data. Public databases at present make a wealth of this data available, but appropriate normalization is critical for meaningful analyses integrating different experiments and technologies. Without such normalization, meta-analyses can be difficult to perform and the potential to address shortcomings in experimental designs, such as inadequate replicates or controls with public data, is limited. Because of a lack of quantitative standards and insufficient annotation, large scale normalization across entire databases is currently limited to approaches that demand ad hoc assumptions about noise sources and the biological signal. By leveraging detectable redundancies in public databases, such as related samples and features, we show that blind normalization without constraints on noise sources and the biological signal is possible. The inherent recovery of confounding factors is formulated in the theoretical framework of compressed sensing and employs efficient optimization on manifolds. As public databases increase in size and offer more detectable redundancies, the proposed approach is able to scale to more complex confounding factors. In addition, the approach accounts for missing values and can incorporate spike-in controls. Our work presents a systematic approach to the blind normalization of public high-throughput databases.

Download Full-text

An Open-Source Framework for Automated High-Throughput Cell Biology Experiments

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.697584 ◽

2021 ◽

Vol 9 ◽

Author(s):

Pavel Katunin ◽

Jianbo Zhou ◽

Ola M. Shehata ◽

Andrew A. Peden ◽

Ashley Cadby ◽

...

Keyword(s):

Open Source ◽

High Throughput ◽

Cell Biology ◽

Large Scale ◽

Low Cost ◽

Fluorescent Labeling ◽

Open Source Framework ◽

Open Source Hardware ◽

High Degree ◽

Data Analysis Methods

Modern data analysis methods, such as optimization algorithms or deep learning have been successfully applied to a number of biotechnological and medical questions. For these methods to be efficient, a large number of high-quality and reproducible experiments needs to be conducted, requiring a high degree of automation. Here, we present an open-source hardware and low-cost framework that allows for automatic high-throughput generation of large amounts of cell biology data. Our design consists of an epifluorescent microscope with automated XY stage for moving a multiwell plate containing cells and a perfusion manifold allowing programmed application of up to eight different solutions. Our system is very flexible and can be adapted easily for individual experimental needs. To demonstrate the utility of the system, we have used it to perform high-throughput Ca2+ imaging and large-scale fluorescent labeling experiments.

Download Full-text

An overview of algorithms and associated applications for single cell RNA-Seq data imputation

Current Genomics ◽

10.2174/1389202921999200716104916 ◽

2020 ◽

Vol 21 ◽

Author(s):

Zarrin Basharat ◽

Sania Majeed ◽

Humaira Saleem ◽

Ishtiaq Ahmad Khan ◽

Azra Yasmin

Keyword(s):

Single Cell ◽

Large Scale ◽

Missing Values ◽

Ad Hoc ◽

Cell Types ◽

Learning Approaches ◽

Data Imputation ◽

Rna Seq ◽

Accurate Analysis ◽

Heterogeneous Datasets

: Single cell RNA-Seq technology enables assessment of RNA expression in individual cells. This makes it popular in experimental biology for gleaning specifications of novel cell types as well as inferring heterogeneity. Experimental data conventionally contains zero counts or dropout events for many single cell transcripts. Such missing data hampers the accurate analysis using standard workflows, designed for massive RNA-Seq datasets. Imputation for single cell datasets is done to infer the missing values. This was traditionally done with ad-hoc code but later customized pipelines, workflows and specialized softwares appeared for the purpose. This made it easy to benchmark and cluster things in an organized manner. In this review, we have assembled a catalog of available RNA-Seq single cell imputation algorithms/workflows and associated softwares for the scientific community performing single-cell RNA-Seq data analysis. Continued development of imputation methods, especially using deep learning approaches would be necessary for eradicating associated pitfalls and addressing challenges associated with future large scale and heterogeneous datasets.

Download Full-text

Accelerated Discovery of High-Refractive-Index Polyimides via First-Principles Molecular Modeling, Virtual High-Throughput Screening, and Data Mining

10.26434/chemrxiv.7670903.v1 ◽

2019 ◽

Author(s):

Mohammad Atif Faiz Afzal ◽

Mojtaba Haghighatlari ◽

Sai Prasad Ganesh ◽

Chong Cheng ◽

Johannes Hachmann

Keyword(s):

Data Mining ◽

Refractive Index ◽

High Throughput ◽

First Principles ◽

High Throughput Screening ◽

Large Scale ◽

Computational Study ◽

High Refractive Index ◽

Structural Features ◽

Learning Program

<div>We present a high-throughput computational study to identify novel polyimides (PIs) with exceptional refractive index (RI) values for use as optic or optoelectronic materials. Our study utilizes an RI prediction protocol based on a combination of first-principles and data modeling developed in previous work, which we employ on a large-scale PI candidate library generated with the ChemLG code. We deploy the virtual screening software ChemHTPS to automate the assessment of this extensive pool of PI structures in order to determine the performance potential of each candidate. This rapid and efficient approach yields a number of highly promising leads compounds. Using the data mining and machine learning program package ChemML, we analyze the top candidates with respect to prevalent structural features and feature combinations that distinguish them from less promising ones. In particular, we explore the utility of various strategies that introduce highly polarizable moieties into the PI backbone to increase its RI yield. The derived insights provide a foundation for rational and targeted design that goes beyond traditional trial-and-error searches.</div>

Download Full-text

Identification of and Correction for Publication Bias: Comment

10.31222/osf.io/dh87m ◽

2019 ◽

Author(s):

Amanda Kvarven ◽

Eirik Strømland ◽

Magnus Johannesson

Keyword(s):

Publication Bias ◽

False Positive ◽

Large Scale ◽

Meta Analysis ◽

False Positive Rate ◽

Effect Sizes ◽

Replication Studies ◽

Moderate Reduction ◽

Positive Rate ◽

Meta Analyses

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.

Download Full-text

High Throughput via Cross-Layer Interference Alignment for Mobile Ad Hoc Networks

10.21236/ada596282 ◽

2013 ◽

Author(s):

Jr Heath ◽

Robert W.

Keyword(s):

Ad Hoc Networks ◽

Mobile Ad Hoc Networks ◽

High Throughput ◽

Ad Hoc ◽

Interference Alignment ◽

Cross Layer ◽

Mobile Ad Hoc ◽

Hoc Networks

Download Full-text

Next-Generation Sequencing: An Emerging Tool for Drug Designing

Current Pharmaceutical Design ◽

10.2174/1381612825666190911155508 ◽

2019 ◽

Vol 25 (31) ◽

pp. 3350-3357 ◽

Cited By ~ 1

Author(s):

Pooja Tripathi ◽

Jyotsna Singh ◽

Jonathan A. Lal ◽

Vijay Tripathi

Keyword(s):

Next Generation Sequencing ◽

High Throughput ◽

Large Scale ◽

Massively Parallel Sequencing ◽

Genomic Research ◽

Biological Research ◽

Next Generation ◽

Human Welfare ◽

Drug Designing ◽

Generation Sequencing

Background: With the outbreak of high throughput next-generation sequencing (NGS), the biological research of drug discovery has been directed towards the oncology and infectious disease therapeutic areas, with extensive use in biopharmaceutical development and vaccine production. Method: In this review, an effort was made to address the basic background of NGS technologies, potential applications of NGS in drug designing. Our purpose is also to provide a brief introduction of various Nextgeneration sequencing techniques. Discussions: The high-throughput methods execute Large-scale Unbiased Sequencing (LUS) which comprises of Massively Parallel Sequencing (MPS) or NGS technologies. The Next geneinvolved necessarily executes Largescale Unbiased Sequencing (LUS) which comprises of MPS or NGS technologies. These are related terms that describe a DNA sequencing technology which has revolutionized genomic research. Using NGS, an entire human genome can be sequenced within a single day. Conclusion: Analysis of NGS data unravels important clues in the quest for the treatment of various lifethreatening diseases and other related scientific problems related to human welfare.

Download Full-text

Large-scale High-throughput Screening Revealed 5'-(carbonylamino)-2,3'- bithiophene-4'-carboxylate as Novel Template for Antibacterial Agents

Current Drug Discovery Technologies ◽

10.2174/1570163816666190603095521 ◽

2020 ◽

Vol 17 (5) ◽

pp. 716-724

Author(s):

Yan A. Ivanenkov ◽

Renat S. Yamidanov ◽

Ilya A. Osterman ◽

Petr V. Sergiev ◽

Vladimir A. Aladinskiy ◽

...

Keyword(s):

Antibacterial Activity ◽

High Throughput ◽

High Throughput Screening ◽

Large Scale ◽

Antibacterial Agents ◽

Sos Response ◽

Luciferase Assay ◽

Translational Machinery ◽

E Coli ◽

New Class

Background: The key issue in the development of novel antimicrobials is a rapid expansion of new bacterial strains resistant to current antibiotics. Indeed, World Health Organization has reported that bacteria commonly causing infections in hospitals and in the community, e.g. E. Coli, K. pneumoniae and S. aureus, have high resistance vs the last generations of cephalosporins, carbapenems and fluoroquinolones. During the past decades, only few successful efforts to develop and launch new antibacterial medications have been performed. This study aims to identify new class of antibacterial agents using novel high-throughput screening technique. Methods: We have designed library containing 125K compounds not similar in structure (Tanimoto coeff.< 0.7) to that published previously as antibiotics. The HTS platform based on double reporter system pDualrep2 was used to distinguish between molecules able to block translational machinery or induce SOS-response in a model E. coli system. MICs for most active chemicals in LB and M9 medium were determined using broth microdilution assay. Results: In an attempt to discover novel classes of antibacterials, we performed HTS of a large-scale small molecule library using our unique screening platform. This approach permitted us to quickly and robustly evaluate a lot of compounds as well as to determine the mechanism of action in the case of compounds being either translational machinery inhibitors or DNA-damaging agents/replication blockers. HTS has resulted in several new structural classes of molecules exhibiting an attractive antibacterial activity. Herein, we report as promising antibacterials. Two most active compounds from this series showed MIC value of 1.2 (5) and 1.8 μg/mL (6) and good selectivity index. Compound 6 caused RFP induction and low SOS response. In vitro luciferase assay has revealed that it is able to slightly inhibit protein biosynthesis. Compound 5 was tested on several archival strains and exhibited slight activity against gram-negative bacteria and outstanding activity against S. aureus. The key structural requirements for antibacterial potency were also explored. We found, that the unsubstituted carboxylic group is crucial for antibacterial activity as well as the presence of bulky hydrophobic substituents at phenyl fragment. Conclusion: The obtained results provide a solid background for further characterization of the 5'- (carbonylamino)-2,3'-bithiophene-4'-carboxylate derivatives discussed herein as new class of antibacterials and their optimization campaign.

Download Full-text

Large-Scale, High-Throughput Validation of Short Hairpin RNA Sequences for RNA Interference

CrossRef Listing of Deleted DOIs ◽

10.1177/1087057105284342 ◽

2006 ◽

Vol 11 (3) ◽

pp. 236-246 ◽

Cited By ~ 6

Author(s):

Laurence H. Lamarcq ◽

Bradley J. Scherer ◽

Michael L. Phelan ◽

Nikolai N. Kalnine ◽

Yen H. Nguyen ◽

...

Keyword(s):

High Throughput ◽

Large Scale ◽

Strong Support ◽

Gc Content ◽

Rapid Identification ◽

Hairpin Rna ◽

Rna Sequences ◽

Short Hairpin ◽

Short Hairpin Rnas ◽

Interfering Rna

A method for high-throughput cloning and analysis of short hairpin RNAs (shRNAs) is described. Using this approach, 464 shRNAs against 116 different genes were screened for knockdown efficacy, enabling rapid identification of effective shRNAs against 74 genes. Statistical analysis of the effects of various criteria on the activity of the shRNAs confirmed that some of the rules thought to govern small interfering RNA (siRNA) activity also apply to shRNAs. These include moderate GC content, absence of internal hairpins, and asymmetric thermal stability. However, the authors did not find strong support for positionspecific rules. In addition, analysis of the data suggests that not all genes are equally susceptible to RNAinterference (RNAi).

Download Full-text

Identification of a juvenile-hormone signaling inhibitor via high-throughput screening of a chemical library

Scientific Reports ◽

10.1038/s41598-020-75386-x ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Takumi Kayukawa ◽

Kenjiro Furuta ◽

Keisuke Nagamine ◽

Tetsuro Shinoda ◽

Kiyoaki Yonesu ◽

...

Keyword(s):

Juvenile Hormone ◽

Cell Line ◽

Signaling Pathway ◽

High Throughput ◽

High Throughput Screening ◽

Large Scale ◽

Ex Vivo ◽

Agricultural Field ◽

Chemical Library ◽

Field Development

Abstract Insecticide resistance has recently become a serious problem in the agricultural field. Development of insecticides with new mechanisms of action is essential to overcome this limitation. Juvenile hormone (JH) is an insect-specific hormone that plays key roles in maintaining the larval stage of insects. Hence, JH signaling pathway is considered a suitable target in the development of novel insecticides; however, only a few JH signaling inhibitors (JHSIs) have been reported, and no practical JHSIs have been developed. Here, we established a high-throughput screening (HTS) system for exploration of novel JHSIs using a Bombyx mori cell line (BmN_JF&AR cells) and carried out a large-scale screening in this cell line using a chemical library. The four-step HTS yielded 69 compounds as candidate JHSIs. Topical application of JHSI48 to B. mori larvae caused precocious metamorphosis. In ex vivo culture of the epidermis, JHSI48 suppressed the expression of the Krüppel homolog 1 gene, which is directly activated by JH-liganded receptor. Moreover, JHSI48 caused a parallel rightward shift in the JH response curve, suggesting that JHSI48 possesses a competitive antagonist-like activity. Thus, large-scale HTS using chemical libraries may have applications in development of future insecticides targeting the JH signaling pathway.

Download Full-text

Allometric Equations for Estimating Biomass and Carbon Stocks in Afforested Open Woodlands with Black Spruce and Jack Pine, in the Eastern Canadian Boreal Forest

Forests ◽

10.3390/f12010059 ◽

2021 ◽

Vol 12 (1) ◽

pp. 59

Author(s):

Olivier Fradette ◽

Charles Marty ◽

Pascal Tremblay ◽

Daniel Lord ◽

Jean-François Boucher

Keyword(s):

Large Scale ◽

Ad Hoc ◽

Black Spruce ◽

Tree Height ◽

Jack Pine ◽

Carbon Stocks ◽

Allometric Equations ◽

C Stocks ◽

The Difference ◽

Canadian Boreal Forest

Allometric equations use easily measurable biometric variables to determine the aboveground and belowground biomasses of trees. Equations produced for estimating the biomass within Canadian forests at a large scale have not yet been validated for eastern Canadian boreal open woodlands (OWs), where trees experience particular environmental conditions. In this study, we harvested 167 trees from seven boreal OWs in Quebec, Canada for biomass and allometric measurements. These data show that Canadian national equations accurately predict the whole aboveground biomass for both black spruce and jack pine trees, but underestimated branches biomass, possibly owing to a particular tree morphology in OWs relative to closed-canopy stands. We therefore developed ad hoc allometric equations based on three power models including diameter at breast height (DBH) alone or in combination with tree height (H) as allometric variables. Our results show that although the inclusion of H in the model yields better fits for most tree compartments in both species, the difference is minor and does not markedly affect biomass C stocks at the stand level. Using these newly developed equations, we found that carbon stocks in afforested OWs varied markedly among sites owing to differences in tree growth and species. Nine years after afforestation, jack pine plantations had accumulated about five times more carbon than black spruce plantations (0.14 vs. 0.80 t C·ha−1), highlighting the much larger potential of jack pine for OW afforestation projects in this environment.

Download Full-text