scholarly journals Predicting functional long non-coding RNAs validated by low throughput experiments

2019 ◽  
Author(s):  
Bailing Zhou ◽  
Yuedong Yang ◽  
Jian Zhan ◽  
Xianghua Dou ◽  
Jihua Wang ◽  
...  

ABSTRACTHigh-throughput techniques have uncovered hundreds and thousands of long non-coding RNAs (lncRNAs). Among them, only a small fraction has experimentally validated functions (EVlncRNAs) by low-throughput methods. What fraction of lncRNAs from high-throughput experiments (HTlncRNAs) is truly functional is an active subject of debate. Here, we developed the first method to distinguish EVlncRNAs from HTlncRNAs and mRNAs by using Support Vector Machines and found that EVlncRNAs can be well separated from HTlncRNAs and mRNAs with 0.6 for Matthews correlation coefficient, 64% for sensitivity, and 81% for precision for the independent human test set. The most discriminative features are related to sequence conservations at RNA (for separating from HTlncRNAs) and protein (for separating from mRNA) levels. The method is found to be robust as the human-RNA-trained model is applicable to independent mouse RNAs with similar accuracy and to a lesser extent to plant RNAs. The method can recover newly discovered EVlncRNAs with high sensitivity. Its application to randomly selected 2000 human HTlncRNAs indicates that a large number of functional lncRNAs are waiting to be validated. The method is expected to speed up and reduce the cost of the discovery by prioritizing potentially functional lncRNAs prior to experimental validation. EVlncRNA-pred is available as a web server at http://biophy.dzu.edu.cn/lncrnapred/index.html. All datasets used in this study can be obtained from the same website.

2005 ◽  
Vol 13 (03) ◽  
pp. 287-298 ◽  
Author(s):  
JUN CAI ◽  
YING HUANG ◽  
LIANG JI ◽  
YANDA LI

In post-genomic biology, researchers in the field of proteome focus their attention on the networks of protein interactions that control the lives of cells and organisms. Protein-protein interactions play a useful role in dynamic cellular machinery. In this paper, we developed a method to infer protein-protein interactions based on the theory of support vector machine (SVM). For a given pair of proteins, a new strategy of calculating cross-correlation function of mRNA expression profiles was used to encode SVM vectors. We compared the performance with other methods of inferring protein-protein interaction. Results suggested that, through five-fold cross validation, our SVM model achieved a good prediction. It enables us to show that expression profiles in transcription level can be used to distinguish physical or functional interactions of proteins as well as sequence contents. Lastly, we applied our SVM classifier to evaluate data quality of interaction data sets from four high-throughput experiments. The results show that high-throughput experiments sacrifice some accuracy in determination of interactions because of limitation of experiment technologies.


2019 ◽  
Vol 7 (47) ◽  
pp. 26785-26790
Author(s):  
Yungchieh Lai ◽  
Ryan J. R. Jones ◽  
Yu Wang ◽  
Lan Zhou ◽  
Matthias H. Richter ◽  
...  

The uniqueness of Cu for CO2 electroreduction is accompanied by high sensitivity to contaminants, but alloys can tune product selectivity.


2021 ◽  
Vol 11 ◽  
Author(s):  
Cecilia Zampedri ◽  
Williams Arony Martínez-Flores ◽  
Jorge Melendez-Zajgla

Breast cancer represents a great challenge since it is the first cause of death by cancer in women worldwide. LncRNAs are a newly described class of non-coding RNAs that participate in cancer progression. Their use as cancer markers and possible therapeutic targets has recently gained strength. Animal xenotransplants allows for in vivo monitoring of disease development, molecular elucidation of pathogenesis and the design of new therapeutic strategies. Nevertheless, the cost and complexities of mice husbandry makes medium to high throughput assays difficult. Zebrafishes (Danio rerio) represent a novel model for these assays, given the ease with which xenotransplantation trials can be performed and the economic and experimental advantages it offers. In this review we propose the use of xenotransplants in zebrafish to study the role of breast cancer lncRNAs using low to medium high throughput assays.


2019 ◽  
Vol 13 ◽  
pp. 117793221985635 ◽  
Author(s):  
Nicholas Mills ◽  
Ethan M Bensman ◽  
William L Poehlman ◽  
Walter B Ligon ◽  
F Alex Feltus

Motivation: As the size of high-throughput DNA sequence datasets continues to grow, the cost of transferring and storing the datasets may prevent their processing in all but the largest data centers or commercial cloud providers. To lower this cost, it should be possible to process only a subset of the original data while still preserving the biological information of interest. Results: Using 4 high-throughput DNA sequence datasets of differing sequencing depth from 2 species as use cases, we demonstrate the effect of processing partial datasets on the number of detected RNA transcripts using an RNA-Seq workflow. We used transcript detection to decide on a cutoff point. We then physically transferred the minimal partial dataset and compared with the transfer of the full dataset, which showed a reduction of approximately 25% in the total transfer time. These results suggest that as sequencing datasets get larger, one way to speed up analysis is to simply transfer the minimal amount of data that still sufficiently detects biological signal. Availability: All results were generated using public datasets from NCBI and publicly available open source software.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mathias Fink ◽  
Monika Cserjan-Puschmann ◽  
Daniela Reinisch ◽  
Gerald Striedner

AbstractTremendous advancements in cell and protein engineering methodologies and bioinformatics have led to a vast increase in bacterial production clones and recombinant protein variants to be screened and evaluated. Consequently, an urgent need exists for efficient high-throughput (HTP) screening approaches to improve the efficiency in early process development as a basis to speed-up all subsequent steps in the course of process design and engineering. In this study, we selected the BioLector micro-bioreactor (µ-bioreactor) system as an HTP cultivation platform to screen E. coli expression clones producing representative protein candidates for biopharmaceutical applications. We evaluated the extent to which generated clones and condition screening results were transferable and comparable to results from fully controlled bioreactor systems operated in fed-batch mode at moderate or high cell densities. Direct comparison of 22 different production clones showed great transferability. We observed the same growth and expression characteristics, and identical clone rankings except one host-Fab-leader combination. This outcome demonstrates the explanatory power of HTP µ-bioreactor data and the suitability of this platform as a screening tool in upstream development of microbial systems. Fast, reliable, and transferable screening data significantly reduce experiments in fully controlled bioreactor systems and accelerate process development at lower cost.


Viruses ◽  
2021 ◽  
Vol 13 (5) ◽  
pp. 749
Author(s):  
Julia Butt ◽  
Rajagopal Murugan ◽  
Theresa Hippchen ◽  
Sylvia Olberg ◽  
Monique van Straaten ◽  
...  

The emerging SARS-CoV-2 pandemic entails an urgent need for specific and sensitive high-throughput serological assays to assess SARS-CoV-2 epidemiology. We, therefore, aimed at developing a fluorescent-bead based SARS-CoV-2 multiplex serology assay for detection of antibody responses to the SARS-CoV-2 proteome. Proteins of the SARS-CoV-2 proteome and protein N of SARS-CoV-1 and common cold Coronaviruses (ccCoVs) were recombinantly expressed in E. coli or HEK293 cells. Assay performance was assessed in a COVID-19 case cohort (n = 48 hospitalized patients from Heidelberg) as well as n = 85 age- and sex-matched pre-pandemic controls from the ESTHER study. Assay validation included comparison with home-made immunofluorescence and commercial enzyme-linked immunosorbent (ELISA) assays. A sensitivity of 100% (95% CI: 86–100%) was achieved in COVID-19 patients 14 days post symptom onset with dual sero-positivity to SARS-CoV-2 N and the receptor-binding domain of the spike protein. The specificity obtained with this algorithm was 100% (95% CI: 96–100%). Antibody responses to ccCoVs N were abundantly high and did not correlate with those to SARS-CoV-2 N. Inclusion of additional SARS-CoV-2 proteins as well as separate assessment of immunoglobulin (Ig) classes M, A, and G allowed for explorative analyses regarding disease progression and course of antibody response. This newly developed SARS-CoV-2 multiplex serology assay achieved high sensitivity and specificity to determine SARS-CoV-2 sero-positivity. Its high throughput ability allows epidemiologic SARS-CoV-2 research in large population-based studies. Inclusion of additional pathogens into the panel as well as separate assessment of Ig isotypes will furthermore allow addressing research questions beyond SARS-CoV-2 sero-prevalence.


Small ◽  
2021 ◽  
Vol 17 (14) ◽  
pp. 2007302
Author(s):  
Mohan Lin ◽  
Yingke Zhou ◽  
Lingzheng Bu ◽  
Chuang Bai ◽  
Muhammad Tariq ◽  
...  

2021 ◽  
Vol 2 (3) ◽  
pp. 100606
Author(s):  
Giuseppina E. Grieco ◽  
Guido Sebastiani ◽  
Daniela Fignani ◽  
Noemi Brusco ◽  
Laura Nigi ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yan Helen Yan ◽  
Sherry X. Chen ◽  
Lauren Y. Cheng ◽  
Alyssa Y. Rodriguez ◽  
Rui Tang ◽  
...  

AbstractWhole exome sequencing (WES) is used to identify mutations in a patient’s tumor DNA that are predictive of tumor behavior, including the likelihood of response or resistance to cancer therapy. WES has a mutation limit of detection (LoD) at variant allele frequencies (VAF) of 5%. Putative mutations called at ≤ 5% VAF are frequently due to sequencing errors, therefore reporting these subclonal mutations incurs risk of significant false positives. Here we performed ~ 1000 × WES on fresh-frozen and formalin-fixed paraffin-embedded (FFPE) tissue biopsy samples from a non-small cell lung cancer patient, and identified 226 putative mutations at between 0.5 and 5% VAF. Each variant was then tested using NuProbe NGSure, to confirm the original WES calls. NGSure utilizes Blocker Displacement Amplification to first enrich the allelic fraction of the mutation and then uses Sanger sequencing to determine mutation identity. Results showed that 52% of the 226 (117) putative variants were disconfirmed, among which 2% (5) putative variants were found to be misidentified in WES. In the 66 cancer-related variants, the disconfirmed rate was 82% (54/66). This data demonstrates Blocker Displacement Amplification allelic enrichment coupled with Sanger sequencing can be used to confirm putative mutations ≤ 5% VAF. By implementing this method, next-generation sequencing can reliably report low-level variants at a high sensitivity, without the cost of high sequencing depth.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Ji-Yong An ◽  
Fan-Rong Meng ◽  
Zi-Ji Yan

Abstract Background Prediction of novel Drug–Target interactions (DTIs) plays an important role in discovering new drug candidates and finding new proteins to target. In consideration of the time-consuming and expensive of experimental methods. Therefore, it is a challenging task that how to develop efficient computational approaches for the accurate predicting potential associations between drug and target. Results In the paper, we proposed a novel computational method called WELM-SURF based on drug fingerprints and protein evolutionary information for identifying DTIs. More specifically, for exploiting protein sequence feature, Position Specific Scoring Matrix (PSSM) is applied to capturing protein evolutionary information and Speed up robot features (SURF) is employed to extract sequence key feature from PSSM. For drug fingerprints, the chemical structure of molecular substructure fingerprints was used to represent drug as feature vector. Take account of the advantage that the Weighted Extreme Learning Machine (WELM) has short training time, good generalization ability, and most importantly ability to efficiently execute classification by optimizing the loss function of weight matrix. Therefore, the WELM classifier is used to carry out classification based on extracted features for predicting DTIs. The performance of the WELM-SURF model was evaluated by experimental validations on enzyme, ion channel, GPCRs and nuclear receptor datasets by using fivefold cross-validation test. The WELM-SURF obtained average accuracies of 93.54, 90.58, 85.43 and 77.45% on enzyme, ion channels, GPCRs and nuclear receptor dataset respectively. We also compared our performance with the Extreme Learning Machine (ELM), the state-of-the-art Support Vector Machine (SVM) on enzyme and ion channels dataset and other exiting methods on four datasets. By comparing with experimental results, the performance of WELM-SURF is significantly better than that of ELM, SVM and other previous methods in the domain. Conclusion The results demonstrated that the proposed WELM-SURF model is competent for predicting DTIs with high accuracy and robustness. It is anticipated that the WELM-SURF method is a useful computational tool to facilitate widely bioinformatics studies related to DTIs prediction.


Sign in / Sign up

Export Citation Format

Share Document