unknown sequence
Recently Published Documents


TOTAL DOCUMENTS

33
(FIVE YEARS 7)

H-INDEX

8
(FIVE YEARS 0)

IUCrJ ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Grzegorz Chojnowski ◽  
Adam J. Simpkin ◽  
Diego A. Leonardo ◽  
Wolfram Seifert-Davila ◽  
Dan E. Vivas-Ruiz ◽  
...  

Although experimental protein-structure determination usually targets known proteins, chains of unknown sequence are often encountered. They can be purified from natural sources, appear as an unexpected fragment of a well characterized protein or appear as a contaminant. Regardless of the source of the problem, the unknown protein always requires characterization. Here, an automated pipeline is presented for the identification of protein sequences from cryo-EM reconstructions and crystallographic data. The method's application to characterize the crystal structure of an unknown protein purified from a snake venom is presented. It is also shown that the approach can be successfully applied to the identification of protein sequences and validation of sequence assignments in cryo-EM protein structures.


Viruses ◽  
2021 ◽  
Vol 13 (11) ◽  
pp. 2181
Author(s):  
Pontus Öhlund ◽  
Juliette Hayer ◽  
Jenny C. Hesson ◽  
Anne-Lie Blomström

RNA interference (RNAi)-mediated antiviral immunity is believed to be the primary defense against viral infection in mosquitoes. The production of virus-specific small RNA has been demonstrated in mosquitoes and mosquito-derived cell lines for viruses in all of the major arbovirus families. However, many if not all mosquitoes are infected with a group of viruses known as insect-specific viruses (ISVs), and little is known about the mosquito immune response to this group of viruses. Therefore, in this study, we sequenced small RNA from an Aedes albopictus-derived cell line infected with either Lammi virus (LamV) or Hanko virus (HakV). These viruses belong to two distinct phylogenetic groups of insect-specific flaviviruses (ISFVs). The results revealed that both viruses elicited a strong virus-derived small interfering RNA (vsiRNA) response that increased over time and that targeted the whole viral genome, with a few predominant hotspots observed. Furthermore, only the LamV-infected cells produced virus-derived Piwi-like RNAs (vpiRNAs); however, they were mainly derived from the antisense genome and did not show the typical ping-pong signatures. HakV, which is more distantly related to the dual-host flaviviruses than LamV, may lack certain unknown sequence elements or structures required for vpiRNA production. Our findings increase the understanding of mosquito innate immunity and ISFVs’ effects on their host.


2021 ◽  
Author(s):  
Grzegorz Chojnowski ◽  
Adam J. Simpkin ◽  
Diego A. Leonardo ◽  
Wolfram Seifert-Davila ◽  
Dan E. Vivas-Ruiz ◽  
...  

AbstractAlthough experimental protein structure determination usually targets known proteins, chains of unknown sequence are often encountered. They can be purified from natural sources, appear as an unexpected fragment of a well characterized protein or as a contaminant. Regardless of the source of the problem, the unknown protein always requires tedious characterization. Here we present an automated pipeline for the identification of protein sequences from cryo-EM reconstructions and crystallographic data. We present the method’s application to characterize the crystal structure of an unknown protein purified from a snake venom. We also show that the approach can be successfully applied to the identification of protein sequences and validation of sequence assignments in cryo-EM protein structures.


Bioanalysis ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 147-164
Author(s):  
John T Mehl ◽  
France Landry ◽  
Lorell Discenza ◽  
Bogdan Sleczka ◽  
Qihong Zhao ◽  
...  

Background: Surrogate monoclonal antibodies (mAbs) used in preclinical in vivo studies can be challenging to quantify due to lack of suitable immunoaffinity reagents or unavailability of the mAb protein sequence. Generic immunoaffinity reagents were evaluated to develop sensitive LC–MS/MS assays. Peptides of unknown sequence can be used for selective LC–MS quantification. Results: anti-mouse IgG1 was found to be an effective immunoaffinity reagent, enabling quantification of mouse IgG1 mAbs in mouse serum. Selective peptides of unknown sequence were applied for multiplex LC–MS quantification of two rat mAbs co-dosed in mouse. Conclusion: Generic anti-mouse IgG subtype-specific antibodies can be used to improve assay sensitivity and peptides of unknown sequence can be used to quantify surrogate mAbs when the mAb protein sequence in unavailable.


Author(s):  
Seyedeh-Samira Ashrafmansouri ◽  
Hossein Kamaladini ◽  
Fatemeh Haddadi ◽  
Marie Seidi

Abstract Background Various polymerase chain reaction (PCR)-based methods have been applied for the development of genome walking (GW) technique. These methods which could be based on the application of restriction enzymes or primers have various efficiencies to identify the unknown nucleotide sequences. The present study was conducted to design a new innovative double-strand adaptor using MAP30 gene sequence of Momordica charantia plant as a model to improve genome walking with convenient PCR. Results The adaptor was designed using multiple restriction sites of Hind III, BamH I, EcoR I, and Bgl II enzymes with no restriction site in a known sequence of the MAP30 gene. In addition, no modification was required to add phosphate, amine, or other groups to the adaptor, since restriction enzyme digestion of double-strand adaptor provided the 5′ phosphate group. Here, preparation of the phosphate group in the genomic DNA of the plant digestion with restriction enzymes was performed followed by ligation with digested adaptor containing 5′ phosphate group. Conclusion PCR was done to amplify the unknown sequence using MAP30 gene-specific primer and adaptor primer. Results confirmed the ability of the technique for successful identification of the sequence. Consequently, a newly designed adaptor in the developed technique reduced the time and cost of the method compared to the conventional genome walking; also, cloning and culturing of bacterial steps could be eliminated.


2020 ◽  
Author(s):  
Jnanendra Prasad Sarkar ◽  
Indrajit Saha ◽  
Arijit Seal ◽  
Debasree Maity

Abstract The problem of virus classification is always a subject of concern for virology or epidemiology over the decades. Moreover, the detection of highly divergent or yet unknown viruses is a major challenge despite of its clinical importance. In this situati on, the outbreak of novel coronavirus (SARS-CoV-2) and its susceptibility in different epidemic condition around the world clearly suggest that the virus is mutating to create divergent variants and making the task of virus prediction more challenging. On the other hand, despite of novel coronavirus, two more coronaviruses such as MERS and SARS-CoV-1 are already present. Therefore, the use of machine learning technique is highly required at this moment to predict the coronaviruses by considering their divergent genetic functional characteristics. Thus, we are proposing machine learning based coronavirus prediction technique, called COVID- Predictor, where 1000 of RNA sequences of SARS-CoV-1, MERS, SARS-CoV-2 and other virus are used to train a Na¨ıve Bayes classifier so that it can predict any unknown sequence of these viruses. In order to develop the COVID-Predictor, the feature vector is constructed by the motifs of the sequence generated by k-mer and n-gram techniques. The model has been validated using 10 fold cross validation in comparison with other classification techniques. The results show the superiority of our predictor by achieving average 97% accuracy on unseen validation set. The same pre-trained model has been used to design a web based application where RNA sequences of unknown viruses can be uploaded to predict class of coronavirus. The predictor, code and datasets are available here: http://www.nitttrkol.ac.in/indrajit/projects/COVID-Predictor/


Author(s):  
Zhenzhen Yang ◽  
Lilan Zhang ◽  
Xuejing Yu ◽  
Shan Wu ◽  
Yong Yang ◽  
...  

Moenomycin-type antibiotics are phosphoglycolipids that are notable for their unique modes of action and have proven to be useful in animal nutrition. The gene clusters tchm from Actinoplanes teichomyceticus and moe from Streptomyces are among a limited number of known moenomycin-biosynthetic pathways. Most genes in tchm have counterparts in the moe cluster, except for tchmy and tchmz, the functions of which remain unknown. Sequence analysis indicates that TchmY belongs to the isoprenoid enzyme C2-like superfamily and may serve as a prenylcyclase. The enzyme was proposed to be involved in terminal cyclization of the moenocinyl chain in teichomycin, leading to the diumycinol chain of moenomycin isomers. Here, recombinant TchmY protein was expressed in Escherichia coli and its crystal structure was solved by SIRAS. Structural analysis and comparison with other prenylcyclases were performed. The overall fold of TchmY consists of an (α/α)6-barrel, and a potential substrate-binding pocket is found in the central chamber. These results should provide important information regarding the biosynthetic basis of moenomycin antibiotics.


2018 ◽  
Author(s):  
Nicholas Bogard ◽  
Johannes Linder ◽  
Alexander B. Rosenberg ◽  
Georg Seelig

Alternative polyadenylation (APA) is a major driver of transcriptome diversity in human cells. Here, we use deep learning to predict APA from DNA sequence alone. We trained our model (APARENT, APA REgression NeT) on isoform expression data from over three million APA reporters, built by inserting random sequence into twelve distinct 3’UTR contexts. Predictions are highly accurate across both synthetic and genomic contexts; when tasked with inferring APA in human 3’UTRs, APARENT outperforms models trained exclusively on endogenous data. Visualizing features learned across all network layers reveals that APARENT recognizes sequence motifs known to recruit APA regulators, discovers previously unknown sequence determinants of cleavage site selection, and integrates these features into a comprehensive, interpretable cis-regulatory code. Finally, we use APARENT to quantify the impact of genetic variants on APA. Our approach detects pathogenic variants in a wide range of disease contexts, expanding our understanding of the genetic origins of disease.


2017 ◽  
Vol 97 (1) ◽  
pp. 5-13 ◽  
Author(s):  
H. Saltaji ◽  
S. Armijo-Olivo ◽  
G.G. Cummings ◽  
M. Amin ◽  
B.R. da Costa ◽  
...  

Emerging evidence suggests that design flaws of randomized controlled trials can result in over- or underestimation of the treatment effect size (ES). The objective of this study was to examine associations between treatment ES estimates and adequacy of sequence generation, allocation concealment, and baseline comparability among a sample of oral health randomized controlled trials. For our analysis, we selected all meta-analyses that included a minimum of 5 oral health randomized controlled trials and used continuous outcomes. We extracted data, in duplicate, related to items of selection bias (sequence generation, allocation concealment, and baseline comparability) in the Cochrane Risk of Bias tool. Using a 2-level meta-meta-analytic approach with a random effects model to allow for intra- and inter-meta-analysis heterogeneity, we quantified the impact of selection bias on the magnitude of ES estimates. We identified 64 meta-analyses, including 540 randomized controlled trials analyzing 137,957 patients. Sequence generation was judged to be adequate (at low risk of bias) in 32% ( n = 173) of trials, and baseline comparability was judged to be adequate in 77.8% of trials. Allocation concealment was unclear in the majority of trials ( n = 458, 84.8%). We identified significantly larger treatment ES estimates in trials that had inadequate/unknown sequence generation (difference in ES = 0.13; 95% CI: 0.01 to 0.25) and inadequate/unknown allocation concealment (difference in ES = 0.15; 95% CI: 0.02 to 0.27). In contrast, baseline imbalance (difference in ES = 0.01, 95% CI: –0.09 to 0.12) was not associated with inflated or underestimated ES. In conclusion, treatment ES estimates were 0.13 and 0.15 larger in trials with inadequate/unknown sequence generation and inadequate/unknown allocation concealment, respectively. Therefore, authors of systematic reviews using oral health randomized controlled trials should perform sensitivity analyses based on the adequacy of sequence generation and allocation concealment.


2017 ◽  
Author(s):  
Jinny X. Zhang ◽  
John Z. Fang ◽  
Wei Duan ◽  
Lucia R. Wu ◽  
Angela W. Zhang ◽  
...  

Hybridization is a key molecular process in biology and biotechnology, but to date there is no predictive model for accurately determining hybridization rate constants based on sequence information. To approach this problem systematically, we first performed 210 fluorescence kinetics experiments to observe the hybridization kinetics of 100 different DNA target and probe pairs (subsequences of the CYCS and VEGF genes) at temperatures ranging from 28 °C to 55 °C. Next, we rationally designed 38 features computable based on sequence, each feature individually correlated with hybridization kinetics. These features are used in our implementation of a weighted neighbor voting (WNV) algorithm, in which the hybridization rate constant of an unknown sequence is predicted based on similarity reactions with known rate constants (a.k.a. labeled instances). Automated feature selection and weighting optimization resulted in a final 6-feature WNV model, which can predict hybridization rate constants of new sequences to within a factor of 2 with ≈74% accuracy and within a factor of 3 with ≈92% accuracy, based on leave-one-out cross-validation. Predictive understanding of hybridization kinetics allows more efficient design of nucleic acid probes, for example in allowing sparse hybrid-capture panels to more quickly and economically enrich desired regions from genomic DNA.


Sign in / Sign up

Export Citation Format

Share Document