coding sequences
Recently Published Documents


TOTAL DOCUMENTS

1546
(FIVE YEARS 367)

H-INDEX

84
(FIVE YEARS 7)

2022 ◽  
pp. 1-4
Author(s):  
Wanling Yang ◽  
Yuanwei Fan ◽  
Yong Chen ◽  
Gumu Ding ◽  
Hu Liu ◽  
...  

Abstract Dongxiang wild rice (Oryza rufipogon Griff.) (DXWR) is the northernmost distributed wild rice found in the world. Similar to other populations of O. rufipogon, DXWR contains a large number of agronomically valuable genes, which makes it a natural gene pool for rice breeding. Molecular markers, especially simple repeat sequence (SSR) markers, play important roles in plant breeding. Although a large number of SSR markers have been developed, most of them are derived from the genome coding sequences, rarely from non-coding sequences. Meanwhile, long non-coding RNAs (lncRNAs), which are derived from the transcription of non-coding sequences, play vital roles in plant growth, development and stress responses. In our previous study, we obtained 1655 lncRNA transcripts from DXWR using strand-specific RNA sequencing. In this study, 1878 SSR loci were detected from the lncRNA sequences of DXWR, and 1258 lncRNA-derived-SSR markers were developed on the genome-wide scale. To verify the validity and applicability of these markers, 72 pairs of primers were randomly selected to test 44 rice accessions. The results showed that 42 (58.33%) pairs of primers have abundant polymorphism among these rice materials; the polymorphism information content values ranged from 0.04 to 0.87 with an average of 0.50; the genetic diversity index of SSR loci varied from 0.04 to 0.88 with an average of 0.56; and the number of alleles per marker ranged from 2 to 11 with an average of 4.36. Thus, we concluded that these lncRNA-derived-SSR markers are a very useful source for future basic and applied research.


Author(s):  
Bart Verwaaijen ◽  
Özgülen Cevahir ◽  
Fabian Hitz ◽  
Jacqueline Römmich ◽  
Donat Wulf

Here, we report the complete genome sequence of Pseudomonas sp. strain MM213 of the Pseudomonas mandelii group, which was isolated from a brookside in Bielefeld, Germany. The genome size is 6,746,355 bp, with a GC content of 59.4% and 6,145 predicted protein-coding sequences.


2022 ◽  
Author(s):  
Anna Grandchamp ◽  
Katrin Berk ◽  
Elias Dohmen ◽  
Erich Bornberg-Bauer

De novo genes are novel genes which emerge from non-coding DNA. Until now, little is known about de novo genes properties, correlated to their age and mechanisms of emergence. In this study, we investigate four properties: introns, upstream regulatory motifs, 5 prime UTRs and protein domains, in 23135 human proto-genes. We found that proto-genes contain introns, whose number and position correlates with the genomic position of proto-gene emergence. The origin of these introns is debated, as our result suggest that 41% proto-genes might have captured existing introns, as well as the fact that 13.7% of them do not splice the ORF. We show that proto-genes which emerged via overprinting tend to be more enriched in core promotor motifs, while intergenic and intronic ones are more enriched in enhancers, even if the motif TATA is most expressed upstream these genes. Intergenic and intronic 5 prime UTRs of proto-genes have a lower potential to stabilise mRNA structures than exonic proto-genes and established human genes. Finally, we confirm that proto-genes gain new putative domains with age. Overall, we find that regulatory motifs inducing transcription and translation of previously non-coding sequences may facilitate proto-gene emergence. Our paper demonstrates that introns, 5 prime UTRs, and domains have specific properties in proto-genes. We also show the importance of studying proto-genes in relation to their genomic position, as it strongly impacts these properties.


Author(s):  
Yanna Reis Praça ◽  
Paula Beatriz Santiago ◽  
Sébastien Charneau ◽  
Samuel Coelho Mandacaru ◽  
Izabela Marques Dourado Bastos ◽  
...  

Triatomines have evolved salivary glands that produce versatile molecules with various biological functions, including those leading their interactions with vertebrate hosts’ hemostatic and immunological systems. Here, using high-throughput transcriptomics and proteomics, we report the first sialome study on the synanthropic triatomine Triatoma sordida. As a result, 57,645,372 reads were assembled into 26,670 coding sequences (CDS). From these, a total of 16,683 were successfully annotated. The sialotranscriptomic profile shows Lipocalin as the most abundant protein family within putative secreted transcripts. Trialysins and Kazal-type protease inhibitors have high transcript levels followed by ubiquitous protein families and enzyme classes. Interestingly, abundant trialysin and Kazal-type members are highlighted in this triatomine sialotranscriptome. Furthermore, we identified 132 proteins in T. sordida salivary gland soluble extract through LC-MS/MS spectrometry. Lipocalins, Hemiptera specific families, CRISP/Antigen-5 and Kazal-type protein inhibitors proteins were identified. Our study provides a comprehensive description of the transcript and protein compositions of the salivary glands of T. sordida. It significantly enhances the information in the Triatominae sialome databanks reported so far, improving the understanding of the vector’s biology, the hematophagous behaviour, and the Triatominae subfamily’s evolution.


2022 ◽  
Author(s):  
Kensuke Yamaguchi ◽  
Kazuyoshi Ishigaki ◽  
Akari Suzuki ◽  
Yumi Tsuchida ◽  
Haruka Tsuchiya ◽  
...  

Splicing QTL (sQTL) are one of the major causal mechanisms in GWAS loci, but their role in disease pathogenesis is poorly understood. One reason is the huge complexity of alternative splicing events producing many unknown isoforms. Here, we proposed two novel approaches, namely integration and selection, for this complexity by focusing on protein-structure of isoforms. First, we integrated isoforms with the same coding sequence (CDS) and identified 369-601 integrated-isoform ratio QTLs (i2-rQTLs), which altered protein-structure, in six immune subsets. Second, we selected CDS incomplete isoforms annotated in GENCODE and identified 175-337 isoform-ratio QTL (i-rQTL). By comprehensive long-read capture RNA-seq among these incomplete isoforms, we revealed 29 full-length isoforms with novel CDSs associated with GWAS traits. Furthermore, we have shown that disease-causal sQTL genes can be identified by evaluating their trans-eQTL effects. Our approaches highlight the understudied role of protein-altering sQTLs and are broadly applicable to other tissues and diseases.


Viruses ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 81
Author(s):  
Hua Feng ◽  
Joaquim Segalés ◽  
Fangyu Wang ◽  
Qianyue Jin ◽  
Aiping Wang ◽  
...  

Porcine circoviruses (PCVs) are distributed in swine herds worldwide and represent a threat to the health of domestic pigs and the profits of the swine industry. Currently, four PCV species, including PCV-1, PCV-2, PCV-3 and PCV-4, have been identified in China. Considering the ubiquitous characteristic of PCVs, the new emerged PCV-4 and the large scale of swine breeding in China, an overall analysis on codon usage bias for Chinese PCV sequences was performed by using the major proteins coding sequences (ORF1 and ORF2) to better understand the relationship of these viruses with their host. The data from genome nucleotide frequency composition and relative synonymous codon usage (RSCU) analysis revealed an overrepresentation of AT pair and the existence of a certain codon usage bias in all PCVs. However, the values of an effective number of codons (ENC) revealed that the bias was of low magnitude. Principal component analysis, ENC-plot, parity rule two analysis and correlation analysis suggested that natural selection and mutation pressure were both involved in the shaping of the codon usage patterns of PCVs. However, a neutrality plot revealed a stronger effect of natural selection than mutation pressure on codon usage patterns. Good host adaptation was also shown by the codon adaptation index analysis for all these viruses. Interestingly, obtained data suggest that PCV-4 might be more adapted to its host compared to other PCVs. The present study obtained insights into the codon usage pattern of PCVs based on ORF1 and ORF2, which further helps the understanding the molecular evolution of these swine viruses.


Biology ◽  
2022 ◽  
Vol 11 (1) ◽  
pp. 63
Author(s):  
Xiu-Xiu Guo ◽  
Xiao-Jian Qu ◽  
Xue-Jie Zhang ◽  
Shou-Jin Fan

Aristidoideae is a subfamily in the PACMAD clade of family Poaceae, including three genera, Aristida, Stipagrostis, and Sartidia. In this study, the plastomes of Aristida adscensionis and Stipagrostis pennata were newly sequenced, and a total of 16 Aristidoideae plastomes were compared. All plastomes were conservative in genome size, gene number, structure, and IR boundary. Repeat sequence analysis showed that forward and palindrome repeats were the most common repeat types. The number of SSRs ranged from 30 (Sartidia isaloensis) to 54 (Aristida purpurea). Codon usage analysis showed that plastome genes preferred to use codons ending with A/T. A total of 12 highly variable regions were screened, including four protein coding sequences (matK, ndhF, infA, and rpl32) and eight non-coding sequences (rpl16-1-rpl16-2, ccsA-ndhD, trnY-GUA-trnD-GUC, ndhF-rpl32, petN-trnC-GCA, trnT-GGU-trnE-UUC, trnG-GCC-trnfM-CAU, and rpl32-trnL-UAG). Furthermore, the phylogenetic position of this subfamily and their intergeneric relationships need to be illuminated. All Maximum Likelihood and Bayesian Inference trees strongly support the monophyly of Aristidoideae and each of three genera, and the clade of Aristidoideae and Panicoideae was a sister to other subfamilies in the PACMAD clade. Within Aristidoideae, Aristida is a sister to the clade composed of Stipagrostis and Sartidia. The divergence between C4 Stipagrostis and C3 Sartidia was estimated at 11.04 Ma, which may be associated with the drought event in the Miocene period. Finally, the differences in carbon fixation patterns, geographical distributions, and ploidy may be related to the difference of species numbers among these three genera. This study provides insights into the phylogeny and evolution of the subfamily Aristidoideae.


2021 ◽  
Author(s):  
Yildirim Dogan ◽  
Cecilia N. Barese ◽  
Jeffrey W. Schindler ◽  
John K. Yoon ◽  
Zeenath Unnisa ◽  
...  

Pompe disease is a rare genetic neuromuscular disorder caused by acid alpha-glucosidase (GAA) deficiency resulting in lysosomal glycogen accumulation and progressive myopathy. Enzyme replacement therapy (ERT) is the current standard of care, which prolongs the quality of life for Pompe patients. However, ERT has limitations due to lack of enzyme penetration into the central nervous system (CNS) and skeletal muscles, immunogenicity against the recombinant enzyme, and requires life-long biweekly infusions. In a preclinical mouse model, a clinically relevant promoter to drive lentiviral vector-mediated expression of engineered GAA in autologous hematopoietic stem and progenitor cells (HSPC) was tested with nine unique human chimeric GAA coding sequences incorporating distinct peptide tags and codon-optimization iterations. Vectors including glycosylation independent lysosomal targeting (GILT) tags resulted in effective GAA enzyme delivery into key disease tissues with enhanced reduction of glycogen, myofiber and CNS vacuolation, compared to non-tagged GAA in Gaa knockout mice, a model of Pompe disease. Genetically modified microglial cells in brains were detected at low levels, but provided robust correction. Furthermore, an aminoacid substitution in the tag added to reduced capacity to induce insulin signaling and there was no evidence of off-target effects. This study demonstrated the therapeutic potential of lentiviral HSPC gene therapy exploiting optimized GAA tagged coding sequences to reverse Pompe disease pathology in a preclinical mouse model providing a promising vector candidate for further investigation.


Sign in / Sign up

Export Citation Format

Share Document