scholarly journals Intra-individual heteroplasmy in the Gentiana tongolensis plastid genome (Gentianaceae)

PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e8025 ◽  
Author(s):  
Shan-Shan Sun ◽  
Xiao-Jun Zhou ◽  
Zhi-Zhong Li ◽  
Hong-Yang Song ◽  
Zhi-Cheng Long ◽  
...  

Chloroplasts are typically inherited from the female parent and are haploid in most angiosperms, but rare intra-individual heteroplasmy in plastid genomes has been reported in plants. Here, we report an example of plastome heteroplasmy and its characteristics in Gentiana tongolensis (Gentianaceae). The plastid genome of G. tongolensis is 145,757 bp in size and is missing parts of petD gene when compared with other Gentiana species. A total of 112 single nucleotide polymorphisms (SNPs) and 31 indels with frequencies of more than 2% were detected in the plastid genome, and most were located in protein coding regions. Most sites with SNP frequencies of more than 10% were located in six genes in the LSC region. After verification via cloning and Sanger sequencing at three loci, heteroplasmy was identified in different individuals. The cause of heteroplasmy at the nucleotide level in plastome of G. tongolensis is unclear from the present data, although biparental plastid inheritance and transfer of plastid DNA seem to be most likely. This study implies that botanists should reconsider the heredity and evolution of chloroplasts and be cautious with using chloroplasts as genetic markers, especially in Gentiana.


2021 ◽  
Vol 12 ◽  
Author(s):  
Fabien Degalez ◽  
Frédéric Jehl ◽  
Kévin Muret ◽  
Maria Bernard ◽  
Frédéric Lecerf ◽  
...  

Most single-nucleotide polymorphisms (SNPs) are located in non-coding regions, but the fraction usually studied is harbored in protein-coding regions because potential impacts on proteins are relatively easy to predict by popular tools such as the Variant Effect Predictor. These tools annotate variants independently without considering the potential effect of grouped or haplotypic variations, often called “multi-nucleotide variants” (MNVs). Here, we used a large RNA-seq dataset to survey MNVs, comprising 382 chicken samples originating from 11 populations analyzed in the companion paper in which 9.5M SNPs— including 3.3M SNPs with reliable genotypes—were detected. We focused our study on in-codon MNVs and evaluate their potential mis-annotation. Using GATK HaplotypeCaller read-based phasing results, we identified 2,965 MNVs observed in at least five individuals located in 1,792 genes. We found 41.1% of them showing a novel impact when compared to the effect of their constituent SNPs analyzed separately. The biggest impact variation flux concerns the originally annotated stop-gained consequences, for which around 95% were rescued; this flux is followed by the missense consequences for which 37% were reannotated with a different amino acid. We then present in more depth the rescued stop-gained MNVs and give an illustration in the SLC27A4 gene. As previously shown in human datasets, our results in chicken demonstrate the value of haplotype-aware variant annotation, and the interest to consider MNVs in the coding region, particularly when searching for severe functional consequence such as stop-gained variants.



2013 ◽  
Vol 2013 ◽  
pp. 1-10 ◽  
Author(s):  
Jiaxin Wu ◽  
Rui Jiang

The identification of genetic variants that are responsible for human inherited diseases is a fundamental problem in human and medical genetics. As a typical type of genetic variation, nonsynonymous single-nucleotide polymorphisms (nsSNPs) occurring in protein coding regions may alter the encoded amino acid, potentially affect protein structure and function, and further result in human inherited diseases. Therefore, it is of great importance to develop computational approaches to facilitate the discrimination of deleterious nsSNPs from neutral ones. In this paper, we review databases that collect nsSNPs and summarize computational methods for the identification of deleterious nsSNPs. We classify the existing methods for characterizing nsSNPs into three categories (sequence based, structure based, and annotation based), and we introduce machine learning models for the prediction of deleterious nsSNPs. We further discuss methods for identifying deleterious nsSNPs in noncoding variants and those for dealing with rare variants.



2020 ◽  
Vol 21 (11) ◽  
pp. 1068-1077
Author(s):  
Xiaochao Sun ◽  
Bin Yang ◽  
Qunye Zhang

: Many studies have shown that the spatial distribution of genes within a single chromosome exhibits distinct patterns. However, little is known about the characteristics of inter-chromosomal distribution of genes (including protein-coding genes, processed transcripts and pseudogenes) in different genomes. In this study, we explored these issues using the available genomic data of both human and model organisms. Moreover, we also analyzed the distribution pattern of protein-coding genes that have been associated with 14 common diseases and the insert/deletion mutations and single nucleotide polymorphisms detected by whole genome sequencing in an acute promyelocyte leukemia patient. We obtained the following novel findings. Firstly, inter-chromosomal distribution of genes displays a nonstochastic pattern and the gene densities in different chromosomes are heterogeneous. This kind of heterogeneity is observed in genomes of both lower and higher species. Secondly, protein-coding genes involved in certain biological processes tend to be enriched in one or a few chromosomes. Our findings have added new insights into our understanding of the spatial distribution of genome and disease- related genes across chromosomes. These results could be useful in improving the efficiency of disease-associated gene screening studies by targeting specific chromosomes.



2020 ◽  
Author(s):  
Zhong-Yin Zhou ◽  
Hang Liu ◽  
Yue-Dong Zhang ◽  
Yin-Qiao Wu ◽  
Min-Sheng Peng ◽  
...  

AbstractUnderstanding the mutational and evolutionary dynamics of SARS-CoV-2 is essential for treating COVID-19 and the development of a vaccine. Here, we analyzed publicly available 15,818 assembled SARS-CoV-2 genome sequences, along with 2,350 raw sequence datasets sampled worldwide. We investigated the distribution of inter-host single nucleotide polymorphisms (inter-host SNPs) and intra-host single nucleotide variations (iSNVs). Mutations have been observed at 35.6% (10,649/29,903) of the bases in the genome. The substitution rate in some protein coding regions is higher than the average in SARS-CoV-2 viruses, and the high substitution rate in some regions might be driven to escape immune recognition by diversifying selection. Both recurrent mutations and human-to-human transmission are mechanisms that generate fitness advantageous mutations. Furthermore, the frequency of three mutations (S protein, F400L; ORF3a protein, T164I; and ORF1a protein, Q6383H) has gradual increased over time on lineages, which provides new clues for the early detection of fitness advantageous mutations. Our study provides theoretical support for vaccine development and the optimization of treatment for COVID-19. We call researchers to submit raw sequence data to public databases.



2021 ◽  
Author(s):  
ZHIYONG Chen ◽  
Yancen He ◽  
Yasir Iqbal ◽  
Yanlan Shi ◽  
Hongmei Huang ◽  
...  

Abstract Background: Miscanthus, which is a leading dedicated-energy grass in Europe and in parts of Asia, is expected to play a key role in the development of the future bioeconomy. However, due to its complex genetic background, it is difficult to investigate phylogenetic relationships and the evolution of gene function in this genus. Here, we investigated 50 Miscanthus germplasms: 1 female parent (M. lutarioriparius), 30 candidate male parents (M. lutarioriparius, M. sinensis, and M. sacchariflorus), and 19 offspring. We used high-throughput Specific-Locus Amplified Fragment sequencing (SLAF-seq) to identify informative single nucleotide polymorphisms (SNPs) in all germplasms.Results: We identified 800,081 SLAF tags, of which 160,368 were polymorphic. Each tag was 264–364 bp long. The obtained SNPs were used to investigate genetic relationships within Miscanthus. We constructed a phylogenetic tree of the 50 germplasms using the obtained SNPs, and found that the germplasms fell into two clades: one clade of M. sinensis only and one clade that included the offspring, M. lutarioriparius, and M. sacchariflorus. Genetic cluster analysis indicated that M. lutarioriparius germplasm C3 was the most likely male parent of the offspring.Conclusions: As a high-throughput sequencing method, SLAF-seq can be used to identify informative SNPs in Miscanthus germplasms and to rapidly characterize genetic relationships within this genus. Our results will support the development of breeding programs utilizing Miscanthus cultivars with elite biomass- or fiber-production potential.



2021 ◽  
Vol 23 (Supplement_6) ◽  
pp. vi1-vi1
Author(s):  
Kristen Drucker ◽  
Connor Yanchus ◽  
Thomas Kollmeyer ◽  
Asma Ali ◽  
Decker Paul ◽  
...  

Abstract BACKGROUND Determination of the causation of germline single nucleotide polymorphisms (SNPs) located in non-coding regions of the genome is challenging. The genomic region of 8q24 has been identified as important in many kinds of cancer, linked to a topologically associated domain (TAD) encompassing MYC; this TAD contains a GWAS SNP (rs55705857) associated with IDH-mutant glioma. METHODS Germline genotyping data from 622 IDH-mutant glioma and 668 controls were used to fine map the rs55705857 locus by detailed haplotype analysis. Chromatin immunoprecipitation sequencing (ChIP-seq) of histone markers H3K4me1, H3K4me3, H3K27ac and H3K36me3 was performed on normal brain samples (n=8) and human glioma samples (n=11 IDH-wt and 52 IDH-mut). RNAseq from 9 normal and 83 brain tumors (n=26 IDH-wt and 55 IDH-mut) were used to assess differential gene expression. RESULTS Fine-mapping identified rs55705857 SNP as the most likely causative allele (OR=8.69; p<0.001) within 8q24 for the development of IDH-mutant glioma. At rs55705857, both H3K27ac and H3K4me1 in IDH-mutant vs IDH-wt tumors were increased 3.05- and 1.58-fold, respectively (DiffBind; p=5.81×10-7 and p=2.31×10-3). ChromHMM analysis of the marks indicated that promoter and enhancer functions were significantly increased, and the activity broadened at rs55705857 in IDH-mut gliomas compared to IDH-wt tumors and normal brain samples. This enhancement correlated with significant increased MYC expression in IDH-mut gliomas (p=3.1×10-13), as well as alterations of Myc signaling targets. Publicly available ATACseq, ChIPseq and long-range DNA interaction data demonstrated that the rs55705857 locus is open and interacts with the MYC promoter. CONCLUSIONS Fine-mapping of the 8q24 locus provided strong evidence that rs55705857 is the causative 8q24 locus associated with IDH-mut glioma. Functional experiments suggest that IDH mutation facilitates rs55705857 interaction with MYC to alter downstream MYC targets.



2020 ◽  
Vol 29 (R2) ◽  
pp. R197-R204 ◽  
Author(s):  
Adi Danieli ◽  
Argyris Papantonis

Abstract Human chromosomes are large spatially and hierarchically structured entities, the integrity of which needs to be preserved throughout the lifespan of the cell and in conjunction with cell cycle progression. Preservation of chromosomal structure is important for proper deployment of cell type-specific gene expression programs. Thus, aberrations in the integrity and structure of chromosomes will predictably lead to disease, including cancer. Here, we provide an updated standpoint with respect to chromatin misfolding and the emergence of various cancer types. We discuss recent studies implicating the disruption of topologically associating domains, switching between active and inactive compartments, rewiring of promoter–enhancer interactions in malignancy as well as the effects of single nucleotide polymorphisms in non-coding regions involved in long-range regulatory interactions. In light of these findings, we argue that chromosome conformation studies may now also be useful for patient diagnosis and drug target discovery.



GigaScience ◽  
2019 ◽  
Vol 8 (10) ◽  
Author(s):  
Bo Song ◽  
Yue Song ◽  
Yuan Fu ◽  
Elizabeth Balyejusa Kizito ◽  
Sandra Ndagire Kamenya ◽  
...  

Abstract Background The African eggplant (Solanum aethiopicum) is a nutritious traditional vegetable used in many African countries, including Uganda and Nigeria. It is thought to have been domesticated in Africa from its wild relative, Solanum anguivi. S. aethiopicum has been routinely used as a source of disease resistance genes for several Solanaceae crops, including Solanum melongena. A lack of genomic resources has meant that breeding of S. aethiopicum has lagged behind other vegetable crops. Results We assembled a 1.02-Gb draft genome of S. aethiopicum, which contained predominantly repetitive sequences (78.9%). We annotated 37,681 gene models, including 34,906 protein-coding genes. Expansion of disease resistance genes was observed via 2 rounds of amplification of long terminal repeat retrotransposons, which may have occurred ∼1.25 and 3.5 million years ago, respectively. By resequencing 65 S. aethiopicum and S. anguivi genotypes, 18,614,838 single-nucleotide polymorphisms were identified, of which 34,171 were located within disease resistance genes. Analysis of domestication and demographic history revealed active selection for genes involved in drought tolerance in both “Gilo” and “Shum” groups. A pan-genome of S. aethiopicum was assembled, containing 51,351 protein-coding genes; 7,069 of these genes were missing from the reference genome. Conclusions The genome sequence of S. aethiopicum enhances our understanding of its biotic and abiotic resistance. The single-nucleotide polymorphisms identified are immediately available for use by breeders. The information provided here will accelerate selection and breeding of the African eggplant, as well as other crops within the Solanaceae family.



2006 ◽  
Vol 188 (12) ◽  
pp. 4453-4463 ◽  
Author(s):  
Patrick S. G. Chain ◽  
Ping Hu ◽  
Stephanie A. Malfatti ◽  
Lyndsay Radnedge ◽  
Frank Larimer ◽  
...  

ABSTRACT Yersinia pestis, the causative agent of bubonic and pneumonic plagues, has undergone detailed study at the molecular level. To further investigate the genomic diversity among this group and to help characterize lineages of the plague organism that have no sequenced members, we present here the genomes of two isolates of the “classical” antiqua biovar, strains Antiqua and Nepal516. The genomes of Antiqua and Nepal516 are 4.7 Mb and 4.5 Mb and encode 4,138 and 3,956 open reading frames, respectively. Though both strains belong to one of the three classical biovars, they represent separate lineages defined by recent phylogenetic studies. We compare all five currently sequenced Y. pestis genomes and the corresponding features in Yersinia pseudotuberculosis. There are strain-specific rearrangements, insertions, deletions, single nucleotide polymorphisms, and a unique distribution of insertion sequences. We found 453 single nucleotide polymorphisms in protein-coding regions, which were used to assess the evolutionary relationships of these Y. pestis strains. Gene reduction analysis revealed that the gene deletion processes are under selective pressure, and many of the inactivations are probably related to the organism's interaction with its host environment. The results presented here clearly demonstrate the differences between the two biovar antiqua lineages and support the notion that grouping Y. pestis strains based strictly on the classical definition of biovars (predicated upon two biochemical assays) does not accurately reflect the phylogenetic relationships within this species. A comparison of four virulent Y. pestis strains with the human-avirulent strain 91001 provides further insight into the genetic basis of virulence to humans.



Sign in / Sign up

Export Citation Format

Share Document