scholarly journals Genomic characteristics and profile of microsatellite primers for Acanthogobius ommaturus by genome survey sequencing

2020 ◽  
Vol 40 (11) ◽  
Author(s):  
Bingjie Chen ◽  
Zhicheng Sun ◽  
Fangrui Lou ◽  
Tian-xiang Gao ◽  
Na Song

Abstract Acanthogobius ommaturus is one of the suitable species to study the genetic mechanism of adaptive evolution, but there are few reports on its genetics. In the present study, the genomic survey sequencing method was used to analyze the genome characters of A. ommaturus. A total of 50.50 G high-quality sequence data were obtained in the present study. From the 19-mer distribution frequency, the estimated genome size was 928.01 Mb. The calculated sequence repeat rate was about 38.31%, the heterozygosity was approximately 0.17%, and the GC% content was approximately 40.88%. Moreover, 475,724 simple sequence repeats (SSRs) were identified. Among them, dinucleotide repeats were the most (53.70% of the total SSRs), followed by tri- (35.36%), hexa- (4.59%), tetra- (4.57%) and penta- (1.77%) nucleotide repeats type. This is the first genome-wide feature of this species to be reported.

Animals ◽  
2019 ◽  
Vol 9 (10) ◽  
pp. 756 ◽  
Author(s):  
Li ◽  
Tian ◽  
Huang ◽  
Lin ◽  
Wang ◽  
...  

Sillago sihama has high economic value and is one of the most attractive aquaculture species in China. Despite its economic importance, studies of its genome have barely been performed. In this study, we conducted a first genomic survey of S. sihama using next-generation sequencing (NGS). In total, 45.063 Gb of high-quality sequence data were obtained. For the 17-mer frequency distribution, the genome size was estimated to be 508.50 Mb. The sequence repeat ratio was calculated to be 21.25%, and the heterozygosity ratio was 0.92%. Reads were assembled into 1,009,363 contigs, with a N50 length of 1362 bp, and then into 814,219 scaffolds, with a N50 length of 2173 bp. The average Guanine and Cytosine (GC) content was 45.04%. Dinucleotide repeats (56.55%) were the dominant form of simple sequence repeats (SSR).


Animals ◽  
2019 ◽  
Vol 9 (12) ◽  
pp. 1117 ◽  
Author(s):  
Yuanqing Huang ◽  
Dongneng Jiang ◽  
Ming Li ◽  
Umar Farouk Mustapha ◽  
Changxu Tian ◽  
...  

The spotted scat, Scatophagus argus, is a species of fish that is widely propagated within the Chinese aquaculture industry and therefore has significant economic value. Despite this, studies of its genome are severely lacking. In the present study, a genomic survey of S. argus was conducted using next-generation sequencing (NGS). In total, 55.699 GB (female) and 51.047 GB (male) of high-quality sequence data were obtained. Genome sizes were estimated to be 598.73 (female) and 597.60 (male) Mbp. The sequence repeat ratios were calculated to be 27.06% (female) and 26.99% (male). Heterozygosity ratios were 0.37% for females and 0.38% for males. Reads were assembled into 444,961 (female) and 453,459 (male) contigs with N50 lengths of 5,747 and 5,745 bp for females and males, respectively. The average guanine-cytosine (GC) content of the female genome was 41.78%, and 41.82% for the male. A total of 42,869 (female) and 43,283 (male) genes were annotated to the non-redundant (NR) and SwissProt databases. The female and male genomes contained 66.6% and 67.8% BUSCO core genes, respectively. Dinucleotide repeats were the dominant form of simple sequence repeats (SSR) observed in females (68.69%) and males (68.56%). Additionally, gene fragments of Dmrt1 were only observed in the male genome. This is the first report of a genome-wide characterization of S. argus.


Author(s):  
Lin Ma ◽  
Xiao Wang ◽  
Min Yan ◽  
Fang Liu ◽  
Shuxing Zhang ◽  
...  

Abstract Background Common vetch (Vicia sativa L.) is an annual legume with excellent suitability in cold and dry regions. Despite its great applied potential, the genomic information regarding common vetch currently remains unavailable. Methods and results In the present study, the whole genome survey of common vetch was performed using the next-generation sequencing (NGS). A total of 79.84 Gbp high quality sequence data were obtained and assembled into 3,754,145 scaffolds with an N50 length of 3556 bp. According to the K-mer analyses, the genome size, heterozygosity rate and GC content of common vetch genome were estimated to be 1568 Mbp, 0.4345 and 35%, respectively. In addition, a total of 76,810 putative simple sequence repeats (SSRs) were identified. Among them, dinucleotide was the most abundant SSR type (44.94%), followed by Tri- (35.82%), Tetra- (13.22%), Penta- (4.47%) and Hexanucleotide (1.54%). Furthermore, a total of 58,175 SSR primer pairs were designed and ten of them were validated in Chinese common vetch. Further analysis showed that Chinese common vetch harbored high genetic diversity and could be clustered into two main subgroups. Conclusion This is the first report about the genome features of common vetch, and the information will help to design whole genome sequencing strategies. The newly identified SSRs in this study provide basic molecular markers for germplasm characterization, genetic diversity and QTL mapping studies for common vetch.


2021 ◽  
Author(s):  
lin ma ◽  
Xiao Wang ◽  
Min Yan ◽  
Fang Liu ◽  
Xuemin Wang

Abstract Common vetch (Vicia sativa L.) is an annual legume with excellent suitability in cold and dry regions. Despite its great applied potential, the genomic information regarding common vetch currently remains unavailable. In the present study, the whole genome survey of common vetch was performed using the next-generation sequencing (NGS). A total of 79.84 Gbp high quality sequence data were obtained and assembled into 3,754,145 scaffolds with an N50 length of 3,556 bp. According to the K-mer analyses, the genome size, heterozygosity rate and GC content of common vetch genome were estimated to be 1,568 Mbp, 0.4345% and 35%, respectively. In addition, a total of 76,810 putative simple sequence repeats (SSRs) were identified. Among them, dinucleotide was the most abundant SSR type (44.94%), followed by Tri- (35.82%), Tetra- (13.22%), Penta- (4.47%) and Hexanucleotide (1.54%). Furthermore, a total of 58,175 SSR primer pairs were designed and ten of them were validated in Chinese common vetch. Further analysis showed that Chinese common vetch harbored high genetic diversity and could be clustered into two main subgroups. This is the first report about the genome features of common vetch, and the information will help to design whole genome sequencing strategies. The newly identified SSRs in this study provide basic molecular markers for germplasm characterization, genetic diversity and QTL mapping studies for common vetch.


2021 ◽  
Author(s):  
Xin Peng ◽  
Zhende Yang ◽  
Lei Xu ◽  
Hantang Wang ◽  
Chunhui Guo ◽  
...  

Abstract The white-striped longhorn beetle Batocera horsfieldi (Coleoptera: Cerambycidae) is a polyphagous wood-boring pest that causes substantial damage to the lumber, fruit and nut industry. Here, next-generation sequencing was used to generate a whole genome survey dataset to provide fundamental information of its genome and develop genome-wide microsatellite markers for it. The genome size of B. horsfieldi was estimated as approximate 520 Mb by using K-mer analyses, and its heterozygosity ratio and repeat sequence ratio were 0.26% and 51.03%, respectively. The assembled genome was 528.56Mb with GC content of 35.40%. A total of 121750 microsatellite motifs were identified. The most frequent repeat motif was mononucleotide with a frequency of 85.84%, followed by 8.08% of dinonucleotide, 5.04% of trinonucleotide, 0.73% of tetranonucleotide, 0.20% of pentanonucleotide and 0.12% of hexanonucleotide motifs. The AT/AT, TA/TAand GA/TC repeats were the most abundant motifs of dinucleotide motifs, and AAT/ATT, TAA/TTA and ATA/TAT were the most abundant motifs of trinucleotide motifs, respectively. ninety six pairs of SSR primers were randomly selected for PCR amplification and agarose gel electrophoresis detection, among which 56 pairs of primers can be effectively amplified to obtain the target fragment. In summary, various candidate microsatellite markers were identified and characterized in this study using genome survey analysis.


2020 ◽  
Vol 40 (6) ◽  
Author(s):  
Jingmiao Li ◽  
Siqiao Li ◽  
Lijuan Kong ◽  
Lihua Wang ◽  
Anzhi Wei ◽  
...  

Abstract Zanthoxylum bungeanum, a spice and medicinal plant, is cultivated in many parts of China and some countries in Southeast Asia; however, data on its genome are lacking. In the present study, we performed a whole-genome survey and developed novel genomic-SSR markers of Z. bungeanum. Clean data (∼197.16 Gb) were obtained and assembled into 11185221 scaffolds with an N50 of 183 bp. K-mer analysis revealed that Z. bungeanum has an estimated genome size of 3971.92 Mb, and the GC content, heterozygous rate, and repeat sequence rate are 37.21%, 1.73%, and 86.04%, respectively. These results indicate that the genome of Z. bungeanum is complex. Furthermore, 27153 simple sequence repeat (SSR) loci were identified from 57288 scaffolds with a minimum length > 1 kb. Mononucleotide repeats (19706) were the most abundant type, followed by dinucleotide repeats (5154). The most common motifs were A/T, followed by AT/AT; these SSRs accounted for 71.42% and 11.84% of all repeats, respectively. A total of 21243 non-repeating primer pairs were designed, and 100 were randomly selected and validated by PCR analysis using DNA from 10 Z. bungeanum individuals and 5 Zanthoxylum armatum individuals. Finally, 36 polymorphic SSR markers were developed with polymorphism information content (PIC) values ranging from 0.16 to 0.75. Cluster analysis revealed that Z. bungeanum and Z. armatum could be divided into two major clusters, suggesting that these newly developed SSR markers are useful for genetic diversity and germplasm resource identification in Z. bungeanum and Z. armatum.


2021 ◽  
Vol 43 (3) ◽  
pp. 1282-1292
Author(s):  
Tianyan Yang ◽  
Xinxin Huang ◽  
Zijun Ning ◽  
Tianxiang Gao

Harpadon nehereus forms one of the most important commercial fisheries along the Bay of Bengal and the southeast coast of China. In this study, the genome-wide survey dataset first produced using next-generation sequencing (NGS) was used to provide general information on the genome size, heterozygosity and repeat sequence ratio of H. nehereus. About 68.74 GB of high-quality sequence data were obtained in total and the genome size was estimated to be 1315 Mb with the 17-mer frequency distribution. The sequence repeat ratio and heterozygosity were calculated to be 52.49% and 0.67%, respectively. A total of 1,027,651 microsatellite motifs were identified and dinucleotide repeat was the most dominant simple sequence repeat (SSR) motif with a frequency of 54.35%. As a by-product of whole genome sequencing, the mitochondrial genome is a powerful tool to investigate the evolutionary relationships between H. nehereus and its relatives. The maximum likelihood (ML) phylogenetic tree was constructed according to the concatenated matrix of amino acids translated from the 13 protein-coding genes (PCGs). Monophyly of two species of the genus Harpadon was revealed in the present study and they formed a monophyletic clade with Saurida with a high bootstrap value of 100%. The results would help to push back the frontiers of genomics and open the doors of molecular diversity as well as conservation genetics studies on this species.


Nature ◽  
2021 ◽  
Vol 590 (7845) ◽  
pp. 290-299 ◽  
Author(s):  
Daniel Taliun ◽  
◽  
Daniel N. Harris ◽  
Michael D. Kessler ◽  
Jedidiah Carlson ◽  
...  

AbstractThe Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.


GigaScience ◽  
2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Taras K Oleksyk ◽  
Walter W Wolfsberger ◽  
Alexandra M Weber ◽  
Khrystyna Shchubelka ◽  
Olga T Oleksyk ◽  
...  

Abstract Background The main goal of this collaborative effort is to provide genome-wide data for the previously underrepresented population in Eastern Europe, and to provide cross-validation of the data from genome sequences and genotypes of the same individuals acquired by different technologies. We collected 97 genome-grade DNA samples from consented individuals representing major regions of Ukraine that were consented for public data release. BGISEQ-500 sequence data and genotypes by an Illumina GWAS chip were cross-validated on multiple samples and additionally referenced to 1 sample that has been resequenced by Illumina NovaSeq6000 S4 at high coverage. Results The genome data have been searched for genomic variation represented in this population, and a number of variants have been reported: large structural variants, indels, copy number variations, single-nucletide polymorphisms, and microsatellites. To our knowledge, this study provides the largest to-date survey of genetic variation in Ukraine, creating a public reference resource aiming to provide data for medical research in a large understudied population. Conclusions Our results indicate that the genetic diversity of the Ukrainian population is uniquely shaped by evolutionary and demographic forces and cannot be ignored in future genetic and biomedical studies. These data will contribute a wealth of new information bringing forth a wealth of novel, endemic and medically related alleles.


2009 ◽  
Vol 16 (4) ◽  
pp. 555-564 ◽  
Author(s):  
Andrey Ilatovskiy ◽  
Michael Petukhov

Sign in / Sign up

Export Citation Format

Share Document