scholarly journals A First Genome Survey and Genomic SSR Marker Analysis of Trematomus loennbergii Regan, 1913

Animals ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 3186
Author(s):  
Eunkyung Choi ◽  
Sun Hee Kim ◽  
Seung Jae Lee ◽  
Euna Jo ◽  
Jinmu Kim ◽  
...  

Trematomus loennbergii Regan, 1913, is an evolutionarily important marine fish species distributed in the Antarctic Ocean. However, its genome has not been studied to date. In the present study, whole genome sequencing was performed using next-generation sequencing (NGS) technology to characterize its genome and develop genomic microsatellite markers. The 25-mer frequency distribution was estimated to be the best, and the genome size was predicted to be 815,042,992 bp. The heterozygosity, average rate of read duplication, and sequencing error rates were 0.536%, 0.724%, and 0.292%, respectively. These data were used to analyze microsatellite markers, and a total of 2,264,647 repeat motifs were identified. The most frequent repeat motif was di-nucleotide with 87.00% frequency, followed by tri-nucleotide (10.45%), tetra-nucleotide (1.94%), penta-nucleotide (0.34%), and hexa-nucleotide (0.27%). The AC repeat motif was the most abundant motif among di-nucleotides and among all repeat motifs. Among microsatellite markers, 181 markers were selected and PCR technology was used to validate several markers. A total of 15 markers produced only one band. In summary, these results provide a good basis for further studies, including evolutionary biology studies and population genetics of Antarctic fish species.

2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Kelley Paskov ◽  
Jae-Yoon Jung ◽  
Brianna Chrisman ◽  
Nate T. Stockham ◽  
Peter Washington ◽  
...  

Abstract Background As next-generation sequencing technologies make their way into the clinic, knowledge of their error rates is essential if they are to be used to guide patient care. However, sequencing platforms and variant-calling pipelines are continuously evolving, making it difficult to accurately quantify error rates for the particular combination of assay and software parameters used on each sample. Family data provide a unique opportunity for estimating sequencing error rates since it allows us to observe a fraction of sequencing errors as Mendelian errors in the family, which we can then use to produce genome-wide error estimates for each sample. Results We introduce a method that uses Mendelian errors in sequencing data to make highly granular per-sample estimates of precision and recall for any set of variant calls, regardless of sequencing platform or calling methodology. We validate the accuracy of our estimates using monozygotic twins, and we use a set of monozygotic quadruplets to show that our predictions closely match the consensus method. We demonstrate our method’s versatility by estimating sequencing error rates for whole genome sequencing, whole exome sequencing, and microarray datasets, and we highlight its sensitivity by quantifying performance increases between different versions of the GATK variant-calling pipeline. We then use our method to demonstrate that: 1) Sequencing error rates between samples in the same dataset can vary by over an order of magnitude. 2) Variant calling performance decreases substantially in low-complexity regions of the genome. 3) Variant calling performance in whole exome sequencing data decreases with distance from the nearest target region. 4) Variant calls from lymphoblastoid cell lines can be as accurate as those from whole blood. 5) Whole-genome sequencing can attain microarray-level precision and recall at disease-associated SNV sites. Conclusion Genotype datasets from families are powerful resources that can be used to make fine-grained estimates of sequencing error for any sequencing platform and variant-calling methodology.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Jessica Garcia ◽  
Nick Kamps-Hughes ◽  
Florence Geiguer ◽  
Sébastien Couraud ◽  
Brice Sarver ◽  
...  

AbstractCirculating cell-free DNA (cfDNA) has the potential to be a specific biomarker for the therapeutic management of lung cancer patients. Here, a new sequencing error-reduction method based on molecular amplification pools (MAPs) was utilized to analyze cfDNA in lung cancer patients. We determined the accuracy of MAPs plasma sequencing with respect to droplet digital polymerase chain reaction assays (ddPCR), and tested whether actionable mutation discovery is improved by next-generation sequencing (NGS) in a clinical setting. This study reports data from 356 lung cancer patients receiving plasma testing as part of routine clinical management. Sequencing of cfDNA via MAPs had a sensitivity of 98.5% and specificity 98.9%. The ddPCR assay was used as the reference, since it is an established, accurate assay that can be performed contemporaneously on the same plasma sample. MAPs sequencing detected somatic variants in 261 of 356 samples (73%). Non-actionable clonal hematopoiesis-associated variants were identified via sequencing in 21% of samples. The accuracy of this cfDNA sequencing approach was similar to that of ddPCR assays in a clinical setting, down to an allele frequency of 0.1%. Due to broader coverage and high sensitivity for insertions and deletions, sequencing via MAPs afforded important detection of additional actionable mutations.


2000 ◽  
Vol 12 (3) ◽  
pp. 257-257 ◽  
Author(s):  
Andrew Clarke

Theodosius Dobzhansky once remarked that nothing in biology makes sense other than in the light of evolution, thereby emphasising the central role of evolutionary studies in providing the theoretical context for all of biology. It is perhaps surprising then that evolutionary biology has played such a small role to date in Antarctic science. This is particularly so when it is recognised that the polar regions provide us with an unrivalled laboratory within which to undertake evolutionary studies. The Antarctic exhibits one of the classic examples of a resistance adaptation (antifreeze peptides and glycopeptides, first described from Antarctic fish), and provides textbook examples of adaptive radiations (for example amphipod crustaceans and notothenioid fish). The land is still largely in the grip of major glaciation, and the once rich terrestrial floras and faunas of Cenozoic Gondwana are now highly depauperate and confined to relatively small patches of habitat, often extremely isolated from other such patches. Unlike the Arctic, where organisms are returning to newly deglaciated land from refugia on the continental landmasses to the south, recolonization of Antarctica has had to take place by the dispersal of propagules over vast distances. Antarctica thus offers an insight into the evolutionary responses of terrestrial floras and faunas to extreme climatic change unrivalled in the world. The sea forms a strong contrast to the land in that here the impact of climate appears to have been less severe, at least in as much as few elements of the fauna show convincing signs of having been completely eradicated.


2021 ◽  
Author(s):  
Xin Peng ◽  
Zhende Yang ◽  
Lei Xu ◽  
Hantang Wang ◽  
Chunhui Guo ◽  
...  

Abstract The white-striped longhorn beetle Batocera horsfieldi (Coleoptera: Cerambycidae) is a polyphagous wood-boring pest that causes substantial damage to the lumber, fruit and nut industry. Here, next-generation sequencing was used to generate a whole genome survey dataset to provide fundamental information of its genome and develop genome-wide microsatellite markers for it. The genome size of B. horsfieldi was estimated as approximate 520 Mb by using K-mer analyses, and its heterozygosity ratio and repeat sequence ratio were 0.26% and 51.03%, respectively. The assembled genome was 528.56Mb with GC content of 35.40%. A total of 121750 microsatellite motifs were identified. The most frequent repeat motif was mononucleotide with a frequency of 85.84%, followed by 8.08% of dinonucleotide, 5.04% of trinonucleotide, 0.73% of tetranonucleotide, 0.20% of pentanonucleotide and 0.12% of hexanonucleotide motifs. The AT/AT, TA/TAand GA/TC repeats were the most abundant motifs of dinucleotide motifs, and AAT/ATT, TAA/TTA and ATA/TAT were the most abundant motifs of trinucleotide motifs, respectively. ninety six pairs of SSR primers were randomly selected for PCR amplification and agarose gel electrophoresis detection, among which 56 pairs of primers can be effectively amplified to obtain the target fragment. In summary, various candidate microsatellite markers were identified and characterized in this study using genome survey analysis.


Genes ◽  
2020 ◽  
Vol 11 (1) ◽  
pp. 50
Author(s):  
Axel Barlow ◽  
Stefanie Hartmann ◽  
Javier Gonzalez ◽  
Michael Hofreiter ◽  
Johanna L. A. Paijmans

A standard practise in palaeogenome analysis is the conversion of mapped short read data into pseudohaploid sequences, frequently by selecting a single high-quality nucleotide at random from the stack of mapped reads. This controls for biases due to differential sequencing coverage, but it does not control for differential rates and types of sequencing error, which are frequently large and variable in datasets obtained from ancient samples. These errors have the potential to distort phylogenetic and population clustering analyses, and to mislead tests of admixture using D statistics. We introduce Consensify, a method for generating pseudohaploid sequences, which controls for biases resulting from differential sequencing coverage while greatly reducing error rates. The error correction is derived directly from the data itself, without the requirement for additional genomic resources or simplifying assumptions such as contemporaneous sampling. For phylogenetic and population clustering analysis, we find that Consensify is less affected by artefacts than methods based on single read sampling. For D statistics, Consensify is more resistant to false positives and appears to be less affected by biases resulting from different laboratory protocols than other frequently used methods. Although Consensify is developed with palaeogenomic data in mind, it is applicable for any low to medium coverage short read datasets. We predict that Consensify will be a useful tool for future studies of palaeogenomes.


1988 ◽  
Vol 66 (12) ◽  
pp. 2611-2617 ◽  
Author(s):  
Peter L. Davies ◽  
Choy L. Hew ◽  
Garth L. Fletcher

Many marine teleosts have adapted to ice-laden seawater by evolving antifreeze proteins and glycoproteins. These proteins are synthesized in the liver for export to the blood where they circulate at levels of up to 20 mg/mL. There are at least four distinct antifreeze protein classes differing in carbohydrate content, amino acid composition, protein sequence, and secondary structure. In addition to antifreeze structural diversity, fish species differ considerably with respect to mechanisms controlling seasonal regulation of plasma antifreeze concentrations. Some species synthesize antifreeze proteins immediately before the onset of freezing conditions, some synthesize them in response to such conditions, whereas others possess high concentrations all year. Endogenous rhythms, water temperature, photoperiod, and pituitary hormones have all been implicated as regulators of plasma antifreeze protein levels. The structural diversity of antifreeze proteins and their occurrence in a wide range of fish species suggest that they evolved separately and recently during Cenozoic glaciation. Invariably, the genes coding for these antifreeze proteins are amplified, sometimes as long tandem arrays, suggesting intense selective pressure to produce large amounts of protein. The distribution of antifreeze gene types among fish species suggests that they could serve as important tools for studying phylogenetic relationships.


2012 ◽  
Vol 4 (4) ◽  
pp. 927-929 ◽  
Author(s):  
Jesse D. Trujillo ◽  
Tyler J. Pilger ◽  
Marlis R. Douglas ◽  
Michael E. Douglas ◽  
Thomas F. Turner

Sign in / Sign up

Export Citation Format

Share Document