scholarly journals Development and Transferability of Black and Red Raspberry Microsatellite Markers from Short-Read Sequences

2015 ◽  
Vol 140 (3) ◽  
pp. 243-252 ◽  
Author(s):  
Michael Dossett ◽  
Jill M. Bushakra ◽  
Barbara Gilmore ◽  
Carol A. Koch ◽  
Chaim Kempler ◽  
...  

The advent of next-generation, or massively parallel sequencing technologies has been a boon to the cost-effective development of molecular markers, particularly in nonmodel species. Here, we demonstrate the efficiency of microsatellite or simple sequence repeat (SSR) marker development from short-read sequences in black and red raspberry (Rubus occidentalis L. and R. idaeus L., respectively), compare transferability of markers across species, and test whether the rate of polymorphism in the recovered markers can be improved upon by how marker sequences are chosen. From 28,536,412 black raspberry reads and 27,430,159 reads in red raspberry, we identified more than 6000 SSR sequences in each species and selected 288 of these (144 from each species), for testing in black and red raspberry. A total of 166 SSR primer pairs were identified with informative polymorphism in one or both species. SSRs selected based on different percentages (90% to 97% as compared with ≥98%) of read cluster similarity did not differ in polymorphism rates from each other or from those originating from singletons. Efficiency of polymorphic SSR recovery was nearly twice as high in black raspberry from black raspberry-derived sequences as from red raspberry-derived sequences, while efficiency of polymorphic SSR recovery in red raspberry was unaffected by the source of the primer sequences. Development of SSR markers that are transferable between red and black raspberry for marker-assisted selection, evaluation of genome collinearity and to facilitate comparative studies in Rubus L. will be more efficient using SSR markers developed from black raspberry sequences.

2005 ◽  
Vol 130 (5) ◽  
pp. 722-728 ◽  
Author(s):  
Eric T. Stafne ◽  
John R. Clark ◽  
Courtney A. Weber ◽  
Julie Graham ◽  
Kim S. Lewers

Interest in molecular markers and genetic maps is growing among researchers developing new cultivars of Rubus L. (raspberry and blackberry). Several traits of interest fail to express in seedlings or reliably in some environments and are candidates for marker-assisted selection. A growing number of simple sequence repeat (SSR) molecular markers derived from Rubus and Fragaria L. (strawberry) are available for use with Rubus mapping populations. The objectives of this study were to test 142 of these SSR markers to screen raspberry and blackberry parental genotypes for potential use in existing mapping populations that segregate for traits of interest, determine the extent of inter-species and inter-genera transferability with amplification, and determine the level of polymorphism among the parents. Up to 32 of the SSR primer pairs tested may be useful for genetic mapping in both the blackberry population and at least one of the raspberry populations. The maximum number of SSR primer pairs found useable for mapping was 60 for the raspberry population and 45 for the blackberry population. Acquisition of many more nucleotide sequences from red raspberry, black raspberry, and blackberry are required to develop useful molecular markers and genetic maps for these species. Rubus, family Rosaceae, is a highly diverse genus that contains hundreds of heterozygous species. The family is one of the most agronomically important plant families in temperate regions of the world, although they also occur in tropical and arctic regions as well. The most important commercial subgenus of Rubus is Idaeobatus Focke, the raspberries, which are primarily diploids. This subgenus contains the european red raspberry R. idaeus ssp. idaeus L., as well as the american black raspberry R. occidentalis L. and the american red raspberry R. idaeus ssp. strigosus Michx. Interspecific hybridization of these, and other raspberry species, has led to greater genetic diversity and allowed for the introgression of superior traits such as large fruit size, fruit firmness and quality, disease resistance, and winter hardiness.


2021 ◽  
Vol 70 (1) ◽  
pp. 108-116
Author(s):  
Chander Shekhar ◽  
Anita Rawat ◽  
Maneesh S. Bhandari ◽  
Santan Barthwal ◽  
Harish S. Ginwal ◽  
...  

Abstract Cross-amplification is a cost-effective method to extend the applicability of SSR markers to closely related taxa which lack their own sequence information. In the present study, 35 SSR markers developed in four oak species of Europe, North America and Asia were selected and screened in five species of the western Himalayas. Fifteen markers were successfully amplified in Quercus semecarpifolia, followed by 11 each in Q. floribunda and Q. leucotrichophora, 10 in Q. glauca, and 9 in Q. lana-ta. Except two primer pairs in Q. semecarpifolia, all were found to be polymorphic. Most of the positively cross-amplified SSRs were derived from the Asian oak, Q. mongolica. The genoty-ping of 10 individuals of each species with positively cross-amplified SSRs displayed varied levels of polymorphism in the five target oak species, viz., QmC00419 was most polymorphic in Q. floribunda, QmC00716 in Q. glauca and Q. lanata, QmC01368 in Q. leucotrichophora, and QmC02269 in Q. semecarpifolia. Among five oak species, the highest gene diversity was depicted in Q. lanata and Q. semecarpifolia with expected heterozygosity (He = 0.72), while the minimum was recorded for Q. leucotrichophora and Q. glauca (He = 0.65). The SSRs validated here provide a valuable resource to carry out further population genetic analysis in oaks of the western Himalayas.


2021 ◽  
Author(s):  
Peipei Wang ◽  
Fanrui Meng ◽  
Bethany M. Moore ◽  
Shin-Han Shiu

Abstract Background: Availability of plant genome sequences has led to significant advances. However, with few exceptions, the great majority of existing genome assemblies are derived from short read sequencing technologies with highly uneven read coverages indicative of sequencing and assembly issues that could significantly impact any downstream analysis of plant genomes. In tomato for example, 0.6% (5.1 Mb) and 9.7% (79.6 Mb) of short-read based assembly had significantly higher and lower coverage compared to background, respectively.Results: To understand what the causes may be for such uneven coverage, we first established machine learning models capable of predicting genomic regions with variable coverages and found that high coverage regions tend to have higher simple sequence repeat and tandem gene densities compared to background regions. To determine if the high coverage regions were misassembled, we examined a recently available tomato long-read based assembly and found that 27.8% (1.41 Mb) of high coverage regions were potentially misassembled of duplicate sequences, compared to 1.4% in background regions. In addition, using a predictive model that can distinguish correctly and incorrectly assembled high coverage regions, we found that misassembled, high coverage regions tend to be flanked by simple sequence repeats, pseudogenes, and transposon elements. Conclusions: Our study provides insights on the causes of variable coverage regions and a quantitative assessment of factors contributing to plant genome misassembly when using short reads.


2020 ◽  
Author(s):  
Peipei Wang ◽  
Fanrui Meng ◽  
Bethany M. Moore ◽  
Shin-Han Shiu

Abstract Background Availability of plant genome sequences has led to significant advances. However, with few exceptions, the great majority of existing genome assemblies are derived from short read sequencing technologies with highly uneven read coverages indicative of sequencing and assembly issues that could significantly impact any downstream analysis of plant genomes. In tomato for example, 0.6% (5.1 Mb) and 9.7% (79.6 Mb) of short-read based assembly had significantly higher and lower coverage compared to background, respectively. Results To understand what the causes may be for such uneven coverage, we first established machine learning models capable of predicting genomic regions with variable coverages and found that high coverage regions tend to have higher simple sequence repeat and tandem gene densities compared to background regions. To determine if the high coverage regions were misassembled, we examined a recently available tomato long-read based assembly and found that 27.8% (1.41 Mb) of high coverage regions were potentially misassembled of duplicate sequences, compared to 1.4% in background regions. In addition, using a predictive model that can distinguish correctly and incorrectly assembled high coverage regions, we found that misassembled, high coverage regions tend to be flanked by simple sequence repeats, pseudogenes, and transposon elements. Conclusions Our study provides insights on the causes of variable coverage regions and a quantitative assessment of factors contributing to plant genome misassembly when using short reads.


2019 ◽  
Vol 9 (22) ◽  
pp. 4957 ◽  
Author(s):  
Xianqing Liu ◽  
Puyang Zhang ◽  
Mingjie Zhao ◽  
Hongyan Ding ◽  
Conghuan Le

Large-diameter multi-bucket foundation is well suited for offshore wind turbines at deeper water than 20 m. Air floating transportation is one of the key technologies for the cost-effective development of bucket foundation. To predict the dynamic behavior of large-diameter tripod bucket foundation (LDTBF) supported by an air cushion and a water plug inside every bucket in waves, three 1/25-scale physical model tests with different bucket spacing were conducted in waves; detailed prototype foundation models were established using a hydrodynamic software MOSES with a draft of 4.0 m, 4.5 m, and 5.0 m and with a water depth of 10.0 m, 11.25 m, and 12.5 m. The numerical and experimental results are consistent for heaving motion, while exhibiting favorable agreement for pitching motion. The results show that the resonant periods for heaving motion increased with increasing draft and water depth. The maximum amplitude for heaving motion first decreased and then increased with the increase of water depth and spacing between the buckets. The maximum amplitude for pitching motion first decreased and then increased with increasing water depth but decreased with increasing spacing between the buckets. The wider the spacing between the bucket foundations, the larger the heave response amplitude operators (RAOs). Simply improving the pitch RAOs by increasing the spacing between bucket foundations is limited and negatively affects motion performance during the transportation of LDTBF.


HortScience ◽  
2004 ◽  
Vol 39 (4) ◽  
pp. 785D-785 ◽  
Author(s):  
Kim S. Lewers* ◽  
Eric T. Stafne ◽  
John R. Clark ◽  
Courtney A. Weber ◽  
Julie Graham

Some raspberry and blackberry breeders are interested in using molecular markers to assist with selection. Simple Sequence Repeat markers (SSRs) have many advantages, and SSRs developed from one species can sometimes be used with related species. Six SSRs derived from the weed R. alceifolius, and 74 SSRs from R. idaeus red raspberry `Glen Moy' were tested on R. idaeus red raspberry selection NY322 from Cornell Univ., R. occidentalis `Jewel' black raspberry, Rubus spp. blackberry `Arapaho', and blackberry selection APF-12 from the Univ. of Arkansas. The two raspberry genotypes are parents of an interspecific mapping population segregating for primocane fruiting and other traits. The two blackberry genotypes are parents of a population segregating for primocane fruiting and thornlessness. Of the six R. alceifolius SSRs, two amplified a product from all genotypes. Of the 74 red raspberry SSRs, 56 (74%) amplified a product from NY322, 39 (53%) amplified a product from `Jewel', and 24 (32%) amplified a product from blackberry. Of the 56 SSRs that amplified a product from NY322, 17 failed to amplify a product from `Jewel' and, therefore, detected polymorphisms between the parents of this mapping population. Twice as many detected polymorphisms of this type between blackberry and red raspberry, since 33 SSRs amplified a product from NY322, but neither of the blackberry genotypes. Differences in PCR product sizes from these genotypes reveal additional polymorphisms. Rubus is among the most diverse genera in the plant kingdom, so it is not surprising that only 19 of the 74 raspberry-derived SSRs amplified a product from all four of the genotypes tested. These SSRs will be useful in interspecific mapping and cultivar development.


2020 ◽  
Author(s):  
Robert A. Player ◽  
Ellen R. Forsyth ◽  
Kathleen J. Verratti ◽  
David W. Mohr ◽  
Alan F. Scott ◽  
...  

ABSTRACTReference genome fidelity is critically important for genome wide association studies (GWAS), yet many are incomplete or too dissimilar from the study population. A typical whole genome sequencing approach implies short-read technologies resulting in fragmented assemblies with regions of ambiguity low complexity. Further information is lost by economic necessity when genotyping populations, as lower resolution technologies such as genotyping arrays are commonly utilized. Here we present a phased reference genome for Canis lupus familiaris utilizing high molecular weight sequencing technologies. We tested wet lab and bioinformatic approaches to demonstrate a minimum workflow to generate the 2.4 gigabase genome for a Labrador Retriever. The resulting de novo assembly required eight Oxford Nanopore R9.4 flowcells (~23X depth) and running a 10X Genomics library on the equivalent of one lane of an Illumina NovaSeq S1 flowcell (~88X depth), bringing the cost of generating a nearly complete reference genome to less than $10K. Mapping of publicly available short-read data from ten Labrador Retrievers against this breed-specific reference resulted in an average of approximately 1% more aligned reads compared to mapping against the current gold standard reference (CanFam3.1, p<0.001), indicating a more complete breed-specific reference. An average 15% reduction of variant calls was observed from the same mapped data, which increases the chance of identifying low effect size variants in a GWAS. We believe that by incorporating the cost to produce a full genome assembly into any large-scale canine genotyping study, an investigator can make an informed cost/benefit analysis regarding genotyping technology.


2021 ◽  
Author(s):  
Peipei Wang ◽  
Fanrui Meng ◽  
Bethany M. Moore ◽  
Shin-Han Shiu

Abstract Background: Availability of plant genome sequences has led to significant advances. However, with few exceptions, the great majority of existing genome assemblies are derived from short read sequencing technologies with highly uneven read coverages indicative of sequencing and assembly issues that could significantly impact any downstream analysis of plant genomes. In tomato for example, 0.6% (5.1 Mb) and 9.7% (79.6 Mb) of short-read based assembly had significantly higher and lower coverage compared to background, respectively.Results: To understand what the causes may be for such uneven coverage, we first established machine learning models capable of predicting genomic regions with variable coverages and found that high coverage regions tend to have higher simple sequence repeat and tandem gene densities compared to background regions. To determine if the high coverage regions were misassembled, we examined a recently available tomato long-read based assembly and found that 27.8% (1.41 Mb) of high coverage regions were potentially misassembled of duplicate sequences, compared to 1.4% in background regions. In addition, using a predictive model that can distinguish correctly and incorrectly assembled high coverage regions, we found that misassembled, high coverage regions tend to be flanked by simple sequence repeats, pseudogenes, and transposon elements. Conclusions: Our study provides insights on the causes of variable coverage regions and a quantitative assessment of factors contributing to plant genome misassembly when using short reads and the generality of these causes and factors should be tested further in other species.


2019 ◽  
Author(s):  
Peipei Wang ◽  
Fanrui Meng ◽  
Bethany M. Moore ◽  
Shin-Han Shiu

ABSTRACTAvailability of genome sequences has led to significant advance in biology. With few exceptions, the great majority of existing genome assemblies are derived from short read sequencing technologies with highly uneven read coverages indicative of sequencing and assembly issues. In tomato, 0.6% (5.1 Mb) and 9.7% (79.6 Mb) of short-read based assembly had significantly higher and lower coverage compared to background, respectively. We established machine learning models capable of predicting genomic regions with variable coverages and found that high coverage regions tend to have lower simple sequence repeat but higher tandem gene densities compared to background regions. To determine if the high coverage regions were misassembled, we examined a recently available long-read based assembly and found that 27.8% (1.41 Mb) of high coverage regions were potentially mis-assembled of duplicate sequences, compared to 1.4% in background regions. In addition, using a machine learning model that can distinguish correctly and incorrectly assembled high coverage regions, we found that misassembled, high coverage regions tend to be flanked by simple sequence repeats, pseudogenes, and transposon elements. Our study provides insights on the causes of variable coverage regions and a quantitative assessment of factors contributing to misassembly when using short reads.


2017 ◽  
Author(s):  
Bernardo J. Clavijo ◽  
Gonzalo Garcia Accinelli ◽  
Jonathan Wright ◽  
Darren Heavens ◽  
Katie Barr ◽  
...  

AbstractProducing high-quality whole-genome shotgun de novo assemblies from plant and animal species with large and complex genomes using low-cost short read sequencing technologies remains a challenge. But when the right sequencing data, with appropriate quality control, is assembled using approaches focused on robustness of the process rather than maximization of a single metric such as the usual contiguity estimators, good quality assemblies with informative value for comparative analyses can be produced. Here we present a complete method described from data generation and qc all the way up to scaffold of complex genomes using Illumina short reads and its application to data from plants and human datasets. We show how to use the w2rap pipeline following a metric-guided approach to produce cost-effective assemblies. The assemblies are highly accurate, provide good coverage of the genome and show good short range contiguity. Our pipeline has already enabled the rapid, cost-effective generation of de novo genome assemblies from large, polyploid crop species with a focus on comparative genomics.Availabilityw2rap is available under MIT license, with some subcomponents under GPL-licenses. A ready-to-run docker with all software pre-requisites and example data is also available.http://github.com/bioinfologics/w2raphttp://github.com/bioinfologics/w2rap-contigger


Sign in / Sign up

Export Citation Format

Share Document