sequencing project
Recently Published Documents


TOTAL DOCUMENTS

264
(FIVE YEARS 60)

H-INDEX

40
(FIVE YEARS 4)

BioTechniques ◽  
2021 ◽  
Author(s):  
Janneke Aylward ◽  
Michael J Wingfield ◽  
Francois Roets ◽  
Brenda D Wingfield

Contamination in sequenced genomes is a relatively common problem and several methods to remove non-target sequences have been devised. Typically, the target and contaminating organisms reside in different kingdoms, simplifying their separation. The authors present the case of a genome for the ascomycete fungus Teratosphaeria eucalypti, contaminated by another ascomycete fungus and a bacterium. Approaching the problem as a low-complexity metagenomics project, the authors used two available software programs, BlobToolKit and anvi'o, to filter the contaminated genome. Both the de novo and reference-assisted approaches yielded a high-quality draft genome assembly for the target fungus. Incorporating reference sequences increased assembly completeness and visualization elucidated previously unknown genome features. The authors suggest that visualization should be routine in any sequencing project, regardless of suspected contamination.


2021 ◽  
Vol 12 ◽  
Author(s):  
Rakesh K. Srivastava ◽  
C. Tara Satyavathi ◽  
Mahesh D. Mahendrakar ◽  
Ram B. Singh ◽  
Sushil Kumar ◽  
...  

Iron (Fe) and zinc (Zn) micronutrient deficiencies are significant health concerns, particularly among the underprivileged and resource-poor people in the semi-arid tropics globally. Pearl millet is regarded as a climate-smart crop with low water and energy footprints. It thrives well under adverse agro-ecologies such as high temperatures and limited rainfall. Pearl millet is regarded as a nutri-cereal owing to health-promoting traits such as high grain Fe and Zn content, metabolizable energy, high antioxidant and polyphenols, high proportion of slowly digestible starches, dietary fibers, and favorable essential amino acid profile compared to many cereals. Higher genetic variability for grain Fe and Zn content has facilitated considerable progress in mapping and mining QTLs, alleles and genes underlying micronutrient metabolism. This has been made possible by developing efficient genetic and genomic resources in pearl millet over the last decade. These include genetic stocks such as bi-parental RIL mapping populations, association mapping panels, chromosome segment substitution lines (CSSLs) and TILLING populations. On the genomics side, considerable progress has been made in generating genomic markers, such as SSR marker repository development. This was followed by the development of a next-generation sequencing-based genome-wide SNP repository. The circa 1,000 genomes re-sequencing project played a significant role. A high-quality reference genome was made available by re-sequencing of world diversity panel, mapping population parents and hybrid parental lines. This mini-review attempts to provide information on the current developments on mapping Fe and Zn content in pearl millet and future outlook.


2021 ◽  
Vol 12 ◽  
Author(s):  
Wan-Ping Lee ◽  
Albert A. Tucci ◽  
Mitchell Conery ◽  
Yuk Yee Leung ◽  
Amanda B. Kuzma ◽  
...  

Alzheimer’s Disease (AD) is a progressive neurologic disease and the most common form of dementia. While the causes of AD are not completely understood, genetics plays a key role in the etiology of AD, and thus finding genetic factors holds the potential to uncover novel AD mechanisms. For this study, we focus on copy number variation (CNV) detection and burden analysis. Leveraging whole-genome sequence (WGS) data released by Alzheimer’s Disease Sequencing Project (ADSP), we developed a scalable bioinformatics pipeline to identify CNVs. This pipeline was applied to 1,737 AD cases and 2,063 cognitively normal controls. As a result, we observed 237,306 and 42,767 deletions and duplications, respectively, with an average of 2,255 deletions and 1,820 duplications per subject. The burden tests show that Non-Hispanic-White cases on average have 16 more duplications than controls do (p-value 2e-6), and Hispanic cases have larger deletions than controls do (p-value 6.8e-5).


2021 ◽  
Author(s):  
Michael E Belloy ◽  
Yann E Le Guen ◽  
Sarah J. Eger ◽  
Valerio Napolioni ◽  
Michael D. Greicius ◽  
...  

Whole-exome sequencing (WES) and whole-genome sequencing (WGS) are expected to be critical to further elucidate the missing genetic heritability of Alzheimer's disease (AD) risk by identifying rare coding and/or noncoding variants that contribute to AD pathogenesis. In the United States, the Alzheimer's Disease Sequencing Project (ADSP) has taken a leading role in sequencing AD-related samples at scale, with the resultant data being made publicly available to researchers to generate new insights into the genetic etiology of AD. In order to achieve sufficient power, the ADSP has adapted a study design where subsets of larger AD cohorts are collected and sequenced across multiple centers, using a variety of sequencing kits. This approach may lead to variable variant quality across sequencing centers and/or kits. Here, we performed exome-wide and genome-wide association analyses on AD risk using the latest ADSP WES and WGS data releases. We observed that many variants displayed large variation in allele frequencies across sequencing centers/kits and contributed to spurious association signals with AD risk. We also observed that sequencing kit/center adjustment in association models could not fully account for these spurious signals. To address this issue, we designed and implemented novel filters that aim to capture and remove these center/kit-specific artifactual variants. We conclude by deriving a novel, fast, and robust approach to filter variants that represent sequencing center- or kit-related artifacts underlying spurious associations with AD risk in ADSP WES and WGS data. This approach will be important to support future robust genetic association studies on ADSP data, as well as other studies with similar designs.


Genes ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 1645
Author(s):  
Anna Vlasova ◽  
Toni Hermoso Pulido ◽  
Francisco Camara ◽  
Julia Ponomarenko ◽  
Roderic Guigó

Functional annotation allows adding biologically relevant information to predicted features in genomic sequences, and it is, therefore, an important procedure of any de novo genome sequencing project. It is also useful for proofreading and improving gene structural annotation. Here, we introduce FA-nf, a pipeline implemented in Nextflow, a versatile computational workflow management engine. The pipeline integrates different annotation approaches, such as NCBI BLAST+, DIAMOND, InterProScan, and KEGG. It starts from a protein sequence FASTA file and, optionally, a structural annotation file in GFF format, and produces several files, such as GO assignments, output summaries of the abovementioned programs and final annotation reports. The pipeline can be broken easily into smaller processes for the purpose of parallelization and easily deployed in a Linux computational environment, thanks to software containerization, thus helping to ensure full reproducibility.


2021 ◽  
Vol 7 (9) ◽  
Author(s):  
Paula Gagetti ◽  
Stephanie W. Lo ◽  
Paulina A. Hawkins ◽  
Rebecca A. Gladstone ◽  
Mabel Regueira ◽  
...  

Invasive disease caused by Streptococcus pneumoniae (IPD) is one of the leading causes of morbidity and mortality in young children worldwide. In Argentina, PCV13 was introduced into the childhood immunization programme nationwide in 2012 and PCV7 was available from 2000, but only in the private market. Since 1993 the National IPD Surveillance Programme, consisting of 150 hospitals, has conducted nationwide pneumococcal surveillance in Argentina in children under 6 years of age, as part of the SIREVA II-OPS network. A total of 1713 pneumococcal isolates characterized by serotype (Quellung) and antimicrobial resistance (agar dilution) to ten antibiotics, belonging to three study periods: pre-PCV7 era 1998–1999 (pre-PCV), before the introduction of PCV13 2010–2011 (PCV7) and after the introduction of PCV13 2012–2013 (PCV13), were available for inclusion. Fifty-four serotypes were identified in the entire collection and serotypes 14, 5 and 1 represented 50 % of the isolates. Resistance to penicillin was 34.9 %, cefotaxime 10.6 %, meropenem 4.9 %, cotrimoxazole 45 %, erythromycin 21.5 %, tetracycline 15.4 % and chloramphenicol 0.4 %. All the isolates were susceptible to levofloxacin, rifampin and vancomycin. Of 1713 isolates, 1061 (61.9 %) were non-susceptible to at least one antibiotic and 235(13.7 %) were multidrug resistant. A subset of 413 isolates was randomly selected and whole-genome sequenced as part of Global Pneumococcal Sequencing Project (GPS). The genome data was used to investigate the population structure of S. pneumoniae defining pneumococcal lineages using Global Pneumococcal Sequence Clusters (GPSCs), sequence types (STs) and clonal complexes (CCs), prevalent serotypes and their associated pneumococcal lineages and genomic inference of antimicrobial resistance. The collection showed a great diversity of strains. Among the 413 isolates, 73 known and 36 new STs were identified belonging to 38 CCs and 25 singletons, grouped into 52 GPSCs. Important changes were observed among vaccine types when pre-PCV and PCV13 periods were compared; a significant decrease in serotypes 14, 6B and 19F and a significant increase in 7F and 3. Among non-PCV13 types, serogroup 24 increased from 0 % in pre-PCV to 3.2 % in the PCV13 period. Our analysis showed that 66.1 % (273/413) of the isolates were predicted to be non-susceptible to at least one antibiotic and 11.9 % (49/413) were multidrug resistant. We found an agreement of 100 % when comparing the serotype determined by Quellung and WGS-based serotyping and 98.4 % of agreement in antimicrobial resistance. Continued surveillance of the pneumococcal population is needed to reveal the dynamics of pneumococcal isolates in Argentina in post-PCV13. This article contains data hosted by Microreact.


2021 ◽  
Author(s):  
Bowen Jin ◽  
John A Capra ◽  
Penelope Benchek ◽  
Nicholas R Wheeler ◽  
Adam C Naj ◽  
...  

Over 90% of variants are rare, and 50% of them are singletons in the Alzheimer's Disease Sequencing Project Whole Exome Sequencing (ADSP WES) data. However, either single variant tests or unit-based tests are limited in the statistical power to detect the association between rare variants and phenotypes. To best utilize rare variants and investigate their biological effect, we exam their association with phenotypes in the context of protein. We developed a protein structure-based approach, POKEMON (Protein Optimized Kernel Evaluation of Missense Nucleotides), which evaluates rare missense variants based on their spatial distribution on the protein rather than allele frequency. The hypothesis behind this is that the three-dimensional spatial distribution of variants within a protein structure provides functional context and improves the power of association tests. POKEMON identified four candidate genes from the ADSP WES data, namely two known Alzheimer's disease (AD) genes (TREM2 and SORL) and two novel genes (DUSP18 and CSF1R). For known AD genes, the signal from the spatial cluster is stable even if we exclude known AD risk variants, indicating the presence of additional low frequency risk variants within these genes. DUSP18 has a cluster of variants primarily shared by case subjects around the ligand-binding domain, and this cluster is further validated in a replication dataset with a larger sample size. POKEMON is an open-source tool available at https://github.com/bushlab-genomics/POKEMON.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Mick Van Vlierberghe ◽  
Arnaud Di Franco ◽  
Hervé Philippe ◽  
Denis Baurain

Abstract Objectives Complex algae are photosynthetic organisms resulting from eukaryote-to-eukaryote endosymbiotic-like interactions. Yet the specific lineages and mechanisms are still under debate. That is why large scale phylogenomic studies are needed. Whereas available proteomes provide a limited diversity of complex algae, MMETSP (Marine Microbial Eukaryote Transcriptome Sequencing Project) transcriptomes represent a valuable resource for phylogenomic analyses, owing to their broad and rich taxonomic sampling, especially of photosynthetic species. Unfortunately, this sampling is unbalanced and sometimes highly redundant. Moreover, we observed contaminated sequences in some samples. In such a context, tree inference and readability are impaired. Consequently, the aim of the data processing reported here is to release a unique set of clean and non-redundant transcriptomes produced through an original protocol featuring decontamination, pooling and dereplication steps. Data description We submitted 678 MMETSP re-assembly samples to our parallel consolidation pipeline. Hence, we combined 423 samples into 110 consolidated transcriptomes, after the systematic removal of the most contaminated samples (186). This approach resulted in a total of 224 high-quality transcriptomes, easy to use and suitable to compute less contaminated, less redundant and more balanced phylogenies.


2021 ◽  
Vol 28 (3) ◽  
pp. 1-4
Author(s):  
A Rahman A Jamal

Precision medicine is transforming healthcare worldwide and aims to improve the effectiveness of management of many diseases including cancers, other non-communicable diseases (NCDs) and also rare diseases. Precision medicine takes into account the individual patient’s genetic, environment and lifestyle data. Developed nations are already embarking on precision medicine initiatives including the 100,000 Genomes England and the Precision Medicine Initiative in the United States (US). The Academy of Sciences Malaysia, the Ministry of Health and the Ministry of Higher Education are working together to put forward a precision medicine initiative for Malaysia. The key drivers that must be put in place include a strong policy agenda, a national large scale genome sequencing project and with it a national genome database, the implementation of the electronic medical record (EMR) system, a payment and reimbursement system to cover for the genetic testing and the targeted treatment, and putting in place an ecosystem that will support precision medicine. Relevant guidelines and Acts will also need to be developed especially with regard to privacy and confidentiality. The future of precision medicine is now and this will certainly bring better outcome and value to the patients.


Sign in / Sign up

Export Citation Format

Share Document