scholarly journals Comparing the utility of in vivo transposon mutagenesis approaches in yeast species to infer gene essentiality

2019 ◽  
Author(s):  
Anton Levitan ◽  
Andrew N. Gale ◽  
Emma K. Dallon ◽  
Darby W. Kozan ◽  
Kyle W. Cunningham ◽  
...  

ABSTRACTIn vivo transposon mutagenesis, coupled with deep sequencing, enables large-scale genome-wide mutant screens for genes essential in different growth conditions. We analyzed six large-scale studies performed on haploid strains of three yeast species (Saccharomyces cerevisiae, Schizosaccaromyces pombe, and Candida albicans), each mutagenized with two of three different heterologous transposons (AcDs, Hermes, and PiggyBac). Using a machine-learning approach, we evaluated the ability of the data to predict gene essentiality. Important data features included sufficient numbers and distribution of independent insertion events. All transposons showed some bias in insertion site preference because of jackpot events, and preferences for specific insertion sequences and short-distance vs long-distance insertions. For PiggyBac, a stringent target sequence limited the ability to predict essentiality in genes with few or no target sequences. The machine learning approach also robustly predicted gene function in less well-studied species by leveraging cross-species orthologs. Finally, comparisons of isogenic diploid versus haploid S. cerevisiae isolates identified several genes that are haplo-insufficient, while most essential genes, as expected, were recessive. We provide recommendations for the choice of transposons and the inference of gene essentiality in genome-wide studies of eukaryotic haploid microbes such as yeasts, including species that have been less amenable to classical genetic studies.

2020 ◽  
Vol 66 (6) ◽  
pp. 1117-1134 ◽  
Author(s):  
Anton Levitan ◽  
Andrew N. Gale ◽  
Emma K. Dallon ◽  
Darby W. Kozan ◽  
Kyle W. Cunningham ◽  
...  

Abstract In vivo transposon mutagenesis, coupled with deep sequencing, enables large-scale genome-wide mutant screens for genes essential in different growth conditions. We analyzed six large-scale studies performed on haploid strains of three yeast species (Saccharomyces cerevisiae, Schizosaccaromyces pombe, and Candida albicans), each mutagenized with two of three different heterologous transposons (AcDs, Hermes, and PiggyBac). Using a machine-learning approach, we evaluated the ability of the data to predict gene essentiality. Important data features included sufficient numbers and distribution of independent insertion events. All transposons showed some bias in insertion site preference because of jackpot events, and preferences for specific insertion sequences and short-distance vs long-distance insertions. For PiggyBac, a stringent target sequence limited the ability to predict essentiality in genes with few or no target sequences. The machine learning approach also robustly predicted gene function in less well-studied species by leveraging cross-species orthologs. Finally, comparisons of isogenic diploid versus haploid S. cerevisiae isolates identified several genes that are haplo-insufficient, while most essential genes, as expected, were recessive. We provide recommendations for the choice of transposons and the inference of gene essentiality in genome-wide studies of eukaryotic haploid microbes such as yeasts, including species that have been less amenable to classical genetic studies.


PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0252096
Author(s):  
Maria B. Rabaglino ◽  
Alan O’Doherty ◽  
Jan Bojsen-Møller Secher ◽  
Patrick Lonergan ◽  
Poul Hyttel ◽  
...  

Pregnancy rates for in vitro produced (IVP) embryos are usually lower than for embryos produced in vivo after ovarian superovulation (MOET). This is potentially due to alterations in their trophectoderm (TE), the outermost layer in physical contact with the maternal endometrium. The main objective was to apply a multi-omics data integration approach to identify both temporally differentially expressed and differentially methylated genes (DEG and DMG), between IVP and MOET embryos, that could impact TE function. To start, four and five published transcriptomic and epigenomic datasets, respectively, were processed for data integration. Second, DEG from day 7 to days 13 and 16 and DMG from day 7 to day 17 were determined in the TE from IVP vs. MOET embryos. Third, genes that were both DE and DM were subjected to hierarchical clustering and functional enrichment analysis. Finally, findings were validated through a machine learning approach with two additional datasets from day 15 embryos. There were 1535 DEG and 6360 DMG, with 490 overlapped genes, whose expression profiles at days 13 and 16 resulted in three main clusters. Cluster 1 (188) and Cluster 2 (191) genes were down-regulated at day 13 or day 16, respectively, while Cluster 3 genes (111) were up-regulated at both days, in IVP embryos compared to MOET embryos. The top enriched terms were the KEGG pathway "focal adhesion" in Cluster 1 (FDR = 0.003), and the cellular component: "extracellular exosome" in Cluster 2 (FDR<0.0001), also enriched in Cluster 1 (FDR = 0.04). According to the machine learning approach, genes in Cluster 1 showed a similar expression pattern between IVP and less developed (short) MOET conceptuses; and between MOET and DKK1-treated (advanced) IVP conceptuses. In conclusion, these results suggest that early conceptuses derived from IVP embryos exhibit epigenomic and transcriptomic changes that later affect its elongation and focal adhesion, impairing post-transfer survival.


2021 ◽  
Author(s):  
Amnah Eltahir ◽  
Jason White ◽  
Terry Lohrenz ◽  
P. Read Montague

Abstract Machine learning advances in electrochemical detection have recently produced subsecond and concurrent detection of dopamine and serotonin during perception and action tasks in conscious humans. Here, we present a new machine learning approach to subsecond, concurrent separation of dopamine, norepinephrine, and serotonin. The method exploits a low amplitude burst protocol for the controlled voltage waveform and we demonstrate its efficacy by showing how it separates dopamine-induced signals from norepinephrine induced signals. Previous efforts to deploy electrochemical detection of dopamine in vivo have not separated the dopamine-dependent signal from a norepinephrine-dependent signal. Consequently, this new method can provide new insights into concurrent signaling by these two important neuromodulators.


PLoS ONE ◽  
2020 ◽  
Vol 15 (11) ◽  
pp. e0241239
Author(s):  
Kai On Wong ◽  
Osmar R. Zaïane ◽  
Faith G. Davis ◽  
Yutaka Yasui

Background Canada is an ethnically-diverse country, yet its lack of ethnicity information in many large databases impedes effective population research and interventions. Automated ethnicity classification using machine learning has shown potential to address this data gap but its performance in Canada is largely unknown. This study conducted a large-scale machine learning framework to predict ethnicity using a novel set of name and census location features. Methods Using census 1901, the multiclass and binary class classification machine learning pipelines were developed. The 13 ethnic categories examined were Aboriginal (First Nations, Métis, Inuit, and all-combined)), Chinese, English, French, Irish, Italian, Japanese, Russian, Scottish, and others. Machine learning algorithms included regularized logistic regression, C-support vector, and naïve Bayes classifiers. Name features consisted of the entire name string, substrings, double-metaphones, and various name-entity patterns, while location features consisted of the entire location string and substrings of province, district, and subdistrict. Predictive performance metrics included sensitivity, specificity, positive predictive value, negative predictive value, F1, Area Under the Curve for Receiver Operating Characteristic curve, and accuracy. Results The census had 4,812,958 unique individuals. For multiclass classification, the highest performance achieved was 76% F1 and 91% accuracy. For binary classifications for Chinese, French, Italian, Japanese, Russian, and others, the F1 ranged 68–95% (median 87%). The lower performance for English, Irish, and Scottish (F1 ranged 63–67%) was likely due to their shared cultural and linguistic heritage. Adding census location features to the name-based models strongly improved the prediction in Aboriginal classification (F1 increased from 50% to 84%). Conclusions The automated machine learning approach using only name and census location features can predict the ethnicity of Canadians with varying performance by specific ethnic categories.


Sign in / Sign up

Export Citation Format

Share Document