scholarly journals Combined use of feature engineering and machine-learning to predict essential genes in Drosophila melanogaster

2020 ◽  
Vol 2 (3) ◽  
Author(s):  
Tulio L Campos ◽  
Pasi K Korhonen ◽  
Andreas Hofmann ◽  
Robin B Gasser ◽  
Neil D Young

Abstract Characterizing genes that are critical for the survival of an organism (i.e. essential) is important to gain a deep understanding of the fundamental cellular and molecular mechanisms that sustain life. Functional genomic investigations of the vinegar fly, Drosophila melanogaster, have unravelled the functions of numerous genes of this model species, but results from phenomic experiments can sometimes be ambiguous. Moreover, the features underlying gene essentiality are poorly understood, posing challenges for computational prediction. Here, we harnessed comprehensive genomic-phenomic datasets publicly available for D. melanogaster and a machine-learning-based workflow to predict essential genes of this fly. We discovered strong predictors of such genes, paving the way for computational predictions of essentiality in less-studied arthropod pests and vectors of infectious diseases.

2021 ◽  
Vol 28 ◽  
Author(s):  
Yi-Wei Zhao ◽  
Shihua Zhang ◽  
Hui Ding

: Sumoylation of proteins is an important reversible post-translational modification of proteins and mediates a variety of cellular processes. Sumo-modified proteins can change their subcellular localization, activity and stability. In addition, it also plays an important role in various cellular processes such as transcriptional regulation and signal transduction. The abnormal sumoylation is involved in many diseases, including neurodegeneration and immune-related diseases, as well as the development of cancer. Therefore, identification of the sumoylation site (SUMO site) is fundamental to understanding their molecular mechanisms and regulatory roles. In contrast to labor-intensive and costly experimental approaches, computational prediction of sumoylation sites in silico also attracted much attention for its accuracy, convenience and speed. At present, many computational prediction models have been used to identify SUMO sites, but these contents have not been comprehensively summarized and reviewed. Therefore, the research progress of relevant models is summarized and discussed in this paper. We will briefly summarize the development of bioinformatics methods on sumoylation site prediction. We will mainly focus on the benchmark dataset construction, feature extraction, machine learning method, published results and online tools. We hope the review will provide more help for wet-experimental scholars.


Biomolecules ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 496
Author(s):  
Alessandra Maresca ◽  
Valerio Carelli

Inherited neurodegeneration of the optic nerve is a paradigm in neurology, as many forms of isolated or syndromic optic atrophy are encountered in clinical practice. The retinal ganglion cells originate the axons that form the optic nerve. They are particularly vulnerable to mitochondrial dysfunction, as they present a peculiar cellular architecture, with axons that are not myelinated for a long intra-retinal segment, thus, very energy dependent. The genetic landscape of causative mutations and genes greatly enlarged in the last decade, pointing to common pathways. These mostly imply mitochondrial dysfunction, which leads to a similar outcome in terms of neurodegeneration. We here critically review these pathways, which include (1) complex I-related oxidative phosphorylation (OXPHOS) dysfunction, (2) mitochondrial dynamics, and (3) endoplasmic reticulum-mitochondrial inter-organellar crosstalk. These major pathogenic mechanisms are in turn interconnected and represent the target for therapeutic strategies. Thus, their deep understanding is the basis to set and test new effective therapies, an urgent unmet need for these patients. New tools are now available to capture all interlinked mechanistic intricacies for the pathogenesis of optic nerve neurodegeneration, casting hope for innovative therapies to be rapidly transferred into the clinic and effectively cure inherited optic neuropathies.


2021 ◽  
Vol 22 (5) ◽  
pp. 2704
Author(s):  
Andi Nur Nilamyani ◽  
Firda Nurul Auliah ◽  
Mohammad Ali Moni ◽  
Watshara Shoombuatong ◽  
Md Mehedi Hasan ◽  
...  

Nitrotyrosine, which is generated by numerous reactive nitrogen species, is a type of protein post-translational modification. Identification of site-specific nitration modification on tyrosine is a prerequisite to understanding the molecular function of nitrated proteins. Thanks to the progress of machine learning, computational prediction can play a vital role before the biological experimentation. Herein, we developed a computational predictor PredNTS by integrating multiple sequence features including K-mer, composition of k-spaced amino acid pairs (CKSAAP), AAindex, and binary encoding schemes. The important features were selected by the recursive feature elimination approach using a random forest classifier. Finally, we linearly combined the successive random forest (RF) probability scores generated by the different, single encoding-employing RF models. The resultant PredNTS predictor achieved an area under a curve (AUC) of 0.910 using five-fold cross validation. It outperformed the existing predictors on a comprehensive and independent dataset. Furthermore, we investigated several machine learning algorithms to demonstrate the superiority of the employed RF algorithm. The PredNTS is a useful computational resource for the prediction of nitrotyrosine sites. The web-application with the curated datasets of the PredNTS is publicly available.


2021 ◽  
Vol 22 (10) ◽  
pp. 5056
Author(s):  
Tulio L. Campos ◽  
Pasi K. Korhonen ◽  
Neil D. Young

Experimental studies of Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular and cellular processes in metazoans at large. Since the publication of their genomes, functional genomic investigations have identified genes that are essential or non-essential for survival in each species. Recently, a range of features linked to gene essentiality have been inferred using a machine learning (ML)-based approach, allowing essentiality predictions within a species. Nevertheless, predictions between species are still elusive. Here, we undertake a comprehensive study using ML to discover and validate features of essential genes common to both C. elegans and D. melanogaster. We demonstrate that the cross-species prediction of gene essentiality is possible using a subset of features linked to nucleotide/protein sequences, protein orthology and subcellular localisation, single-cell RNA-seq, and histone methylation markers. Complementary analyses showed that essential genes are enriched for transcription and translation functions and are preferentially located away from heterochromatin regions of C. elegans and D. melanogaster chromosomes. The present work should enable the cross-prediction of essential genes between model and non-model metazoans.


Genes ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 315
Author(s):  
Xu Yang ◽  
Kai Chen ◽  
Yaohui Wang ◽  
Dehong Yang ◽  
Yongping Huang

In insects, sex determination pathways involve three levels of master regulators: primary signals, which determine the sex; executors, which control sex-specific differentiation of tissues and organs; and transducers, which link the primary signals to the executors. The primary signals differ widely among insect species. In Diptera alone, several unrelated primary sex determiners have been identified. However, the doublesex (dsx) gene is highly conserved as the executor component across multiple insect orders. The transducer level shows an intermediate level of conservation. In many, but not all examined insects, a key transducer role is performed by transformer (tra), which controls sex-specific splicing of dsx. In Lepidoptera, studies of sex determination have focused on the lepidopteran model species Bombyx mori (the silkworm). In B. mori, the primary signal of sex determination cascade starts from Fem, a female-specific PIWI-interacting RNA, and its targeting gene Masc, which is apparently specific to and conserved among Lepidoptera. Tra has not been found in Lepidoptera. Instead, the B. mori PSI protein binds directly to dsx pre-mRNA and regulates its alternative splicing to produce male- and female-specific transcripts. Despite this basic understanding of the molecular mechanisms underlying sex determination, the links among the primary signals, transducers and executors remain largely unknown in Lepidoptera. In this review, we focus on the latest findings regarding the functions and working mechanisms of genes involved in feminization and masculinization in Lepidoptera and discuss directions for future research of sex determination in the silkworm.


2017 ◽  
Vol 11 (01) ◽  
pp. 1850007 ◽  
Author(s):  
Yingchuan He ◽  
Weize Xu ◽  
Yao Zhi ◽  
Rohit Tyagi ◽  
Zhe Hu ◽  
...  

Traditionally, optical microscopy is used to visualize the morphological features of pathogenic bacteria, of which the features are further used for the detection and identification of the bacteria. However, due to the resolution limitation of conventional optical microscopy as well as the lack of standard pattern library for bacteria identification, the effectiveness of this optical microscopy-based method is limited. Here, we reported a pilot study on a combined use of Structured Illumination Microscopy (SIM) with machine learning for rapid bacteria identification. After applying machine learning to the SIM image datasets from three model bacteria (including Escherichia coli, Mycobacterium smegmatis, and Pseudomonas aeruginosa), we obtained a classification accuracy of up to 98%. This study points out a promising possibility for rapid bacterial identification by morphological features.


2018 ◽  
Vol 5 (8) ◽  
pp. 180458 ◽  
Author(s):  
Eva Jiménez-Guri ◽  
Karl R. Wotton ◽  
Johannes Jaeger

Gap genes are involved in segment determination during early development of the vinegar fly Drosophila melanogaster and other dipteran insects (flies, midges and mosquitoes). They are expressed in overlapping domains along the antero-posterior (A–P) axis of the blastoderm embryo. While gap domains cover the entire length of the A–P axis in Drosophila, there is a region in the blastoderm of the moth midge Clogmia albipunctata , which lacks canonical gap gene expression. Is a non-canonical gap gene functioning in this area? Here, we characterize tarsal-less ( tal ) in C. albipunctata . The homologue of tal in the flour beetle Tribolium castaneum (called milles-pattes, mlpt ) is a bona fide gap gene. We find that Ca-tal is expressed in the region previously reported as lacking gap gene expression. Using RNA interference, we study the interaction of Ca-tal with gap genes. We show that Ca-tal is regulated by gap genes, but only has a very subtle effect on tailless (Ca-tll), while not affecting other gap genes at all. Moreover, cuticle phenotypes of Ca-tal depleted embryos do not show any gap phenotype. We conclude that Ca-tal is expressed and regulated like a gap gene, but does not function as a gap gene in C. albipunctata .


2018 ◽  
Vol 19 (S14) ◽  
Author(s):  
Diogo Manuel Carvalho Leite ◽  
Xavier Brochet ◽  
Grégory Resch ◽  
Yok-Ai Que ◽  
Aitana Neves ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document