scholarly journals Development of predicitve models to distinguish metals from non-metal toxicants, and individual metal from one another

2020 ◽  
Vol 21 (S9) ◽  
Author(s):  
Zongtao Yu ◽  
Yuanyuan Fu ◽  
Junmei Ai ◽  
Jicai Zhang ◽  
Gang Huang ◽  
...  

Abstract Background Evaluating the toxicity of chemical mixture and their possible mechanism of action is still a challenge for humans and other organisms. Microarray classifier analysis has shown promise in the toxicogenomic area by identifying biomarkers to predict unknown samples. Our study focuses on identifying gene markers with better sensitivity and specificity, building predictive models to distinguish metals from non-metal toxicants, and individual metal from one another, and furthermore helping understand underlying toxic mechanisms. Results Based on an independent dataset test, using only 15 gene markers, we were able to distinguish metals from non-metal toxicants with 100% accuracy. Of these, 6 and 9 genes were commonly down- and up-regulated respectively by most of the metals. 8 out of 15 genes belong to membrane protein coding genes. Function well annotated genes in the list include ADORA2B, ARNT, S100G, and DIO3. Also, a 10-gene marker list was identified that can discriminate an individual metal from one another with 100% accuracy. We could find a specific gene marker for each metal in the 10-gene marker list. Function well annotated genes in this list include GSTM2, HSD11B, AREG, and C8B. Conclusions Our findings suggest that using a microarray classifier analysis, not only can we create diagnostic classifiers for predicting an exact metal contaminant from a large scale of contaminant pool with high prediction accuracy, but we can also identify valuable biomarkers to help understand the common and underlying toxic mechanisms induced by metals.

2019 ◽  
Author(s):  
Yatish Turakhia ◽  
Heidi I. Chen ◽  
Amir Marcovitz ◽  
Gill Bejerano

Gene losses provide an insightful route for studying the morphological and physiological adaptations of species, but their discovery is challenging. Existing genome annotation tools and protein databases focus on annotating intact genes and do not attempt to distinguish nonfunctional genes from genes missing annotation due to sequencing and assembly artifacts. Previous attempts to annotate gene losses have required significant manual curation, which hampers their scalability for the ever-increasing deluge of newly sequenced genomes. Using extreme sequence erosion (deletion and non-synonymous substitution) as an unambiguous signature of loss, we developed an automated approach for detecting high-confidence protein-coding gene loss events across a species tree. Our approach relies solely on gene annotation in a single reference genome, raw assemblies for the remaining species to analyze, and the associated phylogenetic tree for all organisms involved. Using the hg38 human assembly as a reference, we discovered over 500 unique human genes affected by such high-confidence erosion events in different clades across 58 mammals. While most of these events likely have benign consequences, we also found dozens of clade-specific gene losses that result in early lethality in outgroup mammals or are associated with severe congenital diseases in humans. Our discoveries yield intriguing potential for translational medical genetics and for evolutionary biology, and our approach is readily applicable to large-scale genome sequencing efforts across the tree of life.


2020 ◽  
Vol 27 ◽  
Author(s):  
Zaheer Ullah Khan ◽  
Dechang Pi

Background: S-sulfenylation (S-sulphenylation, or sulfenic acid) proteins, are special kinds of post-translation modification, which plays an important role in various physiological and pathological processes such as cytokine signaling, transcriptional regulation, and apoptosis. Despite these aforementioned significances, and by complementing existing wet methods, several computational models have been developed for sulfenylation cysteine sites prediction. However, the performance of these models was not satisfactory due to inefficient feature schemes, severe imbalance issues, and lack of an intelligent learning engine. Objective: In this study, our motivation is to establish a strong and novel computational predictor for discrimination of sulfenylation and non-sulfenylation sites. Methods: In this study, we report an innovative bioinformatics feature encoding tool, named DeepSSPred, in which, resulting encoded features is obtained via n-segmented hybrid feature, and then the resampling technique called synthetic minority oversampling was employed to cope with the severe imbalance issue between SC-sites (minority class) and non-SC sites (majority class). State of the art 2DConvolutional Neural Network was employed over rigorous 10-fold jackknife cross-validation technique for model validation and authentication. Results: Following the proposed framework, with a strong discrete presentation of feature space, machine learning engine, and unbiased presentation of the underline training data yielded into an excellent model that outperforms with all existing established studies. The proposed approach is 6% higher in terms of MCC from the first best. On an independent dataset, the existing first best study failed to provide sufficient details. The model obtained an increase of 7.5% in accuracy, 1.22% in Sn, 12.91% in Sp and 13.12% in MCC on the training data and12.13% of ACC, 27.25% in Sn, 2.25% in Sp, and 30.37% in MCC on an independent dataset in comparison with 2nd best method. These empirical analyses show the superlative performance of the proposed model over both training and Independent dataset in comparison with existing literature studies. Conclusion : In this research, we have developed a novel sequence-based automated predictor for SC-sites, called DeepSSPred. The empirical simulations outcomes with a training dataset and independent validation dataset have revealed the efficacy of the proposed theoretical model. The good performance of DeepSSPred is due to several reasons, such as novel discriminative feature encoding schemes, SMOTE technique, and careful construction of the prediction model through the tuned 2D-CNN classifier. We believe that our research work will provide a potential insight into a further prediction of S-sulfenylation characteristics and functionalities. Thus, we hope that our developed predictor will significantly helpful for large scale discrimination of unknown SC-sites in particular and designing new pharmaceutical drugs in general.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Shumaila Sayyab ◽  
Anders Lundmark ◽  
Malin Larsson ◽  
Markus Ringnér ◽  
Sara Nystedt ◽  
...  

AbstractThe mechanisms driving clonal heterogeneity and evolution in relapsed pediatric acute lymphoblastic leukemia (ALL) are not fully understood. We performed whole genome sequencing of samples collected at diagnosis, relapse(s) and remission from 29 Nordic patients. Somatic point mutations and large-scale structural variants were called using individually matched remission samples as controls, and allelic expression of the mutations was assessed in ALL cells using RNA-sequencing. We observed an increased burden of somatic mutations at relapse, compared to diagnosis, and at second relapse compared to first relapse. In addition to 29 known ALL driver genes, of which nine genes carried recurrent protein-coding mutations in our sample set, we identified putative non-protein coding mutations in regulatory regions of seven additional genes that have not previously been described in ALL. Cluster analysis of hundreds of somatic mutations per sample revealed three distinct evolutionary trajectories during ALL progression from diagnosis to relapse. The evolutionary trajectories provide insight into the mutational mechanisms leading relapse in ALL and could offer biomarkers for improved risk prediction in individual patients.


Author(s):  
Ekaterina Bourova-Flin ◽  
Samira Derakhshan ◽  
Afsaneh Goudarzi ◽  
Tao Wang ◽  
Anne-Laure Vitte ◽  
...  

Abstract Background Large-scale genetic and epigenetic deregulations enable cancer cells to ectopically activate tissue-specific expression programmes. A specifically designed strategy was applied to oral squamous cell carcinomas (OSCC) in order to detect ectopic gene activations and develop a prognostic stratification test. Methods A dedicated original prognosis biomarker discovery approach was implemented using genome-wide transcriptomic data of OSCC, including training and validation cohorts. Abnormal expressions of silent genes were systematically detected, correlated with survival probabilities and evaluated as predictive biomarkers. The resulting stratification test was confirmed in an independent cohort using immunohistochemistry. Results A specific gene expression signature, including a combination of three genes, AREG, CCNA1 and DDX20, was found associated with high-risk OSCC in univariate and multivariate analyses. It was translated into an immunohistochemistry-based test, which successfully stratified patients of our own independent cohort. Discussion The exploration of the whole gene expression profile characterising aggressive OSCC tumours highlights their enhanced proliferative and poorly differentiated intrinsic nature. Experimental targeting of CCNA1 in OSCC cells is associated with a shift of transcriptomic signature towards the less aggressive form of OSCC, suggesting that CCNA1 could be a good target for therapeutic approaches.


Author(s):  
Pankaj Kumar ◽  
Abhay Kumar ◽  
Kamal Sarma ◽  
Paresh Sharma ◽  
Rashmi Rekha Kumari ◽  
...  

Background: A novel, rapid and specific multiplex polymerase chain reaction was developed to diagnose hemo-parasitic infection in bovine blood co-infected with three of the most common hemo-parasites. Methods: The diagnostic process relied on the detection of the three different bovine hemoparasites isolated from red blood cells (RBCs) of cattle (N=30) by conventional Giemsa stained blood smear (GSBS) and confirmed by multiplex PCR. The multiplex PCR system was used to diagnose GSBS positive blood samples (N=12) found infected or co-infected with hemoparasites. The designed multiplex primer sets was attempted to amplify 205, 313 and 422 bp fragments of apocytochrome b, sporozoite and macroschizont 2 (spm2) and 16S rRNA gene for Babesia bigemina, Theileria annulata and Anaplasma marginale, respectively. Result: This multiplex PCR was sensitive with the ability to detect the presence of 150 ng of genomic DNA. The primers used in this multiplex PCR also showed highly specific amplification of specific gene fragments of each respective parasite. Comparing the two detection methods revealed that 58.33% of specimens showed concordant diagnoses with both techniques. The specificity, positive predictive value and kappa coefficient of the agreement was highest for diagnosis of B. bigemina and lowest for A. marginale. The overall Kappa coefficient for diagnosis based on GSBS for multiple pathogens compared to multiplex PCR was 0.56, slightly behind the threshold of 0.6 of agreement. Therefore, confirmation should always be based on PCR to rule out false positives due to differences in subjective observations, stain particles and false negatives due to low parasitemia. The simplicity and rapidity of this specific multiplex PCR method make it suitable for large-scale epidemiological studies and follow-up of drug treatments.


2017 ◽  
Author(s):  
Morgan N. Price ◽  
Adam P. Arkin

AbstractLarge-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources that link protein sequences to scientific articles (Swiss-Prot, GeneRIF, and EcoCyc). PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/.


2019 ◽  
Author(s):  
Wei Fang ◽  
Yi Wen ◽  
Xiangyun Wei

AbstractTissue-specific or cell type-specific transcription of protein-coding genes is controlled by both trans-regulatory elements (TREs) and cis-regulatory elements (CREs). However, it is challenging to identify TREs and CREs, which are unknown for most genes. Here, we describe a protocol for identifying two types of transcription-activating CREs—core promoters and enhancers—of zebrafish photoreceptor type-specific genes. This protocol is composed of three phases: bioinformatic prediction, experimental validation, and characterization of the CREs. To better illustrate the principles and logic of this protocol, we exemplify it with the discovery of the core promoter and enhancer of the mpp5b apical polarity gene (also known as ponli), whose red, green, and blue (RGB) cone-specific transcription requires its enhancer, a member of the rainbow enhancer family. While exemplified with an RGB cone-specific gene, this protocol is general and can be used to identify the core promoters and enhancers of other protein-coding genes.


2021 ◽  
Author(s):  
Wen Feng ◽  
Lei Zhou ◽  
Pengju Zhao ◽  
Heng Du ◽  
Chenguang Diao ◽  
...  

As warthog (Phacochoerus africanus) has innate immunity against African swine fever (ASF), it is critical to understanding the evolutionary novelty of warthog to explain its specific ASF resistance. Here, we present two completed new genomes of one warthog and one Kenyan domestic pig, as the fundamental genomic references to decode the genetic mechanism on ASF tolerance. Our results indicated, multiple genomic variations, including gene losses, independent contraction and expansion of specific gene families, likely moulded warthog's genome to adapt the environment. Importantly, the analysis of presence and absence of genomic sequences revealed that, the warthog genome had a DNA sequence absence of the lactate dehydrogenase B (LDHB) gene on chromosome 2 compared to the reference genome. The overexpression and siRNA of LDHB indicated that its inhibition on the replication of ASFV. The Combining with large scale sequencing data of 123 pigs from all over world, contraction and expansion of TRIM genes families revealed that TRIM family genes in the warthog genome were potentially responsible for its tolerance to ASF. Our results will help further improve the understanding of genetic resistance ASF in pigs.


2021 ◽  
Author(s):  
Z Jafarian ◽  
S Khamse ◽  
H Afshar ◽  
Khorram Khorshid HR ◽  
A Delbari ◽  
...  

Abstract Across the human protein-coding genes, the neuron-specific gene, RASGEF1C, contains the longest (GGC)-repeat, spanning its core promoter and 5′ untranslated region (RASGEF1C-201 ENST00000361132.9). RASGEF1C expression dysregulation occurs in late-onset neurocognitive disorders (NCDs), such as Alzheimer’s disease. Here we sequenced the GGC-repeat in a sample of human subjects (N = 269), consisting of late-onset NCDs (N = 115) and controls (N = 154). We also studied the status of this STR across vertebrates. The 6-repeat allele of this repeat was the predominant allele in the controls (frequency = 0.85) and NCD patients (frequency = 0.78). The NCD genotype compartment consisted of an excess of genotypes that lacked the 6-repeat (Mid-P exact = 0.004). We also detected divergent genotypes that were present in five NCD patients and not in the controls (Mid-P exact = 0.007). This STR expanded beyond 2-repeats specifically in primates, and was at maximum length in human. We conclude that there is natural selection for the 6-repeat allele of the RASGEF1C (GGC)-repeat in human, and significant divergence from that allele in late-onset NCDs. Indication of natural selection for predominantly abundant STR alleles and divergent genotypes enhance the perspective of evolutionary biology and disease pathogenesis in human complex disorders.


2017 ◽  
Vol 18 (1) ◽  
pp. 1
Author(s):  
Wage Ratna Rohaeni ◽  
Untung Susanto ◽  
Aida F.V. Yuningsih

<p>Resistance traits to brown planthopper on rice varieties are controlled by dominant and recessive genes called Bph/bph. Bph17 is one of dominant genes that control rice resistance to brown planthopper.  Marker of Bph17 allele can be used as a tool of marker assisted selection (MAS) in breeding activity. Association of Bph17 allele and resistance to brown planthopper in Indonesian landraces and new-improved varieties of rice is not clearly known. The study aimed to determine the association of Bph17 allele in landraces and new-improved varieties of rice resistant to brown planthopper. Twenty-one rice genotypes were used in the study, consisting of 13 landraces, 5 improved varieties, 3 popular varieties and a check variety Rathu Heenati. Two simple sequence repeat markers linked to Bph17 allele were used, i.e. RM8213 and RM5953. The results showed that association of Bph17 allele in landraces and new-improved varieties of rice resistant to brown planthopper resistance was very low (r = -0.019 and -0.023, respectively). The presence of Bph17 allele did not constantly express resistance to brown planthopper. The study suggests that Bph17 allele cannot be used as a tool of MAS for evaluating resistance of landraces and new-improved varieties of rice to brown planthopper. Further research is needed to obtain a specific gene marker that can be used as a tool of MAS and applicable for Indonesian differential rice varieties. </p>


Sign in / Sign up

Export Citation Format

Share Document