Gene Expression Data
Recently Published Documents





2022 ◽  
Vol 12 (1) ◽  
pp. 0-0

Gene Regulatory Networks (GRNs) are the pioneering methodology for finding new gene interactions getting insights of the biological processes using time series gene expression data. It remains a challenge to study the temporal nature of gene expression data that mimic complex non-linear dynamics of the network. In this paper, an intelligent framework of recurrent neural network (RNN) and swarm intelligence (SI) based Particle Swarm Optimization (PSO) with controlled behaviour has been proposed for the reconstruction of GRN from time-series gene expression data. A novel PSO algorithm enhanced by human cognition influenced by the ideology of Bhagavad Gita is employed for improved learning of RNN. RNN guided by the proposed algorithm simulates the nonlinear and dynamic gene interactions to a greater extent. The proposed method shows superior performance over traditional SI algorithms in searching biologically plausible candidate networks. The strength of the method is verified by analyzing the small artificial network and real data of Escherichia coli with improved accuracy.

2021 ◽  
Xiangyu Liu ◽  
Zhengchang Su ◽  
Guojun Li

Abstract Background: Identifying significant biclusters of genes with specific expression patterns is an effective approach to reveal functionally correlated genes in gene expression data. However, existing algorithms are limited to finding either broad or narrow biclusters but both due to failure of balancing between effectiveness and efficiency. Methods: We developed a new algorithm ARBic which can accurately identify any meaningful biclusters of shape no matter broad or narrow in a large scale gene expression data matrix, even when the values in the biclusters to be identified have the same distribution as that the background data has. ARBic is developed by integrating column-based and row-based strategies into biclustering procedure. The column-based strategy borrowed from ReBic, a recently published biclustering tool, prefers to narrow bicluters. The row-based strategy newly designed in this article by repeatedly finding a longest path in a specific directed graph prefers to broader ones. Result and Conclusion: When tested and compared to other seven salient biclustering algorithms on simulated datasets, ARBic achieved recovery, relevance and f1-scores 29% higher than the second best algorithm. Furthermore, ARBic substantially outperforms all of them on real datasets and robusts to noises, shapes of biclusters and types of datasets.Code:

2021 ◽  
Vol 4 (1) ◽  
Vaidehi S. Natu ◽  
Mona Rosenke ◽  
Hua Wu ◽  
Francesca R. Querdasi ◽  
Holly Kular ◽  

AbstractDevelopment of cortical tissue during infancy is critical for the emergence of typical brain functions in cortex. However, how cortical microstructure develops during infancy remains unknown. We measured the longitudinal development of cortex from birth  to six months of age  using multimodal quantitative imaging of cortical microstructure. Here we show that infants’ cortex undergoes profound microstructural tissue growth during the first six months of human life. Comparison of postnatal to prenatal transcriptomic gene expression data demonstrates that myelination and synaptic processes are dominant contributors to this postnatal microstructural tissue growth. Using visual cortex as a model system, we find hierarchical microstructural growth: higher-level visual areas have less mature tissue at birth than earlier visual areas but grow at faster rates. This overturns the prominent view that visual areas that are most mature at birth develop fastest. Together, in vivo, longitudinal, and quantitative measurements, which we validated with ex vivo transcriptomic data, shed light on the rate, sequence, and biological mechanisms of developing cortical systems during early infancy. Importantly, our findings propose a hypothesis that cortical myelination is a key factor in cortical development during early infancy, which has important implications for diagnosis of neurodevelopmental disorders and delays in infants.

PLoS ONE ◽  
2021 ◽  
Vol 16 (10) ◽  
pp. e0258326
Wen Bo Liu ◽  
Sheng Nan Liang ◽  
Xi Wen Qin

Gene expression data has the characteristics of high dimensionality and a small sample size and contains a large number of redundant genes unrelated to a disease. The direct application of machine learning to classify this type of data will not only incur a great time cost but will also sometimes fail to improved classification performance. To counter this problem, this paper proposes a dimension-reduction algorithm based on weighted kernel principal component analysis (WKPCA), constructs kernel function weights according to kernel matrix eigenvalues, and combines multiple kernel functions to reduce the feature dimensions. To further improve the dimensional reduction efficiency of WKPCA, t-class kernel functions are constructed, and corresponding theoretical proofs are given. Moreover, the cumulative optimal performance rate is constructed to measure the overall performance of WKPCA combined with machine learning algorithms. Naive Bayes, K-nearest neighbour, random forest, iterative random forest and support vector machine approaches are used in classifiers to analyse 6 real gene expression dataset. Compared with the all-variable model, linear principal component dimension reduction and single kernel function dimension reduction, the results show that the classification performance of the 5 machine learning methods mentioned above can be improved effectively by WKPCA dimension reduction.

Abhirup Shaw ◽  
Beáta B. Tóth ◽  
Róbert Király ◽  
Rini Arianti ◽  
István Csomós ◽  

Thermogenic brown and beige adipocytes might open up new strategies in combating obesity. Recent studies in rodents and humans have indicated that these adipocytes release cytokines, termed “batokines”. Irisin was discovered as a polypeptide regulator of beige adipocytes released by myocytes, primarily during exercise. We performed global RNA sequencing on adipocytes derived from human subcutaneous and deep-neck precursors, which were differentiated in the presence or absence of irisin. Irisin did not exert an effect on the expression of characteristic thermogenic genes, while upregulated genes belonging to various cytokine signaling pathways. Out of the several upregulated cytokines, CXCL1, the highest upregulated, was released throughout the entire differentiation period, and predominantly by differentiated adipocytes. Deep-neck area tissue biopsies also showed a significant release of CXCL1 during 24 h irisin treatment. Gene expression data indicated upregulation of the NFκB pathway upon irisin treatment, which was validated by an increase of p50 and decrease of IκBα protein level, respectively. Continuous blocking of the NFκB pathway, using a cell permeable inhibitor of NFκB nuclear translocation, significantly reduced CXCL1 release. The released CXCL1 exerted a positive effect on the adhesion of endothelial cells. Together, our findings demonstrate that irisin stimulates the release of a novel adipokine, CXCL1, via upregulation of NFκB pathway in neck area derived adipocytes, which might play an important role in improving tissue vascularization.

BMC Cancer ◽  
2021 ◽  
Vol 21 (1) ◽  
Maria Moksnes Bjaanæs ◽  
Gro Nilsen ◽  
Ann Rita Halvorsen ◽  
Hege G. Russnes ◽  
Steinar Solberg ◽  

Abstract Background Genetic alterations are common in non-small cell lung cancer (NSCLC), and DNA mutations and translocations are targets for therapy. Copy number aberrations occur frequently in NSCLC tumors and may influence gene expression and further alter signaling pathways. In this study we aimed to characterize the genomic architecture of NSCLC tumors and to identify genomic differences between tumors stratified by histology and mutation status. Furthermore, we sought to integrate DNA copy number data with mRNA expression to find genes with expression putatively regulated by copy number aberrations and the oncogenic pathways associated with these affected genes. Methods Copy number data were obtained from 190 resected early-stage NSCLC tumors and gene expression data were available from 113 of the adenocarcinomas. Clinical and histopathological data were known, and EGFR-, KRAS- and TP53 mutation status was determined. Allele-specific copy number profiles were calculated using ASCAT, and regional copy number aberration were subsequently obtained and analyzed jointly with the gene expression data. Results The NSCLC tumors tissue displayed overall complex DNA copy number profiles with numerous recurrent aberrations. Despite histological differences, tissue samples from squamous cell carcinomas and adenocarcinomas had remarkably similar copy number patterns. The TP53-mutated lung adenocarcinomas displayed a highly aberrant genome, with significantly altered copy number profiles including gains, losses and focal complex events. The EGFR-mutant lung adenocarcinomas had specific arm-wise aberrations particularly at chromosome7p and 9q. A large number of genes displayed correlation between copy number and expression level, and the PI(3)K-mTOR pathway was highly enriched for such genes. Conclusions The genomic architecture in NSCLC tumors is complex, and particularly TP53-mutated lung adenocarcinomas displayed highly aberrant copy number profiles. We suggest to always include TP53-mutation status when studying copy number aberrations in NSCLC tumors. Copy number may further impact gene expression and alter cellular signaling pathways.

Nimrita Koul ◽  
Sunilkumar S. Manvi

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Lianxin Zhong ◽  
Qingfang Meng ◽  
Yuehui Chen

The correct classification of cancer subtypes is of great significance for the in-depth study of cancer pathogenesis and the realization of accurate treatment for cancer patients. In recent years, the classification of cancer subtypes using deep neural networks and gene expression data has become a hot topic. However, most classifiers may face the challenges of overfitting and low classification accuracy when dealing with small sample size and high-dimensional biological data. In this paper, the Cascade Flexible Neural Forest (CFNForest) Model was proposed to accomplish cancer subtype classification. CFNForest extended the traditional flexible neural tree structure to FNT Group Forest exploiting a bagging ensemble strategy and could automatically generate the model’s structure and parameters. In order to deepen the FNT Group Forest without introducing new hyperparameters, the multilayer cascade framework was exploited to design the FNT Group Forest model, which transformed features between levels and improved the performance of the model. The proposed CFNForest model also improved the operational efficiency and the robustness of the model by sample selection mechanism between layers and setting different weights for the output of each layer. To accomplish cancer subtype classification, FNT Group Forest with different feature sets was used to enrich the structural diversity of the model, which make it more suitable for processing small sample size datasets. The experiments on RNA-seq gene expression data showed that CFNForest effectively improves the accuracy of cancer subtype classification. The classification results have good robustness.

mBio ◽  
2021 ◽  
Taylor Reiter ◽  
Rachel Montpetit ◽  
Ron Runnebaum ◽  
C. Titus Brown ◽  
Ben Montpetit

In this work, Saccharomyces cerevisiae gene expression was used as a biosensor to capture differences across and between fermentations of Pinot noir grapes from 15 unique sites representing eight American Viticultural Areas. This required development of a novel analysis method, DMap-DE, for investigation of asynchronous gene expression data.

2021 ◽  
Vol 22 (1) ◽  
Lianxin Zhong ◽  
Qingfang Meng ◽  
Yuehui Chen ◽  
Lei Du ◽  
Peng Wu

Abstract Background Correctly classifying the subtypes of cancer is of great significance for the in-depth study of cancer pathogenesis and the realization of personalized treatment for cancer patients. In recent years, classification of cancer subtypes using deep neural networks and gene expression data has gradually become a research hotspot. However, most classifiers may face overfitting and low classification accuracy when dealing with small sample size and high-dimensional biology data. Results In this paper, a laminar augmented cascading flexible neural forest (LACFNForest) model was proposed to complete the classification of cancer subtypes. This model is a cascading flexible neural forest using deep flexible neural forest (DFNForest) as the base classifier. A hierarchical broadening ensemble method was proposed, which ensures the robustness of classification results and avoids the waste of model structure and function as much as possible. We also introduced an output judgment mechanism to each layer of the forest to reduce the computational complexity of the model. The deep neural forest was extended to the densely connected deep neural forest to improve the prediction results. The experiments on RNA-seq gene expression data showed that LACFNForest has better performance in the classification of cancer subtypes compared to the conventional methods. Conclusion The LACFNForest model effectively improves the accuracy of cancer subtype classification with good robustness. It provides a new approach for the ensemble learning of classifiers in terms of structural design.

Sign in / Sign up

Export Citation Format

Share Document