scholarly journals GRep: Gene Set Representation via Gaussian Embedding

2019 ◽  
Author(s):  
Sheng Wang ◽  
Emily Flynn ◽  
Russ B. Altman

ABSTRACTMolecular interaction networks are our basis for understanding functional interdependencies among genes. Network embedding approaches analyze these complicated networks by representing genes as low-dimensional vectors based on the network topology. These low-dimensional vectors have recently become the building blocks for a larger number of systems biology applications. Despite the success of embedding genes in this way, it remains unclear how to effectively represent gene sets, such as protein complexes and signaling pathways. The direct adaptation of existing gene embedding approaches to gene sets cannot model the diverse functions of genes in a set. Here, we propose GRep, a novel gene set embedding approach, which represents each gene set as a multivariate Gaussian distribution rather than a single point in the low-dimensional space. The diversity of genes in a set, or the uncertainty of their contribution to a particular function, is modeled by the covariance matrix of the multivariate Gaussian distribution. By doing so, GRep produces a highly informative and compact gene set representation. Using our representation, we analyze two major pharmacogenomics studies and observe substantial improvement in drug target identification from expression-derived gene sets. Overall, the GRep framework provides a novel representation of gene sets that can be used as input features to off-the-shelf machine learning classifiers for gene set analysis.

2011 ◽  
Vol 9 (70) ◽  
pp. 1063-1072 ◽  
Author(s):  
Sali Lv ◽  
Yan Li ◽  
Qianghu Wang ◽  
Shangwei Ning ◽  
Teng Huang ◽  
...  

Numerous gene sets have been used as molecular signatures for exploring the genetic basis of complex disorders. These gene sets are distinct but related to each other in many cases; therefore, efforts have been made to compare gene sets for studies such as those evaluating the reproducibility of different experiments. Comparison in terms of biological function has been demonstrated to be helpful to biologists. We improved the measurement of semantic similarity to quantify the functional association between gene sets in the context of gene ontology and developed a web toolkit named Gene Set Functional Similarity (GSFS; http://bioinfo.hrbmu.edu.cn/GSFS ). Validation based on protein complexes for which the functional associations are known demonstrated that the GSFS scores tend to be correlated with sequence similarity scores and that complexes with high GSFS scores tend to be involved in the same functional catalogue. Compared with the pairwise method and the annotation method, the GSFS shows better discrimination and more accurately reflects the known functional catalogues shared between complexes. Case studies comparing differentially expressed genes of prostate tumour samples from different microarray platforms and identifying coronary heart disease susceptibility pathways revealed that the method could contribute to future studies exploring the molecular basis of complex disorders.


2012 ◽  
Vol 591-593 ◽  
pp. 1783-1788 ◽  
Author(s):  
Zhi Yang Jia ◽  
Pu Wang ◽  
Xue Jin Gao

In the process monitoring and fault diagnosis of batch processes, traditional principal component analysis (PCA) and least-squares (PLS), are assuming that the process variables are multivariate Gaussian distribution. But in the practical industrial process, the data observed of process variables do not necessarily be the multivariate Gaussian distribution. Independent component analysis (ICA), as a higher-order statistical method, is more suitable for dynamic systems. Observational data are decomposed into a linear combination of the independent components under statistical significance. The higher order statistics will be extracted and the mixed signals are decomposed into independent non-Gaussian components. Traditional method of ICA has to predefine the number of independent components. This paper proposed an improved MICA method of realizing the automatically choosing the independent components through setting the threshold value of the negentropy. The method can solve the problem of predefining the number of independent components in traditional methods and meanwhile it reduces the complexity of the monitoring model. The proposed method is used to do the process monitoring and fault diagnosis of penicillin fermentation and the results verify the feasibility and effectiveness of the method.


2021 ◽  
Author(s):  
Athira Athira ◽  
Daniel Dondorp ◽  
Jerneja Rudolf ◽  
Olivia Peytral ◽  
Marios Chatzigeorgiou

Locomotion is broadly conserved in the animal kingdom, yet our understanding of how complex locomotor behaviors are generated and have evolved is relatively limited by the lack of an accurate description of their structural organization. Here we take a neuroethological approach to break down the motor behavioral repertoire of one of our nearest invertebrate relative, the protochordate Ciona intestinalis, into basic building blocks. Using machine vision, we track thousands of swimming larvae to obtain a feature-rich description of larval swimming and show that most of the postural variance can be captured by six basic shapes, which we term Eigencionas. Using multiple complementary approaches, we built representations of the larval behavioral dynamics and systematically reveal the global structure of behavior. By employing matrix profiling and subsequence time-series clustering, we reveal that Ciona swimming is rich in stereotyped behavioral motifs. Combining pharmacological inhibition of bioamine signaling with Hidden Markov Model we discover underlying behavioral states including multiple modes of roaming and dwelling. Finally, performing a spatio-temporal embedding of the postural features onto a behavioral space provides insight into the behavioral repertoire by project it to a low-dimensional space and highlights subtle light stimulus evoked behavioral differences. Taken together, Ciona larvae generate their spontaneous swimming and visuomotor behavioral repertoire by altering both their motor modules and transitions between, which are amenable to pharmacological perturbations, facilitating future functional and mechanistic investigations.


Sign in / Sign up

Export Citation Format

Share Document