scholarly journals Model guided trait-specific co-expression network estimation as a new perspective for identifying molecular interactions and pathways

2021 ◽  
Vol 17 (5) ◽  
pp. e1008960
Author(s):  
Juho A. J. Kontio ◽  
Tanja Pyhäjärvi ◽  
Mikko J. Sillanpää

A wide variety of 1) parametric regression models and 2) co-expression networks have been developed for finding gene-by-gene interactions underlying complex traits from expression data. While both methodological schemes have their own well-known benefits, little is known about their synergistic potential. Our study introduces their methodological fusion that cross-exploits the strengths of individual approaches via a built-in information-sharing mechanism. This fusion is theoretically based on certain trait-conditioned dependency patterns between two genes depending on their role in the underlying parametric model. Resulting trait-specific co-expression network estimation method 1) serves to enhance the interpretation of biological networks in a parametric sense, and 2) exploits the underlying parametric model itself in the estimation process. To also account for the substantial amount of intrinsic noise and collinearities, often entailed by expression data, a tailored co-expression measure is introduced along with this framework to alleviate related computational problems. A remarkable advance over the reference methods in simulated scenarios substantiate the method’s high-efficiency. As proof-of-concept, this synergistic approach is successfully applied in survival analysis, with acute myeloid leukemia data, further highlighting the framework’s versatility and broad practical relevance.

2020 ◽  
Author(s):  
Juho A. J. Kontio ◽  
Tanja Pyhäjärvi ◽  
Mikko J. Sillanpää

AbstractA wide variety of parametric approaches and co-expression networks have been developed for finding gene-by-gene interactions underlying complex traits from expression data. However, a little is known about the practical correspondence and synergistic potential of these different schemes. We provide a framework for parallel consideration of parametric interaction models with quantitative traits and co-expression networks based on a previously uncharacterized link between them. Resulting trait-specific co-expression network estimation method 1) serves to enhance the interpretation of biological networks in a more parametric sense and 2) exploits the underlying parametric model itself in the estimation process. It is tailored for simultaneous identification and classification of molecular interactions and pathways regulating complex traits by accounting for common characteristics of genetic architectures due to which the mainstream methods often lack efficiency. A remarkable advance over the state-of-art methods is illustrated theoretically and through comprehensive simulated scenarios. In particular, prognostically important novel findings in acute myeloid leukemia analysis demonstrate the method’s immediate practical relevance.Author summaryHere we built up a mathematically justified bridge between parametric approaches and co-expression networks that have become prevalent for identifying molecular interactions underlying complex traits. We first shared our concern that methodological improvements around these schemes adjusting only their power and scalability are bounded by more fundamental scheme-specific limitations. Subsequently, our theoretical results were exploited to overcome these limitations to find gene-by-gene interactions neither of which can capture alone. We also aimed to illustrate theoretically and empirically how this framework enables the interpretation of co-expression networks in a more parametric sense to achieve systematic insights into complex biological processes more reliably. The main procedure was fit for various types of biological applications and high-dimensional data to cover the area of systems biology as broadly as possible. In particular, we chose to illustrate the method’s applicability for gene-profile based risk-stratification in cancer research using public acute myeloid leukemia datasets.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ramin Hasibi ◽  
Tom Michoel

Abstract Background Molecular interaction networks summarize complex biological processes as graphs, whose structure is informative of biological function at multiple scales. Simultaneously, omics technologies measure the variation or activity of genes, proteins, or metabolites across individuals or experimental conditions. Integrating the complementary viewpoints of biological networks and omics data is an important task in bioinformatics, but existing methods treat networks as discrete structures, which are intrinsically difficult to integrate with continuous node features or activity measures. Graph neural networks map graph nodes into a low-dimensional vector space representation, and can be trained to preserve both the local graph structure and the similarity between node features. Results We studied the representation of transcriptional, protein–protein and genetic interaction networks in E. coli and mouse using graph neural networks. We found that such representations explain a large proportion of variation in gene expression data, and that using gene expression data as node features improves the reconstruction of the graph from the embedding. We further proposed a new end-to-end Graph Feature Auto-Encoder framework for the prediction of node features utilizing the structure of the gene networks, which is trained on the feature prediction task, and showed that it performs better at predicting unobserved node features than regular MultiLayer Perceptrons. When applied to the problem of imputing missing data in single-cell RNAseq data, the Graph Feature Auto-Encoder utilizing our new graph convolution layer called FeatGraphConv outperformed a state-of-the-art imputation method that does not use protein interaction information, showing the benefit of integrating biological networks and omics data with our proposed approach. Conclusion Our proposed Graph Feature Auto-Encoder framework is a powerful approach for integrating and exploiting the close relation between molecular interaction networks and functional genomics data.


2011 ◽  
Vol 328-330 ◽  
pp. 1667-1670
Author(s):  
Wen Gen Gao ◽  
Ming Jiang ◽  
Shan Shan Qiang

The permanent magnet synchronous motor (PMSM) with its high efficiency, high power factor, small volume and advantage of saving electricity in many areas, has been widely used. The paper mainly analyses PMSM direct torque control (DTC) system, and estimates the speed based on speed sensorless control system, which based on the angle of speed estimation method of the motor rotor flux vector. Use Simulink tool in MATLAB to design and realize the control system simulation, and the simulation results are analyzed. Simulation results show the correctness and feasibility of the speed observation algorithm.


2014 ◽  
Vol 580-583 ◽  
pp. 2815-2819
Author(s):  
You Ping Wu ◽  
Chun Tao Wang ◽  
Jia Bang Wang

In this paper, a new solution to the semi-parametric estimation of a mixed model additional system parameters was conducted to derive a calculation method of parameter adjustment at the model regularization matrix, and determine the estimation of parameters and non-parameters as well as the accuracy evaluation formula of the model. The effectiveness of the semi-parametric estimation method was demonstrated through simulation examples, and the semi-parametric model additional system parameters was further extended.


Author(s):  
Keywan Hassani-Pak ◽  
Ajit Singh ◽  
Marco Brandizi ◽  
Joseph Hearnshaw ◽  
Sandeep Amberkar ◽  
...  

ABSTRACTGenerating new ideas and scientific hypotheses is often the result of extensive literature and database reviews, overlaid with scientists’ own novel data and a creative process of making connections that were not made before. We have developed a comprehensive approach to guide this technically challenging data integration task and to make knowledge discovery and hypotheses generation easier for plant and crop researchers. KnetMiner can digest large volumes of scientific literature and biological research to find and visualise links between the genetic and biological properties of complex traits and diseases. Here we report the main design principles behind KnetMiner and provide use cases for mining public datasets to identify unknown links between traits such grain colour and pre-harvest sprouting in Triticum aestivum, as well as, an evidence-based approach to identify candidate genes under an Arabidopsis thaliana petal size QTL. We have developed KnetMiner knowledge graphs and applications for a range of species including plants, crops and pathogens. KnetMiner is the first open-source gene discovery platform that can leverage genome-scale knowledge graphs, generate evidence-based biological networks and be deployed for any species with a sequenced genome. KnetMiner is available at http://knetminer.org.


2018 ◽  
Vol 1 (1) ◽  
pp. 022-032
Author(s):  
Science Nature

A widely used estimation method in estimating regression model parameters is the ordinary least square (OLS) that minimizes the sum of the error squares. In addition to the ease of computing, OLS is a good unbiased estimator as long as the error component assumption ()  in the given model is met. However, in the application, it is often encountered violations of assumptions. One of the violation types is the violation of distributed error assumption which is caused by the existence of the outlier on observation data. Thus, a solid method is required to overcome the existence of outlier, that is Robust Regression. One of the Robust Regression methods commonly used is robust MM method. Robust MM method is a combination of breakdown point and high efficiency. Results obtained based on simulated data generated using SAS software 9.2, shows that the use of objective weighting function tukey bisquare is able to overcome the existence of extreme outlier. Furthermore, it is determined that the value of tuning constant c with Robust MM method is 4.685 and it is obtained95% of efficiency. Thus, the obtained breakdown point is 50%.    


2009 ◽  
Vol 6 (2) ◽  
pp. 165-190 ◽  
Author(s):  
Mou'ath Hourani ◽  
Emary El

Gene expression data often contain missing expression values. For the purpose of conducting an effective clustering analysis and since many algorithms for gene expression data analysis require a complete matrix of gene array values, choosing the most effective missing value estimation method is necessary. In this paper, the most commonly used imputation methods from literature are critically reviewed and analyzed to explain the proper use, weakness and point the observations on each published method. From the conducted analysis, we conclude that the Local Least Square (LLS) and Support Vector Regression (SVR) algorithms have achieved the best performances. SVR can be considered as a complement algorithm for LLS especially when applied to noisy data. However, both algorithms suffer from some deficiencies presented in choosing the value of Number of Selected Genes (K) and the appropriate kernel function. To overcome these drawbacks, the need for new method that automatically chooses the parameters of the function and it also has an appropriate computational complexity is imperative.


Sign in / Sign up

Export Citation Format

Share Document