scholarly journals Tensor decomposition with relational constraints for predicting multiple types of microRNA-disease associations

Author(s):  
Feng Huang ◽  
Xiang Yue ◽  
Zhankun Xiong ◽  
Zhouxin Yu ◽  
Shichao Liu ◽  
...  

Abstract MicroRNAs (miRNAs) play crucial roles in multifarious biological processes associated with human diseases. Identifying potential miRNA-disease associations contributes to understanding the molecular mechanisms of miRNA-related diseases. Most of the existing computational methods mainly focus on predicting whether a miRNA-disease association exists or not. However, the roles of miRNAs in diseases are prominently diverged, for instance, Genetic variants of miRNA (mir-15) may affect the expression level of miRNAs leading to B cell chronic lymphocytic leukemia, while circulating miRNAs (including mir-1246, mir-1307-3p, etc.) have potentials to detecting breast cancer in the early stage. In this paper, we aim to predict multi-type miRNA-disease associations instead of taking them as binary. To this end, we innovatively represent miRNA-disease-type triples as a tensor and introduce tensor decomposition methods to solve the prediction task. Experimental results on two widely-adopted miRNA-disease datasets: HMDD v2.0 and HMDD v3.2 show that tensor decomposition methods improve a recent baseline in a large scale (up to $38\%$ in Top-1F1). We then propose a novel method, Tensor Decomposition with Relational Constraints (TDRC), which incorporates biological features as relational constraints to further the existing tensor decomposition methods. Compared with two existing tensor decomposition methods, TDRC can produce better performance while being more efficient.

Author(s):  
Gonca Erdemci-Tandogan ◽  
M. Lisa Manning

Large-scale tissue deformation during biological processes such as morphogenesis requires cellular rearrangements. The simplest rearrangement in confluent cellular monolayers involves neighbor exchanges among four cells, called a T1 transition, in analogy to foams. But unlike foams, cells must execute a sequence of molecular processes, such as endocytosis of adhesion molecules, to complete a T1 transition. Such processes could take a long time compared to other timescales in the tissue. In this work, we incorporate this idea by augmenting vertex models to require a fixed, finite time for T1 transitions, which we call the “T1 delay time”. We study how variations in T1 delay time affect tissue mechanics, by quantifying the relaxation time of tissues in the presence of T1 delays and comparing that to the cell-shape based timescale that characterizes fluidity in the absence of any T1 delays. We show that the molecular-scale T1 delay timescale dominates over the cell shape-scale collective response timescale when the T1 delay time is the larger of the two. We extend this analysis to tissues that become anisotropic under convergent extension, finding similar results. Moreover, we find that increasing the T1 delay time increases the percentage of higher-fold coordinated vertices and rosettes, and decreases the overall number of successful T1s, contributing to a more elastic-like – and less fluid-like – tissue response. Our work suggests that molecular mechanisms that act as a brake on T1 transitions could stiffen global tissue mechanics and enhance rosette formation during morphogenesis.


2019 ◽  
Author(s):  
Arshdeep Sekhon ◽  
Beilun Wang ◽  
Yanjun Qi

AbstractWe focus on integrating different types of extra knowledge (other than the observed samples) for estimating the sparse structure change between two p-dimensional Gaussian Graphical Models (i.e. differential GGMs). Previous differential GGM estimators either fail to include additional knowledge or cannot scale up to a high-dimensional (large p) situation. This paper proposes a novel method KDiffNet that incorporates Additional Knowledge in identifying Differential Networks via an Elementary Estimator. We design a novel hybrid norm as a superposition of two structured norms guided by the extra edge information and the additional node group knowledge. KDiffNet is solved through a fast parallel proximal algorithm, enabling it to work in large-scale settings. KDiffNet can incorporate various combinations of existing knowledge without re-designing the optimization. Through rigorous statistical analysis we show that, while considering more evidence, KDiffNet achieves the same convergence rate as the state-of-the-art. Empirically on multiple synthetic datasets and one real-world fMRI brain data, KDiffNet significantly outperforms the cutting edge baselines with regard to the prediction performance, while achieving the same level of time cost or less.


2019 ◽  
Vol 14 (7) ◽  
pp. 614-620 ◽  
Author(s):  
Jiajing Chen ◽  
Jianan Zhao ◽  
Shiping Yang ◽  
Zhen Chen ◽  
Ziding Zhang

Background: As one of the most important reversible protein post-translation modification types, ubiquitination plays a significant role in the regulation of many biological processes, such as cell division, signal transduction, apoptosis and immune response. Protein ubiquitination usually occurs when ubiquitin molecule is attached to a lysine on a target protein, which is also known as “lysine ubiquitination”. Objective: In order to investigate the molecular mechanisms of ubiquitination-related biological processes, the crucial first step is the identification of ubiquitination sites. However, conventional experimental methods in detecting ubiquitination sites are often time-consuming and a large number of ubiquitination sites remain unidentified. In this study, a ubiquitination site prediction method for Arabidopsis thaliana was developed using a Support Vector Machine (SVM). Methods: We collected 3009 experimentally validated ubiquitination sites on 1607 proteins in A. thaliana to construct the training set. Three feature encoding schemes were used to characterize the sequence patterns around ubiquitination sites, including AAC, Binary and CKSAAP. The maximum Relevance and Minimum Redundancy (mRMR) feature selection method was employed to reduce the dimensionality of input features. Five-fold cross-validation and independent tests were used to evaluate the performance of the established models. Results: As a result, the combination of AAC and CKSAAP encoding schemes yielded the best performance with the accuracy and AUC of 81.35% and 0.868 in the independent test. We also generated an online predictor termed as AraUbiSite, which is freely accessible at: http://systbio.cau.edu.cn/araubisite. Conclusion: We developed a well-performed prediction tool for large-scale ubiquitination site identification in A. thaliana. It is hoped that the current work will speed up the process of identification of ubiquitination sites in A. thaliana and help to further elucidate the molecular mechanisms of ubiquitination in plants.


Blood ◽  
2010 ◽  
Vol 116 (11) ◽  
pp. 1899-1907 ◽  
Author(s):  
Thet Thet Lin ◽  
Boitelo T. Letsolo ◽  
Rhiannon E. Jones ◽  
Jan Rowson ◽  
Guy Pratt ◽  
...  

Abstract We performed single-molecule telomere length and telomere fusion analysis in patients at different stages of chronic lymphocytic leukemia (CLL). Our work identified the shortest telomeres ever recorded in primary human tissue, reinforcing the concept that there is significant cell division in CLL. Furthermore, we provide direct evidence that critical telomere shortening, dysfunction, and fusion contribute to disease progression. The frequency of short telomeres and fusion events increased with advanced disease, but importantly these were also found in a subset of early-stage patient samples, indicating that these events can precede disease progression. Sequence analysis of fusion events isolated from persons with the shortest telomeres revealed limited numbers of repeats at the breakpoint, subtelomeric deletion, and microhomology. Array-comparative genome hybridization analysis of persons displaying evidence of telomere dysfunction revealed large-scale genomic rearrangements that were concentrated in the telomeric regions; this was not observed in samples with longer telomeres. The telomere dynamics observed in CLL B cells were indistinguishable from that observed in cells undergoing crisis in culture after abrogation of the p53 pathway. Taken together, our data support the concept that telomere erosion and subsequent telomere fusion are critical in the progression of CLL and that this paradigm may extend to other malignancies.


Author(s):  
Benjamin Hall ◽  
Anna Niarakis

Discrete, logic-based models are increasingly used to describe biological mechanisms. Initially introduced to study gene regulation, these models evolved to cover various molecular mechanisms, such as signalling, transcription factor cooperativity, and even metabolic processes. The abstract nature and amenability of discrete models to robust mathematical analyses make them appropriate for addressing a wide range of complex biological problems. Recent technological breakthroughs have generated a wealth of high throughput data. Novel, literature-based representations of biological processes and emerging machine learning algorithms offer new opportunities for model construction. Here, we review recent efforts to incorporate omic data into logic-based models and discuss critical challenges in constructing and analysing integrative, large-scale, logic-based models of biological mechanisms.


2018 ◽  
Author(s):  
Jin Li ◽  
Le Zheng ◽  
Akihiko Uchiyama ◽  
Lianghua Bin ◽  
Theodora M. Mauro ◽  
...  

AbstractA large volume of biological data is being generated for studying mechanisms of various biological processes. These precious data enable large-scale computational analyses to gain biological insights. However, it remains a challenge to mine the data efficiently for knowledge discovery. The heterogeneity of these data makes it difficult to consistently integrate them, slowing down the process of biological discovery. We introduce a data processing paradigm to identify key factors in biological processes via systematic collection of gene expression datasets, primary analysis of data, and evaluation of consistent signals. To demonstrate its effectiveness, our paradigm was applied to epidermal development and identified many genes that play a potential role in this process. Besides the known epidermal development genes, a substantial proportion of the identified genes are still not supported by gain- or loss-of-function studies, yielding many novel genes for future studies. Among them, we selected a top gene for loss-of-function experimental validation and confirmed its function in epidermal differentiation, proving the ability of this paradigm to identify new factors in biological processes. In addition, this paradigm revealed many key genes in cold-induced thermogenesis using data from cold-challenged tissues, demonstrating its generalizability. This paradigm can lead to fruitful results for studying molecular mechanisms in an era of explosive accumulation of publicly available biological data.


Author(s):  
Renato Pajarola ◽  
Susanne K. Suter ◽  
Rafael Ballester-Ripoll ◽  
Haiyan Yang

AbstractTensor decomposition methods and multilinear algebra are powerful tools to cope with challenges around multidimensional and multivariate data in computer graphics, image processing and data visualization, in particular with respect to compact representation and processing of increasingly large-scale data sets. Initially proposed as an extension of the concept of matrix rank for 3 and more dimensions, tensor decomposition methods have found applications in a remarkably wide range of disciplines. We briefly review the main concepts of tensor decompositions and their application to multidimensional visual data. Furthermore, we will include a first outlook on porting these techniques to multivariate data such as vector and tensor fields.


2020 ◽  
Author(s):  
Hai-bo Si ◽  
Ti-min Yang ◽  
Yang Chen ◽  
Rui-wen Mao ◽  
Li-ming Wu ◽  
...  

Abstract Background MicroRNAs (miRs) have received extensive attention in osteoarthritis (OA) pathogenesis in recent years, and our previous study have confirmed that single intra-articular injection (IAJ) of miR-140-5p alleviates early-stage OA (EOA) progression in rats. This study aims to further investigate the effects of single IAJ of miR-140-5p on different stage OA and multiple IAJs of miR-140-5p on EOA, as well as the potential mechanisms. Methods Firstly, OA model was surgically induced in rats, 9 rats were treated with IAJ of Cy5-miR-140-5p at 1 week after surgery, and fluorescence distribution was measured. Then, 72 rats were treated with single IAJ of miR-140-5p at different time after surgery or multiple IAJs of miR-140-5p at 1 week after surgery, and OA progression were evaluated macroscopically and histologically. Finally, bioinformatics analyses were performed and the potential targets and molecular mechanisms of miR-140-5p were predicted. Results Strong fluorescence was observed in the chondrocytes and joint where Cy5-miR-140-5p was injected. Behavioural scores, chondrocyte numbers and cartilage thickness in cartilage were higher, while pathological scores were lower in the miR-140-5p group than in the control group. Specifically, the earlier a single IAJ of miR-140-5p, the better the therapeutic effect, and multiple IAJs exhibited better therapeutic effect than single IAJ on EOA. Bioinformatics analyses predicted 84 potential target genes of rno-miR-140-5p and revealed that these genes enrich in various biological processes and pathways. Conclusions IAJs of miR-140-5p effectively alleviate EOA progression by modulating various biological processes and pathways, and may be a promising therapeutics for EOA.


Author(s):  
Benjamin Hall ◽  
Anna Niarakis

Discrete, logic-based models are increasingly used to describe biological mechanisms. Initially introduced to study gene regulation, these models evolved to cover various molecular mechanisms, such as signalling, transcription factor cooperativity, and even metabolic processes. The abstract nature and amenability of discrete models to robust mathematical analyses make them appropriate for addressing a wide range of complex biological problems. Recent technological breakthroughs have generated a wealth of high throughput data. Novel, literature-based representations of biological processes and emerging algorithms offer new opportunities for model construction. Here, we review up-to-date efforts to address challenging biological questions by incorporating omic data into logic-based models, and discuss critical difficulties in constructing and analysing integrative, large-scale, logic-based models of biological mechanisms.


Sign in / Sign up

Export Citation Format

Share Document