scholarly journals MiRNA-disease association prediction via hypergraph learning based on high-dimensionality features

2021 ◽  
Vol 21 (S1) ◽  
Author(s):  
Yu-Tian Wang ◽  
Qing-Wen Wu ◽  
Zhen Gao ◽  
Jian-Cheng Ni ◽  
Chun-Hou Zheng

Abstract Background MicroRNAs (miRNAs) have been confirmed to have close relationship with various human complex diseases. The identification of disease-related miRNAs provides great insights into the underlying pathogenesis of diseases. However, it is still a big challenge to identify which miRNAs are related to diseases. As experimental methods are in general expensive and time‐consuming, it is important to develop efficient computational models to discover potential miRNA-disease associations. Methods This study presents a novel prediction method called HFHLMDA, which is based on high-dimensionality features and hypergraph learning, to reveal the association between diseases and miRNAs. Firstly, the miRNA functional similarity and the disease semantic similarity are integrated to form an informative high-dimensionality feature vector. Then, a hypergraph is constructed by the K-Nearest-Neighbor (KNN) method, in which each miRNA-disease pair and its k most relevant neighbors are linked as one hyperedge to represent the complex relationships among miRNA-disease pairs. Finally, the hypergraph learning model is designed to learn the projection matrix which is used to calculate uncertain miRNA-disease association score. Result Compared with four state-of-the-art computational models, HFHLMDA achieved best results of 92.09% and 91.87% in leave-one-out cross validation and fivefold cross validation, respectively. Moreover, in case studies on Esophageal neoplasms, Hepatocellular Carcinoma, Breast Neoplasms, 90%, 98%, and 96% of the top 50 predictions have been manually confirmed by previous experimental studies. Conclusion MiRNAs have complex connections with many human diseases. In this study, we proposed a novel computational model to predict the underlying miRNA-disease associations. All results show that the proposed method is effective for miRNA–disease association predication.

2021 ◽  
Vol 16 ◽  
Author(s):  
Yayan Zhang ◽  
Guihua Duan ◽  
Cheng Yan ◽  
Haolun Yi ◽  
Fang-Xiang Wu ◽  
...  

Background: Increasing evidence has indicated that miRNA-disease association prediction plays a critical role in the study of clinical drugs. Researchers have proposed many computational models for miRNA-disease prediction. However, there is no unified platform to compare and analyze the pros and cons or share the code and data of these models. Objective: In this study, we develop an easy-to-use platform (MDAPlatform) to construct and assess miRNA-disease association prediction method. Methods: MDAPlatform integrates the relevant data of miRNA, disease and miRNA-disease associations that are used in previous miRNA-disease association prediction studies. Based on the componentized model, it develops differet components of previous computational methods. Results: Users can conduct cross validation experiments and compare their methods with other methods, and the visualized comparison results are also provided. Conclusion: Based on the componentized model, MDAPlatform provides easy-to-operate interfaces to construct the miRNA-disease association method, which is beneficial to develop new miRNA-disease association prediction methods in the future.


2020 ◽  
Vol 20 (6) ◽  
pp. 452-460
Author(s):  
Lin Tang ◽  
Yu Liang ◽  
Xin Jin ◽  
Lin Liu ◽  
Wei Zhou

Background: Accumulating experimental studies demonstrated that long non-coding RNAs (LncRNAs) play crucial roles in the occurrence and development progress of various complex human diseases. Nonetheless, only a small portion of LncRNA–disease associations have been experimentally verified at present. Automatically predicting LncRNA–disease associations based on computational models can save the huge cost of wet-lab experiments. Methods and Result: To develop effective computational models to integrate various heterogeneous biological data for the identification of potential disease-LncRNA, we propose a hierarchical extension based on the Boolean matrix for LncRNA-disease association prediction model (HEBLDA). HEBLDA discovers the intrinsic hierarchical correlation based on the property of the Boolean matrix from various relational sources. Then, HEBLDA integrates these hierarchical associated matrices by fusion weights. Finally, HEBLDA uses the hierarchical associated matrix to reconstruct the LncRNA– disease association matrix by hierarchical extending. HEBLDA is able to work for potential diseases or LncRNA without known association data. In 5-fold cross-validation experiments, HEBLDA obtained an area under the receiver operating characteristic curve (AUC) of 0.8913, improving previous classical methods. Besides, case studies show that HEBLDA can accurately predict candidate disease for several LncRNAs. Conclusion: Based on its ability to discover the more-richer correlated structure of various data sources, we can anticipate that HEBLDA is a potential method that can obtain more comprehensive association prediction in a broad field.


Author(s):  
Xing Chen ◽  
Lian-Gang Sun ◽  
Yan Zhao

Abstract Emerging evidence shows that microRNAs (miRNAs) play a critical role in diverse fundamental and important biological processes associated with human diseases. Inferring potential disease related miRNAs and employing them as the biomarkers or drug targets could contribute to the prevention, diagnosis and treatment of complex human diseases. In view of that traditional biological experiments cost much time and resources, computational models would serve as complementary means to uncover potential miRNA–disease associations. In this study, we proposed a new computational model named Neighborhood Constraint Matrix Completion for MiRNA–Disease Association prediction (NCMCMDA) to predict potential miRNA–disease associations. The main task of NCMCMDA was to recover the missing miRNA–disease associations based on the known miRNA–disease associations and integrated disease (miRNA) similarity. In this model, we innovatively integrated neighborhood constraint with matrix completion, which provided a novel idea of utilizing similarity information to assist the prediction. After the recovery task was transformed into an optimization problem, we solved it with a fast iterative shrinkage-thresholding algorithm. As a result, the AUCs of NCMCMDA in global and local leave-one-out cross validation were 0.9086 and 0.8453, respectively. In 5-fold cross validation, NCMCMDA achieved an average AUC of 0.8942 and standard deviation of 0.0015, which demonstrated NCMCMDA’s superior performance than many previous computational methods. Furthermore, NCMCMDA was applied to three different types of case studies to further evaluate its prediction reliability and accuracy. As a result, 84% (colon neoplasms), 98% (esophageal neoplasms) and 98% (breast neoplasms) of the top 50 predicted miRNAs were verified by recent literature.


2020 ◽  
Vol 2 (2) ◽  
pp. 29-38
Author(s):  
Abdur Rohman Harits Martawireja ◽  
Hilman Mujahid Purnama ◽  
Atika Nur Rahmawati

Pengenalan wajah manusia (face recognition) merupakan salah satu bidang penelitian yang penting dan belakangan ini banyak aplikasi yang menerapkannya, baik di bidang komersil ataupun di bidang penegakan hukum. Pengenalan wajah merupakan sebuah sistem yang berfungsikan untuk mengidentifikasi berdasarkan ciri-ciri dari wajah seseorang berbasis biometrik yang memiliki keakuratan tinggi. Pengenalan wajah dapat diterapkan pada sistem keamanan. Banyak metode yang dapat digunakan dalam aplikasi pengenalan wajah untuk keamanan sistem, namun pada artikel ini akan membahas tentang dua metode yaitu Two Dimensial Principal Component Analysis dan Kernel Fisher Discriminant Analysis dengan metode klasifikasi menggunakan K-Nearest Neigbor. Kedua metode ini diuji menggunakan metode cross validation. Hasil dari penelitian terdahulu terbukti bahwa sistem pengenalan wajah metode Two Dimensial Principal Component Analysis dengan 5-folds cross validation menghasilkan akurasi sebesar 88,73%, sedangkan dengan 2-folds validation akurasi yang dihasilkan sebesar 89,25%. Dan pengujian metode Kernel Fisher Discriminant dengan 2-folds cross validation menghasilkan akurasi rata rata sebesar 83,10%.


2019 ◽  
Vol 17 (1) ◽  
Author(s):  
Guobo Xie ◽  
Zhiliang Fan ◽  
Yuping Sun ◽  
Cuiming Wu ◽  
Lei Ma

Abstract Background Recently, numerous biological experiments have indicated that microRNAs (miRNAs) play critical roles in exploring the pathogenesis of various human diseases. Since traditional experimental methods for miRNA-disease associations detection are costly and time-consuming, it becomes urgent to design efficient and robust computational techniques for identifying undiscovered interactions. Methods In this paper, we proposed a computation framework named weighted bipartite network projection for miRNA-disease association prediction (WBNPMD). In this method, transfer weights were constructed by combining the known miRNA and disease similarities, and the initial information was properly configured. Then the two-step bipartite network algorithm was implemented to infer potential miRNA-disease associations. Results The proposed WBNPMD was applied to the known miRNA-disease association data, and leave-one-out cross-validation (LOOCV) and fivefold cross-validation were implemented to evaluate the performance of WBNPMD. As a result, our method achieved the AUCs of 0.9321 and $$0.9173 \pm 0.0005$$ 0.9173 ± 0.0005 in LOOCV and fivefold cross-validation, and outperformed other four state-of-the-art methods. We also carried out two kinds of case studies on prostate neoplasm, colorectal neoplasm, and lung neoplasm, and most of the top 50 predicted miRNAs were confirmed to have an association with the corresponding diseases based on dbDeMC, miR2Disease, and HMDD V3.0 databases. Conclusions The experimental results demonstrate that WBNPMD can accurately infer potential miRNA-disease associations. We anticipated that the proposed WBNPMD could serve as a powerful tool for potential miRNA-disease associations excavation.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5362 ◽  
Author(s):  
Luca Antognoli ◽  
Sara Moccia ◽  
Lucia Migliorelli ◽  
Sara Casaccia ◽  
Lorenzo Scalise ◽  
...  

Background: Heartbeat detection is a crucial step in several clinical fields. Laser Doppler Vibrometer (LDV) is a promising non-contact measurement for heartbeat detection. The aim of this work is to assess whether machine learning can be used for detecting heartbeat from the carotid LDV signal. Methods: The performances of Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and K-Nearest Neighbor (KNN) were compared using the leave-one-subject-out cross-validation as the testing protocol in an LDV dataset collected from 28 subjects. The classification was conducted on LDV signal windows, which were labeled as beat, if containing a beat, or no-beat, otherwise. The labeling procedure was performed using electrocardiography as the gold standard. Results: For the beat class, the f1-score (f1) values were 0.93, 0.93, 0.95, 0.96 for RF, DT, KNN and SVM, respectively. No statistical differences were found between the classifiers. When testing the SVM on the full-length (10 min long) LDV signals, to simulate a real-world application, we achieved a median macro-f1 of 0.76. Conclusions: Using machine learning for heartbeat detection from carotid LDV signals showed encouraging results, representing a promising step in the field of contactless cardiovascular signal analysis.


Author(s):  
Sophia S ◽  
Rajamohana SP

In recent times, online shoppers are technically knowledgeable and open to product reviews. They usually read the buyer reviews and ratings before purchasing any product from ecommerce website. For the better understanding of products or services, reviews provided by the customers gives the vital source of information. In order to buy the right products for the individuals and to make the business decisions for the Organization online reviews are very important. These reviews or opinions in turn, allow us to find out the strength and weakness of the products. Spam reviews are written in order to falsely promote or demote a few target products or services. Also, detecting the spam reviews has also become more critical issue for the customer to make good decision during the purchase of the product. A major problem in identifying the fake review detection is high dimensionality of the feature space. Therefore, feature selection is an essential step in the fake review detection to reduce dimensionality of the feature space and to improve the classification accuracy. Hence it is important to detect the spam reviews but the major issues in spam review detection are the high dimensionality of feature space which contains redundant, noisy and irrelevant features. To resolve this, Deep Learning Techniques for selecting features is necessary. To classify the features, classifiers such as Naive Bayes, K Nearest Neighbor are used. An analysis of the various techniques employed to identify false and genuine reviews has been surveyed.


2013 ◽  
Vol 765-767 ◽  
pp. 3099-3103 ◽  
Author(s):  
Ze Yue Wu ◽  
Yue Hui Chen

Protein subcellular localization is an important research field of bioinformatics. In this paper, we use the algorithm of the increment of diversity combined with weighted K nearest neighbor to predict protein in SNL6 which has six subcelluar localizations and SNL9 which has nine subcelluar localizations. We use the increment of diversity to extract diversity finite coefficient as new features of proteins. And the basic classifier is weighted K-nearest neighbor. The prediction ability was evaluated by 5-jackknife cross-validation. Its predicted result is 83.3% for SNL6 and 87.6 % for SNL9. By comparing its results with other methods, it indicates the new approach is feasible and effective.


2016 ◽  
Vol 16 (01) ◽  
pp. 1640010 ◽  
Author(s):  
YING-TSANG LO ◽  
HAMIDO FUJITA ◽  
TUN-WEN PAI

Background: Coronary artery disease (CAD) is one of the most representative cardiovascular diseases. Early and accurate prediction of CAD based on physiological measurements can reduce the risk of heart attack through medicine therapy, healthy diet, and regular physical activity. Methods:Four heart disease datasets from the UC Irvine Machine Learning Repository were combined and re-examined to remove incomplete entries, and a total of 822 cases were utilized in this study. Seven machine learning methods, including Naïve Bayes, artificial neural networks (ANNs), sequential minimal optimization (SMO), k-nearest neighbor (KNN), AdaBoost, J48, and random forest, were adopted to analyze the collected datasets for CAD prediction. By combining co-expressed observations and an ensemble voting mechanism, we designed and evaluated a new medical decision classifier for CAD prediction. The TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution) algorithm was applied to determine the best prediction method for CAD diagnosis. Results: Features of systolic blood pressure, cholesterol, heart rate, and ST depression are considered to be the most significant differences between patients with and without CADs. We show that the prediction capability of seven machine learning classifiers can be enhanced by integrating combinations of observed co-expressed features. Finally, compared to the use of any single classifier, the proposed voting mechanism achieved optimal performance according to TOPSIS.


Author(s):  
Grassella Gunsyang ◽  
Ika Purnamasari ◽  
Fidia Deny Tisna Amijaya

Algoritma Neighbor Weighted K-Nearest Neighbor (NWKNN) merupakan pengembangan dari algoritma K-Nearest Neighbor (KNN), dengan memberikan bobot pada setiap kelas yang akan diklasifikasikan. Penelitian ini membahas tentang klasifikasi menggunakan algoritma NWKNN yang diaplikasikan pada data status pembayaran premi. Tujuannya untuk mengetahui nilai eksponen (E) dan nilai ketetanggaan (K) yang optimal, serta nilai akurasi dari klasifikasi data status pembayaran Premi di PT. Bumiputera Kota Samarinda. Tahapan dalam penelitian ini yaitu menentukan nilai E dan nilai K menggunakan k-fold cross validation, menghitung jarak euclidean, menghitung bobot dan skor setiap kelas, melihat nilai skor terbesar untuk menentukan hasil klasifikasi, kemudian menghitung nilai akurasi klasifikasi. Hasil penelitian menunjukkan bahwa nilai K dan nilai E yang optimal untuk klasifikasi status pembayaran premi di PT. Bumiputera Kota Samarinda menggunakan NWKNN sebesar K=3 dan E=6 dengan nilai akurasi sebesar 75%.


Sign in / Sign up

Export Citation Format

Share Document