Predicting microRNA–disease associations from lncRNA–microRNA interactions via Multiview Multitask Learning

Author(s):  
Yu-An Huang ◽  
Keith C C Chan ◽  
Zhu-Hong You ◽  
Pengwei Hu ◽  
Lei Wang ◽  
...  

Abstract Motivation Identifying microRNAs that are associated with different diseases as biomarkers is a problem of great medical significance. Existing computational methods for uncovering such microRNA-diseases associations (MDAs) are mostly developed under the assumption that similar microRNAs tend to associate with similar diseases. Since such an assumption is not always valid, these methods may not always be applicable to all kinds of MDAs. Considering that the relationship between long noncoding RNA (lncRNA) and different diseases and the co-regulation relationships between the biological functions of lncRNA and microRNA have been established, we propose here a multiview multitask method to make use of the known lncRNA–microRNA interaction to predict MDAs on a large scale. The investigation is performed in the absence of complete information of microRNAs and any similarity measurement for it and to the best knowledge, the work represents the first ever attempt to discover MDAs based on lncRNA–microRNA interactions. Results In this paper, we propose to develop a deep learning model called MVMTMDA that can create a multiview representation of microRNAs. The model is trained based on an end-to-end multitasking approach to machine learning so that, based on it, missing data in the side information can be determined automatically. Experimental results show that the proposed model yields an average area under ROC curve of 0.8410+/−0.018, 0.8512+/−0.012 and 0.8521+/−0.008 when k is set to 2, 5 and 10, respectively. In addition, we also propose here a statistical approach to predicting lncRNA-disease associations based on these associations and the MDA discovered using MVMTMDA. Availability Python code and the datasets used in our studies are made available at https://github.com/yahuang1991polyu/MVMTMDA/.

Information ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 79 ◽  
Author(s):  
Xiaoyu Han ◽  
Yue Zhang ◽  
Wenkai Zhang ◽  
Tinglei Huang

Relation extraction is a vital task in natural language processing. It aims to identify the relationship between two specified entities in a sentence. Besides information contained in the sentence, additional information about the entities is verified to be helpful in relation extraction. Additional information such as entity type getting by NER (Named Entity Recognition) and description provided by knowledge base both have their limitations. Nevertheless, there exists another way to provide additional information which can overcome these limitations in Chinese relation extraction. As Chinese characters usually have explicit meanings and can carry more information than English letters. We suggest that characters that constitute the entities can provide additional information which is helpful for the relation extraction task, especially in large scale datasets. This assumption has never been verified before. The main obstacle is the lack of large-scale Chinese relation datasets. In this paper, first, we generate a large scale Chinese relation extraction dataset based on a Chinese encyclopedia. Second, we propose an attention-based model using the characters that compose the entities. The result on the generated dataset shows that these characters can provide useful information for the Chinese relation extraction task. By using this information, the attention mechanism we used can recognize the crucial part of the sentence that can express the relation. The proposed model outperforms other baseline models on our Chinese relation extraction dataset.


2019 ◽  
Vol 29 (11n12) ◽  
pp. 1727-1740 ◽  
Author(s):  
Hongming Zhu ◽  
Yi Luo ◽  
Qin Liu ◽  
Hongfei Fan ◽  
Tianyou Song ◽  
...  

Multistep flow prediction is an essential task for the car-sharing systems. An accurate flow prediction model can help system operators to pre-allocate the cars to meet the demand of users. However, this task is challenging due to the complex spatial and temporal relations among stations. Existing works only considered temporal relations (e.g. using LSTM) or spatial relations (e.g. using CNN) independently. In this paper, we propose an attention to multi-graph convolutional sequence-to-sequence model (AMGC-Seq2Seq), which is a novel deep learning model for multistep flow prediction. The proposed model uses the encoder–decoder architecture, wherein the encoder part, spatial and temporal relations are encoded simultaneously. Then the encoded information is passed to the decoder to generate multistep outputs. In this work, specific multiple graphs are constructed to reflect spatial relations from different aspects, and we model them by using the proposed multi-graph convolution. Attention mechanism is also used to capture the important relations from previous information. Experiments on a large-scale real-world car-sharing dataset demonstrate the effectiveness of our approach over state-of-the-art methods.


2022 ◽  
Vol 31 (1) ◽  
pp. 1-37
Author(s):  
Chao Liu ◽  
Xin Xia ◽  
David Lo ◽  
Zhiwe Liu ◽  
Ahmed E. Hassan ◽  
...  

To accelerate software development, developers frequently search and reuse existing code snippets from a large-scale codebase, e.g., GitHub. Over the years, researchers proposed many information retrieval (IR)-based models for code search, but they fail to connect the semantic gap between query and code. An early successful deep learning (DL)-based model DeepCS solved this issue by learning the relationship between pairs of code methods and corresponding natural language descriptions. Two major advantages of DeepCS are the capability of understanding irrelevant/noisy keywords and capturing sequential relationships between words in query and code. In this article, we proposed an IR-based model CodeMatcher that inherits the advantages of DeepCS (i.e., the capability of understanding the sequential semantics in important query words), while it can leverage the indexing technique in the IR-based model to accelerate the search response time substantially. CodeMatcher first collects metadata for query words to identify irrelevant/noisy ones, then iteratively performs fuzzy search with important query words on the codebase that is indexed by the Elasticsearch tool and finally reranks a set of returned candidate code according to how the tokens in the candidate code snippet sequentially matched the important words in a query. We verified its effectiveness on a large-scale codebase with ~41K repositories. Experimental results showed that CodeMatcher achieves an MRR (a widely used accuracy measure for code search) of 0.60, outperforming DeepCS, CodeHow, and UNIF by 82%, 62%, and 46%, respectively. Our proposed model is over 1.2K times faster than DeepCS. Moreover, CodeMatcher outperforms two existing online search engines (GitHub and Google search) by 46% and 33%, respectively, in terms of MRR. We also observed that: fusing the advantages of IR-based and DL-based models is promising; improving the quality of method naming helps code search, since method name plays an important role in connecting query and code.


Author(s):  
S Nagaraju ◽  
B. Prabhakara Reddy

Mental stress is showing harmfulness to human health leads abnormal stress in chronology with this may lose our mental health for proactive care. With recognizable pieces of proof of web-based media, individuals cannot share their everyday exercises and collaborate with companions via web-based media stages, making it happing to use online informal community information for stress identification. We find that users stress state is closely associated with thereupon of his/her friends in social media, which we employ a large-scale dataset from real-world social platforms to systematically study the relationship between users’ stress states and social interactions. We first define a gaggle of stress-related comments, images, and social attributes from various aspects, then proposed a plot. Research results saying that the proposed model can improve the detection performance. With the help of enumeration, we build an internet site for the users to spot their stress rate level and may check other related activities.


2012 ◽  
Vol 15 (4) ◽  
pp. 379-392
Author(s):  
Tzu-Ping Lo ◽  
Sy-Jye Guo ◽  
Chin-Te Chen

Realizing the maintenance cost distribution and predicting the future tendency are important for facility managers to efficiently arrange the limited budget. This paper collects 16,228 maintenance records of a representative hospital in Taiwan and further analyzes the cost distribution. Besides, by calculating the maintenance cost of per square meter of floor area per year (dollar/m2/year) and comparing with the previous studies, this paper also points out the relationship between maintenance cost and the operation ages. moreover, this paper establishes a hybrid grey model termed as EGM(1,1), which adopting exponential series to identify the residual error series resulted from grey model, to predict the maintenance cost. The repair cost of hospital building from 1998 to 2006 is adopted to demonstrate the applicability and practicability of EGM(1,1). Results show that the proposed model can predict the tendency precisely. Santrauka Norint efektyviai išdėstyti ribotą biudžetą, pastatų ūkio valdytojai turi suprasti eksploatacijos sąnaudų pasiskirstymą ir sudaryti ateities tendencijų prognozes. Šiame darbe surinkti 16 228 įrašai apie reprezentacinės Taivano ligoninės eksploataciją ir jais remiantis analizuojamas sąnaudų pasiskirstymas. Apskaičiavus metines eksploatacijos sąnaudas vienam kvadratiniam metrui (doleriai/m2/metus) ir palyginus jas su ankstesniais tyrimais, darbe taip pat parodomas ryšys tarp eksploatacijos sąnaudų ir objekto amžiaus. Be to, darbe sudaromas hibridinis pilkasis modelis, pavadintas EGM(1,1), kuriame naudojant eksponentines eilutes nustatomos liktinės paklaidų eilutės, gautos pilkajame modelyje, taip siekiant prognozuoti eksploatacines sąnaudas. Naudojant 1998–2006 m. ligoninės pastato remontui išleistą sumą pristatomas EGM(1,1) taikymas ir praktiškumas. Rezultatai rodo, kad pasiūlytas modelis tendencijas gali prognozuoti tiksliai.


2020 ◽  
Vol 27 (1) ◽  
pp. 11-22
Author(s):  
Yiyuan Han ◽  
Bing Han ◽  
Zejun Hu ◽  
Xinbo Gao ◽  
Lixia Zhang ◽  
...  

Abstract. The auroral oval boundary represents an important physical process with implications for the ionosphere and magnetosphere. An automatic auroral oval boundary prediction method based on deep learning in this paper is applied to study the variation of the auroral oval boundary associated with different space physical parameters. We construct an auroral oval boundary dataset to train our proposed model, which consists of 184 416 auroral oval boundary points extracted from 3842 images captured by the Ultraviolet Imager (UVI) of the Polar satellite and its corresponding 18 space physical parameters selected from the OMNI dataset from December 1996 to March 1997. Furthermore, several statistical experiments and correlation analysis experiments are performed based on our dataset to explore the relationship between space physical parameters and the location of the auroral oval boundary. The experiment results show that the prediction model based on the deep learning method can estimate the auroral oval boundary efficiently, and different space physical parameters have different effects on the auroral oval boundary, especially the interplanetary magnetic field (IMF), geomagnetic indexes, and solar wind parameters.


2017 ◽  
Vol 2017 ◽  
pp. 1-6 ◽  
Author(s):  
Yu Sun ◽  
Yuan Liu ◽  
Guan Wang ◽  
Haiyan Zhang

Plant image identification has become an interdisciplinary focus in both botanical taxonomy and computer vision. The first plant image dataset collected by mobile phone in natural scene is presented, which contains 10,000 images of 100 ornamental plant species in Beijing Forestry University campus. A 26-layer deep learning model consisting of 8 residual building blocks is designed for large-scale plant classification in natural environment. The proposed model achieves a recognition rate of 91.78% on the BJFU100 dataset, demonstrating that deep learning is a promising technology for smart forestry.


IoT ◽  
2021 ◽  
Vol 2 (3) ◽  
pp. 428-448
Author(s):  
Imtiaz Ullah ◽  
Ayaz Ullah ◽  
Mazhar Sajjad

The tremendous number of Internet of Things (IoT) applications, with their ubiquity, has provided us with unprecedented productivity and simplified our daily life. At the same time, the insecurity of these technologies ensures that our daily lives are surrounded by vulnerable computers, allowing for the launch of multiple attacks via large-scale botnets through the IoT. These attacks have been successful in achieving their heinous objectives. A strong identification strategy is essential to keep devices secured. This paper proposes and implements a model for anomaly-based intrusion detection in IoT networks that uses a convolutional neural network (CNN) and gated recurrent unit (GRU) to detect and classify binary and multiclass IoT network data. The proposed model is validated using the BoT-IoT, IoT Network Intrusion, MQTT-IoT-IDS2020, and IoT-23 intrusion detection datasets. Our proposed binary and multiclass classification model achieved an exceptionally high level of accuracy, precision, recall, and F1 score.


2021 ◽  
Vol 15 ◽  
Author(s):  
Ming Yang ◽  
Menglin Cao ◽  
Yuhao Chen ◽  
Yanni Chen ◽  
Geng Fan ◽  
...  

GoalBrain functional networks (BFNs) constructed using resting-state functional magnetic resonance imaging (fMRI) have proven to be an effective way to understand aberrant functional connectivity in autism spectrum disorder (ASD) patients. It is still challenging to utilize these features as potential biomarkers for discrimination of ASD. The purpose of this work is to classify ASD and normal controls (NCs) using BFNs derived from rs-fMRI.MethodsA deep learning framework was proposed that integrated convolutional neural network (CNN) and channel-wise attention mechanism to model both intra- and inter-BFN associations simultaneously for ASD diagnosis. We investigate the effects of each BFN on performance and performed inter-network connectivity analysis between each pair of BFNs. We compared the performance of our CNN model with some state-of-the-art algorithms using functional connectivity features.ResultsWe collected 79 ASD patients and 105 NCs from the ABIDE-I dataset. The mean accuracy of our classification algorithm was 77.74% for classification of ASD versus NCs.ConclusionThe proposed model is able to integrate information from multiple BFNs to improve detection accuracy of ASD.SignificanceThese findings suggest that large-scale BFNs is promising to serve as reliable biomarkers for diagnosis of ASD.


2019 ◽  
Author(s):  
Yiyuan Han ◽  
Bing Han ◽  
Zejun Hu ◽  
Xinbo Gao ◽  
Lixia Zhang ◽  
...  

Abstract. The auroral oval boundary represents important physical process with implications for the ionosphere and magnetosphere. An automatic auroral oval boundary prediction method based on deep learning in this paper are applied to study the variation of auroral oval boundary, associated with different space physical parameters. We construct an auroral oval boundary dataset to train our proposed model, which consists of 184416 auroral oval boundary points extracted from 3842 UVI images captured by Ultraviolet Imager of the Polar satellite and its corresponding 18 space physical parameters selected from OMNI dataset during December 1996 to March 1997. Furthermore, several statistical experiments and correlation analysis experiment are performed based on our dataset to explore the relationship between space physical parameters and the location of auroral oval boundary. The experiment results show that the prediction model based on deep learning method could estimate auroral oval boundary efficiently, and different space physical parameters have different effects on auroral oval boundary, especially interplanetary magnetic field (IMF), geomagnetic indexes and solar wind parameters.


Sign in / Sign up

Export Citation Format

Share Document