scholarly journals Predicting virus-host association by Kernelized logistic matrix factorization and similarity network fusion

2019 ◽  
Vol 20 (S16) ◽  
Author(s):  
Dan Liu ◽  
Yingjun Ma ◽  
Xingpeng Jiang ◽  
Tingting He

Abstract Background Viruses are closely related to bacteria and human diseases. It is of great significance to predict associations between viruses and hosts for understanding the dynamics and complex functional networks in microbial community. With the rapid development of the metagenomics sequencing, some methods based on sequence similarity and genomic homology have been used to predict associations between viruses and hosts. However, the known virus-host association network was ignored in these methods. Results We proposed a kernelized logistic matrix factorization with integrating different information to predict potential virus-host associations on the heterogeneous network (ILMF-VH) which is constructed by connecting a virus network with a host network based on known virus-host associations. The virus network is constructed based on oligonucleotide frequency measurement, and the host network is constructed by integrating oligonucleotide frequency similarity and Gaussian interaction profile kernel similarity through similarity network fusion. The host prediction accuracy of our method is better than other methods. In addition, case studies show that the host of crAssphage predicted by ILMF-VH is consistent with presumed host in previous studies, and another potential host Escherichia coli is also predicted. Conclusions The proposed model is an effective computational tool for predicting interactions between viruses and hosts effectively, and it has great potential for discovering novel hosts of viruses.

2020 ◽  
pp. 1-17
Author(s):  
Dongqi Yang ◽  
Wenyu Zhang ◽  
Xin Wu ◽  
Jose H. Ablanedo-Rosas ◽  
Lingxiao Yang ◽  
...  

With the rapid development of commercial credit mechanisms, credit funds have become fundamental in promoting the development of manufacturing corporations. However, large-scale, imbalanced credit application information poses a challenge to accurate bankruptcy predictions. A novel multi-stage ensemble model with fuzzy clustering and optimized classifier composition is proposed herein by combining the fuzzy clustering-based classifier selection method, the random subspace (RS)-based classifier composition method, and the genetic algorithm (GA)-based classifier compositional optimization method to achieve accuracy in predicting bankruptcy among corporates. To overcome the inherent inflexibility of traditional hard clustering methods, a new fuzzy clustering-based classifier selection method is proposed based on the mini-batch k-means algorithm to obtain the best performing base classifiers for generating classifier compositions. The RS-based classifier composition method was applied to enhance the robustness of candidate classifier compositions by randomly selecting several subspaces in the original feature space. The GA-based classifier compositional optimization method was applied to optimize the parameters of the promising classifier composition through the iterative mechanism of the GA. Finally, six datasets collected from the real world were tested with four evaluation indicators to assess the performance of the proposed model. The experimental results showed that the proposed model outperformed the benchmark models with higher predictive accuracy and efficiency.


Author(s):  
Junshu Wang ◽  
Guoming Zhang ◽  
Wei Wang ◽  
Ka Zhang ◽  
Yehua Sheng

AbstractWith the rapid development of hospital informatization and Internet medical service in recent years, most hospitals have launched online hospital appointment registration systems to remove patient queues and improve the efficiency of medical services. However, most of the patients lack professional medical knowledge and have no idea of how to choose department when registering. To instruct the patients to seek medical care and register effectively, we proposed CIDRS, an intelligent self-diagnosis and department recommendation framework based on Chinese medical Bidirectional Encoder Representations from Transformers (BERT) in the cloud computing environment. We also established a Chinese BERT model (CHMBERT) trained on a large-scale Chinese medical text corpus. This model was used to optimize self-diagnosis and department recommendation tasks. To solve the limited computing power of terminals, we deployed the proposed framework in a cloud computing environment based on container and micro-service technologies. Real-world medical datasets from hospitals were used in the experiments, and results showed that the proposed model was superior to the traditional deep learning models and other pre-trained language models in terms of performance.


2011 ◽  
Vol 356-360 ◽  
pp. 1516-1519 ◽  
Author(s):  
Jiang Wu ◽  
Jin Hong Zhang ◽  
Wei Feng Xu ◽  
Yu Ran Cai ◽  
Yuan Huang Ouyang ◽  
...  

Along with the rapid development of electric power industry in China, the pollutant discharge limits are getting more strict, and WFGD (wet flue gas desulfurization) technology and its equipment have been greatly developed and extensively applied, in which the gas-gas heater (GGH) is adopted at lots of the power stations. The corrosion is a very important issue in GGH, and it is experimentally studied in this paper. Through experiment of polarization curves and the control variable method, a comparative analysis of the corrosion resistance between the commonly used materials of the GGH is made. The results have shown that the corrosion resistance with the heat transfer surface in GGH will decrease with the temperature increasing, but the enamel steel’s corrosion resistance and stability is strikingly better than that of others, and the increasing volume of corrosion resistance by plating enamel is remarkable.


2010 ◽  
Vol 37-38 ◽  
pp. 116-121
Author(s):  
Yu Lan Li ◽  
Bo Li ◽  
Su Jun Luo

In the facility layout decisions, the previous general design principle is to minimize material handling costs, and the objective of these old models only considers the costs of loaded trip, without regard to empty vehicle trip costs, which do not meet the actual demand. In this paper, the unequal-sized unidirectional loop layout problem is analyzed, and the model of facility layout is improved. The objective of the new model is to minimize the total loaded and empty vehicle trip costs. To solve this model, a heuristic algorithm based on partheno-genetic algorithms is designed. Finally, an unequal-sized unidirectional loop layout problem including 12 devices is simulated. Comparison shows that the result obtained using the proposed model is 20.4% better than that obtained using the original model.


2017 ◽  
Vol 2017 ◽  
pp. 1-10 ◽  
Author(s):  
Wen-Jun Li ◽  
Qiang Dong ◽  
Yan Fu

As the rapid development of mobile Internet and smart devices, more and more online content providers begin to collect the preferences of their customers through various apps on mobile devices. These preferences could be largely reflected by the ratings on the online items with explicit scores. Both of positive and negative ratings are helpful for recommender systems to provide relevant items to a target user. Based on the empirical analysis of three real-world movie-rating data sets, we observe that users’ rating criterions change over time, and past positive and negative ratings have different influences on users’ future preferences. Given this, we propose a recommendation model on a session-based temporal graph, considering the difference of long- and short-term preferences, and the different temporal effect of positive and negative ratings. The extensive experiment results validate the significant accuracy improvement of our proposed model compared with the state-of-the-art methods.


2012 ◽  
Vol 25 (3) ◽  
pp. 235-243 ◽  
Author(s):  
Rashmi Deka ◽  
Soma Chakraborty ◽  
Sekhar Roy

Spectrum availability is becoming scarce due to the rise of number of users and rapid development in wireless environment. Cognitive radio (CR) is an intelligent radio system which uses its in-built technology to use the vacant spectrum holes for the use of another service provider. In this paper, genetic algorithm (GA) is used for the best possible space allocation to cognitive radio in the spectrum available. For spectrum reuse, two criteria have to be fulfilled - 1) probability of detection has to be maximized, and 2) probability of false alarm should be minimized. It is found that with the help of genetic algorithm the optimized result is better than without using genetic algorithm. It is necessary that the secondary user should vacate the spectrum in use when licensed users are demanding and detecting the primary users accurately by the cognitive radio. Here, bit error rate (BER) is minimized for better spectrum sensing purpose using GA.


1991 ◽  
Vol 57 (1) ◽  
pp. 83-91 ◽  
Author(s):  
Norman Kaplan ◽  
Richard R. Hudson ◽  
Masaru Iizuka

SummaryA population genetic model with a single locus at which balancing selection acts and many linked loci at which neutral mutations can occur is analysed using the coalescent approach. The model incorporates geographic subdivision with migration, as well as mutation, recombination, and genetic drift of neutral variation. It is found that geographic subdivision can affect genetic variation even with high rates of migration, providing that selection is strong enough to maintain different allele frequencies at the selected locus. Published sequence data from the alcohol dehydrogenase locus of Drosophila melanogaster are found to fit the proposed model slightly better than a similar model without subdivision.


2008 ◽  
Vol 11 (1) ◽  
pp. 159-171 ◽  
Author(s):  
Itziar Etxebarria ◽  
Pedro Apodaca

The purpose of the study was to confirm a model which proposed two basic dimensions in the subjective experience of guilt, one anxious-aggressive and the other empathic, as well as another dimension associated but not intrinsic to it, namely, the associated negative emotions dimension. Participants were 360 adolescents, young adults and adults of both sexes. They were asked to relate one of the situations that most frequently caused them to experience feelings of guilt and to specify its intensity and that of 9 other emotions that they may have experienced, to a greater or lesser extent, at the same time on a 7-point scale. The proposed model was shown to adequately fit the data and to be better than other alternative nested models. This result supports the views of both Freud and Hoffman regarding the nature of guilt, contradictory only at a first glance.


Author(s):  
J. Y. Sun ◽  
G. Z. Wang ◽  
G. J. He ◽  
D. C. Pu ◽  
W. Jiang ◽  
...  

Abstract. Surface water system is an important part of global ecosystem, and the changes in surface water may lead to disasters, such as drought, waterlog, and water-borne diseases. The rapid development of remote sensing technology has supplied better strategies for water bodies extraction and further monitoring. In this study, AdaBoost and Random Forest (RF), two typical algorithms in integrated learning, were applied to extract water bodies in Chaozhou area (mainly located in Guangzhou Province, China) based on GF-1 data, and the Decision Tree (DT) was used for comparative tests to comprehensively evaluate the performance of classification algorithms listed above for surface water body extraction. The results showed that: (1) Compared with visual interpretation, AdaBoost performed better than RF in the extraction of several typical water bodies, such as rivers, lakes and ponds Moreover, the water extraction results of the strong classifiers using AdaBoost or RF were better than the weak basic classifiers. (2) For the quantitative accuracy statistics, the overall accuracy (96.5%) and kappa coefficient (93%) using AdaBoost exceeded those using RF (5.3% and 10.6%), respectively. The classification time of AdaBoost increased by 403 seconds and 918 seconds relative to RF and DT methods. However, in terms of visual interpretation, quantitative statistical accuracy and classification time, AdaBoost algorithm was more suitable for the water body extraction. (3) For the sample proportion comparison experiment of AdaBoost, four sampling proportions (0.1%, 0.2%, 1% and 2%) were chosen and 0.1% sampling proportion reached the optimum classification accuracy (93.9%) and kappa coefficient (87.8%).


2020 ◽  
Vol 36 (4) ◽  
pp. 305-323
Author(s):  
Quan Hoang Nguyen ◽  
Ly Vu ◽  
Quang Uy Nguyen

Sentiment classification (SC) aims to determine whether a document conveys a positive or negative opinion. Due to the rapid development of the digital world, SC has become an important research topic that affects many aspects of our life. In SC based on machine learning, the representation of the document strongly influences on its accuracy. Word Embedding (WE)-based techniques, i.e., Word2vec techniques, are proved to be beneficial techniques to the SC problem. However, Word2vec is often not enough to represent the semantic of documents with complex sentences of Vietnamese. In this paper, we propose a new representation learning model called a \textbf{two-channel vector} to learn a higher-level feature of a document in SC. Our model uses two neural networks to learn the semantic feature, i.e., Word2vec and the syntactic feature, i.e., Part of Speech tag (POS). Two features are then combined and input to a \textit{Softmax} function to make the final classification. We carry out intensive experiments on $4$ recent Vietnamese sentiment datasets to evaluate the performance of the proposed architecture. The experimental results demonstrate that the proposed model can significantly enhance the accuracy of SC problems compared to two single models and a state-of-the-art ensemble method.


Sign in / Sign up

Export Citation Format

Share Document