scholarly journals A Unified Bayesian Model for Generalized Community Detection in Attribute Networks

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Qiang Tian ◽  
Wenjun Wang ◽  
Yingjie Xie ◽  
Huaming Wu ◽  
Pengfei Jiao ◽  
...  

Identification of community structures and the underlying semantic characteristics of communities are essential tasks in complex network analysis. However, most methods proposed so far are typically only applicable to assortative community structures, that is, more links within communities and fewer links between different communities, which ignore the rich diversity of community regularities in real networks. In addition, the node attributes that provide rich semantics information of communities and networks can facilitate in-depth community detection of structural information. In this paper, we propose a novel unified Bayesian generative model to detect generalized communities and provide semantic descriptions simultaneously by combining network topology and node attributes. The proposed model is composed of two closely correlated parts by a transition matrix; we first apply the concept of a mixture model to describe network regularities and then adjust the classic Latent Dirichlet Allocation (LDA) topic model to identify community semantically. Thus, the model can detect broad types of network structure regularities, including assortative structures, disassortative structures, and mixture structures and provide multiple semantic descriptions for the communities. To optimize the objective function of the model, we use an effective Gibbs sampling algorithm. Experiments on a number of synthetic and real networks show that our model has superior performance compared with some baselines on community detection.

2019 ◽  
Vol 3 (3) ◽  
pp. 165-186 ◽  
Author(s):  
Chenliang Li ◽  
Shiqian Chen ◽  
Yan Qi

Abstract Filtering out irrelevant documents and classifying the relevant ones into topical categories is a de facto task in many applications. However, supervised learning solutions require extravagant human efforts on document labeling. In this paper, we propose a novel seed-guided topic model for dataless short text classification and filtering, named SSCF. Without using any labeled documents, SSCF takes a few “seed words” for each category of interest, and conducts short text filtering and classification in a weakly supervised manner. To overcome the issues of data sparsity and imbalance, the short text collection is mapped to a collection of pseudodocuments, one for each word. SSCF infers two kinds of topics on pseudo-documents: category-topics and general-topics. Each category-topic is associated with one category of interest, covering the meaning of the latter. In SSCF, we devise a novel word relevance estimation process based on the seed words, for hidden topic inference. The dominating topic of a short text is identified through post inference and then used for filtering and classification. On two real-world datasets in two languages, experimental results show that our proposed SSCF consistently achieves better classification accuracy than state-of-the-art baselines. We also observe that SSCF can even achieve superior performance than the supervised classifiers supervised latent dirichlet allocation (sLDA) and support vector machine (SVM) on some testing tasks.


2020 ◽  
Author(s):  
Kai Zhang ◽  
Yuan Zhou ◽  
Zheng Chen ◽  
Yufei Liu ◽  
Zhuo Tang ◽  
...  

Abstract The prevalence of short texts on the Web has made mining the latent topic structures of short texts a critical and fundamental task for many applications. However, due to the lack of word co-occurrence information induced by the content sparsity of short texts, it is challenging for traditional topic models like latent Dirichlet allocation (LDA) to extract coherent topic structures on short texts. Incorporating external semantic knowledge into the topic modeling process is an effective strategy to improve the coherence of inferred topics. In this paper, we develop a novel topic model—called biterm correlation knowledge-based topic model (BCK-TM)—to infer latent topics from short texts. Specifically, the proposed model mines biterm correlation knowledge automatically based on recent progress in word embedding, which can represent semantic information of words in a continuous vector space. To incorporate external knowledge, a knowledge incorporation mechanism is designed over the latent topic layer to regularize the topic assignment of each biterm during the topic sampling process. Experimental results on three public benchmark datasets illustrate the superior performance of the proposed approach over several state-of-the-art baseline models.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Zhiwen Ye ◽  
Hui Zhang ◽  
Libo Feng ◽  
Zhangming Shan

Community discovery can discover the community structure in a network, and it provides consumers with personalized services and information pushing. It plays an important role in promoting the intelligence of the network society. Most community networks have a community structure whose vertices are gathered into groups which is significant for network data mining and identification. Existing community detection methods explore the original network topology, but they do not make the full use of the inherent semantic information on nodes, e.g., node attributes. To solve the problem, we explore networks by considering both the original network topology and inherent community structures. In this paper, we propose a novel nonnegative matrix factorization (NMF) model that is divided into two parts, the community structure matrix and the node attribute matrix, and we present a matrix updating method to deal with the nonnegative matrix factorization optimization problem. NMF can achieve large-scale multidimensional data reduction processing to discover the internal relationships between networks and find the degree of network association. The community structure matrix that we proposed provides more information about the network structure by considering the relationships between nodes that connect directly or share similar neighboring nodes. The use of node attributes provides a semantic interpretation for the community structure. We conduct experiments on attributed graph datasets with overlapping and nonoverlapping communities. The results of the experiments show that the performances of the F1-Score and Jaccard-Similarity in the overlapping community and the performances of normalized mutual information (NMI) and accuracy (AC) in the nonoverlapping community are significantly improved. Our proposed model achieves significant improvements in terms of its accuracy and relevance compared with the state-of-the-art approaches.


Author(s):  
Stefan Winter

This concluding chapter summarizes key themes and presents some final thoughts. The book has shown that the multiplicity of lived ʻAlawi experiences cannot be reduced to the sole question of religion or framed within a monolithic narrative of persecution; that the very attempt to outline a single coherent history of “the ʻAlawis” may indeed be misguided. The sources on which this study has drawn are considerably more accessible, and the social and administrative realities they reflect consistently more mundane and disjointed, than the discourse of the ʻAlawis' supposed exceptionalism would lead one to believe. Therefore, the challenge for historians of ʻAlawi society in Syria and elsewhere is not to use the specific events and structures these sources detail to merely add to the already existing metanarratives of religious oppression, Ottoman misrule, and national resistance but rather to come to a newer and more intricate understanding of that community, and its place in wider Middle Eastern society, by investigating the lives of individual ʻAlawi (and other) actors within the rich diversity of local contexts these sources reveal.


2018 ◽  
Vol 16 (1) ◽  
pp. 112-119
Author(s):  
VLADIMIR GLEB NAYDONOV

The article considers the students’ tolerance as a spectrum of personal manifestations of respect, acceptance and correct understanding of the rich diversity of cultures of the world, values of others’ personality. The purpose of the study is to investgate education and the formation of tolerance among the students. We have compiled a training program to improve the level of tolerance for interethnic differences. Based on the statistical analysis of the data obtained, the most important values that are significant for different levels of tolerance were identified.


2021 ◽  
Author(s):  
Valentin Waschulin ◽  
Chiara Borsetto ◽  
Robert James ◽  
Kevin K. Newsham ◽  
Stefano Donadio ◽  
...  

AbstractThe growing problem of antibiotic resistance has led to the exploration of uncultured bacteria as potential sources of new antimicrobials. PCR amplicon analyses and short-read sequencing studies of samples from different environments have reported evidence of high biosynthetic gene cluster (BGC) diversity in metagenomes, indicating their potential for producing novel and useful compounds. However, recovering full-length BGC sequences from uncultivated bacteria remains a challenge due to the technological restraints of short-read sequencing, thus making assessment of BGC diversity difficult. Here, long-read sequencing and genome mining were used to recover >1400 mostly full-length BGCs that demonstrate the rich diversity of BGCs from uncultivated lineages present in soil from Mars Oasis, Antarctica. A large number of highly divergent BGCs were not only found in the phyla Acidobacteriota, Verrucomicrobiota and Gemmatimonadota but also in the actinobacterial classes Acidimicrobiia and Thermoleophilia and the gammaproteobacterial order UBA7966. The latter furthermore contained a potential novel family of RiPPs. Our findings underline the biosynthetic potential of underexplored phyla as well as unexplored lineages within seemingly well-studied producer phyla. They also showcase long-read metagenomic sequencing as a promising way to access the untapped genetic reservoir of specialised metabolite gene clusters of the uncultured majority of microbes.


Author(s):  
Xi Liu ◽  
Yongfeng Yin ◽  
Haifeng Li ◽  
Jiabin Chen ◽  
Chang Liu ◽  
...  

AbstractExisting software intelligent defect classification approaches do not consider radar characters and prior statistics information. Thus, when applying these appaoraches into radar software testing and validation, the precision rate and recall rate of defect classification are poor and have effect on the reuse effectiveness of software defects. To solve this problem, a new intelligent defect classification approach based on the latent Dirichlet allocation (LDA) topic model is proposed for radar software in this paper. The proposed approach includes the defect text segmentation algorithm based on the dictionary of radar domain, the modified LDA model combining radar software requirement, and the top acquisition and classification approach of radar software defect based on the modified LDA model. The proposed approach is applied on the typical radar software defects to validate the effectiveness and applicability. The application results illustrate that the prediction precison rate and recall rate of the poposed approach are improved up to 15 ~ 20% compared with the other defect classification approaches. Thus, the proposed approach can be applied in the segmentation and classification of radar software defects effectively to improve the identifying adequacy of the defects in radar software.


2021 ◽  
Vol 54 (3) ◽  
pp. 1-35
Author(s):  
Matteo Magnani ◽  
Obaida Hanteer ◽  
Roberto Interdonato ◽  
Luca Rossi ◽  
Andrea Tagarelli

A multiplex network models different modes of interaction among same-type entities. In this article, we provide a taxonomy of community detection algorithms in multiplex networks. We characterize the different algorithms based on various properties and we discuss the type of communities detected by each method. We then provide an extensive experimental evaluation of the reviewed methods to answer three main questions: to what extent the evaluated methods are able to detect ground-truth communities, to what extent different methods produce similar community structures, and to what extent the evaluated methods are scalable. One goal of this survey is to help scholars and practitioners to choose the right methods for the data and the task at hand, while also emphasizing when such choice is problematic.


Science ◽  
2013 ◽  
Vol 341 (6147) ◽  
pp. 746-751 ◽  
Author(s):  
Jeffery L. Dangl ◽  
Diana M. Horvath ◽  
Brian J. Staskawicz

Diverse and rapidly evolving pathogens cause plant diseases and epidemics that threaten crop yield and food security around the world. Research over the last 25 years has led to an increasingly clear conceptual understanding of the molecular components of the plant immune system. Combined with ever-cheaper DNA-sequencing technology and the rich diversity of germ plasm manipulated for over a century by plant breeders, we now have the means to begin development of durable (long-lasting) disease resistance beyond the limits imposed by conventional breeding and in a manner that will replace costly and unsustainable chemical controls.


Sign in / Sign up

Export Citation Format

Share Document