Efficient Markov Network Structure Discovery Using Independence Tests

We present two algorithms for learning the structure of a Markov network from data: GSMN* and GSIMN. Both algorithms use statistical independence tests to infer the structure by successively constraining the set of structures consistent with the results of these tests. Until very recently, algorithms for structure learning were based on maximum likelihood estimation, which has been proved to be NP-hard for Markov networks due to the difficulty of estimating the parameters of the network, needed for the computation of the data likelihood. The independence-based approach does not require the computation of the likelihood, and thus both GSMN* and GSIMN can compute the structure efficiently (as shown in our experiments). GSMN* is an adaptation of the Grow-Shrink algorithm of Margaritis and Thrun for learning the structure of Bayesian networks. GSIMN extends GSMN* by additionally exploiting Pearl's well-known properties of the conditional independence relation to infer novel independences from known ones, thus avoiding the performance of statistical tests to estimate them. To accomplish this efficiently GSIMN uses the Triangle theorem, also introduced in this work, which is a simplified version of the set of Markov axioms. Experimental comparisons on artificial and real-world data sets show GSIMN can yield significant savings with respect to GSMN*, while generating a Markov network with comparable or in some cases improved quality. We also compare GSIMN to a forward-chaining implementation, called GSIMN-FCH, that produces all possible conditional independences resulting from repeatedly applying Pearl's theorems on the known conditional independence tests. The results of this comparison show that GSIMN, by the sole use of the Triangle theorem, is nearly optimal in terms of the set of independences tests that it infers.

Download Full-text

A novel approach to fully representing the diversity in conditional dependencies for learning Bayesian network classifier

Intelligent Data Analysis ◽

10.3233/ida-194959 ◽

2021 ◽

Vol 25 (1) ◽

pp. 35-55

Author(s):

Limin Wang ◽

Peng Chen ◽

Shenglei Chen ◽

Minghui Sun

Keyword(s):

Bayesian Network ◽

Conditional Independence ◽

Structure Learning ◽

Classification Performance ◽

Data Sets ◽

Independence Assumption ◽

Bayesian Network Classifiers ◽

Novel Approach ◽

Conditional Independence Assumption ◽

Dependence Criterion

Bayesian network classifiers (BNCs) have proved their effectiveness and efficiency in the supervised learning framework. Numerous variations of conditional independence assumption have been proposed to address the issue of NP-hard structure learning of BNC. However, researchers focus on identifying conditional dependence rather than conditional independence, and information-theoretic criteria cannot identify the diversity in conditional (in)dependencies for different instances. In this paper, the maximum correlation criterion and minimum dependence criterion are introduced to sort attributes and identify conditional independencies, respectively. The heuristic search strategy is applied to find possible global solution for achieving the trade-off between significant dependency relationships and independence assumption. Our extensive experimental evaluation on widely used benchmark data sets reveals that the proposed algorithm achieves competitive classification performance compared to state-of-the-art single model learners (e.g., TAN, KDB, KNN and SVM) and ensemble learners (e.g., ATAN and AODE).

Download Full-text

Bayesian Prediction Model Based on Attribute Weighting and Kernel Density Estimations

Mathematical Problems in Engineering ◽

10.1155/2015/170324 ◽

2015 ◽

Vol 2015 ◽

pp. 1-7

Author(s):

Zhong-Liang Xiang ◽

Xiang-Ru Yu ◽

Dae-Ki Kang

Keyword(s):

Conditional Independence ◽

Naive Bayes ◽

Learning Algorithm ◽

Kernel Method ◽

Likelihood Estimation ◽

Naïve Bayes ◽

Real World Data ◽

Weighting Method ◽

Attribute Weighting ◽

Smooth Kernel

Although naïve Bayes learner has been proven to show reasonable performance in machine learning, it often suffers from a few problems with handling real world data. First problem is conditional independence; the second problem is the usage of frequency estimator. Therefore, we have proposed methods to solve these two problems revolving around naïve Bayes algorithms. By using an attribute weighting method, we have been able to handle conditional independence assumption issue, whereas, for the case of the frequency estimators, we have found a way to weaken the negative effects through our proposed smooth kernel method. In this paper, we have proposed a compact Bayes model, in which a smooth kernel augments weights on likelihood estimation. We have also chosen an attribute weighting method which employs mutual information metric to cooperate with the framework. Experiments have been conducted on UCI benchmark datasets and the accuracy of our proposed learner has been compared with that of standard naïve Bayes. The experimental results have demonstrated the effectiveness and efficiency of our proposed learning algorithm.

Download Full-text

High-dimensional structure learning of sparse vector autoregressive models using fractional marginal pseudo-likelihood

Statistics and Computing ◽

10.1007/s11222-021-10049-z ◽

2021 ◽

Vol 31 (6) ◽

Author(s):

Kimmo Suotsalo ◽

Yingying Xu ◽

Jukka Corander ◽

Johan Pensar

Keyword(s):

Structure Learning ◽

Marginal Likelihood ◽

Likelihood Estimation ◽

Penalized Regression ◽

Autoregressive Models ◽

Dimensional Structure ◽

Real World Data ◽

Vector Autoregressive Models ◽

Vector Autoregressive ◽

Pseudo Likelihood

AbstractLearning vector autoregressive models from multivariate time series is conventionally approached through least squares or maximum likelihood estimation. These methods typically assume a fully connected model which provides no direct insight to the model structure and may lead to highly noisy estimates of the parameters. Because of these limitations, there has been an increasing interest towards methods that produce sparse estimates through penalized regression. However, such methods are computationally intensive and may become prohibitively time-consuming when the number of variables in the model increases. In this paper we adopt an approximate Bayesian approach to the learning problem by combining fractional marginal likelihood and pseudo-likelihood. We propose a novel method, PLVAR, that is both faster and produces more accurate estimates than the state-of-the-art methods based on penalized regression. We prove the consistency of the PLVAR estimator and demonstrate the attractive performance of the method on both simulated and real-world data.

Download Full-text

Alpha Power Transformed Extended Bur II Distribution: Properties and Applications

Asian Research Journal of Mathematics ◽

10.9734/arjom/2020/v16i830209 ◽

2020 ◽

pp. 50-63

Author(s):

A. A. Ogunde ◽

B. Ajayi ◽

D. O. Omosigho

Keyword(s):

Estimation Method ◽

Likelihood Estimation ◽

Moment Generating Function ◽

Power Transformation ◽

Data Sets ◽

Alpha Power ◽

Real World Data ◽

Mathematical Properties ◽

Maximum Likelihood Estimation Method ◽

New Distribution

This paper presents a new generalization of the extended Bur II distribution. We redefined the Bur II distribution using the Alpha Power Transformation (APT) to obtain a new distribution called the Alpha Power Transformed Extended Bur II distribution. We derived several mathematical properties for the new model which includes moments, moment generating function, order statistics, entropy etc. and used a maximum likelihood estimation method to obtain the parameters of the distribution. Two real-world data sets were used for applications in order to illustrate the usefulness of the new distribution.

Download Full-text

STRUCTURE-LEARNING OF CAUSAL BAYESIAN NETWORKS BASED ON ADJACENT NODES

International Journal of Artificial Intelligence Tools ◽

10.1142/s021821301350005x ◽

2013 ◽

Vol 22 (02) ◽

pp. 1350005 ◽

Cited By ~ 2

Author(s):

XIA LIU ◽

YOULONG YANG ◽

MINGMIN ZHU

Keyword(s):

Bayesian Networks ◽

Latent Variables ◽

Conditional Independence ◽

Structure Learning ◽

Original Graph ◽

Independence Tests ◽

Acyclic Graphs ◽

Causal Bayesian Networks ◽

Unobserved Variables ◽

Direct Acyclic Graphs

Due to the infeasibility of randomized controlled experiments, the existence of unobserved variables and the fact that equivalent direct acyclic graphs obtained generally can not be distinguished, it is difficult to learn the true causal relations of original graph. This paper presents an algorithm called BSPC based on adjacent nodes to learn the structure of Causal Bayesian Networks with unobserved variables by using observational data. It does not have to adjust the structure as the existing algorithms FCI and MBCS*, while it can guarantee to obtain the true adjacent nodes. More important is that algorithm BSPC reduces computational complexity and improves reliability of conditional independence tests. Theoretical results show that the new algorithm is correct. In addition, the advantages of BSPC in terms of the number of conditional independence tests and the number of orientation errors are illustrated with simulation experiments from which we can see that it is more suitable in order to learn the structure of Causal Bayesian Networks with latent variables. Moreover a better latent structure representation is returned.

Download Full-text

TrustSVD: A Novel Trust-Based Matrix Factorization Model with User Trust and Item Ratings

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i11.422 ◽

2017 ◽

Vol 7 (11) ◽

pp. 7 ◽

Cited By ~ 1

Author(s):

K Sobha Rani

Keyword(s):

Matrix Factorization ◽

Social Trust ◽

State Of The Art ◽

Data Sets ◽

Real World Data ◽

Recommendation Algorithm ◽

Active User ◽

Factorization Model ◽

The Social ◽

Matrix Factorization Technique

Collaborative filtering suffers from the problems of data sparsity and cold start, which dramatically degrade recommendation performance. To help resolve these issues, we propose TrustSVD, a trust-based matrix factorization technique. By analyzing the social trust data from four real-world data sets, we conclude that not only the explicit but also the implicit influence of both ratings and trust should be taken into consideration in a recommendation model. Hence, we build on top of a state-of-the-art recommendation algorithm SVD++ which inherently involves the explicit and implicit influence of rated items, by further incorporating both the explicit and implicit influence of trusted users on the prediction of items for an active user. To our knowledge, the work reported is the first to extend SVD++ with social trust information. Experimental results on the four data sets demonstrate that our approach TrustSVD achieves better accuracy than other ten counterparts, and can better handle the concerned issues.

Download Full-text

A New Extension of Thinning-Based Integer-Valued Autoregressive Models for Count Data

Entropy ◽

10.3390/e23010062 ◽

2020 ◽

Vol 23 (1) ◽

pp. 62

Author(s):

Zhengwei Liu ◽

Fukang Zhu

Keyword(s):

Likelihood Estimation ◽

Real Data ◽

Autoregressive Models ◽

Superior Performance ◽

Data Sets ◽

Binomial Thinning ◽

Free Case ◽

Two Parameters ◽

Conditional Maximum ◽

Thinning Operator

The thinning operators play an important role in the analysis of integer-valued autoregressive models, and the most widely used is the binomial thinning. Inspired by the theory about extended Pascal triangles, a new thinning operator named extended binomial is introduced, which is a general case of the binomial thinning. Compared to the binomial thinning operator, the extended binomial thinning operator has two parameters and is more flexible in modeling. Based on the proposed operator, a new integer-valued autoregressive model is introduced, which can accurately and flexibly capture the dispersed features of counting time series. Two-step conditional least squares (CLS) estimation is investigated for the innovation-free case and the conditional maximum likelihood estimation is also discussed. We have also obtained the asymptotic property of the two-step CLS estimator. Finally, three overdispersed or underdispersed real data sets are considered to illustrate a superior performance of the proposed model.

Download Full-text

Hfinger: Malware HTTP Request Fingerprinting

Entropy ◽

10.3390/e23050507 ◽

2021 ◽

Vol 23 (5) ◽

pp. 507

Author(s):

Piotr Białczak ◽

Wojciech Mazurczyk

Keyword(s):

Real World ◽

Network Traffic ◽

Experimental Evaluation ◽

Data Sets ◽

Real World Data ◽

Malicious Software ◽

Default Mode ◽

World Data ◽

Effectiveness Analysis ◽

Http Protocol

Malicious software utilizes HTTP protocol for communication purposes, creating network traffic that is hard to identify as it blends into the traffic generated by benign applications. To this aim, fingerprinting tools have been developed to help track and identify such traffic by providing a short representation of malicious HTTP requests. However, currently existing tools do not analyze all information included in the HTTP message or analyze it insufficiently. To address these issues, we propose Hfinger, a novel malware HTTP request fingerprinting tool. It extracts information from the parts of the request such as URI, protocol information, headers, and payload, providing a concise request representation that preserves the extracted information in a form interpretable by a human analyst. For the developed solution, we have performed an extensive experimental evaluation using real-world data sets and we also compared Hfinger with the most related and popular existing tools such as FATT, Mercury, and p0f. The conducted effectiveness analysis reveals that on average only 1.85% of requests fingerprinted by Hfinger collide between malware families, what is 8–34 times lower than existing tools. Moreover, unlike these tools, in default mode, Hfinger does not introduce collisions between malware and benign applications and achieves it by increasing the number of fingerprints by at most 3 times. As a result, Hfinger can effectively track and hunt malware by providing more unique fingerprints than other standard tools.

Download Full-text

Learning emotional word embeddings for sentiment analysis

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201993 ◽

2021 ◽

pp. 1-13

Author(s):

Qingtian Zeng ◽

Xishi Zhao ◽

Xiaohui Hu ◽

Hua Duan ◽

Zhongying Zhao ◽

...

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

State Of The Art ◽

Research Problem ◽

Emotional Word ◽

Classification Model ◽

Data Sets ◽

Word Embeddings ◽

Real World Data ◽

Text Documents

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.

Download Full-text

Different algorithms, different models

Quality & Quantity ◽

10.1007/s11135-021-01193-9 ◽

2021 ◽

Author(s):

Martyna Daria Swiatczak

Keyword(s):

Comparative Analysis ◽

Real World ◽

Qualitative Comparative Analysis ◽

Comparative Methods ◽

Data Sets ◽

Simulation Studies ◽

Threshold Values ◽

Real World Data ◽

Software Packages ◽

Methodological Approaches

AbstractThis study assesses the extent to which the two main Configurational Comparative Methods (CCMs), i.e. Qualitative Comparative Analysis (QCA) and Coincidence Analysis (CNA), produce different models. It further explains how this non-identity is due to the different algorithms upon which both methods are based, namely QCA’s Quine–McCluskey algorithm and the CNA algorithm. I offer an overview of the fundamental differences between QCA and CNA and demonstrate both underlying algorithms on three data sets of ascending proximity to real-world data. Subsequent simulation studies in scenarios of varying sample sizes and degrees of noise in the data show high overall ratios of non-identity between the QCA parsimonious solution and the CNA atomic solution for varying analytical choices, i.e. different consistency and coverage threshold values and ways to derive QCA’s parsimonious solution. Clarity on the contrasts between the two methods is supposed to enable scholars to make more informed decisions on their methodological approaches, enhance their understanding of what is happening behind the results generated by the software packages, and better navigate the interpretation of results. Clarity on the non-identity between the underlying algorithms and their consequences for the results is supposed to provide a basis for a methodological discussion about which method and which variants thereof are more successful in deriving which search target.

Download Full-text