Towards Computing a Near-Maximum Weighted Independent Set on Massive Graphs

2018 ◽

Cited By ~ 2

Author(s):

Shaowei Cai ◽

Wenying Hou ◽

Jinkun Lin ◽

Yuanjie Li

Keyword(s):

Local Search ◽

Real World ◽

Heuristic Algorithms ◽

Minimum Weight ◽

Vertex Cover ◽

Independent Set ◽

Map Labeling ◽

Massive Graphs ◽

Real World Problem ◽

Np Hardness

The minimum weight vertex cover (MWVC) problem is an important combinatorial optimization problem with various real-world applications. Due to its NP hardness, most works on solving MWVC focus on heuristic algorithms that can return a good quality solution in reasonable time. In this work, we propose two dynamic strategies that adjust the behavior of the algorithm during search, which are used to improve a state of the art local search for MWVC named FastWVC, resulting in two local search algorithms called DynWVC1 and DynWVC2. Previous MWVC algorithms are evaluated on graphs with random or hand crafted weights. In this work, we evaluate the algorithms on the vertex weighted graphs that obtained from an important real world problem, the map labeling problem. Experiments show that our algorithm obtains better results than previous algorithms for MWVC and maximum weight independent set (MWIS) on these real world instances. We also test our algorithms on massive graphs studied in previous works, and show significant improvements there.

Download Full-text

A genetic algorithm-based heuristic for solving the weighted maximum independent set and some equivalent problems

Journal of the Operational Research Society ◽

10.1038/sj.jors.2600405 ◽

1997 ◽

Vol 48 (6) ◽

pp. 612-622 ◽

Cited By ~ 11

Author(s):

M Hifi

Keyword(s):

Genetic Algorithm ◽

Independent Set ◽

Maximum Independent Set ◽

Equivalent Problems

Download Full-text

Prediction of K562 Cells Functional Inhibitors Based on Machine Learning Approaches

Current Pharmaceutical Design ◽

10.2174/1381612825666191107092214 ◽

2020 ◽

Vol 25 (40) ◽

pp. 4296-4302 ◽

Cited By ~ 2

Author(s):

Yuan Zhang ◽

Zhenyan Han ◽

Qian Gao ◽

Xiaoyi Bai ◽

Chi Zhang ◽

...

Keyword(s):

Machine Learning ◽

Inclusion Bodies ◽

Cross Validation ◽

Independent Set ◽

K562 Cells ◽

Machine Learning Algorithms ◽

Learning Approaches ◽

Validation Test ◽

Excess Number ◽

Fold Cross Validation

Background: β thalassemia is a common monogenic genetic disease that is very harmful to human health. The disease arises is due to the deletion of or defects in β-globin, which reduces synthesis of the β-globin chain, resulting in a relatively excess number of α-chains. The formation of inclusion bodies deposited on the cell membrane causes a decrease in the ability of red blood cells to deform and a group of hereditary haemolytic diseases caused by massive destruction in the spleen. Methods: In this work, machine learning algorithms were employed to build a prediction model for inhibitors against K562 based on 117 inhibitors and 190 non-inhibitors. Results: The overall accuracy (ACC) of a 10-fold cross-validation test and an independent set test using Adaboost were 83.1% and 78.0%, respectively, surpassing Bayes Net, Random Forest, Random Tree, C4.5, SVM, KNN and Bagging. Conclusion: This study indicated that Adaboost could be applied to build a learning model in the prediction of inhibitors against K526 cells.

Download Full-text

Predicting Hub Genes of Glioblastomas Based on Support Vector Machine Combined with CFS algorithms

Current Bioinformatics ◽

10.2174/1574893615999200819162140 ◽

2020 ◽

Vol 15 ◽

Author(s):

Chun Qiu ◽

Sai Li ◽

Shenghui Yang ◽

Lin Wang ◽

Aihui Zeng ◽

...

Keyword(s):

Support Vector Machine ◽

Expression Profiles ◽

Independent Set ◽

Classification Model ◽

Support Vector ◽

Feature Subset ◽

Hub Genes ◽

Effective Prevention ◽

Key Genes ◽

Control Samples

Aim: To search the genes related to the mechanisms of the occurrence of glioma and to try to build a prediction model for glioblastomas. Background: The morbidity and mortality of glioblastomas are very high, which seriously endangers human health. At present, the goals of many investigations on gliomas are mainly to understand the cause and mechanism of these tumors at the molecular level and to explore clinical diagnosis and treatment methods. However, there is no effective early diagnosis method for this disease, and there are no effective prevention, diagnosis or treatment measures. Methods: First, the gene expression profiles derived from GEO were downloaded. Then, differentially expressed genes (DEGs) in the disease samples and the control samples were identified. After that, GO and KEGG enrichment analyses of DEGs were performed by DAVID. Furthermore, the correlation-based feature subset (CFS) method was applied to the selection of key DEGs. In addition, the classification model between the glioblastoma samples and the controls was built by an Support Vector Machine (SVM) based on selected key genes. Results and Discussion: Thirty-six DEGs, including 17 upregulated and 19 downregulated genes, were selected as the feature genes to build the classification model between the glioma samples and the control samples by the CFS method. The accuracy of the classification model by using a 10-fold cross-validation test and independent set test was 76.25% and 70.3%, respectively. In addition, PPP2R2B and CYBB can also be found in the top 5 hub genes screened by the protein– protein interaction (PPI) network. Conclusions: This study indicated that the CFS method is a useful tool to identify key genes in glioblastomas. In addition, we also predicted that genes such as PPP2R2B and CYBB might be potential biomarkers for the diagnosis of glioblastomas.

Download Full-text

SSumM: Sparse Summarization of Massive Graphs

Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining ◽

10.1145/3394486.3403057 ◽

2020 ◽

Author(s):

Kyuhan Lee ◽

Hyeonsoo Jo ◽

Jihoon Ko ◽

Sungsu Lim ◽

Kijung Shin

Keyword(s):

Massive Graphs

Download Full-text

ON THE HEIGHT AND RELATIONAL COMPLEXITY OF A FINITE PERMUTATION GROUP

Nagoya Mathematical Journal ◽

10.1017/nmj.2021.6 ◽

2021 ◽

pp. 1-40

Author(s):

NICK GILL ◽

BIANCA LODÀ ◽

PABLO SPIGA

Keyword(s):

Permutation Group ◽

Model Theory ◽

Independent Set ◽

Maximum Size ◽

Proper Subset ◽

Permutation Groups ◽

Finite Permutation Group ◽

Pointwise Stabilizer ◽

Relational Complexity

Abstract Let G be a permutation group on a set $\Omega $ of size t. We say that $\Lambda \subseteq \Omega $ is an independent set if its pointwise stabilizer is not equal to the pointwise stabilizer of any proper subset of $\Lambda $ . We define the height of G to be the maximum size of an independent set, and we denote this quantity $\textrm{H}(G)$ . In this paper, we study $\textrm{H}(G)$ for the case when G is primitive. Our main result asserts that either $\textrm{H}(G)< 9\log t$ or else G is in a particular well-studied family (the primitive large–base groups). An immediate corollary of this result is a characterization of primitive permutation groups with large relational complexity, the latter quantity being a statistic introduced by Cherlin in his study of the model theory of permutation groups. We also study $\textrm{I}(G)$ , the maximum length of an irredundant base of G, in which case we prove that if G is primitive, then either $\textrm{I}(G)<7\log t$ or else, again, G is in a particular family (which includes the primitive large–base groups as well as some others).

Download Full-text

Tiered Sampling

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3441299 ◽

2021 ◽

Vol 15 (5) ◽

pp. 1-52

Author(s):

Lorenzo De Stefani ◽

Erisa Terolli ◽

Eli Upfal

Keyword(s):

Large Scale ◽

Analysis Of Algorithms ◽

Base Layer ◽

Single Edge ◽

Real World Data ◽

High Quality ◽

Large Graphs ◽

Massive Graphs ◽

Variance Estimate ◽

Low Probability

We introduce Tiered Sampling , a novel technique for estimating the count of sparse motifs in massive graphs whose edges are observed in a stream. Our technique requires only a single pass on the data and uses a memory of fixed size M , which can be magnitudes smaller than the number of edges. Our methods address the challenging task of counting sparse motifs—sub-graph patterns—that have a low probability of appearing in a sample of M edges in the graph, which is the maximum amount of data available to the algorithms in each step. To obtain an unbiased and low variance estimate of the count, we partition the available memory into tiers (layers) of reservoir samples. While the base layer is a standard reservoir sample of edges, other layers are reservoir samples of sub-structures of the desired motif. By storing more frequent sub-structures of the motif, we increase the probability of detecting an occurrence of the sparse motif we are counting, thus decreasing the variance and error of the estimate. While we focus on the designing and analysis of algorithms for counting 4-cliques, we present a method which allows generalizing Tiered Sampling to obtain high-quality estimates for the number of occurrence of any sub-graph of interest, while reducing the analysis effort due to specific properties of the pattern of interest. We present a complete analytical analysis and extensive experimental evaluation of our proposed method using both synthetic and real-world data. Our results demonstrate the advantage of our method in obtaining high-quality approximations for the number of 4 and 5-cliques for large graphs using a very limited amount of memory, significantly outperforming the single edge sample approach for counting sparse motifs in large scale graphs.

Download Full-text

Beyond Alice and Bob: Improved Inapproximability for Maximum Independent Set in CONGEST

Proceedings of the 39th Symposium on Principles of Distributed Computing ◽

10.1145/3382734.3405702 ◽

2020 ◽

Author(s):

Yuval Efron ◽

Ofer Grossman ◽

Seri Khoury

Keyword(s):

Independent Set ◽

Maximum Independent Set

Download Full-text

Enumerating maximum cliques in massive graphs

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2020.3036013 ◽

2020 ◽

pp. 1-1

Author(s):

Can Lu ◽

Jeffrey Xu Yu ◽

Hao Wei ◽

Yikai Zhang

Keyword(s):

Maximum Cliques ◽

Massive Graphs

Download Full-text

Parametric Matroid of Rough Set

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488515500403 ◽

2015 ◽

Vol 23 (06) ◽

pp. 893-908 ◽

Cited By ~ 5

Author(s):

Yanfang Liu ◽

Hong Zhao ◽

William Zhu

Keyword(s):

Rough Set ◽

Equivalence Relation ◽

Rough Sets ◽

Rough Set Theory ◽

Attribute Reduction ◽

Independent Set ◽

Approximation Operator ◽

Approximation Number ◽

Lower Approximation ◽

The One

Rough set is mainly concerned with the approximations of objects through an equivalence relation on a universe. Matroid is a generalization of linear algebra and graph theory. Recently, a matroidal structure of rough sets is established and applied to the problem of attribute reduction which is an important application of rough set theory. In this paper, we propose a new matroidal structure of rough sets and call it a parametric matroid. On the one hand, for an equivalence relation on a universe, a parametric set family, with any subset of the universe as its parameter, is defined through the lower approximation operator. This parametric set family is proved to satisfy the independent set axiom of matroids, therefore a matroid is generated, and we call it a parametric matroid of the rough set. Through the lower approximation operator, three equivalent representations of the parametric set family are obtained. Moreover, the parametric matroid of the rough set is proved to be the direct sum of a partition-circuit matroid and a free matroid. On the other hand, partition-circuit matroids are well studied through the lower approximation number, and then we use it to investigate the parametric matroid of the rough set. Several characteristics of the parametric matroid of the rough set, such as independent sets, bases, circuits, the rank function and the closure operator, are expressed by the lower approximation number.

Download Full-text

Towards Computing a Near-Maximum Weighted Independent Set on Massive Graphs

Improving Local Search for Minimum Weight Vertex Cover by Dynamic Strategies

A genetic algorithm-based heuristic for solving the weighted maximum independent set and some equivalent problems

Prediction of K562 Cells Functional Inhibitors Based on Machine Learning Approaches

Predicting Hub Genes of Glioblastomas Based on Support Vector Machine Combined with CFS algorithms

SSumM: Sparse Summarization of Massive Graphs

ON THE HEIGHT AND RELATIONAL COMPLEXITY OF A FINITE PERMUTATION GROUP

Tiered Sampling

Beyond Alice and Bob: Improved Inapproximability for Maximum Independent Set in CONGEST

Enumerating maximum cliques in massive graphs

Parametric Matroid of Rough Set

Export Citation Format