Manifold regularization ensemble clustering with many objectives using unsupervised extreme learning machines

Spectral clustering has been an effective clustering method, in last decades, because it can get an optimal solution without any assumptions on data’s structure. The basic key in spectral clustering is its similarity matrix. Despite many empirical successes in similarity matrix construction, almost all previous methods suffer from handling just one objective. To address the multi-objective ensemble clustering, we introduce a new ensemble manifold regularization (MR) method based on stacking framework. In our Manifold Regularization Ensemble Clustering (MREC) method, several objective functions are considered simultaneously, as a robust method for constructing the similarity matrix. Using it, the unsupervised extreme learning machine (UELM) is employed to find the generalized eigenvectors to embed the data in low-dimensional space. These eigenvectors are then used as the base point in spectral clustering to find the best partitioning of the data. The aims of this paper are to find robust partitioning that satisfy multiple objectives, handling noisy data, keeping diversity-based goals, and dimension reduction. Experiments on some real-world datasets besides to three benchmark protein datasets demonstrate the superiority of MREC over some state-of-the-art single and ensemble methods.

Download Full-text

Competitive Caching with Machine Learned Advice

Journal of the ACM ◽

10.1145/3447579 ◽

2021 ◽

Vol 68 (4) ◽

pp. 1-25

Author(s):

Thodoris Lykouris ◽

Sergei Vassilvitskii

Keyword(s):

Online Algorithms ◽

Empirical Evaluation ◽

Optimal Solution ◽

Poor Performance ◽

Machine Learning Algorithms ◽

Average Error ◽

Generalization Error ◽

Worst Case ◽

Future Events ◽

Real World Datasets

Traditional online algorithms encapsulate decision making under uncertainty, and give ways to hedge against all possible future events, while guaranteeing a nearly optimal solution, as compared to an offline optimum. On the other hand, machine learning algorithms are in the business of extrapolating patterns found in the data to predict the future, and usually come with strong guarantees on the expected generalization error. In this work, we develop a framework for augmenting online algorithms with a machine learned predictor to achieve competitive ratios that provably improve upon unconditional worst-case lower bounds when the predictor has low error. Our approach treats the predictor as a complete black box and is not dependent on its inner workings or the exact distribution of its errors. We apply this framework to the traditional caching problem—creating an eviction strategy for a cache of size k . We demonstrate that naively following the oracle’s recommendations may lead to very poor performance, even when the average error is quite low. Instead, we show how to modify the Marker algorithm to take into account the predictions and prove that this combined approach achieves a competitive ratio that both (i) decreases as the predictor’s error decreases and (ii) is always capped by O (log k ), which can be achieved without any assistance from the predictor. We complement our results with an empirical evaluation of our algorithm on real-world datasets and show that it performs well empirically even when using simple off-the-shelf predictions.

Download Full-text

Evolutionary Algorithm for Improving Decision Tree with Global Discretization in Manufacturing

Sensors ◽

10.3390/s21082849 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2849

Author(s):

Sungbum Jun

Keyword(s):

Decision Tree ◽

Evolutionary Algorithm ◽

Decision Trees ◽

Manufacturing Systems ◽

Ensemble Methods ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Industrial Internet ◽

Tree Models ◽

Real World Datasets

Due to the recent advance in the industrial Internet of Things (IoT) in manufacturing, the vast amount of data from sensors has triggered the need for leveraging such big data for fault detection. In particular, interpretable machine learning techniques, such as tree-based algorithms, have drawn attention to the need to implement reliable manufacturing systems, and identify the root causes of faults. However, despite the high interpretability of decision trees, tree-based models make a trade-off between accuracy and interpretability. In order to improve the tree’s performance while maintaining its interpretability, an evolutionary algorithm for discretization of multiple attributes, called Decision tree Improved by Multiple sPLits with Evolutionary algorithm for Discretization (DIMPLED), is proposed. The experimental results with two real-world datasets from sensors showed that the decision tree improved by DIMPLED outperformed the performances of single-decision-tree models (C4.5 and CART) that are widely used in practice, and it proved competitive compared to the ensemble methods, which have multiple decision trees. Even though the ensemble methods could produce slightly better performances, the proposed DIMPLED has a more interpretable structure, while maintaining an appropriate performance level.

Download Full-text

Heterogeneous Influence Maximization Through Community Detection in Social Networks

International Journal of Ambient Computing and Intelligence ◽

10.4018/ijaci.2021100107 ◽

2021 ◽

Vol 12 (4) ◽

pp. 118-131

Author(s):

Jaya Krishna Raguru ◽

Devi Prasad Sharma

Keyword(s):

Community Detection ◽

Greedy Algorithms ◽

Computational Cost ◽

Optimal Solution ◽

Influence Maximization ◽

Centrality Measures ◽

Influence Spread ◽

Real World Datasets ◽

Initial Seed ◽

High Computational Cost

The problem of identifying a seed set composed of K nodes that increase influence spread over a social network is known as influence maximization (IM). Past works showed this problem to be NP-hard and an optimal solution to this problem using greedy algorithms achieved only 63% of spread. However, this approach is expensive and suffered from performance issues like high computational cost. Furthermore, in a network with communities, IM spread is not always certain. In this paper, heterogeneous influence maximization through community detection (HIMCD) algorithm is proposed. This approach addresses initial seed nodes selection in communities using various centrality measures, and these seed nodes act as sources for influence spread. A parallel influence maximization is applied with the aid of seed node set contained in each group. In this approach, graph is partitioned and IM computations are done in a distributed manner. Extensive experiments with two real-world datasets reveals that HCDIM achieves substantial performance improvement over state-of-the-art techniques.

Download Full-text

Network Embedding via a Bi-Mode and Deep Neural Network Model

10.20944/preprints201712.0156.v1 ◽

2017 ◽

Author(s):

Yang Fang ◽

Xiang Zhao ◽

Zhen Tan

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Semantic Information ◽

Dimensional Space ◽

Relation Extraction ◽

Network Embedding ◽

Structure Information ◽

Second Mode ◽

Real World Datasets ◽

Low Dimensional

Network Embedding (NE) is an important method to learn the representations of network via a low-dimensional space. Conventional NE models focus on capturing the structure information and semantic information of vertices while neglecting such information for edges. In this work, we propose a novel NE model named BimoNet to capture both the structure and semantic information of edges. BimoNet is composed of two parts, i.e., the bi-mode embedding part and the deep neural network part. For bi-mode embedding part, the first mode named add-mode is used to express the entity-shared features of edges and the second mode named subtract-mode is employed to represent the entity-specific features of edges. These features actually reflect the semantic information. For deep neural network part, we firstly regard the edges in a network as nodes, and the vertices as links, which will not change the overall structure of the whole network. Then we take the nodes' adjacent matrix as the input of the deep neural network as it can obtain similar representations for nodes with similar structure. Afterwards, by jointly optimizing the objective function of these two parts, BimoNet could preserve both the semantic and structure information of edges. In experiments, we evaluate BimoNet on three real-world datasets and task of relation extraction, and BimoNet is demonstrated to outperform state-of-the-art baseline models consistently and significantly.

Download Full-text

A Grid-Density Based Algorithm by Weighted Spiking Neural P Systems with Anti-Spikes and Astrocytes in Spatial Cluster Analysis

Processes ◽

10.3390/pr8091132 ◽

2020 ◽

Vol 8 (9) ◽

pp. 1132

Author(s):

Deting Kong ◽

Yuan Wang ◽

Xinyan Wu ◽

Xiyu Liu ◽

Jianhua Qu ◽

...

Keyword(s):

Dimensional Space ◽

P Systems ◽

High Dimensional ◽

P System ◽

Inhibitory Influence ◽

Spiking Neural P Systems ◽

Clustering Approach ◽

Spatial Cluster Analysis ◽

Effectiveness And Efficiency ◽

Real World Datasets

In this paper, we propose a novel clustering approach based on P systems and grid- density strategy. We present grid-density based approach for clustering high dimensional data, which first projects the data patterns on a two-dimensional space to overcome the curse of dimensionality problem. Then, through meshing the plane with grid lines and deleting sparse grids, clusters are found out. In particular, we present weighted spiking neural P systems with anti-spikes and astrocyte (WSNPA2 in short) to implement grid-density based approach in parallel. Each neuron in weighted SN P system contains a spike, which can be expressed by a computable real number. Spikes and anti-spikes are inspired by neurons communicating through excitatory and inhibitory impulses. Astrocytes have excitatory and inhibitory influence on synapses. Experimental results on multiple real-world datasets demonstrate the effectiveness and efficiency of our approach.

Download Full-text

A pareto ensemble based spectral clustering framework

Complex & Intelligent Systems ◽

10.1007/s40747-020-00215-7 ◽

2020 ◽

Author(s):

Juanjuan Luo ◽

Huadong Ma ◽

Dongqing Zhou

Keyword(s):

Phase I ◽

Phase Ii ◽

Spectral Clustering ◽

Clustering Algorithms ◽

Divide And Conquer ◽

Nonzero Entry ◽

Similarity Matrix ◽

Diversity Preservation ◽

Two Phases ◽

Matrix Construction

Abstract Similarity matrix has a significant effect on the performance of the spectral clustering, and how to determine the neighborhood in the similarity matrix effectively is one of its main difficulties. In this paper, a “divide and conquer” strategy is proposed to model the similarity matrix construction task by adopting Multiobjective evolutionary algorithm (MOEA). The whole procedure is divided into two phases, phase I aims to determine the nonzero entries of the similarity matrix, and Phase II aims to determine the value of the nonzero entries of the similarity matrix. In phase I, the main contribution is that we model the task as a biobjective dynamic optimization problem, which optimizes the diversity and the similarity at the same time. It makes each individual determine one nonzero entry for each sample, and the encoding length decreases to O(N) in contrast with the non-ensemble multiobjective spectral clustering. In addition, a specific initialization operator and diversity preservation strategy are proposed during this phase. In phase II, three ensemble strategies are designed to determine the value of the nonzero value of the similarity matrix. Furthermore, this Pareto ensemble framework is extended to semi-supervised clustering by transforming the semi-supervised information to constraints. In contrast with the previous multiobjective evolutionary-based spectral clustering algorithms, the proposed Pareto ensemble-based framework makes a balance between time cost and the clustering accuracy, which is demonstrated in the experiments section.

Download Full-text

Relation Structure-Aware Heterogeneous Information Network Embedding

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014456 ◽

2019 ◽

Vol 33 ◽

pp. 4456-4463 ◽

Cited By ~ 8

Author(s):

Yuanfu Lu ◽

Chuan Shi ◽

Linmei Hu ◽

Zhiyuan Liu

Keyword(s):

Real World ◽

Dimensional Space ◽

Structural Characteristics ◽

Information Network ◽

Network Embedding ◽

Heterogeneous Information Network ◽

Heterogeneous Information ◽

Real World Datasets ◽

Low Dimensional ◽

Embedding Methods

Heterogeneous information network (HIN) embedding aims to embed multiple types of nodes into a low-dimensional space. Although most existing HIN embedding methods consider heterogeneous relations in HINs, they usually employ one single model for all relations without distinction, which inevitably restricts the capability of network embedding. In this paper, we take the structural characteristics of heterogeneous relations into consideration and propose a novel Relation structure-aware Heterogeneous Information Network Embedding model (RHINE). By exploring the real-world networks with thorough mathematical analysis, we present two structure-related measures which can consistently distinguish heterogeneous relations into two categories: Affiliation Relations (ARs) and Interaction Relations (IRs). To respect the distinctive characteristics of relations, in our RHINE, we propose different models specifically tailored to handle ARs and IRs, which can better capture the structures and semantics of the networks. At last, we combine and optimize these models in a unified and elegant manner. Extensive experiments on three real-world datasets demonstrate that our model significantly outperforms the state-of-the-art methods in various tasks, including node clustering, link prediction, and node classification.

Download Full-text

NEURAL ADAPTIVE CONTROL OF NONLINEAR MULTIVARIABLE SYSTEMS WITH APPLICATION TO A CLASS OF INVERTED PENDULUMS

International Journal of Neural Systems ◽

10.1142/s0129065702001254 ◽

2002 ◽

Vol 12 (05) ◽

pp. 411-424

Author(s):

SHOULING HE

Keyword(s):

Neural Networks ◽

Control System ◽

Control Algorithm ◽

Learning Algorithm ◽

Dimensional Space ◽

Three Dimensional ◽

Base Point ◽

Upper Body ◽

Model Parameters ◽

State Variables

In this paper multilayer neural networks (MNNs) are used to control the balancing of a class of inverted pendulums. Unlike normal inverted pendulums, the pendulum discussed here has two degrees of rotational freedom and the base-point moves randomly in three-dimensional space. The goal is to apply control torques to keep the pendulum in a prescribed position in spite of the random movement at the base-point. Since the inclusion of the base-point motion leads to a non-autonomous dynamic system with time-varying parametric excitation, the design of the control system is a challenging task. A feedback control algorithm is proposed that utilizes a set of neural networks to compensate for the effect of the system's nonlinearities. The weight parameters of neural networks updated on-line, according to a learning algorithm that guarantees the Lyapunov stability of the control system. Furthermore, since the base-point movement is considered unmeasurable, a neural inverse model is employed to estimate it from only measured state variables. The estimate is then utilized within the main control algorithm to produce compensating control signals. The examination of the proposed control system, through simulations, demonstrates the promise of the methodology and exhibits positive aspects, which cannot be achieved by the previously developed techniques on the same problem. These aspects include fast, yet well-maintained damped responses with reasonable control torques and no requirement for knowledge of the model or the model parameters. The work presented here can benefit practical problems such as the study of stable locomotion of human upper body and bipedal robots.

Download Full-text

Choosing the Optimal Contact Force Distribution for Multi-Limbed Mobile Robots With Three Feet Contact

Volume 2: 29th Design Automation Conference, Parts A and B ◽

10.1115/detc2003/dac-48837 ◽

2003 ◽

Cited By ~ 1

Author(s):

Dennis W. Hong ◽

Raymond J. Cipra

Keyword(s):

Contact Force ◽

Dimensional Space ◽

Contact Point ◽

Optimal Solution ◽

Solution Space ◽

Contact Forces ◽

Force Distribution ◽

Normal Vector ◽

Optimization Criteria ◽

Contact Force Distribution

One of the inherent problems of multi-limbed mobile robotic systems is the problem of multi-contact force distribution; the contact forces and moments at the feet required to support it and those required by its tasks are indeterminate. A new strategy for choosing an optimal solution for the contact force distribution of multi-limbed robots with three feet in contact with the environment in three-dimensional space is presented. The optimal solution is found using a two-step approach: first finding the description of the entire solution space for the contact force distribution for a statically stable stance under friction constraints, and then choosing an optimal solution in this solution space which maximizes the objectives given by the chosen optimization criteria. An incremental strategy of opening up the friction cones is developed to produce the optimal solution which is defined as the one whose foot contact force vector is closest to the surface normal vector for robustness against slipping. The procedure is aided by using the “force space graph” which indicates where this solution is positioned in the solution space to give insight into the quality of the chosen solution and to provide robustness against disturbances. The “margin against slip with contact point priority” approach is also presented which finds an optimal solution with different priorities given to each foot contact point for the case when one foot is more critical than the other. Examples are presented to illustrate certain aspects of the method and ideas for other optimization criteria are discussed.

Download Full-text

Joint Learning of Spectral Clustering Structure and Fuzzy Similarity Matrix of Data

IEEE Transactions on Fuzzy Systems ◽

10.1109/tfuzz.2018.2856081 ◽

2019 ◽

Vol 27 (1) ◽

pp. 31-44 ◽

Cited By ~ 5

Author(s):

Zekang Bian ◽

Hisao Ishibuchi ◽

Shitong Wang

Keyword(s):

Spectral Clustering ◽

Similarity Matrix ◽

Joint Learning ◽

Fuzzy Similarity

Download Full-text