Manifold regularization ensemble clustering with many objectives using unsupervised extreme learning machines

2021 ◽  
Vol 25 (4) ◽  
pp. 847-862
Author(s):  
Haleh Homayouni ◽  
Eghbal G. Mansoori

Spectral clustering has been an effective clustering method, in last decades, because it can get an optimal solution without any assumptions on data’s structure. The basic key in spectral clustering is its similarity matrix. Despite many empirical successes in similarity matrix construction, almost all previous methods suffer from handling just one objective. To address the multi-objective ensemble clustering, we introduce a new ensemble manifold regularization (MR) method based on stacking framework. In our Manifold Regularization Ensemble Clustering (MREC) method, several objective functions are considered simultaneously, as a robust method for constructing the similarity matrix. Using it, the unsupervised extreme learning machine (UELM) is employed to find the generalized eigenvectors to embed the data in low-dimensional space. These eigenvectors are then used as the base point in spectral clustering to find the best partitioning of the data. The aims of this paper are to find robust partitioning that satisfy multiple objectives, handling noisy data, keeping diversity-based goals, and dimension reduction. Experiments on some real-world datasets besides to three benchmark protein datasets demonstrate the superiority of MREC over some state-of-the-art single and ensemble methods.

2021 ◽  
Vol 68 (4) ◽  
pp. 1-25
Author(s):  
Thodoris Lykouris ◽  
Sergei Vassilvitskii

Traditional online algorithms encapsulate decision making under uncertainty, and give ways to hedge against all possible future events, while guaranteeing a nearly optimal solution, as compared to an offline optimum. On the other hand, machine learning algorithms are in the business of extrapolating patterns found in the data to predict the future, and usually come with strong guarantees on the expected generalization error. In this work, we develop a framework for augmenting online algorithms with a machine learned predictor to achieve competitive ratios that provably improve upon unconditional worst-case lower bounds when the predictor has low error. Our approach treats the predictor as a complete black box and is not dependent on its inner workings or the exact distribution of its errors. We apply this framework to the traditional caching problem—creating an eviction strategy for a cache of size k . We demonstrate that naively following the oracle’s recommendations may lead to very poor performance, even when the average error is quite low. Instead, we show how to modify the Marker algorithm to take into account the predictions and prove that this combined approach achieves a competitive ratio that both (i) decreases as the predictor’s error decreases and (ii) is always capped by O (log k ), which can be achieved without any assistance from the predictor. We complement our results with an empirical evaluation of our algorithm on real-world datasets and show that it performs well empirically even when using simple off-the-shelf predictions.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2849
Author(s):  
Sungbum Jun

Due to the recent advance in the industrial Internet of Things (IoT) in manufacturing, the vast amount of data from sensors has triggered the need for leveraging such big data for fault detection. In particular, interpretable machine learning techniques, such as tree-based algorithms, have drawn attention to the need to implement reliable manufacturing systems, and identify the root causes of faults. However, despite the high interpretability of decision trees, tree-based models make a trade-off between accuracy and interpretability. In order to improve the tree’s performance while maintaining its interpretability, an evolutionary algorithm for discretization of multiple attributes, called Decision tree Improved by Multiple sPLits with Evolutionary algorithm for Discretization (DIMPLED), is proposed. The experimental results with two real-world datasets from sensors showed that the decision tree improved by DIMPLED outperformed the performances of single-decision-tree models (C4.5 and CART) that are widely used in practice, and it proved competitive compared to the ensemble methods, which have multiple decision trees. Even though the ensemble methods could produce slightly better performances, the proposed DIMPLED has a more interpretable structure, while maintaining an appropriate performance level.


2021 ◽  
Vol 12 (4) ◽  
pp. 118-131
Author(s):  
Jaya Krishna Raguru ◽  
Devi Prasad Sharma

The problem of identifying a seed set composed of K nodes that increase influence spread over a social network is known as influence maximization (IM). Past works showed this problem to be NP-hard and an optimal solution to this problem using greedy algorithms achieved only 63% of spread. However, this approach is expensive and suffered from performance issues like high computational cost. Furthermore, in a network with communities, IM spread is not always certain. In this paper, heterogeneous influence maximization through community detection (HIMCD) algorithm is proposed. This approach addresses initial seed nodes selection in communities using various centrality measures, and these seed nodes act as sources for influence spread. A parallel influence maximization is applied with the aid of seed node set contained in each group. In this approach, graph is partitioned and IM computations are done in a distributed manner. Extensive experiments with two real-world datasets reveals that HCDIM achieves substantial performance improvement over state-of-the-art techniques.


Author(s):  
Yang Fang ◽  
Xiang Zhao ◽  
Zhen Tan

Network Embedding (NE) is an important method to learn the representations of network via a low-dimensional space. Conventional NE models focus on capturing the structure information and semantic information of vertices while neglecting such information for edges. In this work, we propose a novel NE model named BimoNet to capture both the structure and semantic information of edges. BimoNet is composed of two parts, i.e., the bi-mode embedding part and the deep neural network part. For bi-mode embedding part, the first mode named add-mode is used to express the entity-shared features of edges and the second mode named subtract-mode is employed to represent the entity-specific features of edges. These features actually reflect the semantic information. For deep neural network part, we firstly regard the edges in a network as nodes, and the vertices as links, which will not change the overall structure of the whole network. Then we take the nodes' adjacent matrix as the input of the deep neural network as it can obtain similar representations for nodes with similar structure. Afterwards, by jointly optimizing the objective function of these two parts, BimoNet could preserve both the semantic and structure information of edges. In experiments, we evaluate BimoNet on three real-world datasets and task of relation extraction, and BimoNet is demonstrated to outperform state-of-the-art baseline models consistently and significantly.


Processes ◽  
2020 ◽  
Vol 8 (9) ◽  
pp. 1132
Author(s):  
Deting Kong ◽  
Yuan Wang ◽  
Xinyan Wu ◽  
Xiyu Liu ◽  
Jianhua Qu ◽  
...  

In this paper, we propose a novel clustering approach based on P systems and grid- density strategy. We present grid-density based approach for clustering high dimensional data, which first projects the data patterns on a two-dimensional space to overcome the curse of dimensionality problem. Then, through meshing the plane with grid lines and deleting sparse grids, clusters are found out. In particular, we present weighted spiking neural P systems with anti-spikes and astrocyte (WSNPA2 in short) to implement grid-density based approach in parallel. Each neuron in weighted SN P system contains a spike, which can be expressed by a computable real number. Spikes and anti-spikes are inspired by neurons communicating through excitatory and inhibitory impulses. Astrocytes have excitatory and inhibitory influence on synapses. Experimental results on multiple real-world datasets demonstrate the effectiveness and efficiency of our approach.


Author(s):  
Juanjuan Luo ◽  
Huadong Ma ◽  
Dongqing Zhou

Abstract Similarity matrix has a significant effect on the performance of the spectral clustering, and how to determine the neighborhood in the similarity matrix effectively is one of its main difficulties. In this paper, a “divide and conquer” strategy is proposed to model the similarity matrix construction task by adopting Multiobjective evolutionary algorithm (MOEA). The whole procedure is divided into two phases, phase I aims to determine the nonzero entries of the similarity matrix, and Phase II aims to determine the value of the nonzero entries of the similarity matrix. In phase I, the main contribution is that we model the task as a biobjective dynamic optimization problem, which optimizes the diversity and the similarity at the same time. It makes each individual determine one nonzero entry for each sample, and the encoding length decreases to O(N) in contrast with the non-ensemble multiobjective spectral clustering. In addition, a specific initialization operator and diversity preservation strategy are proposed during this phase. In phase II, three ensemble strategies are designed to determine the value of the nonzero value of the similarity matrix. Furthermore, this Pareto ensemble framework is extended to semi-supervised clustering by transforming the semi-supervised information to constraints. In contrast with the previous multiobjective evolutionary-based spectral clustering algorithms, the proposed Pareto ensemble-based framework makes a balance between time cost and the clustering accuracy, which is demonstrated in the experiments section.


Author(s):  
Yuanfu Lu ◽  
Chuan Shi ◽  
Linmei Hu ◽  
Zhiyuan Liu

Heterogeneous information network (HIN) embedding aims to embed multiple types of nodes into a low-dimensional space. Although most existing HIN embedding methods consider heterogeneous relations in HINs, they usually employ one single model for all relations without distinction, which inevitably restricts the capability of network embedding. In this paper, we take the structural characteristics of heterogeneous relations into consideration and propose a novel Relation structure-aware Heterogeneous Information Network Embedding model (RHINE). By exploring the real-world networks with thorough mathematical analysis, we present two structure-related measures which can consistently distinguish heterogeneous relations into two categories: Affiliation Relations (ARs) and Interaction Relations (IRs). To respect the distinctive characteristics of relations, in our RHINE, we propose different models specifically tailored to handle ARs and IRs, which can better capture the structures and semantics of the networks. At last, we combine and optimize these models in a unified and elegant manner. Extensive experiments on three real-world datasets demonstrate that our model significantly outperforms the state-of-the-art methods in various tasks, including node clustering, link prediction, and node classification.


2002 ◽  
Vol 12 (05) ◽  
pp. 411-424
Author(s):  
SHOULING HE

In this paper multilayer neural networks (MNNs) are used to control the balancing of a class of inverted pendulums. Unlike normal inverted pendulums, the pendulum discussed here has two degrees of rotational freedom and the base-point moves randomly in three-dimensional space. The goal is to apply control torques to keep the pendulum in a prescribed position in spite of the random movement at the base-point. Since the inclusion of the base-point motion leads to a non-autonomous dynamic system with time-varying parametric excitation, the design of the control system is a challenging task. A feedback control algorithm is proposed that utilizes a set of neural networks to compensate for the effect of the system's nonlinearities. The weight parameters of neural networks updated on-line, according to a learning algorithm that guarantees the Lyapunov stability of the control system. Furthermore, since the base-point movement is considered unmeasurable, a neural inverse model is employed to estimate it from only measured state variables. The estimate is then utilized within the main control algorithm to produce compensating control signals. The examination of the proposed control system, through simulations, demonstrates the promise of the methodology and exhibits positive aspects, which cannot be achieved by the previously developed techniques on the same problem. These aspects include fast, yet well-maintained damped responses with reasonable control torques and no requirement for knowledge of the model or the model parameters. The work presented here can benefit practical problems such as the study of stable locomotion of human upper body and bipedal robots.


Author(s):  
Dennis W. Hong ◽  
Raymond J. Cipra

One of the inherent problems of multi-limbed mobile robotic systems is the problem of multi-contact force distribution; the contact forces and moments at the feet required to support it and those required by its tasks are indeterminate. A new strategy for choosing an optimal solution for the contact force distribution of multi-limbed robots with three feet in contact with the environment in three-dimensional space is presented. The optimal solution is found using a two-step approach: first finding the description of the entire solution space for the contact force distribution for a statically stable stance under friction constraints, and then choosing an optimal solution in this solution space which maximizes the objectives given by the chosen optimization criteria. An incremental strategy of opening up the friction cones is developed to produce the optimal solution which is defined as the one whose foot contact force vector is closest to the surface normal vector for robustness against slipping. The procedure is aided by using the “force space graph” which indicates where this solution is positioned in the solution space to give insight into the quality of the chosen solution and to provide robustness against disturbances. The “margin against slip with contact point priority” approach is also presented which finds an optimal solution with different priorities given to each foot contact point for the case when one foot is more critical than the other. Examples are presented to illustrate certain aspects of the method and ideas for other optimization criteria are discussed.


Sign in / Sign up

Export Citation Format

Share Document