scholarly journals Adaptive sampling using fleets of underwater gliders in the presence of fixed buoys using a constrained clustering algorithm

Author(s):  
Marco Cococcioni ◽  
Beatrice Lazzerini ◽  
Pierre F.J. Lermusiaux
2022 ◽  
pp. 1-38
Author(s):  
Qi Zhang ◽  
Yizhong Wu ◽  
Li Lu ◽  
Ping Qiao

Abstract High dimensional model representation (HDMR), decomposing the high-dimensional problem into summands of different order component terms, has been widely researched to work out the dilemma of “curse-of-dimensionality” when using surrogate techniques to approximate high-dimensional problems in engineering design. However, the available one-metamodel-based HDMRs usually encounter the predicament of prediction uncertainty, while current multi-metamodels-based HDMRs cannot provide simple explicit expressions for black-box problems, and have high computational complexity in terms of constructing the model by the explored points and predicting the responses of unobserved locations. Therefore, aimed at such problems, a new stand-alone HDMR metamodeling technique, termed as Dendrite-HDMR, is proposed in this study based on the hierarchical Cut-HDMR and the white-box machine learning algorithm, Dendrite Net. The proposed Dendrite-HDMR not only provides succinct and explicit expressions in the form of Taylor expansion, but also has relatively higher accuracy and stronger stability for most mathematical functions than other classical HDMRs with the assistance of the proposed adaptive sampling strategy, named KKMC, in which k-means clustering algorithm, k-Nearest Neighbor classification algorithm and the maximum curvature information of the provided expression are utilized to sample new points to refine the model. Finally, the Dendrite-HDMR technique is applied to solve the design optimization problem of the solid launch vehicle propulsion system with the purpose of improving the impulse-weight ratio, which represents the design level of the propulsion system.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-16
Author(s):  
M. A. Balafar ◽  
R. Hazratgholizadeh ◽  
M. R. F. Derakhshi

Constrained clustering is intended to improve accuracy and personalization based on the constraints expressed by an Oracle. In this paper, a new constrained clustering algorithm is proposed and some of the informative data pairs are selected during an iterative process. Then, they are presented to the Oracle and their relation is answered with “Must-link (ML) or Cannot-link (CL).” In each iteration, first, the support vector machine (SVM) is utilized based on the label produced by the current clustering. According to the distance of each document from the hyperplane, the distance matrix is created. Also, based on cosine similarity of word2vector of each document, the similarity matrix is created. Two types of probability (similarity and degree of similarity) are calculated and they are smoothed for belonging to neighborhoods. Neighborhoods form the samples that are labeled by Oracle, to be in the same cluster. Finally, at the end of each iteration, the data with a greater level of uncertainty (in term of probability) is selected for questioning the oracle. In order to evaluate, the proposed method is compared with famous state-of-the-art methods based on two criteria and over a standard dataset. The result demonstrates an increased accuracy and stability of the obtained result with fewer questions.


2017 ◽  
Vol 25 (2) ◽  
pp. 96-113 ◽  
Author(s):  
Matin Macktoobian ◽  
Mahdi Aliyari Sh

A spatially-constrained clustering algorithm is presented in this paper. This algorithm is a distributed clustering approach to fine-tune the optimal distances between agents of the system to strengthen the data passing among them using a set of spatial constraints. In fact, this method will increase interconnectivity among agents and clusters, leading to improvement of the overall communicative functionality of the multi-robot system. This strategy will lead to the establishment of loosely-coupled connections among the clusters. These implicit interconnections will mobilize the clusters to receive and transmit information within the multi-agent system. In other words, this algorithm classifies each agent into the clusters with the lowest cost of local communication with its peers. This research demonstrates that the presented decentralized method will actually boost the communicative agility of the swarm by probabilistic proof of the acquired optimality. Hence, the common assumption regarding the full-knowledge of the agents’ primary locations has been fully relaxed compared to former methods. Consequently, the algorithm’s reliability and efficiency is confirmed. Furthermore, the method’s efficacy in passing information will improve the functionality of higher-level swarm operations, such as task assignment and swarm flocking. Analytical investigations and simulated accomplishments, corresponding to highly-populated swarms, prove the claimed efficiency and coherence.


Author(s):  
Erliang Zeng ◽  
Chengyong Yang ◽  
Tao Li ◽  
Giri Narasimhan

Clustering of gene expression data is a standard exploratory technique used to identify closely related genes. Many other sources of data are also likely to be of great assistance in the analysis of gene expression data. This data provides a mean to begin elucidating the large-scale modular organization of the cell. The authors consider the challenging task of developing exploratory analytical techniques to deal with multiple complete and incomplete information sources. The Multi-Source Clustering (MSC) algorithm developed performs clustering with multiple, but complete, sources of data. To deal with incomplete data sources, the authors adopted the MPCK-means clustering algorithms to perform exploratory analysis on one complete source and other potentially incomplete sources provided in the form of constraints. This paper presents a new clustering algorithm MSC to perform exploratory analysis using two or more diverse but complete data sources, studies the effectiveness of constraints sets and robustness of the constrained clustering algorithm using multiple sources of incomplete biological data, and incorporates such incomplete data into constrained clustering algorithm in form of constraints sets.


Author(s):  
Erliang Zeng ◽  
Chengyong Yang ◽  
Tao Li ◽  
Giri Narasimhan

Clustering of gene expression data is a standard exploratory technique used to identify closely related genes. Many other sources of data are also likely to be of great assistance in the analysis of gene expression data. This data provides a mean to begin elucidating the large-scale modular organization of the cell. The authors consider the challenging task of developing exploratory analytical techniques to deal with multiple complete and incomplete information sources. The Multi-Source Clustering (MSC) algorithm developed performs clustering with multiple, but complete, sources of data. To deal with incomplete data sources, the authors adopted the MPCK-means clustering algorithms to perform exploratory analysis on one complete source and other potentially incomplete sources provided in the form of constraints. This paper presents a new clustering algorithm MSC to perform exploratory analysis using two or more diverse but complete data sources, studies the effectiveness of constraints sets and robustness of the constrained clustering algorithm using multiple sources of incomplete biological data, and incorporates such incomplete data into constrained clustering algorithm in form of constraints sets.


2014 ◽  
Vol 23 (04) ◽  
pp. 1460013 ◽  
Author(s):  
Marian-Andrei Rizoiu ◽  
Julien Velcin ◽  
Stéphane Lallich

In this paper, we propose a new time-aware dissimilarity measure that takes into account the temporal dimension. Observations that are close in the description space, but distant in time are considered as dissimilar. We also propose a method to enforce the segmentation contiguity, by introducing, in the objective function, a penalty term inspired from the Normal Distribution Function. We combine the two propositions into a novel time-driven constrained clustering algorithm, called TDCK-Means, which creates a partition of coherent clusters, both in the multidimensional space and in the temporal space. This algorithm uses soft semi-supervised constraints, to encourage adjacent observations belonging to the same entity to be assigned to the same cluster. We apply our algorithm to a Political Studies dataset in order to detect typical evolution phases. We adapt the Shannon entropy in order to measure the entity contiguity, and we show that our proposition consistently improves temporal cohesion of clusters, without any significant loss in the multidimensional variance.


Sign in / Sign up

Export Citation Format

Share Document