An adaptive Dendrite-HDMR metamodeling technique for high dimensional problems

2022 ◽  
pp. 1-38
Author(s):  
Qi Zhang ◽  
Yizhong Wu ◽  
Li Lu ◽  
Ping Qiao

Abstract High dimensional model representation (HDMR), decomposing the high-dimensional problem into summands of different order component terms, has been widely researched to work out the dilemma of “curse-of-dimensionality” when using surrogate techniques to approximate high-dimensional problems in engineering design. However, the available one-metamodel-based HDMRs usually encounter the predicament of prediction uncertainty, while current multi-metamodels-based HDMRs cannot provide simple explicit expressions for black-box problems, and have high computational complexity in terms of constructing the model by the explored points and predicting the responses of unobserved locations. Therefore, aimed at such problems, a new stand-alone HDMR metamodeling technique, termed as Dendrite-HDMR, is proposed in this study based on the hierarchical Cut-HDMR and the white-box machine learning algorithm, Dendrite Net. The proposed Dendrite-HDMR not only provides succinct and explicit expressions in the form of Taylor expansion, but also has relatively higher accuracy and stronger stability for most mathematical functions than other classical HDMRs with the assistance of the proposed adaptive sampling strategy, named KKMC, in which k-means clustering algorithm, k-Nearest Neighbor classification algorithm and the maximum curvature information of the provided expression are utilized to sample new points to refine the model. Finally, the Dendrite-HDMR technique is applied to solve the design optimization problem of the solid launch vehicle propulsion system with the purpose of improving the impulse-weight ratio, which represents the design level of the propulsion system.

Author(s):  
Yaping Ju ◽  
Geoff Parks ◽  
Chuhua Zhang

A major challenge of metamodeling in simulation-based engineering design optimization is to handle the “curse of dimensionality,” i.e. the exponential growth of computational cost with increase of problem dimensionality. Encouragingly, it has been reported recently that a high-dimensional model representation assisted by a radial basis function is capable of deriving high-dimensional input–output relationships at dramatically reduced computational cost. In this article, support vector regression is employed as an alternative to be coupled with high-dimensional model representation for the metamodeling of high-dimensional problems. In particular, the bisection sampling method is proposed to be used in the metamodeling process to generate high-quality training samples. Testing and comparison results show that the developed bisection-sampling-based support vector regression–high-dimensional model representation metamodeling technique can achieve high modeling accuracy with a smaller number of training sample evaluations. For the problem examined in this study, the bisection-sampling-based support vector regression–high-dimensional model representation enables high modeling accuracy and linear computational complexity as the problem dimensionality increases. Analysis of this performance advantage shows that the use of bisection method enables the developed metamodeling technique to be more effective in dealing with high-dimensional problems.


2012 ◽  
Vol 263-266 ◽  
pp. 2126-2130 ◽  
Author(s):  
Zhi Gang Lou ◽  
Hong Zhao Liu

Manifold learning is a new unsupervised learning method. Its main purpose is to find the inherent law of generated data sets. Be used for high dimensional nonlinear fault samples for learning, in order to identify embedded in high dimensional data space in the low dimensional manifold, can be effective data found the essential characteristics of fault identification. In many types of fault, sometimes often failure and normal operation of the equipment of some operation similar to misjudgment, such as oil pipeline transportation process, pipeline regulating pump, adjustable valve, pump switch, normal operation and pipeline leakage fault condition similar spectral characteristics, thus easy for pipeline leakage cause mistakes. This paper uses the manifold learning algorithm for fault pattern clustering recognition, and through experiments on the algorithm is evaluated.


2009 ◽  
Vol 03 (04) ◽  
pp. 399-419
Author(s):  
ASLI CELIKYILMAZ

Unsupervised spectral clustering methods can yield good performance when identifying crisp clusters with low complexity since the learning algorithm does not rely on finding the local minima of an objective function and rather uses spectral properties of the graph. Nonetheless, the performance of such approaches are usually affected by their uncertain parameters. Using the underlying structure of a general spectral clustering method, in this paper a new soft-link spectral clustering algorithm is introduced to identify clusters based on fuzzy k-nearest neighbor approach. We construct a soft weight matrix of a graph by identifying the upper and lower boundaries of learning parameters of the similarity function, specifically the fuzzifier parameter (fuzziness) of the Fuzzy k-Nearest Neighbor algorithm. The algorithm allows perturbations on the graph Laplace during the learning stage by the changes on such learning parameters. With the empirical analysis using an artificial and a real textual entailment dataset, we demonstrate that our initial hypothesis of implementing soft links for spectral clustering can improve the classification performance of final outcome.


2020 ◽  
Vol 10 (10) ◽  
pp. 270
Author(s):  
Dang-Nhac Lu ◽  
Hong-Quang Le ◽  
Tuan-Ha Vu

The Covid-19 epidemic is affecting all areas of life, including the training activities of universities around the world. Therefore, the online learning method is an effective method in the present time and is used by many universities. However, not all training institutions have sufficient conditions, resources, and experience to carry out online learning, especially in under-resourced developing countries. Therefore, the construction of traditional courses (face to face), e-learning, or blended learning in limited conditions that still meet the needs of students is a problem faced by many universities today. To solve this problem, we propose a method of evaluating the influence of these factors on the e-learning system. From there, it is a matter of clarifying the importance and prioritizing construction investment for each factor based on the K-means clustering algorithm, using the data of students who have been participating in the system. At the same time, we propose a model to support students to choose one of the learning methods, such as traditional, e-learning or blended learning, which is suitable for their skills and abilities. The data classification method with the algorithms multilayer perceptron (MP), random forest (RF), K-nearest neighbor (KNN), support vector machine (SVM) and naïve bayes (NB) is applied to find the model fit. The experiment was conducted on 679 data samples collected from 303 students studying at the Academy of Journalism and Communication (AJC), Vietnam. With our proposed method, the results are obtained from experimentation for the different effects of infrastructure, teachers, and courses, also as features of these factors. At the same time, the accuracy of the prediction results which help students to choose an appropriate learning method is up to 81.52%.


2015 ◽  
pp. 125-138 ◽  
Author(s):  
I. V. Goncharenko

In this article we proposed a new method of non-hierarchical cluster analysis using k-nearest-neighbor graph and discussed it with respect to vegetation classification. The method of k-nearest neighbor (k-NN) classification was originally developed in 1951 (Fix, Hodges, 1951). Later a term “k-NN graph” and a few algorithms of k-NN clustering appeared (Cover, Hart, 1967; Brito et al., 1997). In biology k-NN is used in analysis of protein structures and genome sequences. Most of k-NN clustering algorithms build «excessive» graph firstly, so called hypergraph, and then truncate it to subgraphs, just partitioning and coarsening hypergraph. We developed other strategy, the “upward” clustering in forming (assembling consequentially) one cluster after the other. Until today graph-based cluster analysis has not been considered concerning classification of vegetation datasets.


2009 ◽  
Vol 35 (7) ◽  
pp. 859-866
Author(s):  
Ming LIU ◽  
Xiao-Long WANG ◽  
Yuan-Chao LIU

Mathematics ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 779
Author(s):  
Ruriko Yoshida

A tropical ball is a ball defined by the tropical metric over the tropical projective torus. In this paper we show several properties of tropical balls over the tropical projective torus and also over the space of phylogenetic trees with a given set of leaf labels. Then we discuss its application to the K nearest neighbors (KNN) algorithm, a supervised learning method used to classify a high-dimensional vector into given categories by looking at a ball centered at the vector, which contains K vectors in the space.


2021 ◽  
pp. 1-17
Author(s):  
Ahmed Al-Tarawneh ◽  
Ja’afer Al-Saraireh

Twitter is one of the most popular platforms used to share and post ideas. Hackers and anonymous attackers use these platforms maliciously, and their behavior can be used to predict the risk of future attacks, by gathering and classifying hackers’ tweets using machine-learning techniques. Previous approaches for detecting infected tweets are based on human efforts or text analysis, thus they are limited to capturing the hidden text between tweet lines. The main aim of this research paper is to enhance the efficiency of hacker detection for the Twitter platform using the complex networks technique with adapted machine learning algorithms. This work presents a methodology that collects a list of users with their followers who are sharing their posts that have similar interests from a hackers’ community on Twitter. The list is built based on a set of suggested keywords that are the commonly used terms by hackers in their tweets. After that, a complex network is generated for all users to find relations among them in terms of network centrality, closeness, and betweenness. After extracting these values, a dataset of the most influential users in the hacker community is assembled. Subsequently, tweets belonging to users in the extracted dataset are gathered and classified into positive and negative classes. The output of this process is utilized with a machine learning process by applying different algorithms. This research build and investigate an accurate dataset containing real users who belong to a hackers’ community. Correctly, classified instances were measured for accuracy using the average values of K-nearest neighbor, Naive Bayes, Random Tree, and the support vector machine techniques, demonstrating about 90% and 88% accuracy for cross-validation and percentage split respectively. Consequently, the proposed network cyber Twitter model is able to detect hackers, and determine if tweets pose a risk to future institutions and individuals to provide early warning of possible attacks.


Sign in / Sign up

Export Citation Format

Share Document