Development of Fast and Reliable Nature-Inspired Computing for Supervised Learning in High-Dimensional Data

Nature Inspired Computing for Data Science - Studies in Computational Intelligence ◽

10.1007/978-3-030-33820-6_5 ◽

2019 ◽

pp. 109-138 ◽

Author(s):

Hiram Ponce ◽

Guillermo González-Mora ◽

Elizabeth Morales-Olvera ◽

Paulo Souza

Keyword(s):

Supervised Learning ◽

High Dimensional Data ◽

High Dimensional ◽

Nature Inspired Computing

Download Full-text

On Cluster-Aware Supervised Learning: Frameworks, Convergent Algorithms, and Applications

INFORMS Journal on Computing ◽

10.1287/ijoc.2020.1053 ◽

2021 ◽

Author(s):

Shutong Chen ◽

Weijun Xie

Keyword(s):

Supervised Learning ◽

Random Forests ◽

Stationary Point ◽

Clustering Analysis ◽

Prediction Accuracy ◽

High Dimensional Data ◽

Computational Time ◽

High Dimensional ◽

Support Vector ◽

Numerical Studies

This paper proposes a cluster-aware supervised learning (CluSL) framework, which integrates the clustering analysis with supervised learning. The objective of CluSL is to simultaneously find the best clusters of the data points and minimize the sum of loss functions within each cluster. This framework has many potential applications in healthcare, operations management, manufacturing, and so on. Because CluSL, in general, is nonconvex, we develop a regularized alternating minimization (RAM) algorithm to solve it, where at each iteration, we penalize the distance between the current clustering solution and the one from the previous iteration. By choosing a proper penalty function, we show that each iteration of the RAM algorithm can be computed efficiently. We further prove that the proposed RAM algorithm will always converge to a stationary point within a finite number of iterations. This is the first known convergence result in cluster-aware learning literature. Furthermore, we extend CluSL to the high-dimensional data sets, termed the F-CluSL framework. In F-CluSL, we cluster features and minimize loss function at the same time. Similarly, to solve F-CluSL, a variant of the RAM algorithm (i.e., F-RAM) is developed and proven to be convergent to an [Formula: see text]-stationary point. Our numerical studies demonstrate that the proposed CluSL and F-CluSL can outperform the existing ones such as random forests and support vector classification, both in the interpretability of learning results and in prediction accuracy. Summary of Contribution: Aligned with the mission and scope of the INFORMS Journal on Computing, this paper proposes a cluster-aware supervised learning (CluSL) framework, which integrates the clustering analysis with supervised learning. Because CluSL is, in general, nonconvex, a regularized alternating projection algorithm is developed to solve it and is proven to always find a stationary solution. We further generalize the framework to the high-dimensional data set, F-CluSL. Our numerical studies demonstrate that the proposed CluSL and F-CluSL can deliver more interpretable learning results and outperform the existing ones such as random forests and support vector classification in computational time and prediction accuracy.

Download Full-text

Computed Data-Geometry Based Supervised and Semi-supervised Learning in High Dimensional Data

2013 12th International Conference on Machine Learning and Applications ◽

10.1109/icmla.2013.56 ◽

2013 ◽

Author(s):

Elizabeth P. Chou ◽

Fushing Hsieh ◽

John Capitanio

Keyword(s):

Supervised Learning ◽

High Dimensional Data ◽

High Dimensional

Download Full-text

ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system

Nucleic Acids Research ◽

10.1093/nar/gkw030 ◽

2016 ◽

Vol 44 (8) ◽

pp. e80-e80 ◽

Author(s):

Jacob J. Hughey ◽

Trevor Hastie ◽

Atul J. Butte

Keyword(s):

Supervised Learning ◽

High Dimensional Data ◽

Oscillatory System ◽

High Dimensional

Download Full-text

Random Multi-Graphs: A semi-supervised learning framework for classification of high dimensional data

Image and Vision Computing ◽

10.1016/j.imavis.2016.08.006 ◽

2017 ◽

Vol 60 ◽

pp. 30-37 ◽

Author(s):

Qin Zhang ◽

Jianyuan Sun ◽

Guoqiang Zhong ◽

Junyu Dong

Keyword(s):

Supervised Learning ◽

High Dimensional Data ◽

High Dimensional ◽

Learning Framework

Download Full-text

Large Sample Covariance Matrices and High-Dimensional Data Analysis

10.1017/cbo9781107588080 ◽

2015 ◽

Author(s):

Jianfeng Yao ◽

Shurong Zheng ◽

Zhidong Bai

Keyword(s):

Data Analysis ◽

High Dimensional Data ◽

Covariance Matrices ◽

High Dimensional ◽

Large Sample ◽

Sample Covariance Matrices ◽

Sample Covariance ◽

High Dimensional Data Analysis

Download Full-text

Fractal-Based Methods as a Technique for Estimating the Intrinsic Dimensionality of High-Dimensional Data: A Survey

Informatica ◽

10.15388/informatica.2016.84 ◽

2016 ◽

Vol 27 (2) ◽

pp. 257-281 ◽

Author(s):

Rasa Karbauskaitė ◽

Gintautas Dzemyda

Keyword(s):

High Dimensional Data ◽

High Dimensional ◽

Intrinsic Dimensionality

Download Full-text

A Fast Clustering Algorithm for Large-scale and High Dimensional Data

ACTA AUTOMATICA SINICA ◽

10.3724/sp.j.1004.2009.00859 ◽

2009 ◽

Vol 35 (7) ◽

pp. 859-866

Author(s):

Ming LIU ◽

Xiao-Long WANG ◽

Yuan-Chao LIU

Keyword(s):

Large Scale ◽

Clustering Algorithm ◽

High Dimensional Data ◽

High Dimensional

Download Full-text

Improved negative selection algorithm for network anomaly detection on high-dimensional data

Journal of Computer Applications ◽

10.3724/sp.j.1087.2009.00805 ◽

2009 ◽

Vol 29 (3) ◽

pp. 805-807 ◽

Author(s):

Wen-zhong GUO ◽

Guo-long CHEN ◽

Qing-liang CHEN

Keyword(s):

Anomaly Detection ◽

Negative Selection ◽

High Dimensional Data ◽

High Dimensional ◽

Selection Algorithm ◽

Negative Selection Algorithm ◽

Network Anomaly Detection

Download Full-text

An Advanced Mining Services in Predicting and Ranking User Vitality across Dynamic and High Dimensional Data Sets

SSRN Electronic Journal ◽

10.2139/ssrn.3395242 ◽

2019 ◽

Author(s):

Ch. Durga Bhavani ◽

Dr. A. Daveedu Raju ◽

Dr. V. Surya Narayana

Keyword(s):

High Dimensional Data ◽

High Dimensional ◽

Download Full-text

Outlier Detection in High Dimensional Data Based on the Anti-Hub and Regression Technique

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2017.8219 ◽

2017 ◽

Vol V (VIII) ◽

pp. 1543-1551

Author(s):

Golla Hemalatha

Keyword(s):

Outlier Detection ◽

High Dimensional Data ◽

Regression Technique ◽

High Dimensional

Download Full-text