A new classification method of ancient Chinese ceramics based on machine learning and component analysis

Unsupervised learning in machine learning divides data into several groups. The observations in the same group have similar characteristics and the observations in the different groups have the different characteristics. In the paper, we classify data by partitioning around medoids which have some advantages over the k-means clustering. We apply it to baseball players in Korea Baseball League. We also apply the principal component analysis to data and draw the graph using two components for axis. We interpret the meaning of the clustering graphically through the procedure. The combination of the partitioning around medoids and the principal component analysis can be used to any other data and the approach makes us to figure out the characteristics easily.

Download Full-text

Analysis of the Bath Motion in the MM-SQC Dynamics Using Unsupervised Machine Learning Dimensionality Reduction Approaches: Principal Component Analysis

10.26434/chemrxiv.13332530 ◽

2020 ◽

Author(s):

Jiawei Peng ◽

Yu Xie ◽

Deping Hu ◽

Zhenggang Lan

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Collective Motion ◽

Principal Component ◽

Component Analysis ◽

Nonadiabatic Dynamics ◽

Trajectory Data ◽

Unsupervised Machine Learning ◽

Physical Knowledge ◽

Vibronic Couplings

The system-plus-bath model is an important tool to understand nonadiabatic dynamics for large molecular systems. The understanding of the collective motion of a huge number of bath modes is essential to reveal their key roles in the overall dynamics. We apply the principal component analysis (PCA) to investigate the bath motion based on the massive data generated from the MM-SQC (symmetrical quasi-classical dynamics method based on the Meyer-Miller mapping Hamiltonian) nonadiabatic dynamics of the excited-state energy transfer dynamics of Frenkel-exciton model. The PCA method clearly clarifies that two types of bath modes, which either display the strong vibronic couplings or have the frequencies close to electronic transition, are very important to the nonadiabatic dynamics. These observations are fully consistent with the physical insights. This conclusion is obtained purely based on the PCA understanding of the trajectory data, without the large involvement of pre-defined physical knowledge. The results show that the PCA approach, one of the simplest unsupervised machine learning methods, is very powerful to analyze the complicated nonadiabatic dynamics in condensed phase involving many degrees of freedom.

Download Full-text

An Effective Bridge Cracks Classification Method Based on Machine Learning

Proceedings of the 2020 4th International Conference on Electronic Information Technology and Computer Engineering ◽

10.1145/3443467.3443855 ◽

2020 ◽

Author(s):

Xiaoyan Zhang ◽

Xiaodong Wang

Keyword(s):

Machine Learning ◽

Classification Method

Download Full-text

A machine learning-based underwater noise classification method

Applied Acoustics ◽

10.1016/j.apacoust.2021.108333 ◽

2021 ◽

Vol 184 ◽

pp. 108333

Author(s):

Guoli Song ◽

Xinyi Guo ◽

Wenbo Wang ◽

Qunyan Ren ◽

Jun Li ◽

...

Keyword(s):

Machine Learning ◽

Classification Method ◽

Underwater Noise ◽

Noise Classification

Download Full-text

Comparative Analysis of Machine Learning Techniques with Principal Component Analysis on Kidney and Heart Disease

10.1109/icesc51422.2021.9533011 ◽

2021 ◽

Author(s):

Reena Chandra ◽

Manoj Kapil ◽

Avinash Sharma

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Heart Disease ◽

Comparative Analysis ◽

Principal Component ◽

Component Analysis ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

Enhanced Backpropagation Approach for Identifying Genetic Disease

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.622.75 ◽

2014 ◽

Vol 622 ◽

pp. 75-80

Author(s):

Baskar Nisha ◽

B. Madasamy ◽

J.Jebamalar Tamilselvi

Keyword(s):

Machine Learning ◽

Genetic Disease ◽

Classification Accuracy ◽

Gene Selection ◽

New Classification ◽

Backpropagation Algorithm ◽

High Classification Accuracy ◽

Disease Associations ◽

Disease Analysis

Classification of data on genetic disease is a useful application in microarray analysis. The genetic disease data analysis has the potential for discovering the diseased genes which may be the signature of certain diseases. Machine learning methodologies and data mining techniques are used to predict genetic disease associations of bio informatics data. Among numerous existing methods for gene selection, Backpropagation algorithm has become one of the leading methods and it gives less classification accuracy. It aims to develop a new classification algorithm (Enhanced Backpropagation Algorithm) for genetic disease analysis. Knowledge derived by the Enhanced Backpropagation Algorithm has high classification accuracy with the ability to identify the most significant genes.

Download Full-text

A New Classification Method for Human Gene Splice Site Prediction

Health Information Science - Lecture Notes in Computer Science ◽

10.1007/978-3-642-29361-0_16 ◽

2012 ◽

pp. 121-130 ◽

Cited By ~ 7

Author(s):

Dan Wei ◽

Weiwei Zhuang ◽

Qingshan Jiang ◽

Yanjie Wei

Keyword(s):

Splice Site ◽

Human Gene ◽

Classification Method ◽

New Classification ◽

Splice Site Prediction ◽

Site Prediction

Download Full-text

Criteria for choosing the number of dimensions in a principal component analysis: An empirical assessment

10.5753/sbbd.2020.13632 ◽

2020 ◽

Author(s):

Renata Silva ◽

Daniel Oliveira ◽

Davi Pereira Santos ◽

Lucio F.D. Santos ◽

Rodrigo Erthal Wilson ◽

...

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Hypothesis Test ◽

Feature Learning ◽

Principal Component ◽

Component Analysis ◽

Scree Plot ◽

Open Issue ◽

Chained Tasks ◽

High Dimensional Datasets

Principal component analysis (PCA) is an efficient model for the optimization problem of finding d' axes of a subspace Rd' ⊆ Rd so that the mean squared distances from a given set R of points to the axes are minimal. Despite being steadily employed since 1901 in different scenarios, e.g., mechanics, PCA has become an important link in machine learning chained tasks, such as feature learning and AutoML designs. A frequent yet open issue that arises from supervised-based problems is how many PCA axes are required for the performance of machine learning constructs to be tuned. Accordingly, we investigate the behavior of six independent and uncoupled criteria for estimating the number of PCA axes, namely Scree-Plot %, Scree Plot Gap, Kaiser-Guttman, Broken-Stick, p-Score, and 2D. In total, we evaluate the performance of those approaches in 20 high dimensional datasets by using (i) four different classifiers, and (ii) a hypothesis test upon the reported F-Measures. Results indicate Broken-Stick and Scree-Plot % criteria consistently outperformed the competitors regarding supervised-based tasks, whereas estimators Kaiser-Guttman and Scree-Plot Gap delivered poor performances in the same scenarios.

Download Full-text