Classification of Observations through Combination of the Dimension Reduction and the Cluster Analysis

Author(s):  
Hyeuk Kim

Unsupervised learning in machine learning divides data into several groups. The observations in the same group have similar characteristics and the observations in the different groups have the different characteristics. In the paper, we classify data by partitioning around medoids which have some advantages over the k-means clustering. We apply it to baseball players in Korea Baseball League. We also apply the principal component analysis to data and draw the graph using two components for axis. We interpret the meaning of the clustering graphically through the procedure. The combination of the partitioning around medoids and the principal component analysis can be used to any other data and the approach makes us to figure out the characteristics easily.

2021 ◽  
Author(s):  
Adriana Medeiros Pinheiro ◽  
George Tassiano Melo Pereira ◽  
Caio Carvalho Moreira ◽  
Claudomiro de Souza Sales Junior

Ransomware is a subset of malware that is growing as a serious cyber threat. This malicious software prevents orlimits users from accessing their system until the ransom is paid.The use of Machine Learning (ML) algorithms has been widely used in automatic classification of these attacks. In this paper,we apply the Principal Component Analysis (PCA) techniqueas feature extraction intending to reduce dimensionality of the dataset, then we explore 11 ML algorithms in order to findthe best classifier for ransomware detection. Five comparisonmethods used in the literature were discussed. Nayes Bayesmethod achieved an Accuracy of 100% in one of the methods.


Polymers ◽  
2021 ◽  
Vol 13 (24) ◽  
pp. 4328
Author(s):  
Muhammad Saleem ◽  
Ali Rizwan

This article attempts to introduce a simple and robust way for the classification of soft magnetic material by using multivariate statistics. The six magnetic properties including coercive magnetic field, relative magnetic permeability, electrical resistivity magnetic inductions, i.e., remanence and saturation along with Curie temperature are used for the classification of 16 soft magnetic materials. Descriptive statistics have been used for defining the prioritization order of the mentioned magnetic characteristics with coercive magnetic field and Curie temperature as the most and least important characteristics for classification of soft magnetic material. Moreover, it has also justified the usage of cluster analysis and principal component analysis for classifying the enlisted materials. After descriptive statistics, cluster analysis is used for classification of materials into four groups, i.e., excellent, good, fair and poor while defining the prioritization order of materials on a relative scale. Principal component analysis reveals that the relative permeability is responsible for defining 99.69% of total variance and is also negatively correlated with the coercive magnetic field. Therefore, these two characteristics are considered the responsible factors for categorically placing the enlisted materials into four clusters. Furthermore, principal component analysis also helps in figuring out the fact that a combined influential consequence of relative permeability, coercive magnetic field, electrical resistivity and critical temperature are responsible for defining prioritization ordering of materials within the clusters. The material’s suitability index is identified while making use of adjacency and decision matrices obtained from material assessment graph and relative importance of magnetic properties, respectively. Afterward this material suitability index is used to rank the enlisted materials based on selected attributes. According to the suitability index, the best choice among enlisted soft magnetic materials is Supermalloy, Magnifer 7904 which is present in group 1 labeled as excellent by multivariate analysis. Therefore, the results of graph theory are in accordance with cluster analysis and principal component analysis, thus confirming the potential of this intelligent approach for the selection application specific magnetic materials.


2020 ◽  
Author(s):  
Jiawei Peng ◽  
Yu Xie ◽  
Deping Hu ◽  
Zhenggang Lan

The system-plus-bath model is an important tool to understand nonadiabatic dynamics for large molecular systems. The understanding of the collective motion of a huge number of bath modes is essential to reveal their key roles in the overall dynamics. We apply the principal component analysis (PCA) to investigate the bath motion based on the massive data generated from the MM-SQC (symmetrical quasi-classical dynamics method based on the Meyer-Miller mapping Hamiltonian) nonadiabatic dynamics of the excited-state energy transfer dynamics of Frenkel-exciton model. The PCA method clearly clarifies that two types of bath modes, which either display the strong vibronic couplings or have the frequencies close to electronic transition, are very important to the nonadiabatic dynamics. These observations are fully consistent with the physical insights. This conclusion is obtained purely based on the PCA understanding of the trajectory data, without the large involvement of pre-defined physical knowledge. The results show that the PCA approach, one of the simplest unsupervised machine learning methods, is very powerful to analyze the complicated nonadiabatic dynamics in condensed phase involving many degrees of freedom.


Sign in / Sign up

Export Citation Format

Share Document