Information Mandala: Statistical Distance Matrix with Clustering

10.36227/techrxiv.14271545.v1 ◽

2021 ◽

Author(s):

Xin Lu

Keyword(s):

Machine Learning ◽

Object Recognition ◽

Hierarchical Clustering ◽

Distance Function ◽

Metric Space ◽

Probability Distributions ◽

Distance Matrix ◽

Scalar Output ◽

Statistical Distance ◽

Image Pixels

In machine learning, observation features are measured in a metric space to obtain their distance function for optimization. Given similar features that are statistically sufficient as a population, a statistical distance between two probability distributions can be calculated for more precise learning. Provided the observed features are multi-valued, the statistical distance function is still efficient. However, due to its scalar output, it cannot be applied to represent detailed distances between feature elements. To resolve this problem, this paper extends the traditional statistical distance to a matrix form, called a statistical distance matrix. The proposed approach performs well in object recognition tasks and clearly and intuitively represents the dissimilarities between cat and dog images in the CIFAR dataset, even when directly calculated using the image pixels. By using the hierarchical clustering of the statistical distance matrix, the image pixels can be separated into several clusters that are geometrically arranged around a center like a Mandala pattern. The statistical distance matrix with clustering is called the Information Mandala. (This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible)

Download Full-text

Information Mandala: Statistical Distance Matrix with Clustering

10.36227/techrxiv.14271545 ◽

2021 ◽

Author(s):

Xin Lu

Keyword(s):

Machine Learning ◽

Object Recognition ◽

Hierarchical Clustering ◽

Distance Function ◽

Metric Space ◽

Probability Distributions ◽

Distance Matrix ◽

Scalar Output ◽

Statistical Distance ◽

Image Pixels

In machine learning, observation features are measured in a metric space to obtain their distance function for optimization. Given similar features that are statistically sufficient as a population, a statistical distance between two probability distributions can be calculated for more precise learning. Provided the observed features are multi-valued, the statistical distance function is still efficient. However, due to its scalar output, it cannot be applied to represent detailed distances between feature elements. To resolve this problem, this paper extends the traditional statistical distance to a matrix form, called a statistical distance matrix. The proposed approach performs well in object recognition tasks and clearly and intuitively represents the dissimilarities between cat and dog images in the CIFAR dataset, even when directly calculated using the image pixels. By using the hierarchical clustering of the statistical distance matrix, the image pixels can be separated into several clusters that are geometrically arranged around a center like a Mandala pattern. The statistical distance matrix with clustering is called the Information Mandala. (This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible)

Download Full-text

On the Generalized Distance Energy of Graphs

Mathematics ◽

10.3390/math8010017 ◽

2019 ◽

Vol 8 (1) ◽

pp. 17 ◽

Cited By ~ 4

Author(s):

Abdollah Alhevaz ◽

Maryam Baghipur ◽

Hilal A. Ganie ◽

Yilun Shang

Keyword(s):

Lower Bounds ◽

Complete Graph ◽

Connected Graph ◽

Distance Matrix ◽

Wiener Index ◽

Diagonal Matrix ◽

Upper And Lower Bounds ◽

Star Graph ◽

Extremal Graphs ◽

Generalized Distance

The generalized distance matrix D α ( G ) of a connected graph G is defined as D α ( G ) = α T r ( G ) + ( 1 − α ) D ( G ) , where 0 ≤ α ≤ 1 , D ( G ) is the distance matrix and T r ( G ) is the diagonal matrix of the node transmissions. In this paper, we extend the concept of energy to the generalized distance matrix and define the generalized distance energy E D α ( G ) . Some new upper and lower bounds for the generalized distance energy E D α ( G ) of G are established based on parameters including the Wiener index W ( G ) and the transmission degrees. Extremal graphs attaining these bounds are identified. It is found that the complete graph has the minimum generalized distance energy among all connected graphs, while the minimum is attained by the star graph among trees of order n.

Download Full-text

On the Second-Largest Reciprocal Distance Signless Laplacian Eigenvalue

Mathematics ◽

10.3390/math9050512 ◽

2021 ◽

Vol 9 (5) ◽

pp. 512

Author(s):

Maryam Baghipur ◽

Modjtaba Ghorbani ◽

Hilal A. Ganie ◽

Yilun Shang

Keyword(s):

Lower Bounds ◽

Complete Graph ◽

Distance Matrix ◽

Upper And Lower Bounds ◽

Laplacian Eigenvalue ◽

Signless Laplacian ◽

Graph Parameters ◽

Simple Connected Graph ◽

Second Largest Eigenvalue ◽

Reciprocal Distance Matrix

The signless Laplacian reciprocal distance matrix for a simple connected graph G is defined as RQ(G)=diag(RH(G))+RD(G). Here, RD(G) is the Harary matrix (also called reciprocal distance matrix) while diag(RH(G)) represents the diagonal matrix of the total reciprocal distance vertices. In the present work, some upper and lower bounds for the second-largest eigenvalue of the signless Laplacian reciprocal distance matrix of graphs in terms of various graph parameters are investigated. Besides, all graphs attaining these new bounds are characterized. Additionally, it is inferred that among all connected graphs with n vertices, the complete graph Kn and the graph Kn−e obtained from Kn by deleting an edge e have the maximum second-largest signless Laplacian reciprocal distance eigenvalue.

Download Full-text

The normalized distance Laplacian

Special Matrices ◽

10.1515/spma-2020-0114 ◽

2021 ◽

Vol 9 (1) ◽

pp. 1-18

Author(s):

Carolyn Reinhart

Keyword(s):

Spectral Radius ◽

Characteristic Polynomial ◽

Adjacency Matrix ◽

Connected Graph ◽

Distance Matrix ◽

Laplacian Matrix ◽

Diagonal Matrix ◽

The Matrix

Abstract The distance matrix 𝒟(G) of a connected graph G is the matrix containing the pairwise distances between vertices. The transmission of a vertex vi in G is the sum of the distances from vi to all other vertices and T(G) is the diagonal matrix of transmissions of the vertices of the graph. The normalized distance Laplacian, 𝒟𝒧(G) = I−T(G)−1/2 𝒟(G)T(G)−1/2, is introduced. This is analogous to the normalized Laplacian matrix, 𝒧(G) = I − D(G)−1/2 A(G)D(G)−1/2, where D(G) is the diagonal matrix of degrees of the vertices of the graph and A(G) is the adjacency matrix. Bounds on the spectral radius of 𝒟 𝒧 and connections with the normalized Laplacian matrix are presented. Twin vertices are used to determine eigenvalues of the normalized distance Laplacian. The distance generalized characteristic polynomial is defined and its properties established. Finally, 𝒟𝒧-cospectrality and lack thereof are determined for all graphs on 10 and fewer vertices, providing evidence that the normalized distance Laplacian has fewer cospectral pairs than other matrices.

Download Full-text

Statistical Distance Metric Learning for Image Set Retrieval

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9413393 ◽

2021 ◽

Author(s):

Ting-Yao Hu ◽

Alexander G. Hauptmann

Keyword(s):

Metric Learning ◽

Distance Metric Learning ◽

Distance Metric ◽

Image Set ◽

Statistical Distance

Download Full-text

The Diversity, Composition, and Metabolic Pathways of Archaea in Pigs

Animals ◽

10.3390/ani11072139 ◽

2021 ◽

Vol 11 (7) ◽

pp. 2139

Author(s):

Feilong Deng ◽

Yushan Li ◽

Yunjuan Peng ◽

Xiaoyuan Wei ◽

Xiaofan Wang ◽

...

Keyword(s):

Relative Abundance ◽

Antibiotic Resistance Genes ◽

Alpha Diversity ◽

Distance Matrix ◽

Archaeal Community ◽

Gene Families ◽

Shannon Index ◽

Metabolic Potential ◽

Substantial Progress ◽

Archaeal Communities

Archaea are an essential class of gut microorganisms in humans and animals. Despite the substantial progress in gut microbiome research in the last decade, most studies have focused on bacteria, and little is known about archaea in mammals. In this study, we investigated the composition, diversity, and functional potential of gut archaeal communities in pigs by re-analyzing a published metagenomic dataset including a total of 276 fecal samples from three countries: China (n = 76), Denmark (n = 100), and France (n = 100). For alpha diversity (Shannon Index) of the archaeal communities, Chinese pigs were less diverse than Danish and French pigs (p < 0.001). Consistently, Chinese pigs also possessed different archaeal community structures from the other two groups based on the Bray–Curtis distance matrix. Methanobrevibacter was the most dominant archaeal genus in Chinese pigs (44.94%) and French pigs (15.41%), while Candidatus methanomethylophilus was the most predominant in Danish pigs (15.71%). At the species level, the relative abundance of Candidatus methanomethylophilus alvus, Natrialbaceae archaeon XQ INN 246, and Methanobrevibacter gottschalkii were greatest in Danish, French, and Chinese pigs with a relative abundance of 14.32, 11.67, and 16.28%, respectively. In terms of metabolic potential, the top three pathways in the archaeal communities included the MetaCyc pathway related to the biosynthesis of L-valine, L-isoleucine, and isobutanol. Interestingly, the pathway related to hydrogen consumption (METHANOGENESIS-PWY) was only observed in archaeal reads, while the pathways participating in hydrogen production (FERMENTATION-PWY and PWY4LZ-257) were only detected in bacterial reads. Archaeal communities also possessed CAZyme gene families, with the top five being AA3, GH43, GT2, AA6, and CE9. In terms of antibiotic resistance genes (ARGs), the class of multidrug resistance was the most abundant ARG, accounting for 87.41% of archaeal ARG hits. Our study reveals the diverse composition and metabolic functions of archaea in pigs, suggesting that archaea might play important roles in swine nutrition and metabolism.

Download Full-text

Precise clustering analysis of Internet financial credit reporting dependent on multidimensional attribute sparse large data

International Journal of Electrical Engineering Education ◽

10.1177/00207209211002086 ◽

2021 ◽

pp. 002072092110020

Author(s):

Lingling Chen ◽

Yuanyuan Zhang ◽

Min Zeng

Keyword(s):

Clustering Analysis ◽

Large Data ◽

Distance Matrix ◽

The Internet ◽

Relationship Matrix ◽

Clustering Methods ◽

Correlation Relationship ◽

Credit Reporting ◽

Financial Credit ◽

Approximate Distance

Given that the traditional methods cannot perform clustering analysis on the Internet financial credit reporting directly and effectively, a kind of precise clustering analysis of internet financial credit reporting dependent on multidimensional attribute sparse large data is proposed. By measuring the overall distance between Internet financial credit reporting through the sparse large data with multidimensional attributes, the multidimensional attribute sparse large data are used to perform clustering analysis on the overall distance matrix and the component approximate distance matrix between the data, respectively. The correlation relationship between the Internet financial credit reporting under these two perspectives is taken into comprehensive consideration. Multidimensional attribute sparse large data pairs are used to reflect the comprehensive relationship matrix of the original Internet financial credit reporting to achieve clustering with relatively high quality. Numerical experiments show that compared with the traditional clustering methods, the method proposed in this paper can not only reflect the overall data features effectively, but also improve the clustering effect of the original Internet financial credit reporting data through the analysis of the correlation relationship between the important component attribute sequences.

Download Full-text

Chatbot to improve learning punctuation in Spanish and to enhance open and flexible learning environments

International Journal of Educational Technology in Higher Education ◽

10.1186/s41239-021-00269-8 ◽

2021 ◽

Vol 18 (1) ◽

Author(s):

Esteban Vázquez-Cano ◽

Santiago Mengual-Andrés ◽

Eloy López-Meneses

Keyword(s):

Learning Process ◽

Latent Dirichlet Allocation ◽

Distance Matrix ◽

Ease Of Use ◽

Three Dimensions ◽

Pairwise Distance ◽

Control Group ◽

Quantitative Methodology ◽

Quasi Experimental ◽

Experimental Group

AbstractThe objective of this article is to analyze the didactic functionality of a chatbot to improve the results of the students of the National University of Distance Education (UNED / Spain) in accessing the university in the subject of Spanish Language. For this, a quasi-experimental experiment was designed, and a quantitative methodology was used through pretest and posttest in a control and experimental group in which the effectiveness of two teaching models was compared, one more traditional based on exercises written on paper and another based on interaction with a chatbot. Subsequently, the perception of the experimental group in an academic forum about the educational use of the chatbot was analyzed through text mining with tests of Latent Dirichlet Allocation (LDA), pairwise distance matrix and bigrams. The quantitative results showed that the students in the experimental group substantially improved the results compared to the students with a more traditional methodology (experimental group / mean: 32.1346 / control group / mean: 28.4706). Punctuation correctness has been improved mainly in the usage of comma, colon and periods in different syntactic patterns. Furthermore, the perception of the students in the experimental group showed that they positively value chatbots in their teaching–learning process in three dimensions: greater “support” and companionship in the learning process, as they perceive greater interactivity due to their conversational nature; greater “feedback” and interaction compared to the more traditional methodology and, lastly, they especially value the ease of use and the possibility of interacting and learning anywhere and anytime.

Download Full-text

Structural Health Monitoring by Time Series Analysis and Statistical Distance Measures

10.1007/978-3-030-66259-2 ◽

2021 ◽

Author(s):

Alireza Entezami

Keyword(s):

Time Series ◽

Structural Health Monitoring ◽

Time Series Analysis ◽

Health Monitoring ◽

Distance Measures ◽

Structural Health ◽

Series Analysis ◽

Statistical Distance

Download Full-text