scholarly journals Information Mandala: Statistical Distance Matrix with Clustering

Author(s):  
Xin Lu

<p>In machine learning, observation features are measured in a metric space to obtain their distance function for optimization. Given similar features that are statistically sufficient as a population, a statistical distance between two probability distributions can be calculated for more precise learning. Provided the observed features are multi-valued, the statistical distance function is still efficient. However, due to its scalar output, it cannot be applied to represent detailed distances between feature elements. To resolve this problem, this paper extends the traditional statistical distance to a matrix form, called a statistical distance matrix. The proposed approach performs well in object recognition tasks and clearly and intuitively represents the dissimilarities between cat and dog images in the CIFAR dataset, even when directly calculated using the image pixels. By using the hierarchical clustering of the statistical distance matrix, the image pixels can be separated into several clusters that are geometrically arranged around a center like a Mandala pattern. The statistical distance matrix with clustering is called the Information Mandala.</p><p><br></p><p>(This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible)<br></p>

2021 ◽  
Author(s):  
Xin Lu

<p>In machine learning, observation features are measured in a metric space to obtain their distance function for optimization. Given similar features that are statistically sufficient as a population, a statistical distance between two probability distributions can be calculated for more precise learning. Provided the observed features are multi-valued, the statistical distance function is still efficient. However, due to its scalar output, it cannot be applied to represent detailed distances between feature elements. To resolve this problem, this paper extends the traditional statistical distance to a matrix form, called a statistical distance matrix. The proposed approach performs well in object recognition tasks and clearly and intuitively represents the dissimilarities between cat and dog images in the CIFAR dataset, even when directly calculated using the image pixels. By using the hierarchical clustering of the statistical distance matrix, the image pixels can be separated into several clusters that are geometrically arranged around a center like a Mandala pattern. The statistical distance matrix with clustering is called the Information Mandala.</p><p><br></p><p>(This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible)<br></p>


2013 ◽  
Vol 1 ◽  
pp. 200-231 ◽  
Author(s):  
Andrea C.G. Mennucci

Abstract In this paper we discuss asymmetric length structures and asymmetric metric spaces. A length structure induces a (semi)distance function; by using the total variation formula, a (semi)distance function induces a length. In the first part we identify a topology in the set of paths that best describes when the above operations are idempotent. As a typical application, we consider the length of paths defined by a Finslerian functional in Calculus of Variations. In the second part we generalize the setting of General metric spaces of Busemann, and discuss the newly found aspects of the theory: we identify three interesting classes of paths, and compare them; we note that a geodesic segment (as defined by Busemann) is not necessarily continuous in our setting; hence we present three different notions of intrinsic metric space.


APL Photonics ◽  
2020 ◽  
Vol 5 (12) ◽  
pp. 126103
Author(s):  
B. Limbacher ◽  
S. Schoenhuber ◽  
M. Wenclawiak ◽  
M. A. Kainz ◽  
A. M. Andrews ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Igor Shuryak ◽  
Helen C. Turner ◽  
Monica Pujol-Canadell ◽  
Jay R. Perrier ◽  
Guy Garty ◽  
...  

AbstractWe implemented machine learning in the radiation biodosimetry field to quantitatively reconstruct neutron doses in mixed neutron + photon exposures, which are expected in improvised nuclear device detonations. Such individualized reconstructions are crucial for triage and treatment because neutrons are more biologically damaging than photons. We used a high-throughput micronucleus assay with automated scanning/imaging on lymphocytes from human blood ex-vivo irradiated with 44 different combinations of 0–4 Gy neutrons and 0–15 Gy photons (542 blood samples), which include reanalysis of past experiments. We developed several metrics that describe micronuclei/cell probability distributions in binucleated cells, and used them as predictors in random forest (RF) and XGboost machine learning analyses to reconstruct the neutron dose in each sample. The probability of “overfitting” was minimized by training both algorithms with repeated cross-validation on a randomly-selected subset of the data, and measuring performance on the rest. RF achieved the best performance. Mean R2 for actual vs. reconstructed neutron doses over 300 random training/testing splits was 0.869 (range 0.761 to 0.919) and root mean squared error was 0.239 (0.195 to 0.351) Gy. These results demonstrate the promising potential of machine learning to reconstruct the neutron dose component in clinically-relevant complex radiation exposure scenarios.


Sensors ◽  
2018 ◽  
Vol 18 (7) ◽  
pp. 2285 ◽  
Author(s):  
Tomasz Rymarczyk ◽  
Grzegorz Kłosowski ◽  
Edward Kozłowski

This article presents the results of research on a new method of spatial analysis of walls and buildings moisture. Due to the fact that destructive methods are not suitable for historical buildings of great architectural significance, a non-destructive method based on electrical tomography has been adopted. A hybrid tomograph with special sensors was developed for the measurements. This device enables the acquisition of data, which are then reconstructed by appropriately developed methods enabling spatial analysis of wet buildings. Special electrodes that ensure good contact with the surface of porous building materials such as bricks and cement were introduced. During the research, a group of algorithms enabling supervised machine learning was analyzed. They have been used in the process of converting input electrical values into conductance depicted by the output image pixels. The conductance values of individual pixels of the output vector made it possible to obtain images of the interior of building walls as both flat intersections (2D) and spatial (3D) images. The presented group of algorithms has a high application value. The main advantages of the new methods are: high accuracy of imaging, low costs, high processing speed, ease of application to walls of various thickness and irregular surface. By comparing the results of tomographic reconstructions, the most efficient algorithms were identified.


2021 ◽  
Author(s):  
Vidya Samadi ◽  
Rakshit Pally

&lt;p&gt;Floods are among the most destructive natural hazard that affect millions of people across the world leading to severe loss of life and damage to property, critical infrastructure, and agriculture. Internet of Things (IoTs), machine learning (ML), and Big Data are exceptionally valuable tools for collecting the catastrophic readiness and countless actionable data. The aim of this presentation is to introduce Flood Analytics Information System (FAIS) as a data gathering and analytics system. &amp;#160;FAIS application is smartly designed to integrate crowd intelligence, ML, and natural language processing of tweets to provide warning with the aim to improve flood situational awareness and risk assessment. FAIS has been Beta tested during major hurricane events in US where successive storms made extensive damage and disruption. The prototype successfully identifies a dynamic set of at-risk locations/communities using the USGS river gauge height readings and geotagged tweets intersected with watershed boundary. The list of prioritized locations can be updated, as the river monitoring system and condition change over time (typically every 15 minutes).&amp;#160; The prototype also performs flood frequency analysis (FFA) using various probability distributions with the associated uncertainty estimation to assist engineers in designing safe structures. This presentation will discuss about the FAIS functionalities and real-time implementation of the prototype across south and southeast USA. This research is funded by the US National Science Foundation (NSF).&lt;/p&gt;


2020 ◽  
Author(s):  
Xiao Lai ◽  
Pu Tian

AbstractSupervised machine learning, especially deep learning based on a wide variety of neural network architectures, have contributed tremendously to fields such as marketing, computer vision and natural language processing. However, development of un-supervised machine learning algorithms has been a bottleneck of artificial intelligence. Clustering is a fundamental unsupervised task in many different subjects. Unfortunately, no present algorithm is satisfactory for clustering of high dimensional data with strong nonlinear correlations. In this work, we propose a simple and highly efficient hierarchical clustering algorithm based on encoding by composition rank vectors and tree structure, and demonstrate its utility with clustering of protein structural domains. No record comparison, which is an expensive and essential common step to all present clustering algorithms, is involved. Consequently, it achieves linear time and space computational complexity hierarchical clustering, thus applicable to arbitrarily large datasets. The key factor in this algorithm is definition of composition, which is dependent upon physical nature of target data and therefore need to be constructed case by case. Nonetheless, the algorithm is general and applicable to any high dimensional data with strong nonlinear correlations. We hope this algorithm to inspire a rich research field of encoding based clustering well beyond composition rank vector trees.


Sign in / Sign up

Export Citation Format

Share Document