measure of similarity Latest Research Papers

A non-uniform (skewed) mixture of probability density functions occurs in various disciplines. One needs a measure of similarity to the respective constituents and its bounds. We introduce a skewed Jensen–Fisher divergence based on relative Fisher information, and provide some bounds in terms of the skewed Jensen–Shannon divergence and of the variational distance. The defined measure coincides with the definition from the skewed Jensen–Shannon divergence via the de Bruijn identity. Our results follow from applying the logarithmic Sobolev inequality and Poincaré inequality.

Download Full-text

First-mover Advantage Explains Gender Disparities in Physics Citations

10.21203/rs.3.rs-957696/v1 ◽

2021 ◽

Author(s):

Hyunsik Kong ◽

Samuel Martin-Gutierrez ◽

Fariba Karimi

Keyword(s):

Citation Network ◽

Gender Disparity ◽

Gender Disparities ◽

Men And Women ◽

Science Technology ◽

Gender Biases ◽

First Mover Advantage ◽

Measure Of Similarity ◽

And Mathematics ◽

First Mover

Abstract Mounting evidence suggests that publications and citations of scholars in the STEM fields (Science, Technology, Engineering and Mathematics) suffer from gender biases. In this paper, we study the physics community, a core STEM field in which women are still largely under-represented and where these gender disparities persist. To reveal such inequalities, we compare the citations received by papers led by men and women that cover the same topics in a comparable way. To do that, we devise a robust statistical measure of similarity between publications that enables us to detect pairs of similar papers. Our findings indicate that although papers written by women tend to have lower visibility in the citation network, pairs of similar papers written by men and women receive comparable attention when corrected for the time of publication. These analyses suggest that gender disparity is closely related to the first-mover and cumulative advantage that men have in physics and is not an intentional act of discrimination towards women.

Download Full-text

A Network View of Portfolio Optimization Using Fundamental Information

Frontiers in Physics ◽

10.3389/fphy.2021.721007 ◽

2021 ◽

Vol 9 ◽

Author(s):

Xiangzhen Yan ◽

Hanchao Yang ◽

Zhongyuan Yu ◽

Shuguang Zhang

Keyword(s):

Portfolio Optimization ◽

Asset Allocation ◽

Defensive Strategy ◽

Novel Approach ◽

Fundamental Information ◽

Out Of Sample ◽

Efficient Portfolios ◽

Robust Network ◽

Measure Of Similarity ◽

Mean Variance

This article proposes the use of a novel approach to portfolio optimization, referred to as “Fundamental Networks” (FN). FN is an effective and robust network-based fundamental-incorporated method, and can be served as an alternative to classical mean-variance framework models. As a proxy for a portfolio, a fundamental network is defined as a set of “interconnected” stocks, among which linkages are a measure of similarity of fundamental information and are referred to asset allocation directly. Two empirical models are provided in this paper as applications of Fundamental Networks. We find that Fundamental Networks efficient portfolios are in general more mean-variance efficient in out-of-sample performance than Markwotiz’s efficient portfolios. Specifically, portfolios set for profitability goals create excess return in a general/upward trending market; portfolios targeted for operating fitness perform better in a downward trending market, and can be considered as a defensive strategy in the event of a crisis.

Download Full-text

Meteorological sub-divisions of India : Assessment of coherence, homogeneity and recommended redelineation

MAUSAM ◽

10.54302/mausam.v71i4.41 ◽

2021 ◽

Vol 71 (4) ◽

pp. 585-604

Author(s):

KULKARNI ASHWINI ◽

GUHATHAKURTA PULAK ◽

PATWARDHAN SAVITA ◽

GADGIL SULOCHANA

Keyword(s):

Western Ghats ◽

Tamil Nadu ◽

Rainfall Anomaly ◽

Seasonal Rainfall ◽

Western Coast ◽

Clustering Method ◽

Station Network ◽

Average Rainfall ◽

Measure Of Similarity ◽

Different Time Scales

The data on mean rainfall and mean rainfall anomaly of the meteorological sub-divisions of India, on different time-scales, is extensively used for monitoring the progress of the monsoon as well as applications and research. As such, it is important to ensure that the sub-divisional means are meaningful representations of the rainfall and the rainfall anomaly at districts/stations within the sub-division. Hence, the criteria to be satisfied for an appropriate delineation of a meteorological sub-division are high levels of coherence and homogeneity. In this paper we present an assessment of the coherence and homogeneity of the current meteorological sub-divisions, for rainfall on the seasonal scale, by analysis of monthly district average rainfall for the period 1901-2015 during the summer monsoon for all the states, except Tamil Nadu for which June-December data are considered. Since, earlier studies have shown that some of the sub-divisions of Karnataka and Maharashtra are neither coherent nor homogeneous, the problem of redelineation of the sub-divisions of these states is first addressed. We have assumed that the number of coherent zones in a state is the same as the number of current sub-divisions. Identification of coherent zones is achieved by successive application of the K-means (KM) clustering method to the seasonal rainfall of the districts, considering correlation of seasonal rainfall between districts as a measure of similarity. For these two states we find that some of the districts are not coherent and homogeneous. So we have repeated the exercise with analysis of a dense station network. The coherent zones identified from analysis of district data as well as station data, are found to be homogeneous as well and we have recommended that they become the new sub-divisions of the states. The new sub-divisions suggested for Karnataka, which are coherent and homogeneous, are: (i) Karnataka Western coast and Ghats (which includes districts/stations in the current sub-division of Coastal Karnataka as well as some from the sub-divisions of interior Karnataka) (ii) Karnataka northern plateau and (iii) Karnataka southern plateau. Of the current sub-divisions of Maharashtra, Marathwada and Vidarbha satisfy the criteria of coherence and homogeneity and can be retained as such. The current Madhya Maharashtra sub-division does not satisfy the criteria of coherence and homogeneity. We have derived a modified version of Madhya Maharashtra by allocation of some districts/stations of Western Ghats from the existing sub-division of Madhya Maharashtra to the existing sub-division of Konkan and Goa to generate a modified version of Konkan and Goa. These modified versions are coherent and homogeneous. Thus the suggested sub-divisions of Maharashtra are (i) modified version of Konkan and Goa (which could have been renamed as Konkan, Ghats and Goa but we have retained the old name) and (ii) modified version of Madhya Maharashtra, along with the current sub-divisions of (iii) Marathwada and (iv) Vidarbha. We have shown that the sub-divisions of all the other states of mainland India, are homogeneous and reasonably coherent and recommend that they should be retained as such.

Download Full-text

Aggregation of Indistinguishability Fuzzy Relations Revisited

Mathematics ◽

10.3390/math9121441 ◽

2021 ◽

Vol 9 (12) ◽

pp. 1441

Author(s):

Juan-De-Dios González-Hedström ◽

Juan-José Miñana ◽

Oscar Valero

Keyword(s):

Equivalence Relation ◽

Fuzzy Relations ◽

Measure Of Similarity ◽

Dual Notion ◽

The Relationship

Indistinguishability fuzzy relations were introduced with the aim of providing a fuzzy notion of equivalence relation. Many works have explored their relation to metrics, since they can be interpreted as a kind of measure of similarity and this is, in fact, a dual notion to dissimilarity. Moreover, the problem of how to construct new indistinguishability fuzzy relations by means of aggregation has been explored in the literature. In this paper, we provide new characterizations of those functions that allow us to merge a collection of indistinguishability fuzzy relations into a new one in terms of triangular triplets and, in addition, we explore the relationship between such functions and those that aggregate extended pseudo-metrics, which are the natural distances associated to indistinguishability fuzzy relations. Our new results extend some already known characterizations which involve only bounded pseudo-metrics. In addition, we provide a completely new description of those indistinguishability fuzzy relations that separate points, and we show that both differ a lot.

Download Full-text

Measure of Similarity between GMMs by Embedding of the Parameter Space That Preserves KL Divergence

Mathematics ◽

10.3390/math9090957 ◽

2021 ◽

Vol 9 (9) ◽

pp. 957

Author(s):

Branislav Popović ◽

Lenka Cepova ◽

Robert Cep ◽

Marko Janev ◽

Lidija Krstanović

Keyword(s):

Computational Complexity ◽

Parameter Space ◽

Recognition Accuracy ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Dimensional Manifold ◽

Dimensional Parameter ◽

Trade Off ◽

Measure Of Similarity ◽

Lower Dimensional

In this work, we deliver a novel measure of similarity between Gaussian mixture models (GMMs) by neighborhood preserving embedding (NPE) of the parameter space, that projects components of GMMs, which by our assumption lie close to lower dimensional manifold. By doing so, we obtain a transformation from the original high-dimensional parameter space, into a much lower-dimensional resulting parameter space. Therefore, resolving the distance between two GMMs is reduced to (taking the account of the corresponding weights) calculating the distance between sets of lower-dimensional Euclidean vectors. Much better trade-off between the recognition accuracy and the computational complexity is achieved in comparison to measures utilizing distances between Gaussian components evaluated in the original parameter space. The proposed measure is much more efficient in machine learning tasks that operate on large data sets, as in such tasks, the required number of overall Gaussian components is always large. Artificial, as well as real-world experiments are conducted, showing much better trade-off between recognition accuracy and computational complexity of the proposed measure, in comparison to all baseline measures of similarity between GMMs tested in this paper.

Download Full-text

A comparison of 71 binary similarity coefficients: The effect of base rates

PLoS ONE ◽

10.1371/journal.pone.0247751 ◽

2021 ◽

Vol 16 (4) ◽

pp. e0247751

Author(s):

Michael Brusco ◽

J. Dennis Cradit ◽

Douglas Steinley

Keyword(s):

Simulation Experiment ◽

Base Rate ◽

Similarity Matrix ◽

Similarity Coefficients ◽

Binary Matrix ◽

Vast Number ◽

Base Rates ◽

Pairwise Correlations ◽

Measure Of Similarity ◽

Selection Of

There are many psychological applications that require collapsing the information in a two-mode (e.g., respondents-by-attributes) binary matrix into a one-mode (e.g., attributes-by-attributes) similarity matrix. This process requires the selection of a measure of similarity between binary attributes. A vast number of binary similarity coefficients have been proposed in fields such as biology, geology, and ecology. Although previous studies have reported cluster analyses of binary similarity coefficients, there has been little exploration of how cluster memberships are affected by the base rates (percentage of ones) for the binary attributes. We conducted a simulation experiment that compared two-cluster K-median partitions of 71 binary similarity coefficients based on their pairwise correlations obtained under 15 different base-rate configurations. The results reveal that some subsets of coefficients consistently group together regardless of the base rates. However, there are other subsets of coefficients that group together for some base rates, but not for others.

Download Full-text

Visual object categorization in infancy

10.1101/2021.02.25.432436 ◽

2021 ◽

Author(s):

Céline Spriet ◽

Etienne Abassi ◽

Jean-Rémy Hochmann ◽

Liuba Papeo

Keyword(s):

Visual Cortex ◽

Age Groups ◽

Visual Exploration ◽

Object Categorization ◽

Visual Object ◽

Object Categories ◽

Representational Space ◽

The World ◽

Measure Of Similarity ◽

Incremental Process

AbstractHumans make sense of the world by organizing things into categories. When and how does this process begin? We investigated whether real-world object categories that spontaneously emerge in the first months of life match categorical representations of objects in the human visual cortex. Taking infants’ looking times as a measure of similarity, we defined a representational space where each object was defined in relation to others of the same or different categories. This space was compared with hypothesis-based and fMRI-based models of visual-object categorization in the adults’ visual cortex. Analyses across different age groups revealed an incremental process with two milestones. Between 4 and 10 months, visual exploration guided by saliency gives way to an organization according to the animate-inanimate distinction. Between 10 and 19 months, a category spurt leads towards a mature organization. We propose that these changes underlie the coupling between seeing and thinking in the developing mind.

Download Full-text

Normalizacja zmiennych a porządkowanie krajów Unii Europejskiej pod względem stopnia wykorzystania technologii ICT w przedsiębiorstwach

Nierówności społeczne a wzrost gospodarczy ◽

10.15584/nsawg.2021.1.4 ◽

2021 ◽

Vol 65 (1) ◽

pp. 74-89

Author(s):

Patrycja Wieczorek ◽

◽

Eliza Frejtag-Mika ◽

Keyword(s):

European Union ◽

Statistical Data ◽

Linear Ordering ◽

Data Normalization ◽

Maximum Value ◽

European Union Countries ◽

Range Correlation ◽

Measure Of Similarity ◽

The Impact ◽

Diagnostic Variables

The main issue of multivariate comparative analysis is the normalization of variables. The literature offers various procedures for data normalization, and therefore the researcher has to choose between them. The article presents and discusses the most commonly used normalizing formulas. The article assesses the impact of data normalization procedures on the results of the linear ordering of European Union countries in terms of the level of ICT usage in enterprises. A hypothesis was formulated that the method of data normalization influenced the position of the objects in the ranking. The study is based on statistical data from Eurostat for the year 2018. Based on the selected diagnostic variables, values for a synthetic measure have been determined for individual countries. The synthetic measure was calculated according to the model-less method of linear ordering using four types of normalization. The method used in the research allowed the creation of rankings for the countries. The compliance of the orders thus obtained was compared using the Spearman’s coefficient of range correlation and the measure of similarity of rankings. As the study shows, the choice of normalization formula influences the result of linear ordering, which is not due to any change in the data structure. It was proven that the quotient transformation with the normalization base equal to the maximum value allowed the most similar ranking to be obtained of the examined objects in relation to the Rother rankings. The results of the study show that Denmark, Sweden and Finland had the highest positions in each ranking while Bulgaria, Romania and Latvia had the lowest positions.

Download Full-text

Hemeroby and homotoneity of plant communities: can we detect evident codependencies?

BIO Web of Conferences ◽

10.1051/bioconf/20213100035 ◽

2021 ◽

Vol 31 ◽

pp. 00035

Author(s):

Andrei Zverev ◽

Natalia Shchegoleva ◽

Christina Levitskaya

Keyword(s):

Plant Communities ◽

Indicator Value ◽

Measure Of Similarity

The results of codependency analysis of 9 qualitative and 4 quantitative measures of plant communities homotoneity with statuses of their naturalness based on the application of the indicator value scale of hemeroby tolerance of South Siberian plants are presented. The highest correlation with level of naturalness was performed by qualitative multiplace measure of similarity by Jaccard.

Download Full-text

measure of similarity
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Skewed Jensen—Fisher Divergence and Its Bounds

First-mover Advantage Explains Gender Disparities in Physics Citations

A Network View of Portfolio Optimization Using Fundamental Information

Meteorological sub-divisions of India : Assessment of coherence, homogeneity and recommended redelineation

Aggregation of Indistinguishability Fuzzy Relations Revisited

Measure of Similarity between GMMs by Embedding of the Parameter Space That Preserves KL Divergence

A comparison of 71 binary similarity coefficients: The effect of base rates

Visual object categorization in infancy

Normalizacja zmiennych a porządkowanie krajów Unii Europejskiej pod względem stopnia wykorzystania technologii ICT w przedsiębiorstwach

Hemeroby and homotoneity of plant communities: can we detect evident codependencies?

Export Citation Format

measure of similarityRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Skewed Jensen—Fisher Divergence and Its Bounds

First-mover Advantage Explains Gender Disparities in Physics Citations

A Network View of Portfolio Optimization Using Fundamental Information

Meteorological sub-divisions of India : Assessment of coherence, homogeneity and recommended redelineation

Aggregation of Indistinguishability Fuzzy Relations Revisited

Measure of Similarity between GMMs by Embedding of the Parameter Space That Preserves KL Divergence

A comparison of 71 binary similarity coefficients: The effect of base rates

Visual object categorization in infancy

Normalizacja zmiennych a porządkowanie krajów Unii Europejskiej pod względem stopnia wykorzystania technologii ICT w przedsiębiorstwach

Hemeroby and homotoneity of plant communities: can we detect evident codependencies?

measure of similarity
Recently Published Documents