scholarly journals An empirical assessment of quality metrics for diversified similarity searching

2021 ◽  
Vol 12 (3) ◽  
Author(s):  
Camila R. Lopes ◽  
Lúcio F. D. Santos ◽  
Daniel L. Jasbick ◽  
Daniel De Oliveira ◽  
Marcos Bedo

A diversified similarity search retrieves elements that are simultaneously similar to a query object and akin to the different collections within the explored data. While several methods in information retrieval, data clustering, and similarity searching have tackled the problem of adding diversity into result sets, the experimental comparison of their performances is still an open issue mainly because the quality metrics are “borrowed” from those different research areas, bringing their biases alongside. In this manuscript, we investigate a series of such metrics and experimentally discuss their trends and limitations. We conclude diversity is better addressed by a set of measures rather than a single quality index and introduce the concept of Diversity Features Model (DFM), which combines the viewpoints of biased metrics into a multidimensional representation. Experimental evaluations indicate (i) DFM enables comparing different result diversification algorithms by considering multiple criteria, and (ii) the most suitable searching methods for a particular dataset are spotted by combining DFM with ranking aggregation and parallel coordinates maps.

2020 ◽  
Author(s):  
Camila L. Lopes ◽  
Daniel L. Jasbick ◽  
Marcos Bedo ◽  
Lúcio F.D. Santos

Diversity-oriented searches retrieve objects not only similar to a reference element but also related to the different types of collections within the queried dataset. While such characterization is flexible enough to include methods originally from information retrieval, data clustering, and similarity searching under the same umbrella, diversity metrics are expected to be much less paradigm-biased in order to discriminate which approaches are more suitable and when they should be applied. Accordingly, we extend and implement a broad set of quality metrics from those distinct realms and experimentally discuss their trends and limitations. In particular, we evaluate the suitability of data clustering indexes, and similarity-driven measures regarding their adherence to diversified similarity searching. Experiments in real-world datasets indicate such measures are capable of distinguishing diversity methods from different paradigms, but they heavily favor the approaches of the same group – especially cluster indexes. As an alternative, we argue diversity is better addressed by a set of measures rather than a single quality value. Therefore, we propose the Diversity Features Model (DFM) that combines the perspectives of the competing approaches into a multidimensional point whose features are calculated based on the distance distribution within both retrieved and queried datasets. Empirical evaluations showed DFM compares different diversity searching approaches by considering multiple criteria, whereas overall winners can be found by ranking aggregation or visualized through parallel coordinates maps.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 591 ◽  
Author(s):  
Swarit Jasial ◽  
Ye Hu ◽  
Martin Vogt ◽  
Jürgen Bajorath

A largely unsolved problem in chemoinformatics is the issue of how calculated compound similarity relates to activity similarity, which is central to many applications. In general, activity relationships are predicted from calculated similarity values. However, there is no solid scientific foundation to bridge between calculated molecular and observed activity similarity. Accordingly, the success rate of identifying new active compounds by similarity searching is limited. Although various attempts have been made to establish relationships between calculated fingerprint similarity values and biological activities, none of these has yielded generally applicable rules for similarity searching. In this study, we have addressed the question of molecular versus activity similarity in a more fundamental way. First, we have evaluated if activity-relevant similarity value ranges could in principle be identified for standard fingerprints and distinguished from similarity resulting from random compound comparisons. Then, we have analyzed if activity-relevant similarity values could be used to guide typical similarity search calculations aiming to identify active compounds in databases. It was found that activity-relevant similarity values can be identified as a characteristic feature of fingerprints. However, it was also shown that such values cannot be reliably used as thresholds for practical similarity search calculations. In addition, the analysis presented herein helped to rationalize differences in fingerprint search performance.


Author(s):  
Blandine Bril ◽  
Jeroen Smaers ◽  
James Steele ◽  
Robert Rein ◽  
Tetsushi Nonaka ◽  
...  

Various authors have suggested behavioural similarities between tool use in early hominins and chimpanzee nut cracking, where nut cracking might be interpreted as a precursor of more complex stone flaking. In this paper, we bring together and review two separate strands of research on chimpanzee and human tool use and cognitive abilities. Firstly, and in the greatest detail, we review our recent experimental work on behavioural organization and skill acquisition in nut-cracking and stone-knapping tasks, highlighting similarities and differences between the two tasks that may be informative for the interpretation of stone tools in the early archaeological record. Secondly, and more briefly, we outline a model of the comparative neuropsychology of primate tool use and discuss recent descriptive anatomical and statistical analyses of anthropoid primate brain evolution, focusing on cortico-cerebellar systems. By juxtaposing these two strands of research, we are able to identify unsolved problems that can usefully be addressed by future research in each of these two research areas.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 591 ◽  
Author(s):  
Swarit Jasial ◽  
Ye Hu ◽  
Martin Vogt ◽  
Jürgen Bajorath

A largely unsolved problem in chemoinformatics is the issue of how calculated compound similarity relates to activity similarity, which is central to many applications. In general, activity relationships are predicted from calculated similarity values. However, there is no solid scientific foundation to bridge between calculated molecular and observed activity similarity. Accordingly, the success rate of identifying new active compounds by similarity searching is limited. Although various attempts have been made to establish relationships between calculated fingerprint similarity values and biological activities, none of these has yielded generally applicable rules for similarity searching. In this study, we have addressed the question of molecular versus activity similarity in a more fundamental way. First, we have evaluated if activity-relevant similarity value ranges could in principle be identified for standard fingerprints and distinguished from similarity resulting from random compound comparisons. Then, we have analyzed if activity-relevant similarity values could be used to guide typical similarity search calculations aiming to identify active compounds in databases. It was found that activity-relevant similarity values can be identified as a characteristic feature of fingerprints. However, it was also shown that such values cannot be reliably used as thresholds for practical similarity search calculations. In addition, the analysis presented herein helped to rationalize differences in fingerprint search performance.


2015 ◽  
Vol 57 (1) ◽  
Author(s):  
Dirk J. Lehmann ◽  
Sebastian Hundt ◽  
Holger Theisel

AbstractThe number of visualizations being required for a complete view on data non-linearly grows with the number of data dimensions. Thus, relevant visualizations need to be filtered to guide the user during the visual search. A popular filter approach is the usage of quality metrics, which map a visual pattern to a real number. This way, visualizations that contain interesting patterns are automatically detected. Quality metrics are a useful tool in visual analysis, if they resemble the human perception. In this work we present a broad study to examine the relation between filtering relevant visualizations based on human perception versus quality metrics. For this, seven widely-used quality metrics were tested on five high-dimensional datasets, covering scatterplots, parallel coordinates, and radial visualizations. In total, 102 participants were available. The results of our studies show that quality metrics often work similar to the human perception. Interestingly, a subset of so-called Scagnostic measures does the best job.


Author(s):  
Андрей Сергеевич Рубель ◽  
Владимир Васильевич Лукин

Images are subject to noise during acquisition, transmission and processing. Image denoising is highly desirable, not only to provide better visual quality, but also to improve performance of the subsequent operations such as compression, segmentation, classification, object detection and recognition. In the past decades, a large number of image denoising algorithms has been developed, ranging from simple linear methods to complex methods based on similar blocks search and deep convolutional neural networks. However, most of existing denoising techniques have a tendency to oversmooth image edges, fine details and textures. Thus, there are cases when noise reduction leads to loss of image features and filtering does not produce better visual quality. According to this, it is very important to evaluate denoising result and hence to undertake a decision whether denoising is expedient. Despite the fact that image denoising has been one of the most active research areas, only a little work has been dedicated to visual quality evaluation for denoised images. There are many approaches and metrics to characterize image quality, but adequateness of these metrics is of question. Existing image quality metrics, especially no-reference ones, have not been thoroughly studies for image denoising. In terms of using visual quality metrics, it is usually supposed that the higher the improvement for a given metric, the better visual quality for denoised image. However, there are situations when denoising does not result in visual quality enhancement, especially for texture images. Thus, it would be desirable to predict human subjective evaluation for denoised image. Then, this information will clarify when denoising can be expedient. The purpose of this paper is to give analysis of denoising expedience using no-reference (NR) image quality metrics. In addition, this work considers possible ways to predict human subjective evaluation of denoised images based on several input parameters. More in details, two denoising techniques, namely the standard sliding window DCT filter and the BM3D filter have been considered. Using a specialized database of test images SubjectiveIQA, performance evaluation of existing state-of-the-art objective no-reference quality metrics for denoised images is carried out


2008 ◽  
Vol 16 (3) ◽  
pp. 131-134
Author(s):  
Urte Scholz ◽  
Rainer Hornung

Abstract. The main research areas of the Social and Health Psychology group at the Department of Psychology at the University of Zurich, Switzerland, are introduced. Exemplarily, three currently ongoing projects are described. The project ”Dyadic exchange processes in couples facing dementia” examines social exchanges in couples with the husband suffering from dementia and is based on Equity Theory. This project applies a multi-method approach by combining self-report with observational data. The ”Swiss Tobacco Monitoring System” (TMS) is a representative survey on smoking behaviour in Switzerland. Besides its survey character, the Swiss TMS also allows for testing psychological research questions on smoking with a representative sample. The project, ”Theory-based planning interventions for changing nutrition behaviour in overweight individuals”, elaborates on the concept of planning. More specifically, it is tested whether there is a critical amount of repetitions of a planning intervention (e.g., three or nine times) in order to ensure long-term effects.


Sign in / Sign up

Export Citation Format

Share Document