scholarly journals Extended similarity indices: the benefits of comparing more than two objects simultaneously. Part 2: speed, consistency, diversity selection

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Ramón Alain Miranda-Quintana ◽  
Anita Rácz ◽  
Dávid Bajusz ◽  
Károly Héberger

AbstractDespite being a central concept in cheminformatics, molecular similarity has so far been limited to the simultaneous comparison of only two molecules at a time and using one index, generally the Tanimoto coefficent. In a recent contribution we have not only introduced a complete mathematical framework for extended similarity calculations, (i.e. comparisons of more than two molecules at a time) but defined a series of novel idices. Part 1 is a detailed analysis of the effects of various parameters on the similarity values calculated by the extended formulas. Their features were revealed by sum of ranking differences and ANOVA. Here, in addition to characterizing several important aspects of the newly introduced similarity metrics, we will highlight their applicability and utility in real-life scenarios using datasets with popular molecular fingerprints. Remarkably, for large datasets, the use of extended similarity measures provides an unprecedented speed-up over “traditional” pairwise similarity matrix calculations. We also provide illustrative examples of a more direct algorithm based on the extended Tanimoto similarity to select diverse compound sets, resulting in much higher levels of diversity than traditional approaches. We discuss the inner and outer consistency of our indices, which are key in practical applications, showing whether then-ary and binary indices rank the data in the same way. We demonstrate the use of the newn-ary similarity metrics ont-distributed stochastic neighbor embedding (t-SNE) plots of datasets of varying diversity, or corresponding to ligands of different pharmaceutical targets, which show that our indices provide a better measure of set compactness than standard binary measures. We also present a conceptual example of the applicability of our indices in agglomerative hierarchical algorithms. The Python code for calculating the extended similarity metrics is freely available at:https://github.com/ramirandaq/MultipleComparisons

Author(s):  
Jimmy Ming-Tai Wu ◽  
Qian Teng ◽  
Shahab Tayeb ◽  
Jerry Chun-Wei Lin

AbstractThe high average-utility itemset mining (HAUIM) was established to provide a fair measure instead of genetic high-utility itemset mining (HUIM) for revealing the satisfied and interesting patterns. In practical applications, the database is dynamically changed when insertion/deletion operations are performed on databases. Several works were designed to handle the insertion process but fewer studies focused on processing the deletion process for knowledge maintenance. In this paper, we then develop a PRE-HAUI-DEL algorithm that utilizes the pre-large concept on HAUIM for handling transaction deletion in the dynamic databases. The pre-large concept is served as the buffer on HAUIM that reduces the number of database scans while the database is updated particularly in transaction deletion. Two upper-bound values are also established here to reduce the unpromising candidates early which can speed up the computational cost. From the experimental results, the designed PRE-HAUI-DEL algorithm is well performed compared to the Apriori-like model in terms of runtime, memory, and scalability in dynamic databases.


2018 ◽  
Vol 8 (9) ◽  
pp. 1646 ◽  
Author(s):  
Qi Yao ◽  
Hongbing Wang ◽  
Jim Uttley ◽  
Xiaobo Zhuang

Big lighting data are required for evaluation of lighting performance and impacts on human beings, environment, and ecology for smart urban lighting. However, traditional approaches of measuring road lighting cannot achieve this aim. We propose a rule-of-thumb model approach based on some feature points to reconstruct road lighting in urban areas. We validated the reconstructed illuminance with both software simulated and real road lighting scenes, and the average error is between 6 and 19%. This precision is acceptable in practical applications. Using this approach, we reconstructed the illuminance of three real road lighting environments in a block and further estimated the mesopic luminance and melanopic illuminance performance. In the future, by virtue of Geographic Information System technology, the approach may provide big lighting data for evaluation and analysis, and help build smarter urban lighting.


Author(s):  
Jae Young Choi

Recently, considerable research efforts have been devoted to effective utilization of facial color information for improved recognition performance. Of all color-based face recognition (FR) methods, the most widely used approach is a color FR method using input-level fusion. In this method, augmented input vectors of the color images are first generated by concatenating different color components (including both luminance and chrominance information) by column order at the input level and feature subspace is then trained with a set of augmented input vectors. However, in practical applications, a testing image could be captured as a grayscale image, rather than as a color image, mainly caused by different, heterogeneous image acquisition environment. A grayscale testing image causes so-called dimensionality mismatch between the trained feature subspace and testing input vector. Disparity in dimensionality negatively impacts the reliable FR performance and even imposes a significant restriction on carrying out FR operations in practical color FR systems. To resolve the dimensionality mismatch, we propose a novel approach to estimate new feature subspace, suitable for recognizing a grayscale testing image. In particular, new feature subspace is estimated from a given feature subspace created using color training images. The effectiveness of proposed solution has been successfully tested on four public face databases (DBs) such as CMU, FERET, XM2VTSDB, and ORL DBs. Extensive and comparative experiments showed that the proposed solution works well for resolving dimensionality mismatch of importance in real-life color FR systems.


Author(s):  
Yumin Ma ◽  
Ting Li ◽  
Yongzhong Song ◽  
Xingju Cai

In this paper, we consider nonseparable convex minimization models with quadratic coupling terms arised in many practical applications. We use a majorized indefinite proximal alternating direction method of multipliers (iPADMM) to solve this model. The indefiniteness of proximal matrices allows the function we actually solved to be no longer the majorization of the original function in each subproblem. While the convergence still can be guaranteed and larger stepsize is permitted which can speed up convergence. For this model, we analyze the global convergence of majorized iPADMM with two different techniques and the sublinear convergence rate in the nonergodic sense. Numerical experiments illustrate the advantages of the indefinite proximal matrices over the positive definite or the semi-definite proximal matrices.


Author(s):  
Sreenu G. ◽  
M.A. Saleem Durai

Advances in recent hardware technology have permitted to document transactions and other pieces of information of everyday life at an express pace. In addition of speed up and storage capacity, real-life perceptions tend to transform over time. However, there are so much prospective and highly functional values unseen in the vast volume of data. For this kind of applications conventional data mining is not suitable, so they should be tuned and changed or designed with new algorithms. Big data computing is inflowing to the category of most hopeful technologies that shows the way to new ways of thinking and decision making. This epoch of big data helps users to take benefit out of all available data to gain more precise systematic results or determine latent information, and then make best possible decisions. Depiction from a broad set of workloads, the author establishes a set of classifying measures based on the storage architecture, processing types, processing techniques and the tools and technologies used.


Author(s):  
Norman Gwangwava ◽  
Catherine Hlahla

Using 3D printing technology in learning institutions brings an industrial experience to learners as well as an exposure to the same cutting-edge technologies encountered in real life careers. The chapter explores 3D printing technology at kindergarten (preschool), in the lecture room (BEng programme), and ready-to-use 3D printed products. In educational toy applications, the effect of poor product designs that do not meet the children's dimensional and safety requirements can lead to injuries, development of musculoskeletal disorders and health problems, some of which may be experienced by the children when they grow up. In order to address the problem of poor design, measurements of anthropometric dimensions from male and female children, aging from 6 to 7 years old were taken and concepts for educational toys were then generated. Other practical applications of the 3D printing technology explored in the chapter are lecture room demonstrations, prototyping of design projects and a web-based mass-customization of office mini-storage products.


Author(s):  
Yves Vanrompay ◽  
Manuele Kirsch-Pinheiro ◽  
Yolande Berbers

The current evolution of Service-Oriented Computing in ubiquitous systems is leading to the development of context-aware services. Context-aware services are services of which the description is enriched with context information related to non-functional requirements, describing the service execution environment or its adaptation capabilities. This information is often used for discovery and adaptation purposes. However, in real-life systems, context information is naturally dynamic, uncertain, and incomplete, which represents an important issue when comparing the service description with user requirements. Uncertainty of context information may lead to an inexact match between provided and required service capabilities, and consequently to the non-selection of services. In this chapter, we focus on how to handle uncertain and incomplete context information for service selection. We consider this issue by presenting a service ranking and selection algorithm, inspired by graph-based matching algorithms. This graph-based service selection algorithm compares contextual service descriptions using similarity measures that allow inexact matching. The service description and non-functional requirements are compared using two kinds of similarity measures: local measures, which compare individually required and provided properties, and global measures, which take into account the context description as a whole.


Author(s):  
José D. Martín-Guerrero ◽  
Emilio Soria-Olivas ◽  
Paulo J.G. Lisboa ◽  
Antonio J. Serrano-López

This work is intended for providing a review of reallife practical applications of Artificial Intelligence (AI) methods. We focus on the use of Machine Learning (ML) methods applied to rather real problems than synthetic problems with standard and controlled environment. In particular, we will describe the following problems in next sections: • Optimization of Erythropoietin (EPO) dosages in anaemic patients undergoing Chronic Renal Failure (CRF). • Optimization of a recommender system for citizen web portal users. • Optimization of a marketing campaign. The choice of these problems is due to their relevance and their heterogeneity. This heterogeneity shows the capabilities and versatility of ML methods to solve real-life problems in very different fields of knowledge. The following methods will be mentioned during this work: • Artificial Neural Networks (ANNs): Multilayer Perceptron (MLP), Finite Impulse Response (FIR) Neural Network, Elman Network, Self-Oganizing Maps (SOMs) and Adaptive Resonance Theory (ART). • Other clustering algorithms: K-Means, Expectation- Maximization (EM) algorithm, Fuzzy C-Means (FCM), Hierarchical Clustering Algorithms (HCA). • Generalized Auto-Regressive Conditional Heteroskedasticity (GARCH). • Support Vector Regression (SVR). • Collaborative filtering techniques. • Reinforcement Learning (RL) methods.


2019 ◽  
Vol 4 (1) ◽  
Author(s):  
Blaž Škrlj ◽  
Jan Kralj ◽  
Nada Lavrač

Abstract Complex networks are used as means for representing multimodal, real-life systems. With increasing amounts of data that lead to large multilayer networks consisting of different node and edge types, that can also be subject to temporal change, there is an increasing need for versatile visualization and analysis software. This work presents a lightweight Python library, Py3plex, which focuses on the visualization and analysis of multilayer networks. The library implements a set of simple graphical primitives supporting intra- as well as inter-layer visualization. It also supports many common operations on multilayer networks, such as aggregation, slicing, indexing, traversal, and more. The paper also focuses on how node embeddings can be used to speed up contemporary (multilayer) layout computation. The library’s functionality is showcased on both real and synthetic networks.


2019 ◽  
Vol 214 ◽  
pp. 07012 ◽  
Author(s):  
Nikita Balashov ◽  
Maxim Bashashin ◽  
Pavel Goncharov ◽  
Ruslan Kuchumov ◽  
Nikolay Kutovskiy ◽  
...  

Cloud computing has become a routine tool for scientists in many fields. The JINR cloud infrastructure provides JINR users with computational resources to perform various scientific calculations. In order to speed up achievements of scientific results the JINR cloud service for parallel applications has been developed. It consists of several components and implements a flexible and modular architecture which allows to utilize both more applications and various types of resources as computational backends. An example of using the Cloud&HybriLIT resources in scientific computing is the study of superconducting processes in the stacked long Josephson junctions (LJJ). The LJJ systems have undergone intensive research because of the perspective of practical applications in nano-electronics and quantum computing. In this contribution we generalize the experience in application of the Cloud&HybriLIT resources for high performance computing of physical characteristics in the LJJ system.


Sign in / Sign up

Export Citation Format

Share Document