similarity scores
Recently Published Documents


TOTAL DOCUMENTS

82
(FIVE YEARS 37)

H-INDEX

15
(FIVE YEARS 3)

Metabolites ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 68
Author(s):  
Jesi Lee ◽  
Tobias Kind ◽  
Dean Joseph Tantillo ◽  
Lee-Ping Wang ◽  
Oliver Fiehn

Mass spectrometry is the most commonly used method for compound annotation in metabolomics. However, most mass spectra in untargeted assays cannot be annotated with specific compound structures because reference mass spectral libraries are far smaller than the complement of known molecules. Theoretically predicted mass spectra might be used as a substitute for experimental spectra especially for compounds that are not commercially available. For example, the Quantum Chemistry Electron Ionization Mass Spectra (QCEIMS) method can predict 70 eV electron ionization mass spectra from any given input molecular structure. In this work, we investigated the accuracy of QCEIMS predictions of electron ionization (EI) mass spectra for 80 purine and pyrimidine derivatives in comparison to experimental data in the NIST 17 database. Similarity scores between every pair of predicted and experimental spectra revealed that 45% of the compounds were found as the correct top hit when QCEIMS predicted spectra were matched against the NIST17 library of >267,000 EI spectra, and 74% of the compounds were found within the top 10 hits. We then investigated the impact of matching, missing, and additional fragment ions in predicted EI mass spectra versus ion abundances in MS similarity scores. We further include detailed studies of fragmentation pathways such as retro Diels–Alder reactions to predict neutral losses of (iso)cyanic acid, hydrogen cyanide, or cyanamide in the mass spectra of purines and pyrimidines. We describe how trends in prediction accuracy correlate with the chemistry of the input compounds to better understand how mechanisms of QCEIMS predictions could be improved in future developments. We conclude that QCEIMS is useful for generating large-scale predicted mass spectral libraries for identification of compounds that are absent from experimental libraries and that are not commercially available.


2021 ◽  
Author(s):  
Guru Nagaraj ◽  
Prashanth Pillai ◽  
Mandar Kulkarni

Abstract Over the years, well test analysis or pressure transient analysis (PTA) methods have progressed from straight lines via type curve analysis to pressure derivatives and deconvolution methods. Today, analysis of the log-log (pressure and its derivative) response is the most used method for PTA. Although these methods are widely available through commercial software, they are not fully automated, and human interaction is needed for their application. Furthermore, PTA is described as an inverse problem, whose solution in general is non-unique, and several models (well, reservoir and boundary) can be found applicable to similar pressure-derivative response. This tends to always bring about confusion in choosing the correct model using the conventional approach. This results in multiple iterations that are time consuming and requires constant human interaction. Our approach automates the process of PTA using a Siamese neural network (SNN) architecture comprised of Convolutional neural network (CNN) and Long Short-Term Memory (LSTM) layers. The SNN model is trained on simulated experimental data created using a design of experiments (DOE) approach involving most common 14 interpretation scenarios across well, reservoir, and boundary model types. Across each model type, parameters such as permeability, horizontal well length, skin factor, and distance to the boundary were sampled to compute 560 different pressure derivative responses. SNN is trained using a self-supervised training strategy where the positive and negative pairs are generated from the training data. We use transformations such as compression and expansion to generate positive pairs and negative pairs for the well test model responses. For a given well test model response, similarity scores are computed against the candidates in each model class, and the best match from each class is identified. These matches are then ranked according to the similarity scores to identify optimal candidates. Experimental analysis indicated that the true model class frequently appeared among the top ranked classes. The model achieves an accuracy of 93% for the top one model recommendations when tested on 70 samples from the 14 interpretation scenarios. Prior information on the top ranked probable well test models, significantly reduces the manual effort involved in the analysis. This machine learning (ML) approach can be integrated with any PTA software or function as a standalone application in the interpreter's system. Current work using SNN with LSTM layers can be used to speed up the process of detecting the pressure derivative response explained by a certain combination of well, reservoir and boundary models and produce models with less user interaction. This methodology will facilitate the interpretation engineer in making the model recognition faster for detailed integration with additional information from sources such as geophysics, geology, petrophysics, drilling, and production logging.


Author(s):  
Mourad Fariss ◽  
Naoufal El Allali ◽  
Hakima Asaidi ◽  
Mohamed Bellouki

Web service (WS) discovery is an essential task for implementing complex applications in a service oriented architecture (SOA), such as selecting, composing, and providing services. This task is limited semantically in the incorporation of the customer’s request and the web services. Furthermore, applying suitable similarity methods for the increasing number of WSs is more relevant for efficient web service discovery. To overcome these limitations, we propose a new approach for web service discovery integrating multiple similarity measures and k-means clustering. The approach enables more accurate services appropriate to the customer's request by calculating different similarity scores between the customer's request and the web services. The global semantic similarity is determined by applying k-means clustering using the obtained similarity scores. The experimental results demonstrated that the proposed semantic web service discovery approach outperforms the state-of-the approaches in terms of precision (98%), recall (95%), and F-measure (96%). The proposed approach is efficiently designed to support and facilitate the selection and composition of web services phases in complex applications.


i-Perception ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 204166952110587
Author(s):  
Zhaoyi Li ◽  
Xiaofang Lei ◽  
Xinze Yan ◽  
Zhiguo Hu ◽  
Hongyan Liu

The present study aims to explore the influence of masculine/feminine changes on the attractiveness evaluation of one's own face, and examine the relationship of this attractiveness evaluation and the similarities between masculine/feminine faces and original faces. A picture was taken from each participant and considered as his or her original self-face, and a male or female face with an average attractiveness score was adopted as the original other face. Masculinized and feminized transformations of the original faces (self-face, male other face, and female other face) into 100% masculine and feminine faces were produced with morphing software stepping by 2%. Thirty female participants and 30 male participants were asked to complete three tasks, i.e., to “like” or “not like” the original face judgment of a given face compared to the original face, to choose the most attractive face from a morphed facial clip, and to subjectively evaluate the attractiveness and similarity of morphed faces. The results revealed that the acceptable range of masculine/feminine transformation for self-faces was narrower than that for other faces. Furthermore, the attractiveness ratings for masculinized or femininized self-faces were correlated with the similarity scores of the faces with the original self-faces. These findings suggested that attractiveness enhancement of self-face through masculinity/femininity must be within reasonable extent and take into account the similarity between the modified faces and the original self-face.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yuhua Yao ◽  
Binbin Ji ◽  
Yaping Lv ◽  
Ling Li ◽  
Ju Xiang ◽  
...  

Studies have found that long non-coding RNAs (lncRNAs) play important roles in many human biological processes, and it is critical to explore potential lncRNA–disease associations, especially cancer-associated lncRNAs. However, traditional biological experiments are costly and time-consuming, so it is of great significance to develop effective computational models. We developed a random walk algorithm with restart on multiplex and heterogeneous networks of lncRNAs and diseases to predict lncRNA–disease associations (MHRWRLDA). First, multiple disease similarity networks are constructed by using different approaches to calculate similarity scores between diseases, and multiple lncRNA similarity networks are also constructed by using different approaches to calculate similarity scores between lncRNAs. Then, a multiplex and heterogeneous network was constructed by integrating multiple disease similarity networks and multiple lncRNA similarity networks with the lncRNA–disease associations, and a random walk with restart on the multiplex and heterogeneous network was performed to predict lncRNA–disease associations. The results of Leave-One-Out cross-validation (LOOCV) showed that the value of Area under the curve (AUC) was 0.68736, which was improved compared with the classical algorithm in recent years. Finally, we confirmed a few novel predicted lncRNAs associated with specific diseases like colon cancer by literature mining. In summary, MHRWRLDA contributes to predict lncRNA–disease associations.


2021 ◽  
Author(s):  
Rahul Sharan Renu ◽  
Gregory Mocko

Abstract Many manufacturing enterprises have large collections of solid models and text-based assembly processes to support assembly operations. These data are often distributed across their extended enterprise. As these enterprises expand globally, there is often an increase in product and process variability which can often lead to challenges with training, quality control, and obstacles with change management to name a few. Thus, there is a desire to increase the consistency of assembly work instructions within and across assembly locations. The objective of this research is to retrieve existing 3d models of components and assemblies and their associated assembly work instructions. This is accomplished using 3d solid model similarity and text mining of assembly work instructions. Initially, a design study was conducted in which participants authored assembly work instructions for several different solid model assemblies. Next, a geometric similarity algorithm was used to compute similarity scores between solid models and latent semantic analysis is used to compute the similarity between text-based assembly work instructions. Finally, a correlation study between solid model-assembly instruction tuples is computed. A moderately strong positive correlation was found to exist between solid model similarity scores and their associated assembly instruction similarity scores. This indicates that designs with a similar shape have a similar assembly process and thus can serve as the basis for authoring new assembly processes. This aids in resolving differences in existing processes by linking three-dimensional solid models and their associated assembly work instructions.


Author(s):  
Katie A. Wilson ◽  
Burkely T. Gallo ◽  
Patrick Skinner ◽  
Adam Clark ◽  
Pamela Heinselman ◽  
...  

AbstractConvection-allowing model ensemble guidance, such as that provided by the Warn-on-Forecast System (WoFS), is designed to provide predictions of individual thunderstorm hazards within the next 0–6 h. The WoFS web viewer provides a large suite of storm and environmental attribute products, but the applicability of these products to the National Weather Service forecast process has not been objectively documented. Therefore, this study describes an experimental forecasting task designed to investigate what WoFS products forecasters accessed and how they accessed them for a total of 26 cases (comprised of 13 weather events, each worked by two forecasters). Analysis of web access log data revealed that in all 26 cases, product accesses were dominated in the reflectivity, rotation, hail, and surface wind categories. However, the number of different product types viewed and the number of transitions between products varied in each case. Therefore, the Levenshtein (Edit Distance) method was used to compute similarity scores across all 26 cases, which helped identify what it meant for relatively similar vs. dissimilar navigation of WoFS products. The Spearman’s Rank correlation coefficient (R) results found that forecasters working the same weather event had higher similarity scores for events that produced more tornado reports and for events in which forecasters had higher performance scores. The findings from this study will influence subsequent efforts for further improving WoFS products and developing an efficient and effective user interface for operational applications.


2021 ◽  
Vol 7 (7) ◽  
pp. 116
Author(s):  
Pasquale Ferrara ◽  
Rudolf Haraksim ◽  
Laurent Beslay

Performance evaluation of source camera attribution methods typically stop at the level of analysis of hard to interpret similarity scores. Standard analytic tools include Detection Error Trade-off or Receiver Operating Characteristic curves, or other scalar performance metrics, such as Equal Error Rate or error rates at a specific decision threshold. However, the main drawback of similarity scores is their lack of probabilistic interpretation and thereby their lack of usability in forensic investigation, when assisting the trier of fact to make more sound and more informed decisions. The main objective of this work is to demonstrate a transition from the similarity scores to likelihood ratios in the scope of digital evidence evaluation, which not only have probabilistic meaning, but can be immediately incorporated into the forensic casework and combined with the rest of the case-related forensic. Likelihood ratios are calculated from the Photo Response Non-Uniformity source attribution similarity scores. The experiments conducted aim to compare different strategies applied to both digital images and videos, by considering their respective peculiarities. The results are presented in a format compatible with the guideline for validation of forensic likelihood ratio methods.


Author(s):  
Ayad I. Abdulsada ◽  
Dhafer G. Honi ◽  
Salah Al-Darraji

Many organizations and individuals are attracted to outsource their data into remote cloud service providers. To ensure privacy, sensitive data should be encrypted be-fore being hosted. However, encryption disables the direct application of the essential data management operations like searching and indexing. Searchable encryption is acryptographic tool that gives users the ability to search the encrypted data while being encrypted. However, the existing schemes either serve a single exact search that loss the ability to handle the misspelled keywords or multi-keyword search that generate very long trapdoors. In this paper, we address the problem of designing a practical multi-keyword similarity scheme that provides short trapdoors and returns the correct results according to their similarity scores. To do so, each document is translated intoa compressed trapdoor. Trapdoors are generated using key based hash functions to en-sure their privacy. Only authorized users can issue valid trapdoors. Similarity scores of two textual documents are evaluated by computing the Hamming distance between their corresponding trapdoors. A robust security definition is provided together withits proof. Our experimental results illustrate that the proposed scheme improves thesearch efficiency compared to the existing schemes. Further more, it shows a high level of performance.


Sign in / Sign up

Export Citation Format

Share Document