string distance Latest Research Papers

Study of trajectory of care is attractive for predicting medical outcome. Models based on machine learning (ML) techniques have proven their efficiency for sequence prediction modeling compared to other models. Introducing pattern mining techniques contributed to reduce model complexity. In this respect, we explored methods for medical events’ prediction based on the extraction of sets of relevant event sequences of a national hospital discharge database. It is illustrated to predict the risk of in-hospital mortality in acute coronary syndrome (ACS). We mined sequential patterns from the French Hospital Discharge Database. We compared several predictive models using a text string distance to measure the similarity between patients’ patterns of care. We computed combinations of similarity measurements and ML models commonly used. A Support Vector Machine model coupled with edit-based distance appeared as the most effective model. Indeed discrimination ranged from 0.71 to 0.99, together with a good overall accuracy. Thus, sequential patterns mining appear motivating for event prediction in medical settings as described here for ACS.

Download Full-text

Comparison of Apache SOLR Search Spellcheck String Distance Measure – Levenshtein, Jaro Winkler, and N-Gram

International Journal of Computer Trends and Technology ◽

10.14445/22312803/ijctt-v69i3p101 ◽

2021 ◽

Vol 69 (3) ◽

pp. 1-4

Author(s):

Parameswara Rao Kandregula

Keyword(s):

Distance Measure ◽

String Distance ◽

N Gram

Download Full-text

Repacked android application detection using image similarity

Nexo Revista Científica ◽

10.5377/nexo.v33i01.10058 ◽

2020 ◽

Vol 33 (01) ◽

pp. 190-199

Author(s):

M.A. Rahim Khan ◽

R.C. Tripathi ◽

Ajit Kumar

Keyword(s):

Hamming Distance ◽

Main Idea ◽

Scanning Speed ◽

Image Similarity ◽

Detection Accuracy ◽

Android Application ◽

Distance Calculation ◽

String Distance ◽

Perceptual Hashing ◽

Binary Features

The popularity of Android brings many functionalities to its users but it also brings many threats. Repacked Android application is one such threat which is the root of many other threats such as malware, phishing, adware, and economical loss. Earlier many techniques have been proposed for the detection of repacked application but they have their limitations and bottlenecks. In this work, we proposed an image similarity based repacked application detection technique. The proposed work utilized the main idea behind the repacking of application that is “the attacker wants to create fake application looking visually similar to the original". We convert each APK file into a grayscale image and then use perceptual hashing for creating a hash of each image. The string distance algorithms like Hamming distance was used to calculate the distance and searching for the repacked application. The proposed work also used distance calculation on binary features extracted from the app. The proposed work is very powerful in terms of detection accuracy and scanning speed and we achieved 96% accuracy.

Download Full-text

Entanglement dynamics for static two-level atoms in cosmic string spacetime

The European Physical Journal C ◽

10.1140/epjc/s10052-020-7663-x ◽

2020 ◽

Vol 80 (2) ◽

Author(s):

Pingyang He ◽

Hongwei Yu ◽

Jiawei Hu

Keyword(s):

Cosmic String ◽

Minkowski Spacetime ◽

Scalar Fields ◽

Entangled State ◽

Entanglement Dynamics ◽

Entanglement Generation ◽

Maximally Entangled State ◽

Interatomic Separation ◽

Antisymmetric State ◽

String Distance

Abstract We study the entanglement dynamics of two static atoms coupled with a bath of fluctuating scalar fields in vacuum in the cosmic string spacetime. Three different alignments of atoms, i.e. parallel, vertical, and symmetric alignments with respect to the cosmic string are considered. We focus on how entanglement degradation and generation are influenced by the cosmic string, and find that they are crucially dependent on the atom-string distance r, the interatomic separation L, and the parameter $$\nu $$ν that characterizes the nontrivial topology of the cosmic string. For two atoms initially in a maximally entangled state, the destroyed entanglement can be revived when the atoms are aligned vertically to the string, which cannot happen in the Minkowski spacetime. When the symmetrically aligned two-atom system is initially in the antisymmetric state, the lifetime of entanglement can be significantly enhanced as $$\nu $$ν increases. For two atoms which are initially in the excited state, when the interatomic separation is large compared to the transition wavelength, entanglement generation cannot happen in the Minkowski spacetime, while it can be achieved in the cosmic string spacetime when the position of the two atoms is appropriate with respect to the cosmic string and $$\nu $$ν is large enough.

Download Full-text

Combining string and phonetic similarity matching to identify misspelt names of drugs in medical records written in Portuguese

Journal of Biomedical Semantics ◽

10.1186/s13326-019-0216-2 ◽

2019 ◽

Vol 10 (S1) ◽

Cited By ~ 2

Author(s):

Hegler Tissot ◽

Richard Dobson

Keyword(s):

Medical Records ◽

Similarity Search ◽

Hybrid Approach ◽

Free Text ◽

Distance Metrics ◽

Exact Match ◽

Text Data ◽

String Similarity ◽

Phonetic Similarity ◽

String Distance

Abstract Background There is an increasing amount of unstructured medical data that can be analysed for different purposes. However, information extraction from free text data may be particularly inefficient in the presence of spelling errors. Existing approaches use string similarity methods to search for valid words within a text, coupled with a supporting dictionary. However, they are not rich enough to encode both typing and phonetic misspellings. Results Experimental results showed a joint string and language-dependent phonetic similarity is more accurate than traditional string distance metrics when identifying misspelt names of drugs in a set of medical records written in Portuguese. Conclusion We present a hybrid approach to efficiently perform similarity match that overcomes the loss of information inherit from using either exact match search or string based similarity search methods.

Download Full-text

Teknik Budidaya Rumput Laut (Kappaphycus alvarezii) dengan Metode Rakit Apung di Desa Tanjung, Kecamatan Saronggi, Kabupaten Sumenep, Jawa Timur [Technique Culture of Sea Weed (Kappaphycus alvarezii) with Flouting Raft Method in Tanjung Village, Saronggi Sub District, Sumenep Regency, East Java]

Jurnal Ilmiah Perikanan dan Kelautan ◽

10.20473/jipk.v3i1.11619 ◽

2019 ◽

Vol 3 (1) ◽

pp. 21

Author(s):

Annur Ahadi, Abdullah

Keyword(s):

Kappaphycus Alvarezii ◽

White Spot ◽

White Spot Disease ◽

Spot Disease ◽

String Distance ◽

Great Culture

Abstract Sea weed is a great culture commodity with high economical value. The purpose of this Field Job Practice were to get knowledge and skill ,the constraint, and also the opportunity to develop it. This Field Job st thPractice performed in Tanjung Village at July 31 until September 10 2006 used description method. Sea weed culture techique use flouting raft with raft size 7 x 10 m and string distance 14 cm with culture 30 days period. The constraint are pest attack like Baronang (Siganus sp.), ice ice, white spot disease and also weather fluctuation.This culture has opportunity to develop because it has fast cash back about 2,176 periode.

Download Full-text

SIMILARITY DISTANCE MEASURE AND PRIORITIZATION ALGORITHM FOR TEST CASE PRIORITIZATION IN SOFTWARE PRODUCT LINE TESTING

Journal of Information and Communication Technology ◽

10.32890/jict2019.18.1.8281 ◽

2018 ◽

Author(s):

Shahliza Abd Halim ◽

Dayang Norhayati Abang Jawawi ◽

Muhammad Sahak

Keyword(s):

Fault Detection ◽

Distance Measure ◽

Software Product Line ◽

Product Line ◽

Test Case ◽

Test Cases ◽

Test Case Prioritization ◽

Software Product ◽

String Distance ◽

Similarity Distance

To achieve the goal of creating products for a specific market segment, implementation of Software Product Line (SPL) is required to fulfill specific needs of customers by managing a set of common features and exploiting the variabilities between the products. Testing product-by-product is not feasible in SPL due to the combinatorial explosion of product number, thus, Test Case Prioritization (TCP) is needed to select a few test cases which could yield high number of faults. Among the most promising TCP techniques is similarity-based TCP technique which consists of similarity distance measure and prioritization algorithm. The goal of this paper is to propose an enhanced string distance and prioritization algorithm which could reorder the test cases resulting to higher rate of fault detection. Comparative study has been done between different string distance measures and prioritization algorithms to select the best techniques for similarity-based test case prioritization. Identified enhancements have been implemented to both techniques for a better adoption of prioritizing SPL test cases. Experiment has been done in order to identify the effectiveness of enhancements done for combination of both techniques. Result shows the effectiveness of the combination where it achieved highest average fault detection rate, attained fastest execution time for highest number of test cases and accomplished 41.25% average rate of fault detection. The result proves that the combination of both techniques improve SPL testing effectiveness compared to other existing techniques.

Download Full-text

Particle Swarm Optimization for Test Case Prioritization Using String Distance

Advanced Science Letters ◽

10.1166/asl.2018.12918 ◽

2018 ◽

Vol 24 (10) ◽

pp. 7221-7226

Author(s):

Muhammad Khatibsyarbini ◽

Mohd Adham Isa ◽

Dayang Norhayati Abang Jawawi

Keyword(s):

Particle Swarm Optimization ◽

Particle Swarm ◽

Test Case ◽

Test Case Prioritization ◽

Swarm Optimization ◽

String Distance

Download Full-text

Hidden Structures: Clustering, String Distance, Text Vectors and Topic Modeling

Text Mining in Practice with R ◽

10.1002/9781119282105.ch5 ◽

2017 ◽

pp. 129-179

Keyword(s):

Topic Modeling ◽

String Distance

Download Full-text

string distance
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

How Compression and Approximation Affect Efficiency in String Distance Measures

Prediction of In-Hospital Mortality from Administrative Data: A Sequential Pattern Mining Approach

Comparison of Apache SOLR Search Spellcheck String Distance Measure – Levenshtein, Jaro Winkler, and N-Gram

Repacked android application detection using image similarity

Entanglement dynamics for static two-level atoms in cosmic string spacetime

Combining string and phonetic similarity matching to identify misspelt names of drugs in medical records written in Portuguese

Teknik Budidaya Rumput Laut (Kappaphycus alvarezii) dengan Metode Rakit Apung di Desa Tanjung, Kecamatan Saronggi, Kabupaten Sumenep, Jawa Timur [Technique Culture of Sea Weed (Kappaphycus alvarezii) with Flouting Raft Method in Tanjung Village, Saronggi Sub District, Sumenep Regency, East Java]

SIMILARITY DISTANCE MEASURE AND PRIORITIZATION ALGORITHM FOR TEST CASE PRIORITIZATION IN SOFTWARE PRODUCT LINE TESTING

Particle Swarm Optimization for Test Case Prioritization Using String Distance

Hidden Structures: Clustering, String Distance, Text Vectors and Topic Modeling

Export Citation Format

string distanceRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

How Compression and Approximation Affect Efficiency in String Distance Measures

Prediction of In-Hospital Mortality from Administrative Data: A Sequential Pattern Mining Approach

Comparison of Apache SOLR Search Spellcheck String Distance Measure – Levenshtein, Jaro Winkler, and N-Gram

Repacked android application detection using image similarity

Entanglement dynamics for static two-level atoms in cosmic string spacetime

Combining string and phonetic similarity matching to identify misspelt names of drugs in medical records written in Portuguese

Teknik Budidaya Rumput Laut (Kappaphycus alvarezii) dengan Metode Rakit Apung di Desa Tanjung, Kecamatan Saronggi, Kabupaten Sumenep, Jawa Timur [Technique Culture of Sea Weed (Kappaphycus alvarezii) with Flouting Raft Method in Tanjung Village, Saronggi Sub District, Sumenep Regency, East Java]

SIMILARITY DISTANCE MEASURE AND PRIORITIZATION ALGORITHM FOR TEST CASE PRIORITIZATION IN SOFTWARE PRODUCT LINE TESTING

Particle Swarm Optimization for Test Case Prioritization Using String Distance

Hidden Structures: Clustering, String Distance, Text Vectors and Topic Modeling

string distance
Recently Published Documents