String Algorithms

2019 ◽  
pp. 493-517
Keyword(s):  
2021 ◽  
Vol 54 (1) ◽  
pp. 1-22
Author(s):  
Rayan Chikhi ◽  
Jan Holub ◽  
Paul Medvedev

The analysis of biological sequencing data has been one of the biggest applications of string algorithms. The approaches used in many such applications are based on the analysis of k -mers, which are short fixed-length strings present in a dataset. While these approaches are rather diverse, storing and querying a k -mer set has emerged as a shared underlying component. A set of k -mers has unique features and applications that, over the past 10 years, have resulted in many specialized approaches for its representation. In this survey, we give a unified presentation and comparison of the data structures that have been proposed to store and query a k -mer set. We hope this survey will serve as a resource for researchers in the field as well as make the area more accessible to researchers outside the field.


2020 ◽  
Vol 13 (1) ◽  
pp. 50-56
Author(s):  
Zekâi Şen

Background: There are different methodologies for DNA comparison based on two string algorithms, which are dependent on crisp logical principles, where there is no room for verbal (linguistic) uncertainty. These are successfully applicable procedures in DNA bioinformatics researches even by taking into consideration probabilistic random variability components based on the probability distribution functions of various types. Objective: The main purpose of this paper is to review first briefly all available DNA string matching methodologies that are based on crisp logic and then to suggest a new method based on the fuzzy logic rules and application. Methods: There are different methodologies for DNA comparison based on two string algorithms, which are dependent on crisp logical principles, where there is no room for verbal (linguistic) uncertainty. These are successfully applicable procedures in DNA bioinformatics researchers even by taking into consideration probabilistic random variability components based on the probability distribution functions of various types. Results: Fuzzy number representation of each gene implies some sort of uncertainty or unhealthiness in some or all the genes. Their better identifications can be achieved on the basis of fuzzy numbers with different membership degrees, which imply the unhealthiness or healthiness of the genes and their collective behaviors. Conclusion: After the development of fuzzy number representation of the text string coupled with crisp pattern string their relationships are searched at different shift operations, and hence, the possibility of defaulters are identified in the text string with a certain degree of membership.


2021 ◽  
Author(s):  
Pedro Mirabal ◽  
Ignacio Lincolao-Venegas ◽  
Mario Castillo-Sanhueza ◽  
Jose Abreu

Algorithms ◽  
2020 ◽  
Vol 13 (9) ◽  
pp. 224
Author(s):  
Paniz Abedin ◽  
M. Oğuzhan Külekci ◽  
Shama V. Thankachan

The shortest unique substring (SUS) problem is an active line of research in the field of string algorithms and has several applications in bioinformatics and information retrieval. The initial version of the problem was proposed by Pei et al. [ICDE’13]. Over the years, many variants and extensions have been pursued, which include positional-SUS, interval-SUS, approximate-SUS, palindromic-SUS, range-SUS, etc. In this article, we highlight some of the key results and summarize the recent developments in this area.


Sign in / Sign up

Export Citation Format

Share Document