input strings
Recently Published Documents


TOTAL DOCUMENTS

34
(FIVE YEARS 9)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Vol 10 (4) ◽  
pp. 2144-2151
Author(s):  
Rogel L. Quilala ◽  
Theda Flare G. Quilala

Abstract—Recently, a Modified SHA-1 (MSHA-1) has been proposed and claimed to have better security performance over SHA-1. However, the study showed that MSHA-1 hashing time performance was slower. In this research, an improved version of MSHA-1 was analyzed using avalanche effect and hashing time as performance measure applying 160-bit output and the mixing method to improve the diffusion rate.  The diffusion results showed the improvement in the avalanche effect of the improved MSHA-1 algorithm by 51.88%, which is higher than the 50% standard to be considered secured. MSHA-1 attained 50.53% avalanche effect while SHA1 achieved only 47.03% thereby showing that the improved MSHA-1 performed better security performance by having an improvement of 9.00% over the original SHA-1 and 3.00% over MSHA-1. The improvement was also tested using 500 random string for ten trials. The improved MSHA-1 has better hashing time performance as indicated by 31.03% improvement. Hash test program has been used to test the effectiveness of the algorithm by producing 1000 hashes from random input strings and showed zero (0) duplicate hashes.


Mathematics ◽  
2021 ◽  
Vol 9 (13) ◽  
pp. 1515
Author(s):  
Bojan Nikolic ◽  
Aleksandar Kartelj ◽  
Marko Djukanovic ◽  
Milana Grbic ◽  
Christian Blum ◽  
...  

The longest common subsequence (LCS) problem is a prominent NP–hard optimization problem where, given an arbitrary set of input strings, the aim is to find a longest subsequence, which is common to all input strings. This problem has a variety of applications in bioinformatics, molecular biology and file plagiarism checking, among others. All previous approaches from the literature are dedicated to solving LCS instances sampled from uniform or near-to-uniform probability distributions of letters in the input strings. In this paper, we introduce an approach that is able to effectively deal with more general cases, where the occurrence of letters in the input strings follows a non-uniform distribution such as a multinomial distribution. The proposed approach makes use of a time-restricted beam search, guided by a novel heuristic named Gmpsum. This heuristic combines two complementary scoring functions in the form of a convex combination. Furthermore, apart from the close-to-uniform benchmark sets from the related literature, we introduce three new benchmark sets that differ in terms of their statistical properties. One of these sets concerns a case study in the context of text analysis. We provide a comprehensive empirical evaluation in two distinctive settings: (1) short-time execution with fixed beam size in order to evaluate the guidance abilities of the compared search heuristics; and (2) long-time executions with fixed target duration times in order to obtain high-quality solutions. In both settings, the newly proposed approach performs comparably to state-of-the-art techniques in the context of close-to-uniform instances and outperforms state-of-the-art approaches for non-uniform instances.


Author(s):  
Sarah J. Berkemer ◽  
Christian Höner zu Siederdissen ◽  
Peter F. Stadler

AbstractAlignments, i.e., position-wise comparisons of two or more strings or ordered lists are of utmost practical importance in computational biology and a host of other fields, including historical linguistics and emerging areas of research in the Digital Humanities. The problem is well-known to be computationally hard as soon as the number of input strings is not bounded. Due to its practical importance, a huge number of heuristics have been devised, which have proved very successful in a wide range of applications. Alignments nevertheless have received hardly any attention as formal, mathematical structures. Here, we focus on the compositional aspects of alignments, which underlie most algorithmic approaches to computing alignments. We also show that the concepts naturally generalize to finite partially ordered sets and partial maps between them that in some sense preserve the partial orders. As a consequence of this discussion we observe that alignments of even more general structure, in particular graphs, are essentially characterized by the fact that the restriction of alignments to a row must coincide with the corresponding input graphs. Pairwise alignments of graphs are therefore determined completely by common induced subgraphs. In this setting alignments of alignments are well-defined, and alignments can be decomposed recursively into subalignments. This provides a general framework within which different classes of alignment algorithms can be explored for objects very different from sequences and other totally ordered data structures.


Author(s):  
B. Padmini Devi ◽  
S. K. Aruna ◽  
K. Sindhanaiselvan

In real life, the increased data accessing speed and data storage ability is required by most of the machinery fields. However, the real-world problems can be studied effectively with the combination of scientific computational techniques with the mathematical models. Automata theory is known to be the popular mathematical model. Towards most of the software and hardware related applications, the computational methods are analyzed and designed using significant automata theory concepts (likely, pushdown automata (PDA), Turing machines (TMs) and finite automata (FA)). Hence, the conventional lecture-driven style has attracted the reflective preferences of learners using these abstract natured concepts. But the lecture-driven teaching style has less motivated the computer engineering learners. In order to learn automata theory and computational models, we introduce the PDA and TM in a virtual platform. However, this work has motivated the improvement of longitudinal experimental validation and learning using the modern technology. Java Formal Languages and Automata Package (JFLAP) tool is used to write our simulators in JAVA language and the results are obtained from each machine through simulating the input strings.


Author(s):  
Maren Brand ◽  
Nguyen Khoa Tran ◽  
Philipp Spohr ◽  
Sven Schrinner ◽  
Gunnar W. Klau

AbstractWe consider the homo-edit distance problem, which is the minimum number of homo-deletions or homo-insertions to convert one string into another. A homo-insertion is the insertion of a string of equal characters into another string, while a homo-deletion is the inverse operation. We show how to compute the homo-edit distance of two strings in polynomial time: We first demonstrate that the problem is equivalent to computing a common subsequence of the two input strings with a minimum number of homo-deletions and then present a dynamic programming solution for the reformulated problem.2012 ACM Subject ClassificationApplied computing → Bioinformatics; Applied computing → Molecular sequence analysis; Theory of computation → Dynamic programming


2019 ◽  
Vol 46 (6) ◽  
pp. 1169-1201
Author(s):  
Andrew CAINES ◽  
Emma ALTMANN-RICHER ◽  
Paula BUTTERY

AbstractWe select three word segmentation models with psycholinguistic foundations – transitional probabilities, the diphone-based segmenter, and PUDDLE – which track phoneme co-occurrence and positional frequencies in input strings, and in the case of PUDDLE build lexical and diphone inventories. The models are evaluated on caregiver utterances in 132 CHILDES corpora representing 28 languages and 11.9 m words. PUDDLE shows the best performance overall, albeit with wide cross-linguistic variation. We explore the reasons for this variation, fitting regression models to performance scores with linguistic properties which capture lexico-phonological characteristics of the input: word length, utterance length, diversity in the lexicon, the frequency of one-word utterances, the regularity of phoneme patterns at word boundaries, and the distribution of diphones in each language. These properties together explain four-tenths of the observed variation in segmentation performance, a strong outcome and a solid foundation for studying further variables which make the segmentation task difficult.


Web applications are the source of information such as usernames, passwords, personally identifiable information, etc., they act as platforms of knowledge, resource sharing, digital transactions, digital ledgers, etc., and have been a target for attackers. In recent years reports say that there is a spike in the attacks on web applications, especially attacks like SQL injection and Cross Site Scripting have grown in drastic numbers due to discovery of new vulnerabilities. The attacks on web applications still persist due to the nature of attack payloads, as these payloads are highly heterogeneous and look very similar to regular text even web applications with many security features in place may fail to detect these malicious payload strings. To overcome this problem there are various methods described one such method is utilizing machine learning models to detect malicious strings by classifying the input strings given to the web applications. This paper describes the study of six binary classification methods Logistic regression, Naïve Bayes, SGD, ADABoost, Random Forrest, Decision trees using our own dataset and feature set.


2019 ◽  
Vol 35 (1) ◽  
pp. 21-37
Author(s):  
Trường Huy Nguyễn

In this paper, we introduce two efficient algorithms in practice for computing the length of a longest common subsequence of two strings, using automata technique, in sequential and parallel ways. For two input strings of lengths m and n with m ≤ n, the parallel algorithm uses k processors (k ≤ m) and costs time complexity O(n) in the worst case, where k is an upper estimate of the length of a longest common subsequence of the two strings. These results are based on the Knapsack Shaking approach proposed by P. T. Huy et al. in 2002. Experimental results show that for the alphabet of size 256, our sequential and parallel algorithms are about 65.85 and 3.41m times faster than the standard dynamic programming algorithm proposed by Wagner and Fisher in 1974, respectively.


Author(s):  
Javier Segovia-Aguas ◽  
Sergio Jiménez ◽  
Anders Jonsson

This paper presents a novel approach for generating Context-Free Grammars (CFGs) from small sets of input strings (a single input string in some cases). Our approach is to compile this task into a classical planning problem whose solutions are sequences of actions that build and validate a CFG compliant with the input strings. In addition, we show that our compilation is suitable for implementing the two canonical tasks for CFGs, string production and string recognition.


Sign in / Sign up

Export Citation Format

Share Document