string comparison
Recently Published Documents


TOTAL DOCUMENTS

56
(FIVE YEARS 18)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Vol 11 (5) ◽  
pp. 7605-7609
Author(s):  
. Waseemullah ◽  
M. F. Hyder ◽  
M. A. Siddiqui ◽  
M. Mukarram

Automatic TV ad detection is a challenging task in computer vision. Manual ad detection is considered a tedious job. Detecting advertisements automatically saves time and human effort. In this paper, a method is proposed for detecting repeated video segments automatically, since generally, ads appear in TV transmissions frequently. At first, the user is allowed to browse the advertisements needed to be detected, and the video in which they are to be detected. The videos are then converted into a text file using the Base64 encodings. In the third step, the advertisements are detected using string comparison methods. In the end, a report, with the names of the advertisements is shown against the total time and the number of times these advertisements appeared in the stream. The implementation was carried out in python.


2021 ◽  
Author(s):  
Nikita Mishin ◽  
Daniil Berezun ◽  
Alexander Tiskin

2021 ◽  
Vol 31 (1) ◽  
pp. 79-110
Author(s):  
Luıs Russo ◽  
◽  
Alexandre Francisco ◽  

We consider the problem of identifying tandem scattered subsequences within a string. Our algorithm identifies a longest subsequence which occurs twice without overlap in a string. This algorithm is based on the Hunt-Szymanski algorithm, therefore its performance improves if the string is not self similar, which occurs naturally on strings over large alphabets. Our algorithm relies on new results for data structures that support dynamic longest increasing sub-sequences. In the process we also obtain improved algorithms for the decremental string comparison problem.


Author(s):  
Joel Kalvesmaki

Classical models of string comparison have been difficult to implement in XSLT, in part because those models are designed for imperative, stateful programming. In this article I introduce tan:diff(), an XSLT function built upon a different approach to string comparison, one more conducive to a declarative, stateless language. tan:diff() is efficient and fast, even on pairs of very long strings (100K to 1M characters), in part because of its staggered-sample approach, in part because of its stategies for optimizing enormous strings (> 1M characters). Its results are of optimal quality: the function normally returns a minimal diff (shortest edit script). As an open-source function, tan:diff() enables developers to incorporate robust text comparison directly into XML applications.


Queue ◽  
2021 ◽  
Vol 19 (3) ◽  
pp. 107-116
Author(s):  
Torsten Ullrich

In many languages a string comparison is a pitfall for beginners. With any Unicode string as input, a comparison often causes problems even for advanced users. The semantic equivalence of different characters in Unicode requires a normalization of the strings before comparing them. This article shows how to handle Unicode sequences correctly. The comparison of two strings for equality often raises questions concerning the difference between comparison by value, comparison of object references, strict equality, and loose equality. The most important aspect is semantic equivalence.


Author(s):  
Mai Alzamel ◽  
Lorraine A.K. Ayad ◽  
Giulia Bernardini ◽  
Roberto Grossi ◽  
Costas S. Iliopoulos ◽  
...  

Uncertain sequences are compact representations of sets of similar strings. They highlight common segments by collapsing them, and explicitly represent varying segments by listing all possible options. A generalized degenerate string (GD string) is a type of uncertain sequence. Formally, a GD string Ŝ is a sequence of n sets of strings of total size N, where the ith set contains strings of the same length ki but this length can vary between different sets. We denote by W the sum of these lengths k0, k1, …, kn-1. Our main result is an O(N + M)-time algorithm for deciding whether two GD strings of total sizes N and M, respectively, over an integer alphabet, have a non-empty intersection. This result is based on a combinatorial result of independent interest: although the intersection of two GD strings can be exponential in the total size of the two strings, it can be represented in linear space. We then apply our string comparison tool to devise a simple algorithm for computing all palindromes in Ŝ in O(min{W, n2}N)-time. We complement this upper bound by showing a similar conditional lower bound for computing maximal palindromes in Ŝ. We also show that a result, which is essentially the same as our string comparison linear-time algorithm, can be obtained by employing an automata-based approach.


2020 ◽  
Vol 175 (1-4) ◽  
pp. 41-58
Author(s):  
Mai Alzamel ◽  
Lorraine A.K. Ayad ◽  
Giulia Bernardini ◽  
Roberto Grossi ◽  
Costas S. Iliopoulos ◽  
...  

Uncertain sequences are compact representations of sets of similar strings. They highlight common segments by collapsing them, and explicitly represent varying segments by listing all possible options. A generalized degenerate string (GD string) is a type of uncertain sequence. Formally, a GD string Ŝ is a sequence of n sets of strings of total size N, where the ith set contains strings of the same length ki but this length can vary between different sets. We denote by W the sum of these lengths k0, k1, . . . , kn-1. Our main result is an 𝒪(N + M)-time algorithm for deciding whether two GD strings of total sizes N and M, respectively, over an integer alphabet, have a non-empty intersection. This result is based on a combinatorial result of independent interest: although the intersection of two GD strings can be exponential in the total size of the two strings, it can be represented in linear space. We then apply our string comparison tool to devise a simple algorithm for computing all palindromes in Ŝ in 𝒪(min{W, n2}N)-time. We complement this upper bound by showing a similar conditional lower bound for computing maximal palindromes in Ŝ. We also show that a result, which is essentially the same as our string comparison linear-time algorithm, can be obtained by employing an automata-based approach.


Author(s):  
Martin Marinov

This paper describes a string encoding algorithm, which produces sparse dis-tributed representations (SDR) of text data. In essence, this is a modified version of a prior algorithm and the modifications have the following benefits: - the ability to decode data, without loss of information; - greatly increased capacity of the encoding space; - the possibility of performing more detailed comparisons of encoded strings. The main disadvantage compared to the prior algorithm is the increased complexity of the procedure for encoded string comparison. This is due to the use of a four-dimensional encoding space, instead of a two-dimensional space.


Author(s):  
Philip Gauglitz ◽  
Jan Ulffers ◽  
Gyde Thomsen ◽  
Felix Frischmuth ◽  
David Geiger ◽  
...  

<p>The electrification of the transport sector together with an increasing share of renewable energies has the potential to reduce CO<sub>2</sub> emissions significantly. This transformation requires the roll-out of charging infrastructure, which, as a new and rapidly growing electrical consumer, has an impact on the power grid. For grid planning and dimensioning purposes, it is crucial to assess this impact as accurately as possible. Consequently, the possibility to simulate potential spatial distributions of charging points and their ramp-up is of central importance. We present an approach using socio-economic data such as population size, income levels and age to estimate where electric mobility will be concentrated, especially during the transition phase.</p><p>Suitable socio-economic data for Germany is only available for the current population and, in terms of spatial resolution, at the level of streets. Thus, both spatial disaggregation and temporal extrapolation within a demographic model are necessary for more detailed scenario predictions. In our proposed approach, a fuzzy-string comparison method and geographical mapping are used to allocate the socio-economic data to buildings (LOD1). A prediction on demographic changes taking into account recent municipal developments in Germany has been implemented. Age-specific changes at the community level are disaggregated on the household level and merged with socio-economic data. Combined with framework scenarios, we use these criteria based on socio-economic factors to develop spatially disaggregated scenarios. The framework scenarios take into account an increased penetration of renewable energies and a developed TCO approach for the ramp-up of electric mobility.</p><p>Predicting future distributions of domestic charging points with such a level of detail in terms of the ramp-up model and spatial resolution is highly beneficial for grid analysis and planning purposes. Typically, distribution grid studies that assess necessary grid investments rely on various simplified assumptions. A more detailed analysis of when and where the power flow at certain building connection points is likely to increase allows for more precise analyses of possible grid congestions. This also makes more efficient grid reinforcement and expansion planning possible, especially in urban areas, where infrastructure changes are expensive and time-consuming.</p><p>Another important aspect for demand-driven grid planning is the temporal modeling of charging processes. We use individual driving profiles based on surveys to create charging profiles for different consumer types. We combine them with a holistic model of the energy system including power plant scheduling as well as other (future) local producers and consumers such as photovoltaics and heat pumps. It allows us to consider correlations and simultaneities in their behavior and additionally enables us to explore various flexibility options and their influence on the electricity market and the grid.</p>


Sign in / Sign up

Export Citation Format

Share Document