Light Diacritic Restoration to Disambiguate Homographs in Modern Arabic Texts

Author(s):  
Aqil M. Azmi ◽  
Rehab M. Alnefaie ◽  
Hatim A. Aboalsamh

Diacritic restoration (also known as diacritization or vowelization) is the process of inserting the correct diacritical markings into a text. Modern Arabic is typically written without diacritics, e.g., newspapers. This lack of diacritical markings often causes ambiguity, and though natives are adept at resolving, there are times they may fail. Diacritic restoration is a classical problem in computer science. Still, as most of the works tackle the full (heavy) diacritization of text, we, however, are interested in diacritizing the text using a fewer number of diacritics. Studies have shown that a fully diacritized text is visually displeasing and slows down the reading. This article proposes a system to diacritize homographs using the least number of diacritics, thus the name “light.” There is a large class of words that fall under the homograph category, and we will be dealing with the class of words that share the spelling but not the meaning. With fewer diacritics, we do not expect any effect on reading speed, while eye strain is reduced. The system contains morphological analyzer and context similarities. The morphological analyzer is used to generate all word candidates for diacritics. Then, through a statistical approach and context similarities, we resolve the homographs. Experimentally, the system shows very promising results, and our best accuracy is 85.6%.

2020 ◽  
Vol 31 (05) ◽  
pp. 583-593
Author(s):  
Saeid Alirezazadeh ◽  
Khadijeh Alibabaei

Forest algebras are defined for investigating languages of forests [ordered sequences] of unranked trees, where a node may have more than two [ordered] successors. They consist of two monoids, the horizontal and the vertical, with an action of the vertical monoid on the horizontal monoid, and a complementary axiom of faithfulness. A pseudovariety is a class of finite algebras of a given signature, closed under the taking of homomorphic images, subalgebras and finitary direct products. By looking at the syntactic congruence for monoids and as the natural extension in the case of forest algebras, we could define a version of syntactic congruence of a subset of the free forest algebra, not just a forest language. Let [Formula: see text] be a finite alphabet and [Formula: see text] be a pseudovariety of finite forest algebras. A language [Formula: see text] is [Formula: see text]-recognizable if its syntactic forest algebra belongs to [Formula: see text]. Separation is a classical problem in mathematics and computer science. It asks whether, given two sets belonging to some class, it is possible to separate them by another set of a smaller class. Suppose that a forest language [Formula: see text] and a forest [Formula: see text] are given. We want to find if there exists any proof for that [Formula: see text] does not belong to [Formula: see text] just by using [Formula: see text]-recognizable languages, i.e. given such [Formula: see text] and [Formula: see text], if there exists a [Formula: see text]-recognizable language [Formula: see text] which contains [Formula: see text] and does not contain [Formula: see text]. In this paper, we present how one can use profinite forest algebra to separate a forest language and a forest term and also to separate two forest languages.


Author(s):  
Dariusz Jacek Jakóbczak

Object recognition is one of the topics of artificial intelligence, computer vision, image processing, and machine vision. The classical problem in these areas of computer science is that of determining object via characteristic features. An important feature of the object is its contour. Accurate reconstruction of contour points leads to possibility to compare the unknown object with models of specified objects. The key information about the object is the set of contour points which are treated as interpolation nodes. Classical interpolations (Lagrange or Newton polynomials) are useless for precise reconstruction of the contour. The chapter is dealing with proposed method of contour reconstruction via curves interpolation. First stage consists in computing the contour points of the object to be recognized. Then one can compare models of known objects, given by the sets of contour points, with coordinates of interpolated points of unknown object. Contour points reconstruction and curve interpolation are possible using a new method of Hurwitz-Radon matrices.


Author(s):  
Quang Vu ◽  

The classical problem “Coin change” in Computer Science has become a key problem to a number of subsequent problems in different areas: finance, algorithm study, sports, etc. Mathematicians have been paying attention to only two possible outcomes of the problem: the most time/resource efficient solution and the total number of solutions. However, solutions among the “normal solutions” can be beneficial in certain situations, if carefully considered with math and economic phenomena in the past. Our work describes some of such possible beneficial solutions that are worth paying attention to and its application in finance and fiscal policy. Now it is of particular importance because of COVID-19 pandemic.


1979 ◽  
Vol 8 (102) ◽  
Author(s):  
Jørgen Sand ◽  
Ole Østerby

At the Department of Computer Science a system has been developed for plotting regions of absolute stability for a large class of formulae and methods for solving systems of ordinary differential equations. This report is a pictorial guide through the stability regions of a number of well-known formulae thereby showing the capabilities of our programs, and hopefully also giving some new information about the methods. In an appendix we give coefficients for Adams, Nystrom, generalized Milne-Simpson and backward differentiation formulae up to order 12 (resp. 11) and coefficients for Pade approximations to the exponential up to degree 6.


Algorithmica ◽  
2020 ◽  
Vol 82 (12) ◽  
pp. 3707-3743
Author(s):  
Amihood Amir ◽  
Panagiotis Charalampopoulos ◽  
Solon P. Pissis ◽  
Jakub Radoszewski

Abstract Given two strings S and T, each of length at most n, the longest common substring (LCS) problem is to find a longest substring common to S and T. This is a classical problem in computer science with an $$\mathcal {O}(n)$$ O ( n ) -time solution. In the fully dynamic setting, edit operations are allowed in either of the two strings, and the problem is to find an LCS after each edit. We present the first solution to the fully dynamic LCS problem requiring sublinear time in n per edit operation. In particular, we show how to find an LCS after each edit operation in $$\tilde{\mathcal {O}}(n^{2/3})$$ O ~ ( n 2 / 3 ) time, after $$\tilde{\mathcal {O}}(n)$$ O ~ ( n ) -time and space preprocessing. This line of research has been recently initiated in a somewhat restricted dynamic variant by Amir et al. [SPIRE 2017]. More specifically, the authors presented an $$\tilde{\mathcal {O}}(n)$$ O ~ ( n ) -sized data structure that returns an LCS of the two strings after a single edit operation (that is reverted afterwards) in $$\tilde{\mathcal {O}}(1)$$ O ~ ( 1 ) time. At CPM 2018, three papers (Abedin et al., Funakoshi et al., and Urabe et al.) studied analogously restricted dynamic variants of problems on strings; specifically, computing the longest palindrome and the Lyndon factorization of a string after a single edit operation. We develop dynamic sublinear-time algorithms for both of these problems as well. We also consider internal LCS queries, that is, queries in which we are to return an LCS of a pair of substrings of S and T. We show that answering such queries is hard in general and propose efficient data structures for several restricted cases.


2013 ◽  
pp. 998-1018
Author(s):  
Dariusz Jakóbczak

Object recognition is one of the topics of artificial intelligence, computer vision, image processing and machine vision. The classical problem in these areas of computer science is that of determining object via characteristic features. Important feature of the object is its contour. Accurate reconstruction of contour points leads to possibility to compare the unknown object with models of specified objects. The key information about the object is the set of contour points which are treated as interpolation nodes. Classical interpolations (Lagrange or Newton polynomials) are useless for precise reconstruction of the contour. The chapter is dealing with proposed method of contour reconstruction via curves interpolation. First stage consists in computing the contour points of the object to be recognized. Then one can compare models of known objects, given by the sets of contour points, with coordinates of interpolated points of unknown object. Contour points reconstruction and curve interpolation is possible using new method of Hurwitz - Radon Matrices.


Author(s):  
Dariusz Jakóbczak

Object recognition is one of the topics of artificial intelligence, computer vision, image processing and machine vision. The classical problem in these areas of computer science is that of determining object via characteristic features. Important feature of the object is its contour. Accurate reconstruction of contour points leads to possibility to compare the unknown object with models of specified objects. The key information about the object is the set of contour points which are treated as interpolation nodes. Classical interpolations (Lagrange or Newton polynomials) are useless for precise reconstruction of the contour. The chapter is dealing with proposed method of contour reconstruction via curves interpolation. First stage consists in computing the contour points of the object to be recognized. Then one can compare models of known objects, given by the sets of contour points, with coordinates of interpolated points of unknown object. Contour points reconstruction and curve interpolation is possible using new method of Hurwitz - Radon Matrices.


Sign in / Sign up

Export Citation Format

Share Document