Text Information Extraction: in Context of Indian Languages

2012 ◽  
Vol 433-440 ◽  
pp. 5012-5019
Author(s):  
Ganesh K. Sethi ◽  
Rajesh K. Bawa

Text data present in images contains useful information and its extraction involves detection, localization, extraction, enhancement and recognition. However, the problem is challenging due to fact that text can have various styles, size, orientations, alignments, effect of lighting conditions. While a large number of techniques have been proposed in the past for extracting text from images and video frames for foreign languages, not much research has been carried out for Indian languages. The purpose of this paper is to review various algorithms for the problem for foreign as well as for the for the Indian languages.

2021 ◽  
Vol 83 (1) ◽  
pp. 72-79
Author(s):  
O.A. Kan ◽  
◽  
N.A. Mazhenov ◽  
K.B. Kopbalina ◽  
G.B. Turebaeva ◽  
...  

The main problem: The article deals with the issues of hiding text information in a graphic file. A formula for hiding text information in image pixels is proposed. A steganography scheme for embedding secret text in random image pixels has been developed. Random bytes are pre-embedded in each row of pixels in the source image. As a result of the operations performed, a key image is obtained. The text codes are embedded in random bytes of pixels of a given RGB channel. To form a secret message, the characters of the ASCII code table are used. Demo encryption and decryption programs have been developed in the Python 3.5.2 programming language. A graphic file is used as the decryption key. Purpose: To develop an algorithm for embedding text information in random pixels of an image. Methods: Among the methods of hiding information in graphic images, the LSB method of hiding information is widely used, in which the lower bits in the image bytes responsible for color encoding are replaced by the bits of the secret message. Analysis of methods of hiding information in graphic files and modeling of algorithms showed an increase in the level of protection of hidden information from detection. Results and their significance: Using the proposed steganography scheme and the algorithm for embedding bytes of a secret message in a graphic file, protection against detection of hidden information is significantly increased. The advantage of this steganography scheme is that for decryption, a key image is used, in which random bytes are pre-embedded. In addition, the entire pixel bits of the container image are used to display the color shades. It can also be noted that the developed steganography scheme allows not only to transmit secret information, but also to add digital fingerprints or hidden tags to the image.


2018 ◽  
Author(s):  
Ismail Suardi Wekke ◽  
Muhammad Yusuf ◽  
Agung Muttaqien

Arabic learning was starting from an assumption that its method was more important than its materials. This assumption then saw that its success would be determined by methods selected by the teacher. This paper discusses the paradigm encouraged Arabic teachers to master several methods considered effective and efficient to achieve the goals of learning Arabic. Starting from this, eclectic method was born as central axis to provide various alternative methods combined to support each other in achieving the goals of learning Arabic. There were, in the past, facts showing that the causes of failures of learning foreign languages were, among others, teachers’ capability to select proper and attracting methods. This statement showed that the cause was the inappropriateness of method selection which led to students’ desperation. Finally, some recommendations will be presented to enhance Arabic language learning.


2021 ◽  
Vol 2083 (4) ◽  
pp. 042044
Author(s):  
Zuhua Dai ◽  
Yuanyuan Liu ◽  
Shilong Di ◽  
Qi Fan

Abstract Aspect level sentiment analysis belongs to fine-grained sentiment analysis, w hich has caused extensive research in academic circles in recent years. For this task, th e recurrent neural network (RNN) model is usually used for feature extraction, but the model cannot effectively obtain the structural information of the text. Recent studies h ave begun to use the graph convolutional network (GCN) to model the syntactic depen dency tree of the text to solve this problem. For short text data, the text information is not enough to accurately determine the emotional polarity of the aspect words, and the knowledge graph is not effectively used as external knowledge that can enrich the sem antic information. In order to solve the above problems, this paper proposes a graph co nvolutional neural network (GCN) model that can process syntactic information, know ledge graphs and text semantic information. The model works on the “syntax-knowled ge” graph to extract syntactic information and common sense information at the same t ime. Compared with the latest model, the model in this paper can effectively improve t he accuracy of aspect-level sentiment classification on two datasets.


2004 ◽  
Vol 37 (5) ◽  
pp. 977-997 ◽  
Author(s):  
Keechul Jung ◽  
Kwang In Kim ◽  
Anil K. Jain

2019 ◽  
Vol 25 (06) ◽  
pp. 677-692
Author(s):  
Ralph Grishman

AbstractInformation extraction is the process of converting unstructured text into a structured data base containing selected information from the text. It is an essential step in making the information content of the text usable for further processing. In this paper, we describe how information extraction has changed over the past 25 years, moving from hand-coded rules to neural networks, with a few stops on the way. We connect these changes to research advances in NLP and to the evaluations organized by the US Government.


English Today ◽  
2014 ◽  
Vol 30 (1) ◽  
pp. 13-20 ◽  
Author(s):  
Tvrtko Prćić

The concept of English as the nativized foreign language – or ENFL, for short – was first proposed in 2003, at the 13th International Conference on British and American Studies, in Timişoara, Romania, in a presentation entitled ‘Rethinking the status of English today: is it still a purely foreign language?’, and subsequently published as Prćić, 2003 and 2004. Identified and described in these papers are new, additional properties of English, which have developed over the past few decades, concurrently with the establishment of English as the first language of world communication and as today's global lingua franca (for accounts of this phenomenon, see Jenkins, 2007; Mauranen & Ranta, 2010; Seidlhofer, 2011). Viewed from the perspective of the Expanding Circle (Kachru, 1985), English can no longer be considered a purely, or prototypically, foreign language, usually characterized by three defining properties: not the first language of a country, not the official language of a country and taught as a subject in schools (cf. Richards & Schmidt, 2002). Three newly emerged defining properties of English, over and above the three customary ones, set it uniquely apart from all other purely foreign languages and they will be briefly summarized below (for more extensive discussions, see Prćić, 2003, 2004, 2011a: Chapter 2, 2011b, 2014).


Sign in / Sign up

Export Citation Format

Share Document