Automatically Recognizing Emotions in Text Using Prediction by Partial Matching (PPM) Text Compression Method

Author(s):  
Amer Almahdawi ◽  
William John Teahan
2015 ◽  
Author(s):  
Francisco N. Neto ◽  
Cláudio Baptista ◽  
Cláudio Campelo

Information about destination and route that a person will take is important for various purposes, such as to prevent a user going through a congested route. However, an information system where users must explicitly input their intended destination seems not be useful for daily routines. Ideally, the system should be able to predict the destination and the route to be taken by a vehicle as soon as it starts to move. This paper presents a new technique to predict route and destination, based on Prediction by Partial Matching (PPM) compression method. By considering two important contextual information (day of week and time of departure), the results obtained by our approach were encouraging, reaching around 92% of accuracy rate.


Author(s):  
И.В. Селиванова ◽  
I.V. Selivanova ◽  
Д.В. Косяков ◽  
D.V. Kosyakov ◽  
А.Е. Гуськов ◽  
...  

Исследуется возможность установления смысловой близости научных текстов методом их автоматической классификации, основанным на сжатии аннотаций. Идея метода состоит в том, что алгоритмы компрессии типа PPM (prediction by partial matching) сжимают терминологически близкие тексты существенно лучше, чем далекие. Если для каждой классифицируемой тематики будет сформировано ядро публикаций (аналог обучающей выборки), то наилучшая доля сжатия будет указывать на принадлежность классифицируемого текста к соответствующей тематике. Было определено 30 тематических категорий, каждой из них в базе данных Scopus получены аннотации около 500 публикаций, из которых разными способами выбирались 100 аннотаций для ядра и 20 аннотаций для тестирования. Установлено, что построение ядра на основе высокоцитируемых публикаций выявляет до 12% ошибок против 32% при случайной выборке. На качество классификации влияет и изначальное количество категорий: чем меньше категорий участвует в классификации и чем больше терминологические различия между ними, тем выше её качество.


Author(s):  
Irina Bubnova

The article presents the results of a study on the texts and decrees on awarding and statutes of the highest orders of the USSR and modern Russia - the Order of Lenin and the Order of St. Andrew the First-Called. The purpose of the study was to identify the virtue imperatives that meet main goals of some stage in the state's development and transmit them through civil decorations in order to construe a sociocultural layer of a value picture of the world and, accordingly, to design a type of personality with a set of socially welcomed models of behavior, that is being in demand at a certain time period. The texts of decrees and statutes are analyzed with the reference to a method of content analysis, which resulted in distinction between the leading super-topical themes that are viewed as a reflection of a required public interest. Verification and clarification of the results was continued with a text compression method and conclusion on the virtue imperatives of both historical periods under study. In the USSR industrialization, ideological construction associated with the formation of moral excellence coordinates of the individual, the development of agriculture and science were (in descending order) of leading value, which in general corresponds to the main functions of the governmental institution. The leading virtue imperative of modern Russia is the sphere of ideology, meanwhile, reassessment of the value system is based primarily on samples of the Soviet past, which causes cognitive dissonance and can lead to unpredictable results.


2003 ◽  
Vol 13 (01) ◽  
pp. 39-45
Author(s):  
AMER AL-NASSIRI

In this paper we considered a theoretical evaluation of data and text compression algorithm based on the Burrows–Wheeler Transform (BWT) and General Bidirectional Associative Memory (GBAM). A new data and text lossless compression method, based on the combination of BWT1 and GBAM2 approaches, is presented. The algorithm was tested on many texts in different formats (ASCII and RTF). The compression ratio achieved is fairly good, on average 28–36%. Decompression is fast.


Sign in / Sign up

Export Citation Format

Share Document