text compression
Recently Published Documents


TOTAL DOCUMENTS

287
(FIVE YEARS 39)

H-INDEX

18
(FIVE YEARS 2)

2022 ◽  
Vol 3 (1) ◽  
pp. 1-28
Author(s):  
Giorgio Grani ◽  
Andrea Lenzi ◽  
Paola Velardi

Social media analytics can considerably contribute to understanding health conditions beyond clinical practice, by capturing patients’ discussions and feelings about their quality of life in relation to disease treatments. In this article, we propose a methodology to support a detailed analysis of the therapeutic experience in patients affected by a specific disease, as it emerges from health forums. As a use case to test the proposed methodology, we analyze the experience of patients affected by hypothyroidism and their reactions to standard therapies. Our approach is based on a data extraction and filtering pipeline, a novel topic detection model named Generative Text Compression with Agglomerative Clustering Summarization ( GTCACS ), and an in-depth data analytic process. We advance the state of the art on automated detection of adverse drug reactions ( ADRs ) since, rather than simply detecting and classifying positive or negative reactions to a therapy, we are capable of providing a fine characterization of patients along different dimensions, such as co-morbidities, symptoms, and emotional states.


Author(s):  
Mohammad Andri Budiman ◽  
Dian Rachmawati ◽  
Sari Wardhatul Jannah
Keyword(s):  

Poetics Today ◽  
2021 ◽  
Vol 42 (2) ◽  
pp. 193-206
Author(s):  
Lutz Koepnick

Abstract Compression is often considered a royal road to process data in ever-shorter time and to cater to our desire to outspeed the accelerating transmission of information in the digital age. This article explores how different techniques of accelerated text dissemination and reading, such as consonant writing, speed-reading apps, and the PDF file format, borrow from the language of compression yet, precisely in so doing, obscure the constitutive multilayered temporality of reading and the embodied role of the reader. While discussing different methods aspiring to compress textual objects and processes of reading, the author illuminates hidden assumptions that accompany the rhetoric of text compression and compressed reading.


2021 ◽  
Vol 15 (01) ◽  
pp. 11-15
Author(s):  
Tariq Abu Hilal ◽  
Hasan Abu Hilal ◽  
Ala Abu Hilal

Turkish lossless text compression was proposed by converting the character’s from UTF-8 to ANSI system for space-preserving. Likewise, we present a decoding method that transforms the encoded ANSI string back to its original format. Unlike the one-byte ANSI characters, some of the Turkish alphabets are being stored in 2 bytes size. All that space comes at a price. The developed sequential encoding technique will reduce the size of the text file up to 9%. Moreover, the Turkish encoded text will retain its original form after decoding. According to our proposal, it is considered as a lossless text compression, where it’s a common concern today. Thus, many parties have become interested in Unicode compression. Basically, our algorithm is mapping Unicode Turkish characters into ANSI, by using the available 8-bit legacy. For Arabic Text Compression, a sequential encoding technique was suggested that efficiently converts Arabic characters string from UTF-8 to ANSI characters coding. The encoding algorithm presented in this paper significantly reduces the file size. The decoding method transforms the encoded ANSI string back to its original format. Unlike the one-byte ANSI characters, Arabic alphabets are currently being stored in 2 bytes size which leads to inefficient space utilization. The newly developed sequential encoding technique reduces the space required for storage up to fifty percent. In addition, the proposed technique will retain the Arabic encoded text to its original form after decoding, which is leading to a lossless text compression. Thus, addressing the common concern of the currently available Arabic characters compression techniques. In this research, a multistage compression process was implemented on Turkish and Arabic languages, by using the new encoding technique, in addition to the 7-Zip application, which has shown a significant file size reduction.


2021 ◽  
Vol 102 ◽  
pp. 04013
Author(s):  
Md. Atiqur Rahman ◽  
Mohamed Hamada

Modern daily life activities produced lots of information for the advancement of telecommunication. It is a challenging issue to store them on a digital device or transmit it over the Internet, leading to the necessity for data compression. Thus, research on data compression to solve the issue has become a topic of great interest to researchers. Moreover, the size of compressed data is generally smaller than its original. As a result, data compression saves storage and increases transmission speed. In this article, we propose a text compression technique using GPT-2 language model and Huffman coding. In this proposed method, Burrows-Wheeler transform and a list of keys are used to reduce the original text file’s length. Finally, we apply GPT-2 language mode and then Huffman coding for encoding. This proposed method is compared with the state-of-the-art techniques used for text compression. Finally, we show that the proposed method demonstrates a gain in compression ratio compared to the other state-of-the-art methods.


Author(s):  
Zuchao Li ◽  
Zhuosheng Zhang ◽  
Hai Zhao ◽  
Rui Wang ◽  
Kehai Chen ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document