original text Latest Research Papers

Differentially Private Medical Texts Generation Using Generative Neural Networks

ACM Transactions on Computing for Healthcare ◽

10.1145/3469035 ◽

2022 ◽

Vol 3 (1) ◽

pp. 1-27

Author(s):

Md Momin Al Aziz ◽

Tanbir Ahmed ◽

Tasnia Faequa ◽

Xiaoqian Jiang ◽

Yiyu Yao ◽

...

Keyword(s):

Data Science ◽

Medical Information ◽

Healthcare Providers ◽

Well Being ◽

Original Text ◽

Sensitive Information ◽

Classification Problems ◽

Medical Texts ◽

Health Records ◽

Content Type

Technological advancements in data science have offered us affordable storage and efficient algorithms to query a large volume of data. Our health records are a significant part of this data, which is pivotal for healthcare providers and can be utilized in our well-being. The clinical note in electronic health records is one such category that collects a patient’s complete medical information during different timesteps of patient care available in the form of free-texts. Thus, these unstructured textual notes contain events from a patient’s admission to discharge, which can prove to be significant for future medical decisions. However, since these texts also contain sensitive information about the patient and the attending medical professionals, such notes cannot be shared publicly. This privacy issue has thwarted timely discoveries on this plethora of untapped information. Therefore, in this work, we intend to generate synthetic medical texts from a private or sanitized (de-identified) clinical text corpus and analyze their utility rigorously in different metrics and levels. Experimental results promote the applicability of our generated data as it achieves more than 80\% accuracy in different pragmatic classification problems and matches (or outperforms) the original text data.

CATS: Customizable Abstractive Topic-based Summarization

ACM Transactions on Information Systems ◽

10.1145/3464299 ◽

2022 ◽

Vol 40 (1) ◽

pp. 1-24

Author(s):

Seyed Ali Bahrainian ◽

George Zerveas ◽

Fabio Crestani ◽

Carsten Eickhoff

Keyword(s):

Computer Science ◽

State Of The Art ◽

Original Text ◽

Learning Method ◽

Source Text ◽

Resource Setting ◽

Low Resource Setting ◽

Topic Distribution ◽

Latent Topic ◽

Abstractive Summarization

Neural sequence-to-sequence models are the state-of-the-art approach used in abstractive summarization of textual documents, useful for producing condensed versions of source text narratives without being restricted to using only words from the original text. Despite the advances in abstractive summarization, custom generation of summaries (e.g., towards a user’s preference) remains unexplored. In this article, we present CATS, an abstractive neural summarization model that summarizes content in a sequence-to-sequence fashion while also introducing a new mechanism to control the underlying latent topic distribution of the produced summaries. We empirically illustrate the efficacy of our model in producing customized summaries and present findings that facilitate the design of such systems. We use the well-known CNN/DailyMail dataset to evaluate our model. Furthermore, we present a transfer-learning method and demonstrate the effectiveness of our approach in a low resource setting, i.e., abstractive summarization of meetings minutes, where combining the main available meetings’ transcripts datasets, AMI and International Computer Science Institute(ICSI) , results in merely a few hundred training documents.

Detecting Arabic Spam Reviews in Social Networks Based on Classification Algorithms

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3476115 ◽

2022 ◽

Vol 21 (1) ◽

pp. 1-13

Author(s):

Hassan Najadat ◽

Mohammad A. Alzubaidi ◽

Islam Qarqaz

Keyword(s):

Social Media ◽

Information Gain ◽

Machine Learning Algorithms ◽

Classification Algorithms ◽

Detection Accuracy ◽

Original Text ◽

Decision Tree Classifier ◽

Filter Methods ◽

Tree Classifier ◽

Business Entities

Reviews or comments that users leave on social media have great importance for companies and business entities. New product ideas can be evaluated based on customer reactions. However, this use of social media is complicated by those who post spam on social media in the form of reviews and comments. Designing methodologies to automatically detect and block social media spam is complicated by the fact that spammers continuously develop new ways to leave their spam comments. Researchers have proposed several methods to detect English spam reviews. However, few studies have been conducted to detect Arabic spam reviews. This article proposes a keyword-based method for detecting Arabic spam reviews. Keywords or Features are subsets of words from the original text that are labelled as important. A term's weight, Term Frequency–Inverse Document Frequency (TF-IDF) matrix, and filter methods (such as information gain, chi-squared, deviation, correlation, and uncertainty) have been used to extract keywords from Arabic text. The method proposed in this article detects Arabic spam in Facebook comments. The dataset consists of 3,000 Arabic comments extracted from Facebook pages. Four different machine learning algorithms are used in the detection process, including C4.5, kNN, SVM, and Naïve Bayes classifiers. The results show that the Decision Tree classifier outperforms the other classification algorithms, with a detection accuracy of 92.63%.

KALMYK FOLK TALE IN A FOREIGN LANGUAGE TRANSLATION (ON THE EXAMPLE OF THE ANALYSIS OF ORIGINAL AND TRANSLATED TEXTS OF FAIRY TALE "AYU CHIKTE AND AVKHA TSETSEN")

SCIENTIFIC REVIEW OF SAYANO-ALTAI ◽

10.52782/kril.2021.2.30.004 ◽

2022 ◽

pp. 17-22

Author(s):

Б. В. Эльбикова

Keyword(s):

Comparative Analysis ◽

Foreign Language ◽

Fairy Tale ◽

Russian Translation ◽

Language Translation ◽

Original Text ◽

Folk Tale ◽

Original Meaning ◽

Maximum Accuracy

Исследование посвящено сравнительному анализу оригинального и переводных текстов калмыцкой народной сказки «Аю Чикт Авха Цецен хойр» («Аю Чикте и Авха Цецен») из репертуара сказителя М. Буринова. В процессе сличения исходного текста сказки на калмыцком языке (1960) и русскоязычного перевода М. Г. Ватагина (1964) отмечается характер разночтений и неточностей, обнаруженных в иноязычном нарративе в передаче смысла отдельных эпизодов сюжета, формульных выражений, словосочетаний, играющих важную роль в сказочном повествовании. Изучение фольклорного текста в его разноязычных воплощениях представляется актуальным в свете проблем, возникающих при взаимодействии текстов дистантных культур. Для передачи национальной специфики сказочной традиции требуется максимальная точность при переводе, имеющим важное значение для понимания исконного смысла оригинального текста. The study is devoted to a comparative analysis of the original and translated texts of the Kalmyk folk tale "Ayu Chikt Avkha Tsetsn khoir" ("Ayu Chikte and Avkha Tsetsen") from the repertoire of the narrator M. Burinov. In the process of comparing the original text of the fairy tale in the Kalmyk language (1960) and the Russian translation by M. G. Vatagina (1964) notes the nature of the discrepancies and inaccuracies found in the foreign language narrative in the transfer of the meaning of individual episodes of the plot, formula expressions, word combinations), which play an important role in the fairy tale narration. The study of a folklore text in its multilingual embodiments is relevant in the light of the problems that arise within the interaction of texts of distant cultures. To convey the national specifics of the fairy - tale tradition, maximum accuracy is required when translating episodes, formulas and some words that are important for understanding the original meaning of an original text.

KALMYK FOLK TALE IN A FOREIGN LANGUAGE TRANSLATION (ON THE EXAMPLE OF THE ANALYSIS OF ORIGINAL AND TRANSLATED TEXTS OF FAIRY TALE "AYU CHIKTE AND AVKHA TSETSEN")

SCIENTIFIC REVIEW OF SAYANO-ALTAI ◽

10.52782/kril2021.2.30.004 ◽

2022 ◽

pp. 17-22

Author(s):

Б. В. Эльбикова

Keyword(s):

Comparative Analysis ◽

Foreign Language ◽

Fairy Tale ◽

Russian Translation ◽

Language Translation ◽

Original Text ◽

Folk Tale ◽

Original Meaning ◽

Maximum Accuracy

Исследование посвящено сравнительному анализу оригинального и переводных текстов калмыцкой народной сказки «Аю Чикт Авха Цецен хойр» («Аю Чикте и Авха Цецен») из репертуара сказителя М. Буринова. В процессе сличения исходного текста сказки на калмыцком языке (1960) и русскоязычного перевода М. Г. Ватагина (1964) отмечается характер разночтений и неточностей, обнаруженных в иноязычном нарративе в передаче смысла отдельных эпизодов сюжета, формульных выражений, словосочетаний, играющих важную роль в сказочном повествовании. Изучение фольклорного текста в его разноязычных воплощениях представляется актуальным в свете проблем, возникающих при взаимодействии текстов дистантных культур. Для передачи национальной специфики сказочной традиции требуется максимальная точность при переводе, имеющим важное значение для понимания исконного смысла оригинального текста. The study is devoted to a comparative analysis of the original and translated texts of the Kalmyk folk tale "Ayu Chikt Avkha Tsetsn khoir" ("Ayu Chikte and Avkha Tsetsen") from the repertoire of the narrator M. Burinov. In the process of comparing the original text of the fairy tale in the Kalmyk language (1960) and the Russian translation by M. G. Vatagina (1964) notes the nature of the discrepancies and inaccuracies found in the foreign language narrative in the transfer of the meaning of individual episodes of the plot, formula expressions, word combinations), which play an important role in the fairy tale narration. The study of a folklore text in its multilingual embodiments is relevant in the light of the problems that arise within the interaction of texts of distant cultures. To convey the national specifics of the fairy - tale tradition, maximum accuracy is required when translating episodes, formulas and some words that are important for understanding the original meaning of an original text.

TRANSLATION OF KOREAN-INDONESIAN SHORT STORIES: AN ANALYSIS OF CLASS AND SEMANTIC SHIFTS OF ADVERBS OF MODALITY

LiNGUA Jurnal Ilmu Bahasa dan Sastra ◽

10.18860/ling.v16i2.13139 ◽

2022 ◽

Vol 16 (2) ◽

pp. 271-282

Author(s):

Nur Rosyidah Syahbaniyah ◽

Totok Suhardijanto

Keyword(s):

Short Stories ◽

Short Story ◽

Qualitative Method ◽

Analytical Data ◽

The Other ◽

Original Text ◽

Source Text ◽

Word Classes ◽

Grammar Systems ◽

Bahasa Indonesia

This study discusses class and semantic shifts of adverbs of modality in the Korean short story and its Bahasa Indonesia translation in the short story anthology of ‘Langit dan Kupu-Kupu. This study aims to identify how the adverbs of modality original text change into a different word class in the target text. The sources of data in this study were six Korean short stories entitled ‘Dua Generasi yang Teraniaya’, ‘Seoul Musim Dingin 1964’, ‘Jalan ke Sampho’, ‘Bung Kim di Kampung Kami’, ‘Dinihari ke Garis Depan’, dan ‘Betulkah? Saya Jerapah’ and its Indonesian translation. This study was conducted using a descriptive qualitative method, and the design of a linguistic corpus was used to collect analytical data. The analysis results found that from 46 adverbs of modality, four translated adverbs remained classified as adverbs. At the same time, the other ten words change their class into pronouns, nouns, particles, adjectives, and verbs. Additionally, the other 32 words have a combination of adverbs and other word classes. Furthermore, of the 290 adverb words in the source text, 143 words were accurately translated, 100 were deleted, and 47 changed their meaning in the TT. In the translation of Korean-Indonesian short stories, the shifting technique is used to adjust differences between Korean and Indonesian grammar systems. Translators also make a shift in the word's meaning of short stories as long as they do not deviate from the context and message in the ST to produce a natural translation that TL readers can easily understand.

Edition of the original text and its annotations

10.2307/j.ctv24tr7hc.17 ◽

2022 ◽

pp. 181-230

Keyword(s):

Original Text

Rythm-forming, functional and pragmatic potential and immanence of repetitions in the modern Spanish language

Title in english ◽

10.24833/2410-2423-2021-5-29-24-35 ◽

2022 ◽

Vol 7 (5) ◽

pp. 24-35

Author(s):

E. S. Goncharenko

Keyword(s):

Comparative Analysis ◽

Spanish Language ◽

The Other ◽

Literary Texts ◽

Original Text ◽

Frequent Use ◽

The People ◽

Pragmatic Function ◽

Analysis And Synthesis

This article offers the results of the investigation of repetitions in the modern Spanish language. To understand the role of the repetitions in a certain text, first of all, it’s necessary to determine whether they are immanent in the language or culture, and, therefore, unmarked, or, on the contrary, carry some charge: stylistic, rhythmic or pragmatic. Such differentiation is carried out by means of the analysis and synthesis of the theoretic material (А. Аlonso, E. А. Llorach, J. Nogeira, V. Iovenko, V. Vinogradov, etc.), contrastive and comparative analysis. The results show the redundancy of the Spanish language in comparison with Russian, which accounts for the numerous unmarked repetitions in Spanish. On the other hand, the frequent use of repetitions as stylistic, semantic or rhythmic device becomes evident too. For the analysis, we chose some official documents, characterized by the absence of stylistic devices, and some appellative and literary texts (poetry by A. Carvajal, a novel by S. Puertolas, etc.), which are apriori aimed at the form and pragmatic effect. This approach helps achieve the most objective conclusions concerning the nature of the repetitions in a text. We considered lexical and grammar repetitions, grammar, semantic and concept repetitions. Phonetic and lexical repetitions, as the basic stylistic devices, have not been subjected to analysis, as their markedness is evident. The results of the research, presented in the article, may be useful both for the people studying the Spanish language in order to speak it correctly and to understand the pragmatic function of repetition, and for translators to decide whether to follow the structure and rhythm of the text if repetitions are marked, or to omit them when they are in the original text, should they be immanent in the language and the culture.

Automatic Text Summarization by Providing Coverage, Non-Redundancy, and Novelty Using Sentence Graph

Journal of Information Technology Research ◽

10.4018/jitr.2022010108 ◽

2022 ◽

Vol 15 (1) ◽

pp. 1-18

Author(s):

Krishnaveni P. ◽

Balasundaram S. R.

Keyword(s):

Graph Algorithms ◽

Maximal Clique ◽

Text Summarization ◽

Original Text ◽

Online Information ◽

Automatic Text Summarization ◽

Global Properties ◽

Input Text ◽

Local Properties ◽

Automatic Text

The day-to-day growth of online information necessitates intensive research in automatic text summarization (ATS). The ATS software produces summary text by extracting important information from the original text. With the help of summaries, users can easily read and understand the documents of interest. Most of the approaches for ATS used only local properties of text. Moreover, the numerous properties make the sentence selection difficult and complicated. So this article uses a graph based summarization to utilize structural and global properties of text. It introduces maximal clique based sentence selection (MCBSS) algorithm to select important and non-redundant sentences that cover all concepts of the input text for summary. The MCBSS algorithm finds novel information using maximal cliques (MCs). The experimental results of recall oriented understudy for gisting evaluation (ROUGE) on Timeline dataset show that the proposed work outperforms the existing graph algorithms Bushy Path (BP), Aggregate Similarity (AS), and TextRank (TR).

Appendix I. Extracts from the original text, Le Roman de Waldef

Waldef ◽

10.1515/9781641894074-006 ◽

2021 ◽

pp. 231-238

Keyword(s):

Original Text

original text
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Differentially Private Medical Texts Generation Using Generative Neural Networks

CATS: Customizable Abstractive Topic-based Summarization

Detecting Arabic Spam Reviews in Social Networks Based on Classification Algorithms

KALMYK FOLK TALE IN A FOREIGN LANGUAGE TRANSLATION (ON THE EXAMPLE OF THE ANALYSIS OF ORIGINAL AND TRANSLATED TEXTS OF FAIRY TALE "AYU CHIKTE AND AVKHA TSETSEN")

KALMYK FOLK TALE IN A FOREIGN LANGUAGE TRANSLATION (ON THE EXAMPLE OF THE ANALYSIS OF ORIGINAL AND TRANSLATED TEXTS OF FAIRY TALE "AYU CHIKTE AND AVKHA TSETSEN")

TRANSLATION OF KOREAN-INDONESIAN SHORT STORIES: AN ANALYSIS OF CLASS AND SEMANTIC SHIFTS OF ADVERBS OF MODALITY

Edition of the original text and its annotations

Rythm-forming, functional and pragmatic potential and immanence of repetitions in the modern Spanish language

Automatic Text Summarization by Providing Coverage, Non-Redundancy, and Novelty Using Sentence Graph

Appendix I. Extracts from the original text, Le Roman de Waldef

Export Citation Format

original textRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Differentially Private Medical Texts Generation Using Generative Neural Networks

CATS: Customizable Abstractive Topic-based Summarization

Detecting Arabic Spam Reviews in Social Networks Based on Classification Algorithms

KALMYK FOLK TALE IN A FOREIGN LANGUAGE TRANSLATION (ON THE EXAMPLE OF THE ANALYSIS OF ORIGINAL AND TRANSLATED TEXTS OF FAIRY TALE "AYU CHIKTE AND AVKHA TSETSEN")

KALMYK FOLK TALE IN A FOREIGN LANGUAGE TRANSLATION (ON THE EXAMPLE OF THE ANALYSIS OF ORIGINAL AND TRANSLATED TEXTS OF FAIRY TALE "AYU CHIKTE AND AVKHA TSETSEN")

TRANSLATION OF KOREAN-INDONESIAN SHORT STORIES: AN ANALYSIS OF CLASS AND SEMANTIC SHIFTS OF ADVERBS OF MODALITY

Edition of the original text and its annotations

Rythm-forming, functional and pragmatic potential and immanence of repetitions in the modern Spanish language

Automatic Text Summarization by Providing Coverage, Non-Redundancy, and Novelty Using Sentence Graph

Appendix I. Extracts from the original text, Le Roman de Waldef

original text
Recently Published Documents