Rekonstrukcja i rewitalizacja zagrożonych wymarciem języków z wykorzystaniem narzędzi lingwistyki komputerowej

2021 ◽  
Author(s):  
Mirosław Gajer ◽  
Zbigniew Handzel

RECONSTRUCTION AND REVITALISATION OF ENDANGERED LANGUAGES USING COMPUTATIONAL LINGUISTICS TOOLS The monograph presented here consists of an introduction, six chapters and a conclusion. The first chapter briefly discusses the current linguistic situation, paying particular attention to the languages with a small number of speakers, which are threatened with extinction, and provides a thorough analysis of reasons for the disappearance of such languages. In the following chapters, namely chapters two, three and four, the authors reviewed the current linguistic situation in the western part of Europe. They briefly discuss selected languages spoken in this area which are threatened with extinction and belong respectively to the Germanic (second chapter) and Romance (third chapter) language groups. The fourth chapter discusses the languages in danger of extinction in Europe which belong to other groups of the Indo-European language family, with particular emphasis on the Celtic language group. The fifth chapter contains a description of the syntactic structure generator of Norwegian, developed by the authors. This kind of software may serve as a model for building analogous systems for other languages in danger of extinction. At this point, it is worth mentioning that the Norwegian language chosen by the authors exists in two official variants. One of them – New Norwegian (Nynorsk) – is currently perceived as a language potentially threatened with extinction. On the other hand, the sixth chapter of the monograph presents further prospects for the development of the system elaborated by the authors. In particular, the possibilities of its evolution into Machine-Aided Human Translation software have been considered. The task of this software would be to support the process of translating texts into the endangered languages. It is important to mention here that, thanks to the use of machine learning techniques, especially deep learning artificial neural networks, the issue of computer translation has already been solved in a largely satisfactory manner in the general sense, which not so long ago seemed like a scenario from a science fiction novel.

2017 ◽  
Vol Special Issue on... (Project presentations) ◽  
Author(s):  
Pramit Chaudhuri ◽  
Joseph P. Dexter

This paper describes the Quantitative Criticism Lab, a collaborative initiative between classicists, quantitative biologists, and computer scientists to apply ideas and methods drawn from the sciences to the study of literature. A core goal of the project is the use of computational biology, natural language processing, and machine learning techniques to investigate authorial style, intertextuality, and related phenomena of literary significance. As a case study in our approach, here we review the use of sequence alignment, a common technique in genomics and computational linguistics, to detect intertextuality in Latin literature. Sequence alignment is distinguished by its ability to find inexact verbal similarities, which makes it ideal for identifying phonetic echoes in large corpora of Latin texts. Although especially suited to Latin, sequence alignment in principle can be extended to many other languages.


2012 ◽  
Vol 8 ◽  
Author(s):  
Fadi Abu Sheikha ◽  
Diana Inkpen

This paper discusses an important issue in computational linguistics: classifying texts as formal or informal style. Our work describes a genre-independent methodology for building classifiers for formal and informal texts. We used machine learning techniques to do the automatic classification, and performed the classification experiments at both the document level and the sentence level. First, we studied the main characteristics of each style, in order to train a system that can distinguish between them. We then built two datasets: the first dataset represents general-domain documents of formal and informal style, and the second represents medical texts. We tested on the second dataset at the document level, to determine if our model is sufficiently general, and that it works on any type of text. The datasets are built by collecting documents for both styles from different sources. After collecting the data, we extracted features from each text. The features that we designed represent the main characteristics of both styles. Finally, we tested several classification algorithms, namely Decision Trees, Naïve Bayes, and Support Vector Machines, in order to choose the classifier that generates the best classification results.


Author(s):  
Shuchita Mudgil ◽  
Prof Ashok Verma

Sentiment analysis is used to conclude the approach of a consumer with respect to some topic. Sentimental analysis, a sub discipline within data mining and computational linguistics, refers to the methodology for mining, understanding the opinions expressed by the consumer in various forms like forums, forms blogs etc. The goal of sentiment analysis is to identify emotional states in online text. We Know human’s learns from past knowledge and machines follows instructions given by humans. But what if humans can prepare the machines from the past data and to put output to work much faster well that what is machine learning is it’s not about learning it’s also about understanding. So we will learn about analysis of sentiments using machine learning techniques


Author(s):  
Cheryl A. Bolstad ◽  
Peter Foltz ◽  
Marita Franzke ◽  
Haydee M. Cuevas ◽  
Mark Rosenstein ◽  
...  

Given the importance of Situation Awareness (SA) in military operations, there is a critical need for a realtime, unobtrusive tool that objectively and reliably measures warfighters' SA in both training and operations. Just as the requirement for improved access to SA measures has become vital, it is now commonplace for military team communications to be mediated by technology, hence easily captured and available for analysis. We believe that team communications can be used to derive SA measures. To address this issue, we are developing the Automated Communications Analysis of Situation Awareness (ACASA) system. ACASA combines the explanatory capacity of the SA construct with the predictive and computational power of TeamPrints, to assess team and shared SA as well as other cognitive processes. TeamPrints is a system that combines computational linguistics and machine learning techniques coupled with Latent Semantic Analysis (LSA) to analyze team communication. In this paper, we present the findings from an exploratory evaluation of how well TeamPrints predicts SA from the team communications arising during a military training exercise.


2006 ◽  
Author(s):  
Christopher Schreiner ◽  
Kari Torkkola ◽  
Mike Gardner ◽  
Keshu Zhang

2020 ◽  
Vol 12 (2) ◽  
pp. 84-99
Author(s):  
Li-Pang Chen

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.


Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 389-P
Author(s):  
SATORU KODAMA ◽  
MAYUKO H. YAMADA ◽  
YUTA YAGUCHI ◽  
MASARU KITAZAWA ◽  
MASANORI KANEKO ◽  
...  

Author(s):  
Anantvir Singh Romana

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.


Sign in / Sign up

Export Citation Format

Share Document