scholarly journals TACIT: An open-source text analysis, crawling, and interpretation tool

2016 ◽  
Vol 49 (2) ◽  
pp. 538-547 ◽  
Author(s):  
Morteza Dehghani ◽  
Kate M. Johnson ◽  
Justin Garten ◽  
Reihane Boghrati ◽  
Joe Hoover ◽  
...  
2015 ◽  
Author(s):  
Morteza Dehghani ◽  
Kate M. Johnson ◽  
Justin Garten ◽  
Vijayan Balasubramanian ◽  
Anurag Singh ◽  
...  

2016 ◽  
Author(s):  
John Andersson ◽  
Sebastian Berlin ◽  
André Costa ◽  
Harald Berthelsen ◽  
Hanna Lindgren ◽  
...  

2013 ◽  
Vol 9 (2) ◽  
pp. e1002854 ◽  
Author(s):  
Hamish Cunningham ◽  
Valentin Tablan ◽  
Angus Roberts ◽  
Kalina Bontcheva

Author(s):  
Igor Tyumentsev ◽  
◽  
Alexander Kleitman ◽  

Introduction. Memoirs of I.A. Makhanov, who in the 1930s was the chief designer of artillery weapons at the Kirov plant, contain unique data on the development of the military-technical thought and the defense sector of the USSR industry in the pre-war period. The published fragment of memoirs, first introduced into scientific circulation, supplements and corrects the ideas formed in historiography about the militarytechnical cooperation of the USSR and Czechoslovakia on the eve of World War II. Methods and materials. The preparation of the source text for publication is carried out taking into consideration the modern requirements of archaeography. The published fragment is provided with archaeographic notes which allow to reconstruct the history of creation and modification of the text by the author. The scientific commentary provides information about personalities, place names and specific terms mentioned in the text. Analysis. The author pointed out that despite the supply of the latest weapons from Czechoslovakia to Yugoslavia, Italy, Turkey, Latin America, the share of purchases by the USSR was 50% and had broad prospects for increasing. The German occupation of 1938 suspended and then interrupted military-technical cooperation between the countries. Nevertheless, the Czech side fulfilled all obligations to the USSR. Result. As the published fragment of I.A. Makhanov proves, in the 1930s Czech specialists willingly acquainted the Soviet delegation with the latest developments in artillery systems. At the same time, after the occupation of Czechoslovakia by Germany, none of these weapons were brought to a prototype. Plants “Skoda” and “Zbroevka” were engaged only in the production and modernization of old weapons. Thus, the data of I.A. Makhanova confirm the hypothesis of sabotage of work for Nazi Germany by Czech designers led by V. Gromadko.


In data mining, shorter text analysis is performed more widely for many applications. Based on the syntax of the language, it is very difficult to analyze the short text with several traditional tools of natural language processing and this is not applied correctly either. In short text, it is known that there are rare and insufficient data available and further it is difficult to identify semantic knowledge with the great noise and ambiguity of short texts. In this paper, the authors proposed to replace the coefficient of similarity of Cosine with the measure of similarity of Jaro-Winkler to obtain the coincidence of similarity between pairs of text (source text and target text). Jaro-Winkler does a better job of determining the similarity of the strings because it takes an order into account when using the positional indices to estimate relevance. It is presumed that the performance of CACT driven by Jaro-Wrinkler with respect to one-to-many data links offers optimized performance when compared to the operation of CACT driven by cosine. In this paper, the ensemble algorithm CACTS and SAE is adopted with Jaro-Winkler similarity approach. The new algorithm is employed for short text analysis and better results. An evaluation of our proposed concept is sufficient as validation.


Author(s):  
Igor Tyumentsev ◽  
◽  
Alexander Kleitman ◽  

Introduction. Memoirs of I.A. Makhanov, who in the 1930s was the chief designer of artillery weapons at the Kirov plant, contain unique data on the development of the military-technical thought and the defense sector of the USSR industry in the pre-war period. The published fragment of memoirs, first introduced into scientific circulation, supplements and corrects the ideas formed in historiography about the military-technical cooperation of the USSR and Czechoslovakia on the eve of World War II. Methods and materials. The preparation of the source text for publication is carried out taking into consideration modern requirements of archaeography. The published fragment is provided with archaeographic notes which allow reconstructing the history of creation and modification of the text by the author. The scientific commentary provides information about personalities, place names and specific terms mentioned in the text. Analysis. The author pointed out that despite the supply of the latest weapons from Czechoslovakia to Yugoslavia, Italy, Turkey, Latin America, the share of purchases by the USSR was 50% and had broad prospects for increasing. The German occupation of 1938 suspended and then interrupted military-technical cooperation between the countries. Nevertheless, the Czech side fulfilled all obligations to the USSR. Results. As the published fragment of I.A. Makhanov proves, in the 1930s Czech specialists willingly acquainted the Soviet delegation with the latest developments in artillery systems. At the same time, after the occupation of Czechoslovakia by Germany, none of these weapons were brought to a prototype. “Skoda” and “Zbroevka” plants were engaged only in the production and modernization of old weapons. Thus, the data of I.A. Makhanov confirm the hypothesis of sabotage of work for Nazi Germany by Czech designers led by V. Gromadko.


Author(s):  
A. A. Goncharov ◽  
◽  
O. Yu. Inkova ◽  
◽  

One of the main characteristics of logical-semantic relations (LSRs) between two fragments of a text is that these relations can be either explicit (expressed by some marker, e.g. a connective) or implicit (derived from the interrelation of these fragments’ semantics). Since implicit LSRs do not have any marker, it is difficult to find them in a text (whether automatically or not). In this paper, approaches to analysing implicit LSRs are compared, an original definition for them is offered and differences between implicit LSRs and LSRs expressed by non-prototypical means are described. A method is proposed to identify implicit LSRs using a parallel corpus and a supracorpora database of connectives. Based on the well-known statement that LSRs can be explicitated by adding connectives in the translation, it is argued here that through selecting pairs in which fragments where a connective is used to express an LSR in the translation correspond to those containing any of the translation stimuli standard for this connective in the source language, it is possible to get an array of contexts in which this LSR is implicit in the source text (or expressed by means other than connectives). This method is then applied to study the French causal connectives car, parce que and puisque using a Russian-French parallel corpus. The corpus data are analysed to obtain information about LSRs particularly about cases where the causal LSR in Russian is implicit, as well as about the use of causal connectives in French. These results are used to show that the method proposed allows to quickly create a representative array of contexts with implicit LSRs, which can be useful in both text analysis and in machine learning.


Sign in / Sign up

Export Citation Format

Share Document