chinese text
Recently Published Documents


TOTAL DOCUMENTS

874
(FIVE YEARS 224)

H-INDEX

22
(FIVE YEARS 6)

2021 ◽  
Vol 4 (5) ◽  
pp. 1183-1198
Author(s):  
Sergey S. Sidorovich

The Institute of Oriental Manuscripts of the Russian Academy of Sciences possesses a xylographed fragment in classical Mongolian script with a handwritten text on the reverse side (call mark G 110 recto), which was obtained in 1909 during P. K. Kozlov’s expedition in Khara-Khoto. The printed text in classical Mongolian script with several interlinear glosses in Chinese and a page footer (of the transcription of the Chinese name of the chapter and the page number) was read by the Soviet Orientalist N. Ts. Munkuyev more than 50 years ago. Munkuyev dated it by the XIV century based on the paleographic peculiarities. Moreover, based on the official history Yuan shi, he supposed that the text might be a Mongolian translation of the legislative code Da Yuan tong-zhi and suggested two possible versions of original Chinese name of the chapter, out of which an incorrect one was unfortunately chosen. Since Da Yuan tong-zhi was not preserved in full and the major part of the written monument including the chapters of interest were lost, it was impossible to find the text in scope, and the mistake in the reconstruction of the chapter name also could not be detected. However, in 2002 in South Korea a part of Zhi-zheng tiao-ge code was found, which was promulgated in 1346 and was intended to replace the outdated Da Yuan tong-zhi. In one of his previous articles, the author has shown that both codes were built according to a general pattern elaborated as far back as the Tang epoch (618–907). This enabled reconstruction of the name of the chapter mentioned in the fragment. Fortunately, the surviving part of the Zhi-zheng tiao-ge code contains the required chapters, and the Chinese glosses in the fragment allowed us to find the original Chinese text, which turned out to be a document dated 1303 and, according to the date, was evidently included in both codes. The article also contains the Chinese text of the document and its annotated translation.


2021 ◽  
Vol 26 (3) ◽  
pp. 529-536
Author(s):  
Alexander G. Kovalenko ◽  
Polina V. Porol

The article considers N. Gumilevs poem The Moon at Sea from the cycle Porcelain Pavilion. New in the work is the interpretation of the poem, the identification and explanation of the Chinese realities of N. Gumilevs poetic text. Revealed the original texts, which became the basis for the creation of N. Gumilevs poem The Moon on the Sea, considered a version of the poem, preserved in the poets manuscript. The authors reasoning and conclusions are based on critical research, which compares two cultures. The analysis of N. Gumilevs poem is carried out in the semantic aspect using the search for textual parallels. Interpretation of N. Gumilevs poem The Moon on the Sea allows, on the other hand, to approach the world outlook of the Silver Age culture, explains the genesis of the image of China in N. Gumilevs poetry.


2021 ◽  
Vol 72 (2) ◽  
pp. 590-602
Author(s):  
Kirill I. Semenov ◽  
Armine K. Titizian ◽  
Aleksandra O. Piskunova ◽  
Yulia O. Korotkova ◽  
Alena D. Tsvetkova ◽  
...  

Abstract The article tackles the problems of linguistic annotation in the Chinese texts presented in the Ruzhcorp – Russian-Chinese Parallel Corpus of RNC, and the ways to solve them. Particular attention is paid to the processing of Russian loanwords. On the one hand, we present the theoretical comparison of the widespread standards of Chinese text processing. On the other hand, we describe our experiments in three fields: word segmentation, grapheme-to-phoneme conversion, and PoS-tagging, on the specific corpus data that contains many transliterations and loanwords. As a result, we propose the preprocessing pipeline of the Chinese texts, that will be implemented in Ruzhcorp.


2021 ◽  
pp. 1-13
Author(s):  
Jiawen Shi ◽  
Hong Li ◽  
Chiyu Wang ◽  
Zhicheng Pang ◽  
Jiale Zhou

Short text matching is one of the fundamental technologies in natural language processing. In previous studies, most of the text matching networks are initially designed for English text. The common approach to applying them to Chinese is segmenting each sentence into words, and then taking these words as input. However, this method often results in word segmentation errors. Chinese short text matching faces the challenges of constructing effective features and understanding the semantic relationship between two sentences. In this work, we propose a novel lexicon-based pseudo-siamese model (CL2 N), which can fully mine the information expressed in Chinese text. Instead of utilizing a character-sequence or a single word-sequence, CL2 N augments the text representation with multi-granularity information in characters and lexicons. Additionally, it integrates sentence-level features through single-sentence features as well as interactive features. Experimental studies on two Chinese text matching datasets show that our model has better performance than the state-of-the-art short text matching models, and the proposed method can solve the error propagation problem of Chinese word segmentation. Particularly, the incorporation of single-sentence features and interactive features allows the network to capture the contextual semantics and co-attentive lexical information, which contributes to our best result.


2021 ◽  
Vol 11 (11) ◽  
pp. 1428-1433
Author(s):  
Shifang Li

This paper analyzes Thematic Progression pattern and its role in the process of text translation, providing a new perspective for the current translation teaching. TP pattern can be used as a reference object before translation to avoid the translator's mistakes in information transmission; after translation, it can also be a means to test the cohesion and coherence of the translation. In order to preserve the style of the original text, it is necessary to maintain the same TP pattern as the source text in translating activity, which is the foothold of this research. Therefore, the article proposes that in English-Chinese translation, the same TP pattern as the source text should be maintained as much as possible, in order to achieve the purpose of retaining the original style. Nevertheless, all this is based on the premise that the meaning of the original text will not be misunderstood.


2021 ◽  
Vol 2078 (1) ◽  
pp. 012021
Author(s):  
Hongyang Zhao ◽  
Qiang Xie

Abstract In view of the fact that the traditional graph model method which only considers statistical features or general semantic features when extracting keywords from existing massive educational resources, lacks the function of mining and utilizing multi-factor semantic features, this paper proposes an improved TextRank-based algorithm for keyword extraction of educational resources. According to the characteristics of Chinese text and the shortcomings of traditional TextRank algorithm, the improved algorithm featuring multi-feature fusion is developed using the importance of words in the corpus, the location information in the text and the attributes of words. Experimental results show that this method has higher accuracy, recall rate, and F-measure value than traditional algorithms in the process of keyword extraction of educational resources, which improves the quality of keyword extraction and is beneficial to better utilization and management of educational resources.


Sign in / Sign up

Export Citation Format

Share Document