Research on Sustainable Mining Engineering

2013 ◽  
Vol 340 ◽  
pp. 126-130 ◽  
Author(s):  
Xiao Guang Yue ◽  
Guang Zhang ◽  
Qing Guo Ren ◽  
Wen Cheng Liao ◽  
Jing Xi Chen ◽  
...  

The concepts of Chinese information processing and natural language processing (NLP) and their development tendency are summarized. There are different comprehension of Chinese information processing and natural language processing in China and the other countries. But the work appears to emerge in the study of key point of languages processing. Mining engineering is very important for our country. Though the final task of languages processing is difficult, Chinese information processing has contributed substantially to our scientific research and social economy and it will play an important part for mining engineering in our future.

2013 ◽  
Vol 427-429 ◽  
pp. 2568-2571
Author(s):  
Shu Xian Liu ◽  
Xiao Hua Li

This article provides a brief introduction to Natural Language Processing and basic knowledge of Chinese Word Segmentation at first. Chinese Word Segmentation is a process of turning a series of Chinese characters into a series of Chinese words with some rules. As the fundamental component of Chinese information processing, it is wildly used in correlative areas. Accordingly, research on Chinese Word Segmentation has important theoretic and realistic meaning. In this paper, we mainly introduces the challenge in Chinese Word Segmentation, and presented the categories of Chinese Word Segmentation method.


2011 ◽  
Vol 48 (4) ◽  
Author(s):  
Louise-Amélie Cougnon ◽  
Thomas François

This paper details an international project called sms4science that aims to collect text message corpora (hereafter referred to as "SMS corpora") from across the globe for scientific research. The project already has ten participating regions, including Belgium, Réunion, Switzerland and Quebec. This article first presents the initial corpora collected from these four areas (resulting in a combined total of 116'000 text messages) and the accompanying methodology. It then exposes the research possibilities related to it: the corpus-based studies pertain as much to linguistics and sociolinguistics as they do to natural language processing and statistics. A specific statistical study is thus presented here and its possible conclusions outline the differences in SMS practices between regions, notably when you consider abbreviation rate or message length. Finally, the paper delineates the project obstacles and correspondingly proposes fresh perspectives for the ongoing year (2011).


2020 ◽  
Author(s):  
Masashi Sugiyama

Recently, word embeddings have been used in many natural language processing problems successfully and how to train a robust and accurate word embedding system efficiently is a popular research area. Since many, if not all, words have more than one sense, it is necessary to learn vectors for all senses of word separately. Therefore, in this project, we have explored two multi-sense word embedding models, including Multi-Sense Skip-gram (MSSG) model and Non-parametric Multi-sense Skip Gram model (NP-MSSG). Furthermore, we propose an extension of the Multi-Sense Skip-gram model called Incremental Multi-Sense Skip-gram (IMSSG) model which could learn the vectors of all senses per word incrementally. We evaluate all the systems on word similarity task and show that IMSSG is better than the other models.


Author(s):  
Davide Picca ◽  
Dominique Jaccard ◽  
Gérald Eberlé

In the last decades, Natural Language Processing (NLP) has obtained a high level of success. Interactions between NLP and Serious Games have started and some of them already include NLP techniques. The objectives of this paper are twofold: on the one hand, providing a simple framework to enable analysis of potential uses of NLP in Serious Games and, on the other hand, applying the NLP framework to existing Serious Games and giving an overview of the use of NLP in pedagogical Serious Games. In this paper we present 11 serious games exploiting NLP techniques. We present them systematically, according to the following structure:  first, we highlight possible uses of NLP techniques in Serious Games, second, we describe the type of NLP implemented in the each specific Serious Game and, third, we provide a link to possible purposes of use for the different actors interacting in the Serious Game.


2013 ◽  
Vol 274 ◽  
pp. 359-362
Author(s):  
Shuang Zhang ◽  
Shi Xiong Zhang

Abstract. Shallow parsing is a new strategy of language processing in the domain of natural language processing recently years. It is not focus on the obtaining of the full parsing tree but requiring of the recognition of some simple composition of some structure. It separated parsing into two subtasks: one is the recognition and analysis of chunks the other is the analysis of relationships among chunks. In this essay, some applied technology of shallow parsing is introduced and a new method of it is experimented.


2018 ◽  
Vol 7 (3.12) ◽  
pp. 674
Author(s):  
P Santhi Priya ◽  
T Venkateswara Rao

The other name of sentiment analysis is the opinion mining. It’s one of the primary objectives in a Natural Language Processing(NLP). Opinion mining is having a lot of audience lately. In our research we have taken up a prime problem of opinion mining which is theSentiment Polarity Categorization(SPC) that is very influential. We proposed a methodology for the SPC with explanations to the minute level. Apart from theories computations are made on both review standard and sentence standard categorization with benefitting outcomes. Also, the data that is represented here is from the product reviews given on the shopping site called Amazon.  


2019 ◽  
Author(s):  
William Jin

Recently, word embeddings have been used in many natural language processing problems successfully and how to train a robust and accurate word embedding system efficiently is a popular research area. Since many, if not all, words have more than one sense, it is necessary to learn vectors for all senses of word separately. Therefore, in this project, we have explored two multi-sense word embedding models, including Multi-Sense Skip-gram (MSSG) model and Non-parametric Multi-sense Skip Gram model (NP-MSSG). Furthermore, we propose an extension of the Multi-Sense Skip-gram model called Incremental Multi-Sense Skip-gram (IMSSG) model which could learn the vectors of all senses per word incrementally. We evaluate all the systems on word similarity task and show that IMSSG is better than the other models.


Author(s):  
Fazel Keshtkar ◽  
Ledong Shi ◽  
Syed Ahmad Chan Bukhari

Finding our favorite dishes have became a hard task since restaurants are providing more choices and va- rieties. On the other hand, comments and reviews of restaurants are a good place to look for the answer. The purpose of this study is to use computational linguistics and natural language processing to categorise and find semantic relation in various dishes based on reviewers’ comments and menus description. Our goal is to imple- ment a state-of-the-art computational linguistics meth- ods such as, word embedding model, word2vec, topic modeling, PCA, classification algorithm. For visualiza- tions, t-Distributed Stochastic Neighbor Embedding (t- SNE) was used to explore the relation within dishes and their reviews. We also aim to extract the common pat- terns between different dishes among restaurants and reviews comment, and in reverse, explore the dishes with a semantics relations. A dataset of articles related to restaurant and located dishes within articles used to find comment patterns. Then we applied t-SNE visual- izations to identify the root of each feature of the dishes. As a result, to find a dish our model is able to assist users by several words of description and their inter- est. Our dataset contains 1,000 articles from food re- views agency on a variety of dishes from different cul- tures: American, i.e. ’steak’, hamburger; Chinese, i.e. ’stir fry’, ’dumplings’; Japanese, i.e., ’sushi’.


Information ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 181 ◽  
Author(s):  
Pablo Gamallo ◽  
José Ramom Pichel ◽  
Iñaki Alegria

Phylogenetics is a sub-field of historical linguistics whose aim is to classify a group of languages by considering their distances within a rooted tree that stands for their historical evolution. A few European languages do not belong to the Indo-European family or are otherwise isolated in the European rooted tree. Although it is not possible to establish phylogenetic links using basic strategies, it is possible to calculate the distances between these isolated languages and the rest using simple corpus-based techniques and natural language processing methods. The objective of this article is to select some isolated languages and measure the distance between them and from the other European languages, so as to shed light on the linguistic distances and proximities of these controversial languages without considering phylogenetic issues. The experiments were carried out with 40 European languages including six languages that are isolated in their corresponding families: Albanian, Armenian, Basque, Georgian, Greek, and Hungarian.


1996 ◽  
Vol 1 (1) ◽  
pp. 99-119 ◽  
Author(s):  
John McH. Sinclair

This paper1 contrasts two views on the analysis of language. In one view, language is primarily seen as a carrier of messages in sentences whose propo-sitional content can be retrieved, and symbolised in a knowledge base. In the other, language is seen as a means of communication that deals in much more complex matters than just carrying messages. In relation to vocabulary and the design of lexicons, the model of terminology suits the first position, while in the other the lexicon is considered empty at the start and is gradually filled with the evidence of usage. Similar contrasts are made in other areas relevant to natural language processing. In one approach, the expectation is of tidiness and conformity to rules; the other stresses the inherently provisional nature of the organisation of language and, therefore, the meanings. As these two approaches encounter the vast amount of evidence stored in today's corpora, their methods and responses contrast in interesting ways.


Sign in / Sign up

Export Citation Format

Share Document