Cross Language Query Expansion Approach for CIMS Based on Weighted D-S Evidence Theory

2014 ◽  
Vol 620 ◽  
pp. 534-543
Author(s):  
Xiao Bo Wang ◽  
Fan Zhao ◽  
Xiao Li ◽  
Rong Hui Zhang

With the Computer Integrated Manufacturing System and Information Technology rapid development, rapid retrieval multilingual becomes one of the hot spots in Machine Translation. The cross-language information retrieval (CLIR) provides a convenient way, enabling users to use their own familiar language to submit queries to retrieve documents in another language. Basic query expansion is one of the effective methods to improve recall of information retrieval. There are many researchers have proposed many extension methods, but most methods are simply added to the query expansion terms. If we do not distinguish the original query words and extended words, expanded query may deviate from the original semantics. So, it is very inconvenience for mechanical engineer and programmer. Based on Dempster-Shafer theory of evidence, we proposed a query expansion computing model, which considered as the main evidence of the original query terms, while the extensions as a secondary evidence of the original query terms. Which method to use semantic dictionary Han and Uygur-Chinese bilingual dictionary of synonyms forest and How to get the query word synonyms, near-synonyms and hypernym. Latent Semantic Analysis is used to obtain semantic relationships query words related words the using potentially large-scale text. The combination of these two types of evidence is in order to put forward a weighted combination of the Dempster-Shafer rule. Experimental results show that this method can effectively improve retrieval efficiency in Mechanical Engineering and Information Technology. The research results can be provided a reference for CIMS multilingual quick retrieval.

2016 ◽  
Vol 68 (4) ◽  
pp. 448-477 ◽  
Author(s):  
Dong Zhou ◽  
Séamus Lawless ◽  
Xuan Wu ◽  
Wenyu Zhao ◽  
Jianxun Liu

Purpose – With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native speakers. The purpose of this paper is to present a comprehensive study of user profile representation techniques and investigate their use in personalized cross-language information retrieval (CLIR) systems through the means of personalized query expansion. Design/methodology/approach – The user profiles consist of weighted terms computed by using frequency-based methods such as tf-idf and BM25, as well as various latent semantic models trained on monolingual documents and cross-lingual comparable documents. This paper also proposes an automatic evaluation method for comparing various user profile generation techniques and query expansion methods. Findings – Experimental results suggest that latent semantic-weighted user profile representation techniques are superior to frequency-based methods, and are particularly suitable for users with a sufficient amount of historical data. The study also confirmed that user profiles represented by latent semantic models trained on a cross-lingual level gained better performance than the models trained on a monolingual level. Originality/value – Previous studies on personalized information retrieval systems have primarily investigated user profiles and personalization strategies on a monolingual level. The effect of utilizing such monolingual profiles for personalized CLIR remains unclear. The current study fills the gap by a comprehensive study of user profile representation for personalized CLIR and a novel personalized CLIR evaluation methodology to ensure repeatable and controlled experiments can be conducted.


2012 ◽  
Vol 488-489 ◽  
pp. 1722-1726
Author(s):  
Ming Hong She ◽  
Hong Bing Yang ◽  
Ming Ying She

With the rapid development of information technology, people’s daily life becomes more and more dependent on it. Yet traditional controlling algorithm is too rough to satisfy the requirement of living convenience. To overcome this shortcoming, the technology of IOT is introduced to the control process of smart home. The structure and working process are described in detail. The intelligent controlling algorithm based on D-S Evidence theory is introduced and the simulation experiment for lighting control is described, the result has perfectly validate the validity of the algorithm described.


The rapid development in information technology has rendered an increase in the data volume at a speed which is surprising. In recent times, cloud computing and the Internet of Things (IoT) have become the hottest among the topics in the industry of information technology. There are many advantages to Cloud computing such as scalability, low price, and large scale and the primary technique of the IoTs like the Radio-Frequency Identification (RFID) have been applied to a large scale. In the recent times, the users of cloud storage have been increasing to a great extent and the reason behind this was the cloud storage system bringing down the issues in maintenance and also has a low amount of storage when compared to other methods. This system provides a high degree of reliability and availability where redundancy is introduced to the systems. In the replicated systems, objects get to be copied many times and every copy resides in a different location found in distributed computing. So, replication of data has been posing some threat to the cloud storage for users and also for the providers since it has been a major challenge providing efficient storage of data. So, the work has been analysing different strategies of replication of data and have pointed out several issues that are affected by this. For the purpose of this work, replication of data has been presented by employing the Cuckoo Search (CS) and the Greedy Search. The research is proceeding in a direction to reduce the replications without any adverse effect on the reliability and the availability of data.


Author(s):  
Eugene Santos Jr. ◽  
Eunice E. Santos ◽  
Hien Nguyen ◽  
Long Pan ◽  
John Korah

With the proliferation of the Internet and rapid development of information and communication infrastructure, E-governance has become a viable option for effective deployment of government services and programs. Areas of E-governance such as Homeland security and disaster relief have to deal with vast amounts of dynamic heterogeneous data. Providing rapid real-time search capabilities for such databases/sources is a challenge. Intelligent Foraging, Gathering, and Matching (I-FGM) is an established framework developed to assist analysts to find information quickly and effectively by incrementally collecting, processing and matching information nuggets. This framework has previously been used to develop a distributed, free text information retrieval application. In this chapter, we provide a comprehensive solution for the E-GOV analyst by extending the I-FGM framework to image collections and creating a “live” version of I-FGM deployable for real-world use. We present a Content Based Image Retrieval (CBIR) technique that incrementally processes the images, extracts low-level features and map them to higher level concepts. Our empirical evaluation of the algorithm shows that our approach performs competitively compared to some existing approaches in terms of retrieving relevant images while offering the speed advantages of a distributed and incremental process, and unified framework for both text and images. We describe our production level prototype that has a sophisticated user interface which can also deal with multiple queries from multiple users. The interface provides real-time updating of the search results and provides “under the hood” details of I-FGM processes as the queries are being processed.


Author(s):  
Anne Kao ◽  
Steve Poteet ◽  
Jason Wu ◽  
William Ferng ◽  
Rod Tjoelker ◽  
...  

Latent Semantic Analysis (LSA) or Latent Semantic Indexing (LSI), when applied to information retrieval, has been a major analysis approach in text mining. It is an extension of the vector space method in information retrieval, representing documents as numerical vectors but using a more sophisticated mathematical approach to characterize the essential features of the documents and reduce the number of features in the search space. This chapter summarizes several major approaches to this dimensionality reduction, each of which has strengths and weaknesses, and it describes recent breakthroughs and advances. It shows how the constructs and products of LSA applications can be made user-interpretable and reviews applications of LSA beyond information retrieval, in particular, to text information visualization. While the major application of LSA is for text mining, it is also highly applicable to cross-language information retrieval, Web mining, and analysis of text transcribed from speech and textual information in video.


2019 ◽  
Vol 12 (1) ◽  
pp. 105-116
Author(s):  
Qiuyu Zhu ◽  
Dongmei Li ◽  
Cong Dai ◽  
Qichen Han ◽  
Yi Lin

With the rapid development of the Internet, the information retrieval model based on the keywords matching algorithm has not met the requirements of users, because people with various query history always have different retrieval intentions. User query history often implies their interests. Therefore, it is of great importance to enhance the recall ratio and the precision ratio by applying query history into the judgment of retrieval intentions. For this sake, this article does research on user query history and proposes a method to construct user interest model utilizing query history. Coordinately, the authors design a model called PLSA-based Personalized Information Retrieval with Network Regularization. Finally, the model is applied into academic information retrieval and the authors compare it with Baidu Scholar and the personalized information retrieval model based on the probabilistic latent semantic analysis topic model. The experiment results prove that this model can effectively extract topics and retrieves back results more satisfied for users' requirements. Also, this model improves the effect of retrieval results apparently. In addition, the retrieval model can be utilized not only in the academic information retrieval, but also in the personalized information retrieval on microblog search, associate recommendation, etc.


Sign in / Sign up

Export Citation Format

Share Document