Design and Implementation of Full-Text Retrieval System for People’s Daily Annotated Corpus

2011 ◽  
Vol 135-136 ◽  
pp. 369-374
Author(s):  
Yang Sen Zhang ◽  
Gai Juan Huang

In this paper, we have designed and realized a efficient full-text retrieval system for the basic annotation People's Daily Corpus based on the inverted index technology. According to the characteristics of the basic annotation People’s Daily Corpus data, we have analyzed the methods and strategies of system implementing thoroughly. On the basis of comparing the various schemes, we have put forward to the three levels index structure of Chinese character, word and address set, and given the design approach of each level index dictionary structure. After converting the unstructured People’s Daily corpus into index structured data, we realized the full-text search algorithm correspond to the proposed index structure. Experimental results show that the proposed search algorithm has achieved the target of "ten millions Chinese characters, response in a second", improved the speed of the People's Daily Corpus full-text search.

2013 ◽  
Vol 347-350 ◽  
pp. 2185-2190
Author(s):  
Guo Qiang Ding ◽  
Min Lin

Under the premise of in-depth understanding of Lucene full-text retrieval technology, this paper will apply it to the Mongolian text search. First, several key issues are proposed which are need to be addressed in achieving the Mongolian text search technology, and give the corresponding solutions to achieve the Mongolian full-text retrieval in Lucene. Second, this paper provides a fast, accurate and comprehensive Mongolian information full-text search service, played a key role in promoting the development of the Mongolian search engine.


2014 ◽  
Vol 05 (01) ◽  
pp. 191-205 ◽  
Author(s):  
P. Biron ◽  
C. Pezet ◽  
C. Sebban ◽  
E. Barthuet ◽  
T. Durand ◽  
...  

SummaryBackground: A full-text search tool was introduced into the daily practice of Léon Bérard Center (France), a health care facility devoted to treatment of cancer. This tool was integrated into the hospital information system by the IT department having been granted full autonomy to improve the system.Objectives: To describe the development and various uses of a tool for full-text search of computerized patient records.Methods: The technology is based on Solr, an open-source search engine. It is a web-based application that processes HTTP requests and returns HTTP responses. A data processing pipeline that retrieves data from different repositories, normalizes, cleans and publishes it to Solr, was integrated in the information system of the Leon Bérard center. The IT department developed also user interfaces to allow users to access the search engine within the computerized medical record of the patient.Results: From January to May 2013, 500 queries were launched per month by an average of 140 different users. Several usages of the tool were described, as follows: medical management of patients, medical research, and improving the traceability of medical care in medical records. The sensitivity of the tool for detecting the medical records of patients diagnosed with both breast cancer and diabetes was 83.0%, and its positive predictive value was 48.7% (gold standard: manual screening by a clinical research assistant).Conclusion: The project demonstrates that the introduction of full-text-search tools allowed practitioners to use unstructured medical information for various purposes.Citation: Biron P; Metzger MH; Pezet C; Sebban C; Barthuet E; Durand T. An information retrieval system for computerized patient records in the context of a daily hospital practice: the example of the Léon Bérard Cancer Center (France)Appl Clin Inf 2014; 5: 191–205http://dx.doi.org/10.4338/ACI-2013-08-CR-0065


1995 ◽  
Vol 25 (8) ◽  
pp. 891-903 ◽  
Author(s):  
Justin Zobel ◽  
Alistair Moffat

2016 ◽  
Vol 08 (01) ◽  
pp. 1-8 ◽  
Author(s):  
Kehinde Daniel Aruleba ◽  
Dipo Theophilus Akomolafe ◽  
Babajide Afeni

Author(s):  
Tetsuo Sakaguchi ◽  
Shigetaka Nakao ◽  
Akira Maeda ◽  
Shieo Sugimoto ◽  
Koichi Tabata

Sign in / Sign up

Export Citation Format

Share Document