Text searching on Splash 2

Author(s):  
D.V. Pryor ◽  
M.R. Thistle ◽  
N. Shirazi
Keyword(s):  
1985 ◽  
Vol 10 (2) ◽  
pp. 79-86 ◽  
Author(s):  
Anne Costigan ◽  
Frances E. Wood ◽  
David Bawden

A comparative evaluation of three implementations of a large databank, the NIOSH Registry of Toxic Effects of Chem ical Substances, has been carried out. The three implementa tions are: a printed index, a text searching computer system, and a computerised chemical databank system, with substruc ture searching facilities. Seven test queries were used, with the aim of drawing conclusions of general relevance to chemical databank searching. The computer systems were shown to have advantages over printed indexes for several of the queries, including those involving an element of browsing. Substructure search facilities were especially advantageous. Aspects of indexing of data present, and the criteria for inclusion of types of data, were also highlighted.


Author(s):  
A.L. Abbott ◽  
P.M. Athanas ◽  
L. Chen ◽  
R.L. Elliott
Keyword(s):  

2021 ◽  
Author(s):  
Yunxin Huang ◽  
Aiguo Song ◽  
Yafei Yang
Keyword(s):  

Author(s):  
Jeff Blackadar

Bibliothèque et Archives Nationales du Québec digitally scanned and converted to text a large collection of newspapers to create a resource of tremendous potential value to historians. Unfortunately, the text files are difficult to search reliably due to many errors caused by the optical character recognition (OCR) text conversion process. This digital history project applied natural language processing in an R language computer program to create a new and useful index of this corpus of digitized content despite OCR related errors. The project used editions of The Equity, published in Shawville, Quebec since 1883. The program extracted the names of all the person, location and organization entities that appeared in each edition. Each of the entities was cataloged in a database and related to the edition of the newspaper it appeared in. The database was published to a public website to allow other researchers to use it. The resulting index or finding aid allows researchers to access The Equity in a different way than just full text searching. People, locations and organizations appearing in the Equity are listed on the website and each entity links to a page that lists all of the issues that entity appeared in as well as the other entities that may be related to it. Rendering the text files of each scanned newspaper into entities and indexing them in a database allows the content of the newspaper to be interacted with by entity name and type rather than just a set of large text files. Website: http://www.jeffblackadar.ca/graham_fellowship/corpus_entities_equity/


2015 ◽  
Vol 22 (6) ◽  
pp. 1220-1230 ◽  
Author(s):  
Huan Mo ◽  
William K Thompson ◽  
Luke V Rasmussen ◽  
Jennifer A Pacheco ◽  
Guoqian Jiang ◽  
...  

Abstract Background Electronic health records (EHRs) are increasingly used for clinical and translational research through the creation of phenotype algorithms. Currently, phenotype algorithms are most commonly represented as noncomputable descriptive documents and knowledge artifacts that detail the protocols for querying diagnoses, symptoms, procedures, medications, and/or text-driven medical concepts, and are primarily meant for human comprehension. We present desiderata for developing a computable phenotype representation model (PheRM). Methods A team of clinicians and informaticians reviewed common features for multisite phenotype algorithms published in PheKB.org and existing phenotype representation platforms. We also evaluated well-known diagnostic criteria and clinical decision-making guidelines to encompass a broader category of algorithms. Results We propose 10 desired characteristics for a flexible, computable PheRM: (1) structure clinical data into queryable forms; (2) recommend use of a common data model, but also support customization for the variability and availability of EHR data among sites; (3) support both human-readable and computable representations of phenotype algorithms; (4) implement set operations and relational algebra for modeling phenotype algorithms; (5) represent phenotype criteria with structured rules; (6) support defining temporal relations between events; (7) use standardized terminologies and ontologies, and facilitate reuse of value sets; (8) define representations for text searching and natural language processing; (9) provide interfaces for external software algorithms; and (10) maintain backward compatibility. Conclusion A computable PheRM is needed for true phenotype portability and reliability across different EHR products and healthcare systems. These desiderata are a guide to inform the establishment and evolution of EHR phenotype algorithm authoring platforms and languages.


Author(s):  
Shengkai Zhu ◽  
Zhiwei Xiao ◽  
Haibo Chen ◽  
Rong Chen ◽  
Weihua Zhang ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document