scholarly journals A Prospective Comparison of Evidence Synthesis Search Strategies Developed With and Without Text-Mining Tools

Author(s):  
Robin A. Paynter ◽  
Robin Featherstone ◽  
Elizabeth Stoeger ◽  
Celia Fiordalisi ◽  
Christiane Voisin ◽  
...  
Author(s):  
Robin A. Paynter ◽  
Celia Fiordalisi ◽  
Elizabeth Stoeger ◽  
Eileen Erinoff ◽  
Robin Featherstone ◽  
...  

Background: In an era of explosive growth in biomedical evidence, improving systematic review (SR) search processes is increasingly critical. Text-mining tools (TMTs) are a potentially powerful resource to improve and streamline search strategy development. Two types of TMTs are especially of interest to searchers: word frequency (useful for identifying most used keyword terms, e.g., PubReminer) and clustering (visualizing common themes, e.g., Carrot2). Objectives: The objectives of this study were to compare the benefits and trade-offs of searches with and without the use of TMTs for evidence synthesis products in real world settings. Specific questions included: (1) Do TMTs decrease the time spent developing search strategies? (2) How do TMTs affect the sensitivity and yield of searches? (3) Do TMTs identify groups of records that can be safely excluded in the search evaluation step? (4) Does the complexity of a systematic review topic affect TMT performance? In addition to quantitative data, we collected librarians' comments on their experiences using TMTs to explore when and how these new tools may be useful in systematic review search¬¬ creation. Methods: In this prospective comparative study, we included seven SR projects, and classified them into simple or complex topics. The project librarian used conventional “usual practice” (UP) methods to create the MEDLINE search strategy, while a paired TMT librarian simultaneously and independently created a search strategy using a variety of TMTs. TMT librarians could choose one or more freely available TMTs per category from a pre-selected list in each of three categories: (1) keyword/phrase tools: AntConc, PubReMiner; (2) subject term tools: MeSH on Demand, PubReMiner, Yale MeSH Analyzer; and (3) strategy evaluation tools: Carrot2, VOSviewer. We collected results from both MEDLINE searches (with and without TMTs), coded every citation’s origin (UP or TMT respectively), deduplicated them, and then sent the citation library to the review team for screening. When the draft report was submitted, we used the final list of included citations to calculate the sensitivity, precision, and number-needed-to-read for each search (with and without TMTs). Separately, we tracked the time spent on various aspects of search creation by each librarian. Simple and complex topics were analyzed separately to provide insight into whether TMTs could be more useful for one type of topic or another. Results: Across all reviews, UP searches seemed to perform better than TMT, but because of the small sample size, none of these differences was statistically significant. UP searches were slightly more sensitive (92% [95% confidence intervals (CI) 85–99%]) than TMT searches (84.9% [95% CI 74.4–95.4%]). The mean number-needed-to-read was 83 (SD 34) for UP and 90 (SD 68) for TMT. Keyword and subject term development using TMTs generally took less time than those developed using UP alone. The average total time was 12 hours (SD 8) to create a complete search strategy by UP librarians, and 5 hours (SD 2) for the TMT librarians. TMTs neither affected search evaluation time nor improved identification of exclusion concepts (irrelevant records) that can be safely removed from the search set. Conclusion: Across all reviews but one, TMT searches were less sensitive than UP searches. For simple SR topics (i.e., single indication–single drug), TMT searches were slightly less sensitive, but reduced time spent in search design. For complex SR topics (e.g., multicomponent interventions), TMT searches were less sensitive than UP searches; nevertheless, in complex reviews, they identified unique eligible citations not found by the UP searches. TMT searches also reduced time spent in search strategy development. For all evidence synthesis types, TMT searches may be more efficient in reviews where comprehensiveness is not paramount, or as an adjunct to UP for evidence syntheses, because they can identify unique includable citations. If TMTs were easier to learn and use, their utility would be increased.


Author(s):  
Antonina Durfee

Massive quantities of information continue accumulating at about 1.5 billion gigabytes per year in numerous repositories held at news agencies, at libraries, on corporate intranets, on personal computers, and on the Web. A large portion of all available information exists in the form of text. Researchers, analysts, editors, venture capitalists, lawyers, help desk specialists, and even students are faced with text analysis challenges. Text mining tools aim at discovering knowledge from textual databases by isolating key bits of information from large amounts of text, identifying relationships among documents. Text mining technology is used for plagiarism and authorship attribution, text summarization and retrieval, and deception detection.


Author(s):  
Taşkın Dirsehan

Marketing concept has progressed through different phases of evolution in the past. At the moment, customer relationship management is considered as the last era of marketing development. The main purpose of this approach is to build long-term oriented profitable relationships with customers. So, companies should know better their customers. This knowledge can be created through a deeper analysis of companies' data with data mining tools. Companies which are able to use data mining tools will gain strong competitive advantages for their strategic decisions. Hotel industry is selected in this study, since it provides a warehouse of customer comments from which precious knowledge can be obtained if text mining as a data mining tool is used appropriately. Thus, this study attempts to explain the stages of text mining with the use of Rapidminer. As a result, different approaches according to the customer satisfaction/dissatisfaction are discussed to build competitive advantages.


2013 ◽  
pp. 2160-2162
Author(s):  
Jörg Hakenberg
Keyword(s):  

2019 ◽  
Vol 81 ◽  
pp. 63-75 ◽  
Author(s):  
Estelle Chaix ◽  
Louise Deléger ◽  
Robert Bossy ◽  
Claire Nédellec

2016 ◽  
Vol 2016 ◽  
pp. 1-8 ◽  
Author(s):  
Nicola Bernabò ◽  
Alessandra Ordinelli ◽  
Marina Ramal Sanchez ◽  
Mauro Mattioli ◽  
Barbara Barboni

Here we realized a networks-based model representing the process of actin remodelling that occurs during the acquisition of fertilizing ability of human spermatozoa (HumanMade_ActinSpermNetwork, HM_ASN). Then, we compared it with the networks provided by two different text mining tools: Agilent Literature Search (ALS) and PESCADOR. As a reference, we used the data from the online repository Kyoto Encyclopaedia of Genes and Genomes (KEGG), referred to the actin dynamics in a more general biological context. We found that HM_ALS and the networks from KEGG data shared the same scale-free topology following the Barabasi-Albert model, thus suggesting that the information is spread within the network quickly and efficiently. On the contrary, the networks obtained by ALS and PESCADOR have a scale-free hierarchical architecture, which implies a different pattern of information transmission. Also, the hubs identified within the networks are different: HM_ALS and KEGG networks contain as hubs several molecules known to be involved in actin signalling; ALS was unable to find other hubs than “actin,” whereas PESCADOR gave some nonspecific result. This seems to suggest that the human-made information retrieval in the case of a specific event, such as actin dynamics in human spermatozoa, could be a reliable strategy.


Sign in / Sign up

Export Citation Format

Share Document