scholarly journals A Model Workflow for GeoDeepDive: Locating Pliocene and Pleistocene Ice-Rafted Debris

2021 ◽  
Author(s):  
Simon Goring ◽  
Jeremiah Marsicek ◽  
Shan Ye ◽  
John Williams ◽  
Stephen Meyers ◽  
...  

Machine learning technology promises a more efficient and scalable approach to locating and aggregating data and information from the burgeoning scientific literature. Realizing this promise requires provision of applications, data resources, and the documentation of analytic workflows. GeoDeepDive provides a digital library comprising over 13 million peer-reviewed documents and the computing infrastructure upon which to build and deploy search and text-extraction capabilities using regular expressions and natural language processing. Here we present a model GeoDeepDive workflow and accompanying R package to show how GeoDeepDive can be employed to extract spatiotemporal information about site-level records in the geoscientific literature. We apply these capabilities to a proof-of-concept subset of papers in a case study to generate a preliminary distribution of ice-rafted debris (IRD) records in both space and time. We use regular expressions and natural language-processing utilities to extract and plot reliable latitude-longitude pairs from publications containing IRD, and also extract age estimates from those publications. This workflow and R package provides researchers from the geosciences and allied disciplines a general set of tools for querying spatiotemporal information from GeoDeepDive for their own science questions.

AERA Open ◽  
2021 ◽  
Vol 7 ◽  
pp. 233285842110286
Author(s):  
Kylie L. Anglin ◽  
Vivian C. Wong ◽  
Arielle Boguslav

Though there is widespread recognition of the importance of implementation research, evaluators often face intense logistical, budgetary, and methodological challenges in their efforts to assess intervention implementation in the field. This article proposes a set of natural language processing techniques called semantic similarity as an innovative and scalable method of measuring implementation constructs. Semantic similarity methods are an automated approach to quantifying the similarity between texts. By applying semantic similarity to transcripts of intervention sessions, researchers can use the method to determine whether an intervention was delivered with adherence to a structured protocol, and the extent to which an intervention was replicated with consistency across sessions, sites, and studies. This article provides an overview of semantic similarity methods, describes their application within the context of educational evaluations, and provides a proof of concept using an experimental study of the impact of a standardized teacher coaching intervention.


Author(s):  
Jacqueline Peng ◽  
Mengge Zhao ◽  
James Havrilla ◽  
Cong Liu ◽  
Chunhua Weng ◽  
...  

Abstract Background Natural language processing (NLP) tools can facilitate the extraction of biomedical concepts from unstructured free texts, such as research articles or clinical notes. The NLP software tools CLAMP, cTAKES, and MetaMap are among the most widely used tools to extract biomedical concept entities. However, their performance in extracting disease-specific terminology from literature has not been compared extensively, especially for complex neuropsychiatric disorders with a diverse set of phenotypic and clinical manifestations. Methods We comparatively evaluated these NLP tools using autism spectrum disorder (ASD) as a case study. We collected 827 ASD-related terms based on previous literature as the benchmark list for performance evaluation. Then, we applied CLAMP, cTAKES, and MetaMap on 544 full-text articles and 20,408 abstracts from PubMed to extract ASD-related terms. We evaluated the predictive performance using precision, recall, and F1 score. Results We found that CLAMP has the best performance in terms of F1 score followed by cTAKES and then MetaMap. Our results show that CLAMP has much higher precision than cTAKES and MetaMap, while cTAKES and MetaMap have higher recall than CLAMP. Conclusion The analysis protocols used in this study can be applied to other neuropsychiatric or neurodevelopmental disorders that lack well-defined terminology sets to describe their phenotypic presentations.


Author(s):  
Sourajit Roy ◽  
Pankaj Pathak ◽  
S. Nithya

During the advent of the 21st century, technical breakthroughs and developments took place. Natural Language Processing or NLP is one of their promising disciplines that has been increasingly dynamic via groundbreaking findings on most computer networks. Because of the digital revolution the amounts of data generated by M2M communication across devices and platforms such as Amazon Alexa, Apple Siri, Microsoft Cortana, etc. were significantly increased. This causes a great deal of unstructured data to be processed that does not fit in with standard computational models. In addition, the increasing problems of language complexity, data variability and voice ambiguity make implementing models increasingly harder. The current study provides an overview of the potential and breadth of the NLP market and its acceptance in industry-wide, in particular after Covid-19. It also gives a macroscopic picture of progress in natural language processing research, development and implementation.


2013 ◽  
Vol 52 (01) ◽  
pp. 33-42 ◽  
Author(s):  
M.-H. Kuo ◽  
P. Gooch ◽  
J. St-Maurice

SummaryObjective: The objective of this study was to undertake a proof of concept that demonstrated the use of primary care data and natural language processing and term extraction to assess emergency room use. The study extracted biopsychosocial concepts from primary care free text and related them to inappropriate emergency room use through the use of odds ratios.Methods: De-identified free text notes were extracted from a primary care clinic in Guelph, Ontario and analyzed with a software toolkit that incorporated General Architecture for Text Engineering (GATE) and MetaMap components for natural language processing and term extraction.Results: Over 10 million concepts were extracted from 13,836 patient records. Codes found in at least 1% percent of the sample were regressed against inappropriate emergency room use. 77 codes fell within the realm of biopsychosocial, were very statistically significant (p < 0.001) and had an OR > 2.0. Thematically, these codes involved mental health and pain related concepts.Conclusions: Analyzed thematically, mental health issues and pain are important themes; we have concluded that pain and mental health problems are primary drivers for inappropriate emergency room use. Age and sex were not significant. This proof of concept demonstrates the feasibly of combining natural language processing and primary care data to analyze a system use question. As a first work it supports further research and could be applied to investigate other, more complex problems.


2021 ◽  
Author(s):  
Oscar Nils Erik Kjell ◽  
H. Andrew Schwartz ◽  
Salvatore Giorgi

The language that individuals use for expressing themselves contains rich psychological information. Recent significant advances in Natural Language Processing (NLP) and Deep Learning (DL), namely transformers, have resulted in large performance gains in tasks related to understanding natural language such as machine translation. However, these state-of-the-art methods have not yet been made easily accessible for psychology researchers, nor designed to be optimal for human-level analyses. This tutorial introduces text (www.r-text.org), a new R-package for analyzing and visualizing human language using transformers, the latest techniques from NLP and DL. Text is both a modular solution for accessing state-of-the-art language models and an end-to-end solution catered for human-level analyses. Hence, text provides user-friendly functions tailored to test hypotheses in social sciences for both relatively small and large datasets. This tutorial describes useful methods for analyzing text, providing functions with reliable defaults that can be used off-the-shelf as well as providing a framework for the advanced users to build on for novel techniques and analysis pipelines. The reader learns about six methods: 1) textEmbed: to transform text to traditional or modern transformer-based word embeddings (i.e., numeric representations of words); 2) textTrain: to examine the relationships between text and numeric/categorical variables; 3) textSimilarity and 4) textSimilarityTest: to computing semantic similarity scores between texts and significance test the difference in meaning between two sets of texts; and 5) textProjection and 6) textProjectionPlot: to examine and visualize text within the embedding space according to latent or specified construct dimensions (e.g., low to high rating scale scores).


Author(s):  
Shruthi J. ◽  
Suma Swamy

In the present state of digital world, computer machine do not understand the human’s ordinary language. This is the great barrier between humans and digital systems. Hence, researchers found an advanced technology that provides information to the users from the digital machine. However, natural language processing (i.e. NLP) is a branch of AI that has significant implication on the ways that computer machine and humans can interact. NLP has become an essential technology in bridging the communication gap between humans and digital data. Thus, this study provides the necessity of the NLP in the current computing world along with different approaches and their applications. It also, highlights the key challenges in the development of new NLP model.


Author(s):  
J. A. Rodger ◽  
P. C. Pendharkar

The case study describes the process of planning, analysis, design and implementation of an integrated voice interactive device (VID) for the Navy. The goal of this research is to enhance Force Health Protection and to improve medical readiness by applying voice interactive technology to environmental and clinical surveillance activities aboard U.S. Navy ships.


Sign in / Sign up

Export Citation Format

Share Document