Leaving No Stone Unturned: Using Machine Learning Based Approaches for Information Extraction from Full Texts of a Research Data Warehouse

Author(s):  
Johanna Fiebeck ◽  
Hans Laser ◽  
Hinrich B. Winther ◽  
Svetlana Gerbel
Author(s):  
Neil Ireson ◽  
Fabio Ciravegna ◽  
Mary Elaine Califf ◽  
Dayne Freitag ◽  
Nicholas Kushmerick ◽  
...  

2020 ◽  
pp. 1-21 ◽  
Author(s):  
Clément Dalloux ◽  
Vincent Claveau ◽  
Natalia Grabar ◽  
Lucas Emanuel Silva Oliveira ◽  
Claudia Maria Cabral Moro ◽  
...  

Abstract Automatic detection of negated content is often a prerequisite in information extraction systems in various domains. In the biomedical domain especially, this task is important because negation plays an important role. In this work, two main contributions are proposed. First, we work with languages which have been poorly addressed up to now: Brazilian Portuguese and French. Thus, we developed new corpora for these two languages which have been manually annotated for marking up the negation cues and their scope. Second, we propose automatic methods based on supervised machine learning approaches for the automatic detection of negation marks and of their scopes. The methods show to be robust in both languages (Brazilian Portuguese and French) and in cross-domain (general and biomedical languages) contexts. The approach is also validated on English data from the state of the art: it yields very good results and outperforms other existing approaches. Besides, the application is accessible and usable online. We assume that, through these issues (new annotated corpora, application accessible online, and cross-domain robustness), the reproducibility of the results and the robustness of the NLP applications will be augmented.


2020 ◽  
Vol 2 (4) ◽  
pp. 554-568
Author(s):  
Chris Graf ◽  
Dave Flanagan ◽  
Lisa Wylie ◽  
Deirdre Silver

Data availability statements can provide useful information about how researchers actually share research data. We used unsupervised machine learning to analyze 124,000 data availability statements submitted by research authors to 176 Wiley journals between 2013 and 2019. We categorized the data availability statements, and looked at trends over time. We found expected increases in the number of data availability statements submitted over time, and marked increases that correlate with policy changes made by journals. Our open data challenge becomes to use what we have learned to present researchers with relevant and easy options that help them to share and make an impact with new research data.


2020 ◽  
Vol 81 (6) ◽  
pp. 265
Author(s):  
David Free

Welcome to the June 2020 issue of C&RL News. Every two years, ACRL’s Research Planning and Review Committee produces their “Top trends in academic libraries.” The 2020 edition discusses change management; evolving integrated library systems; learning analytics; machine learning and AI; the state of open access and research data services; social justice, critical librarianship, and critical digital pedagogy; streaming media; and student wellbeing. Many thanks to the committee for pulling together this important survey of the current landscape of academic and research librarianship.


Author(s):  
SANDA M. HARABAGIU

This paper presents a novel methodology of disambiguating prepositional phrase attachments. We create patterns of attachments by classifying a collection of prepositional relations derived from Treebank parses. As a by-product, the arguments of every prepositional relation are semantically disambiguated. Attachment decisions are generated as the result of a learning process, that builds upon some of the most popular current statistical and machine learning techniques. We have tested this methodology on (1) Wall Street Journal articles, (2) textual definitions of concepts from a dictionary and (3) an ad hoc corpus of Web documents, used for conceptual indexing and information extraction.


Sign in / Sign up

Export Citation Format

Share Document