Report on the 4th Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries at SIGIR 2019

Muthu Kumar Chandrasekaran; Philipp Mayr

doi:10.1145/3458553.3458554

Report on the 4th Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries at SIGIR 2019

ACM SIGIR Forum ◽

10.1145/3458553.3458554 ◽

2019 ◽

Vol 53 (2) ◽

pp. 3-10

Author(s):

Muthu Kumar Chandrasekaran ◽

Philipp Mayr

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Research And Development ◽

Language Processing ◽

Digital Libraries ◽

State Of The Art ◽

Shared Task ◽

Processing Information ◽

Joint Workshop

The 4 th joint BIRNDL workshop was held at the 42nd ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019) in Paris, France. BIRNDL 2019 intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The workshop incorporated different paper sessions and the 5 th edition of the CL-SciSumm Shared Task.

Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2018)

The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval - SIGIR '18 ◽

10.1145/3209978.3210194 ◽

2018 ◽

Cited By ~ 2

Author(s):

Muthu Kumar Chandrasekaran ◽

Kokil Jaidka ◽

Philipp Mayr

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Digital Libraries ◽

Joint Workshop

Report on the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2016)

ACM SIGIR Forum ◽

10.1145/3053408.3053417 ◽

2017 ◽

Vol 50 (2) ◽

pp. 36-43 ◽

Cited By ~ 6

Author(s):

Guillaume Cabanac ◽

Muthu Kumar Chandrasekaran ◽

Ingo Frommholz ◽

Kokil Jaidka ◽

Min-Yen Kan ◽

...

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Digital Libraries ◽

Joint Workshop

Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017)

Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '17 ◽

10.1145/3077136.3084370 ◽

2017 ◽

Author(s):

Muthu Kumar Chandrasekaran ◽

Kokil Jaidka ◽

Philipp Mayr

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Digital Libraries ◽

Joint Workshop

Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2019)

Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR'19 ◽

10.1145/3331184.3331650 ◽

2019 ◽

Cited By ~ 1

Author(s):

Muthu Kumar Chandrasekaran ◽

Philipp Mayr ◽

Michihiro Yasunaga ◽

Dayne Freitag ◽

Dragomir Radev ◽

...

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Digital Libraries ◽

Joint Workshop

Report on the 3rd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2018)

ACM SIGIR Forum ◽

10.1145/3308774.3308792 ◽

2019 ◽

Vol 52 (2) ◽

pp. 105-110 ◽

Cited By ~ 2

Author(s):

Philipp Mayr ◽

Muthu Kumar Chandrasekaran ◽

Kokil Jaidka

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Digital Libraries ◽

Joint Workshop

Report on the 2nd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017)

ACM SIGIR Forum ◽

10.1145/3190580.3190597 ◽

2018 ◽

Vol 51 (3) ◽

pp. 107-113 ◽

Cited By ~ 5

Author(s):

Philipp Mayr ◽

Muthu Kumar Chandrasekaran ◽

Kokil Jaidka

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Digital Libraries ◽

Joint Workshop

Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2016)

Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries - JCDL '16 ◽

10.1145/2910896.2926734 ◽

2016 ◽

Cited By ~ 4

Author(s):

Guillaume Cabanac ◽

Muthu Kumar Chandrasekaran ◽

Ingo Frommholz ◽

Kokil Jaidka ◽

Min-Yen Kan ◽

...

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Digital Libraries ◽

Joint Workshop

Applying Natural Language Processing, Information Retrieval and Machine Learning to Decision Support in Medical Coordination in an Emergency Medicine Context

2015 IEEE 28th International Symposium on Computer-Based Medical Systems ◽

10.1109/cbms.2015.82 ◽

2015 ◽

Cited By ~ 2

Author(s):

Juliana Tarossi Pollettini ◽

Hugo Cesar Pessotti ◽

Antonio Pazin Filho ◽

Evandro Eduardo Seron Ruiz ◽

Mario Sergio Adolfi Junior

Keyword(s):

Machine Learning ◽

Emergency Medicine ◽

Information Retrieval ◽

Natural Language Processing ◽

Decision Support ◽

Natural Language ◽

Language Processing ◽

Processing Information

LIS4: Lesk Inspired Sense Specific Semantic Similarity using WordNet

Journal of Information & Knowledge Management ◽

10.1142/s0219649221500064 ◽

2021 ◽

pp. 2150006

Author(s):

Saravanakumar Kandasamy ◽

Aswani Kumar Cherukuri

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Gold Standard ◽

Question Answering ◽

Knowledge Based ◽

Benchmark Datasets ◽

Processing Information

Semantic similarity quantification between concepts is one of the inevitable parts in domains like Natural Language Processing, Information Retrieval, Question Answering, etc. to understand the text and their relationships better. Last few decades, many measures have been proposed by incorporating various corpus-based and knowledge-based resources. WordNet and Wikipedia are two of the Knowledge-based resources. The contribution of WordNet in the above said domain is enormous due to its richness in defining a word and all of its relationship with others. In this paper, we proposed an approach to quantify the similarity between concepts that exploits the synsets and the gloss definitions of different concepts using WordNet. Our method considers the gloss definitions, contextual words that are helping in defining a word, synsets of contextual word and the confidence of occurrence of a word in other word’s definition for calculating the similarity. The evaluation based on different gold standard benchmark datasets shows the efficiency of our system in comparison with other existing taxonomical and definitional measures.

The 2019 n2c2/OHNLP Track on Clinical Semantic Textual Similarity: Overview (Preprint)

10.2196/preprints.23375 ◽

2020 ◽

Cited By ~ 1

Author(s):

Yanshan Wang ◽

Sunyang Fu ◽

Feichen Shen ◽

Sam Henry ◽

Ozlem Uzuner ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Shared Task ◽

Data Set ◽

Clinical Text ◽

Clinical Notes ◽

Clinical Domain ◽

Semantic Textual Similarity

BACKGROUND Semantic textual similarity is a common task in the general English domain to assess the degree to which the underlying semantics of 2 text segments are equivalent to each other. Clinical Semantic Textual Similarity (ClinicalSTS) is the semantic textual similarity task in the clinical domain that attempts to measure the degree of semantic equivalence between 2 snippets of clinical text. Due to the frequent use of templates in the Electronic Health Record system, a large amount of redundant text exists in clinical notes, making ClinicalSTS crucial for the secondary use of clinical text in downstream clinical natural language processing applications, such as clinical text summarization, clinical semantics extraction, and clinical information retrieval. OBJECTIVE Our objective was to release ClinicalSTS data sets and to motivate natural language processing and biomedical informatics communities to tackle semantic text similarity tasks in the clinical domain. METHODS We organized the first BioCreative/OHNLP ClinicalSTS shared task in 2018 by making available a real-world ClinicalSTS data set. We continued the shared task in 2019 in collaboration with National NLP Clinical Challenges (n2c2) and the Open Health Natural Language Processing (OHNLP) consortium and organized the 2019 n2c2/OHNLP ClinicalSTS track. We released a larger ClinicalSTS data set comprising 1642 clinical sentence pairs, including 1068 pairs from the 2018 shared task and 1006 new pairs from 2 electronic health record systems, GE and Epic. We released 80% (1642/2054) of the data to participating teams to develop and fine-tune the semantic textual similarity systems and used the remaining 20% (412/2054) as blind testing to evaluate their systems. The workshop was held in conjunction with the American Medical Informatics Association 2019 Annual Symposium. RESULTS Of the 78 international teams that signed on to the n2c2/OHNLP ClinicalSTS shared task, 33 produced a total of 87 valid system submissions. The top 3 systems were generated by IBM Research, the National Center for Biotechnology Information, and the University of Florida, with Pearson correlations of r=.9010, r=.8967, and r=.8864, respectively. Most top-performing systems used state-of-the-art neural language models, such as BERT and XLNet, and state-of-the-art training schemas in deep learning, such as pretraining and fine-tuning schema, and multitask learning. Overall, the participating systems performed better on the Epic sentence pairs than on the GE sentence pairs, despite a much larger portion of the training data being GE sentence pairs. CONCLUSIONS The 2019 n2c2/OHNLP ClinicalSTS shared task focused on computing semantic similarity for clinical text sentences generated from clinical notes in the real world. It attracted a large number of international teams. The ClinicalSTS shared task could continue to serve as a venue for researchers in natural language processing and medical informatics communities to develop and improve semantic textual similarity techniques for clinical text.