scholarly journals A term-based and citation network-based search system for COVID-19

JAMIA Open ◽  
2021 ◽  
Vol 4 (4) ◽  
Author(s):  
Chrysoula Zerva ◽  
Samuel Taylor ◽  
Axel J Soto ◽  
Nhung T H Nguyen ◽  
Sophia Ananiadou

Abstract The COVID-19 pandemic resulted in an unprecedented production of scientific literature spanning several fields. To facilitate navigation of the scientific literature related to various aspects of the pandemic, we developed an exploratory search system. The system is based on automatically identified technical terms, document citations, and their visualization, accelerating identification of relevant documents. It offers a multi-view interactive search and navigation interface, bringing together unsupervised approaches of term extraction and citation analysis. We conducted a user evaluation with domain experts, including epidemiologists, biochemists, medicinal chemists, and medicine students. In general, most users were satisfied with the relevance and speed of the search results. More interestingly, participants mostly agreed on the capacity of the system to enable exploration and discovery of the search space using the graph visualization and filters. The system is updated on a weekly basis and it is publicly available at http://www.nactem.ac.uk/cord/.

2021 ◽  
pp. 1-13
Author(s):  
Jenish Dhanani ◽  
Rupa Mehta ◽  
Dipti Rana

Legal practitioners analyze relevant previous judgments to prepare favorable and advantageous arguments for an ongoing case. In Legal domain, recommender systems (RS) effectively identify and recommend referentially and/or semantically relevant judgments. Due to the availability of enormous amounts of judgments, RS needs to compute pairwise similarity scores for all unique judgment pairs in advance, aiming to minimize the recommendation response time. This practice introduces the scalability issue as the number of pairs to be computed increases quadratically with the number of judgments i.e., O (n2). However, there is a limited number of pairs consisting of strong relevance among the judgments. Therefore, it is insignificant to compute similarities for pairs consisting of trivial relevance between judgments. To address the scalability issue, this research proposes a graph clustering based novel Legal Document Recommendation System (LDRS) that forms clusters of referentially similar judgments and within those clusters find semantically relevant judgments. Hence, pairwise similarity scores are computed for each cluster to restrict search space within-cluster only instead of the entire corpus. Thus, the proposed LDRS severely reduces the number of similarity computations that enable large numbers of judgments to be handled. It exploits a highly scalable Louvain approach to cluster judgment citation network, and Doc2Vec to capture the semantic relevance among judgments within a cluster. The efficacy and efficiency of the proposed LDRS are evaluated and analyzed using the large real-life judgments of the Supreme Court of India. The experimental results demonstrate the encouraging performance of proposed LDRS in terms of Accuracy, F1-Scores, MCC Scores, and computational complexity, which validates the applicability for scalable recommender systems.


Terminology ◽  
2000 ◽  
Vol 6 (2) ◽  
pp. 195-210 ◽  
Author(s):  
Hiroshi Nakagawa

The NTCIR1 TMREC group called for participation of the term recognition task which is a part of NTCIR1 held in 1999. As an activity of TMREC, they have provided us with the test collection of the term recognition task. The goal of this task is to automatically recognize and extract terms from the text corpus which consists of 1,870 abstracts gathered from the NACSIS Academic Conference Database. This article describes the term extraction method we have proposed to extract terms consisting of simple and compound nouns and the experimental evaluation of the proposed method with this NTCIR TMREC test collection. The basic idea of scoring a simple noun N of our term extraction method is to count how many nouns are conjoined with N to make compound nouns. Then we extend this score to measure the score of compound nouns because most of technical terms are compound nouns. Our method has a parameter to tune the degree of preference either for longer compound nouns or for shorter compound nouns. As for term candidates, in addition to noun sequences, we may add variations such as patterns of "A no B" that roughly means "B of A" or "A’ś B" and/or "A na B" where "A na" is an adjective. Experimental results of our method are promising, namely recall of 0.83, precision of 0.46 and F-value of 0.59 for exactly matched extracted terms when we take into account top scoring 16,000 extracted terms.


Terminology ◽  
2001 ◽  
Vol 7 (2) ◽  
pp. 259-279 ◽  
Author(s):  
Heather Fulford

The proliferation of specialist texts over recent decades has exacerbated the need for term extraction software to assist terminologists in compiling terminology collections. To this end, an automated approach to English term extraction is presented, which, in keeping with the multidisciplinary working environments of many contemporary terminologists, is designed to be domain independent. Based on observations made of the linguistic features of terms and their linguistic environment in text, this approach identifies single- and multi-word terms spanning a range of word classes. An implementation of the approach (denoted ‘Textprobe’) is described and evaluated by measuring its term extraction efficiency against the manual scanning output of both domain experts and terminologists. Results obtained in the evaluation suggest that a high proportion of single-and multi-word terms can successfully be extracted from special language texts. It is anticipated that the approach will be portable to other European languages.


2021 ◽  
Vol 11 (22) ◽  
pp. 10970
Author(s):  
Naif Radi Aljohani ◽  
Ayman Fayoumi ◽  
Saeed-Ul Hassan

We investigated the scientific research dissemination by analyzing the publications and citation data, implying that not all citations are significantly important. Therefore, as alluded to existing state-of-the-art models that employ feature-based techniques to measure the scholarly research dissemination between multiple entities, our model implements the convolutional neural network (CNN) with fastText-based pre-trained embedding vectors, utilizes only the citation context as its input to distinguish between important and non-important citations. Moreover, we speculate using focal-loss and class weight methods to address the inherited class imbalance problems in citation classification datasets. Using a dataset of 10 K annotated citation contexts, we achieved an accuracy of 90.7% along with a 90.6% f1-score, in the case of binary classification. Finally, we present a case study to measure the comprehensiveness of our deployed model on a dataset of 3100 K citations taken from the ACL Anthology Reference Corpus. We employed state-of-the-art graph visualization open-source tool Gephi to analyze the various aspects of citation network graphs, for each respective citation behavior.


Author(s):  
Thomas Schultz ◽  
Niccolò Ridi

This introductory chapter provides an overview of the arbitration literature. Arbitration literature has a long history. So far, however, no attempt has been made to examine it and its evolution systematically and with a quantitative approach. The lack of investigation of this research question is, in and by itself, surprising. Clearly, the literature plays a strong role in shaping the thinking and making of international arbitration law. Moreover, literature—and scientific literature in particular—is a privileged conduit for the various actors in the social field of international arbitration. The chapter then looks at scientometrics. This field was first defined as ‘the quantitative methods of the research on the development of science as an informational process’. On the scientometrics market, the citation is the main currency. The rationale is that citation counts are positively associated with subsequent impact. Thus, arbitration literature can be measured in two ways. First, one determines which works are the most cited, in absolute terms and over time, for two different time windows. These are the works that likely have had the most impact on the knowledge in and about arbitration, where this knowledge is taken as a single, common whole. Second, one looks at what the co-citation network can reveal about the make-up of the world of arbitration literature.


2013 ◽  
Vol 13 (02) ◽  
pp. 1340008
Author(s):  
AKRITI NIGAM ◽  
AJAY INDORIA ◽  
R. C. TRIPATHI

In this paper, an efficient preprocessing module has been described which focuses on building a trademark database that can be used for developing a trademark retrieval system. The preprocessing module focuses on noise removal from the trademark images using an adaptive filtering technique using Wiener filters, followed by Karhunen–Loève transform that makes the trademark search process rotation invariant by rotating the object along positive y direction. Since the registered trademarks are huge in number and will increase invariantly in the future it will be strenuous for the search system to search for similarity in such huge database. Intention was to reduce the search space hence fuzzy clustering has been applied. All these preprocessing steps make a retrieval system more efficient and reduce computation cost.


2021 ◽  
Vol 8 ◽  
Author(s):  
Mohammed Odeh ◽  
Faten F. Kharbat ◽  
Rana Yousef ◽  
Yousra Odeh ◽  
Dina Tbaishat ◽  
...  

Background: Few ontological attempts have been reported for conceptualizing the bioethics domain. In addition to limited scope representativeness and lack of robust methodological approaches in driving research design and evaluation of bioethics ontologies, no bioethics ontologies exist for pandemics and COVID-19. This research attempted to investigate whether studying the bioethics research literature, from the inception of bioethics research publications, facilitates developing highly agile, and representative computational bioethics ontology as a foundation for the automatic governance of bioethics processes in general and the COVID-19 pandemic in particular.Research Design: The iOntoBioethics agile research framework adopted the Design Science Research Methodology. Using systematic literature mapping, the search space resulted in 26,170 Scopus indexed bioethics articles, published since 1971. iOntoBioethics underwent two distinctive stages: (1) Manually Constructing Bioethics (MCB) ontology from selected bioethics sources, and (2) Automatically generating bioethics ontological topic models with all 26,170 sources and using special-purpose developed Text Mining and Machine-Learning (TM&ML) engine. Bioethics domain experts validated these ontologies, and further extended to construct and validate the Bioethics COVID-19 Pandemic Ontology.Results: Cross-validation of the MCB and TM&ML bioethics ontologies confirmed that the latter provided higher-level abstraction for bioethics entities with well-structured bioethics ontology class hierarchy compared to the MCB ontology. However, both bioethics ontologies were found to complement each other forming a highly comprehensive Bioethics Ontology with around 700 concepts and associations COVID-19 inclusive.Conclusion:The iOntoBioethics framework yielded the first agile, semi-automatically generated, literature-based, and domain experts validated General Bioethics and Bioethics Pandemic Ontologies Operable in COVID-19 context with readiness for automatic governance of bioethics processes. These ontologies will be regularly and semi-automatically enriched as iOntoBioethics is proposed as an open platform for scientific and healthcare communities, in their infancy COVID-19 learning stage. iOntoBioethics not only it contributes to better understanding of bioethics processes, but also serves as a bridge linking these processes to healthcare systems. Such big data analytics platform has the potential to automatically inform bioethics governance adherence given the plethora of developing bioethics and COVID-19 pandemic knowledge. Finally, iOntoBioethics contributes toward setting the first building block for forming the field of “Bioethics Informatics”.


Author(s):  
Yi-Jie Lu ◽  
Phuong Anh Nguyen ◽  
Hao Zhang ◽  
Chong-Wah Ngo

Author(s):  
Pengyu Zhao ◽  
Kecheng Xiao ◽  
Yuanxing Zhang ◽  
Kaigui Bian ◽  
Wei Yan

Recently, deep learning models have been widely explored in recommender systems. Though having achieved remarkable success, the design of task-aware recommendation models usually requires manual feature engineering and architecture engineering from domain experts. To relieve those efforts, we explore the potential of neural architecture search (NAS) and introduce AMEIR for Automatic behavior Modeling, interaction Exploration and multi-layer perceptron (MLP) Investigation in the Recommender system. Specifically, AMEIR divides the complete recommendation models into three stages of behavior modeling, interaction exploration, MLP aggregation, and introduces a novel search space containing three tailored subspaces that cover most of the existing methods and thus allow for searching better models. To find the ideal architecture efficiently and effectively, AMEIR realizes the one-shot random search in recommendation progressively on the three stages and assembles the search results as the final outcome. The experiment over various scenarios reveals that AMEIR outperforms competitive baselines of elaborate manual design and leading algorithmic complex NAS methods with lower model complexity and comparable time cost, indicating efficacy, efficiency, and robustness of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document