Authorship Weightage Algorithm for Academic Publications: A New Calculation and ACES Webserver for Determining Expertise

Despite the public availability, finding experts in any field when relying on academic publications can be challenging, especially with the use of jargons. Even after overcoming these issues, the discernment of expertise by authorship positions is often also absent in the many publication-based search platforms. Given that it is common in many academic fields for the research group lead or lab head to take the position of the last author, some of the existing authorship scoring systems that assign a decreasing weightage from the first author would not reflect the last author correctly. To address these problems, we incorporated natural language processing (Common Crawl using fastText) to retrieve related keywords when using jargons as well as a modified authorship positional scoring that allows the assignment of greater weightage to the last author. The resulting output is a ranked scoring system of researchers upon every search that we implemented as a webserver for internal use called the APD lab Capability & Expertise Search (ACES).

Download Full-text

Authorship Weightage Algorithm for Academic publications: A new calculation and ACES webserver for determining expertise.

10.31224/osf.io/4vz8n ◽

2021 ◽

Author(s):

Weiling Wu ◽

Owen Tan ◽

Kwok-Fong Chan ◽

Nicole Ong ◽

David Gunasegaran ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Research Group ◽

Scoring System ◽

Scoring Systems ◽

Expert Search ◽

Academic Fields ◽

Academic Publications ◽

The Many

Finding experts in any field can be difficult, especially when relying on academic publications given the use of jargons despite publication lists being publicly available. Within the use of publications, discernment of expertise by authorship positions is often absent in the many pub-lication-based expert search platforms available which could function as another discernment filter of expertise. Given that it is common in many academic fields for the research group lead or lab heads to take the position of the last author, the existing authorship scoring systems that assign a decreasing weightage from the first author would not reflect the last author correctly. To address these mentioned problems, we incorporated natural language processing (Common Crawl using fastText) to identify related keywords for a search compatible to using jargons as well as an au-thorship positional scoring with the option to provide greater weightage to the last author. The resulting output is a ranked scoring system of researchers upon every search which we imple-mented as a webserver for internal agency use called ‘APD lab Capability & Expertise Search (ACES)’ webserver which can be accessed from webserver.apdskeg.com/aces.

Download Full-text

Using NLP for Fact Checking: A Survey

Designs ◽

10.3390/designs5030042 ◽

2021 ◽

Vol 5 (3) ◽

pp. 42

Author(s):

Eric Lazarski ◽

Mahmood Al-Khassaweneh ◽

Cynthia Howard

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Computer Science ◽

Language Processing ◽

The Internet ◽

Fake News ◽

Fact Checking ◽

The Many ◽

Human Powered ◽

The Web

In recent years, disinformation and “fake news” have been spreading throughout the internet at rates never seen before. This has created the need for fact-checking organizations, groups that seek out claims and comment on their veracity, to spawn worldwide to stem the tide of misinformation. However, even with the many human-powered fact-checking organizations that are currently in operation, disinformation continues to run rampant throughout the Web, and the existing organizations are unable to keep up. This paper discusses in detail recent advances in computer science to use natural language processing to automate fact checking. It follows the entire process of automated fact checking using natural language processing, from detecting claims to fact checking to outputting results. In summary, automated fact checking works well in some cases, though generalized fact checking still needs improvement prior to widespread use.

Download Full-text

Finite-state methods and models in natural language processing

Natural Language Engineering ◽

10.1017/s1351324911000015 ◽

2011 ◽

Vol 17 (2) ◽

pp. 141-144

Author(s):

ANSSI YLI-JYRÄ ◽

ANDRÁS KORNAI ◽

JACQUES SAKAROVITCH

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Computational Linguistics ◽

Language Processing ◽

Current Issue ◽

Special Issue ◽

The Public ◽

The Past ◽

Finite State ◽

Final Selection

For the past two decades, specialised events on finite-state methods have been successful in presenting interesting studies on natural language processing to the public through journals and collections. The FSMNLP workshops have become well-known among researchers and are now the main forum of the Association for Computational Linguistics' (ACL) Special Interest Group on Finite-State Methods (SIGFSM). The current issue on finite-state methods and models in natural language processing was planned in 2008 in this context as a response to a call for special issue proposals. In 2010, the issue received a total of sixteen submissions, some of which were extended and updated versions of workshop papers, and others which were completely new. The final selection, consisting of only seven papers that could fit into one issue, is not fully representative, but complements the prior special issues in a nice way. The selected papers showcase a few areas where finite-state methods have less than obvious and sometimes even groundbreaking relevance to natural language processing (NLP) applications.

Download Full-text

Application of Natural Language Processing (NLP) Techniques in E–Governance

E-Government Development and Diffusion ◽

10.4018/978-1-60566-713-3.ch008 ◽

2009 ◽

pp. 122-132 ◽

Cited By ~ 2

Author(s):

Siddhartha Ghosh

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Information And Communication Technologies ◽

Citizen Participation ◽

Language Processing ◽

Good Governance ◽

Communication Technologies ◽

The Public ◽

Information And Communication ◽

The Masses

E-governance is the public sector’s use of information and communication technologies (ICT) with the aim of improving information and service delivery, encouraging citizen participation in the decision-making process and making government more accountable, transparent, and effective. Effective and efficient e-governments deploy information and communication technology systems to deliver services through multiple channels that are accessible, fast, secure, reliable, seamless, and coherent. To implement better government-to-government (G2G), government-to-business (G2B), government-to-enterprise (G2E) and government-to-citizen (G2C) services a good governance should not only utilize ICT, it has to be also serious about implementing natural language processing (NLP) Techniques to reach up to the masses and make e-governance successful one. This chapter shows the need of applying NLP technologies in the field of e-governance and also tries to focus on the issues, which can be resolved very easily with the help of these modern technologies. It also shows the advantages of applying NLP in e-governance.

Download Full-text

A text-based multi-span network for reading comprehension

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-200581 ◽

2021 ◽

pp. 1-13

Author(s):

Deguang Chen ◽

Ziping Ma ◽

Lin Wei ◽

Yanbin Zhu ◽

Jinlin Ma ◽

...

Keyword(s):

Reading Comprehension ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Market Value ◽

The State ◽

Experimental Results ◽

Training Time ◽

The Public

Text-based reading comprehension models have great research significance and market value and are one of the main directions of natural language processing. Reading comprehension models of single-span answers have recently attracted more attention and achieved significant results. In contrast, multi-span answer models for reading comprehension have been less investigated and their performances need improvement. To address this issue, in this paper, we propose a text-based multi-span network for reading comprehension, ALBERT_SBoundary, and build a multi-span answer corpus, MultiSpan_NMU. We also conduct extensive experiments on the public multi-span corpus, MultiSpan_DROP, and our multi-span answer corpus, MultiSpan_NMU, and compare the proposed method with the state-of-the-art. The experimental results show that our proposed method achieves F1 scores of 84.10 and 92.88 on MultiSpan_DROP and MultiSpan_NMU datasets, respectively, while it also has fewer parameters and a shorter training time.

Download Full-text

Opinion Mining Analysis of Twitter Users of Particular Topic

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206521 ◽

2020 ◽

pp. 102-108

Author(s):

Sneha Naik ◽

Mona Mulchandani

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Opinion Mining ◽

Stock Exchange ◽

Daily Basis ◽

The Public ◽

Public Sentiment ◽

Twitter Users

Opinion mining consists of many different fields like natural language processing, text mining, decision making and linguistics. Opinion mining is a type of natural language processing for tracking the mood of the public about a particular product. Opinion mining, which is also called sentiment analysis, involves building a system to collect and categorize opinions about a product. Automated opinion mining often uses machine learning, a type of artificial intelligence (AI), to mine text for sentiment. This project addresses the problem of sentiment analysis in twitter; that is classifying tweets according to the sentiment expressed in them: positive, negative or neutral. Twitter is an online micro-blogging and social-networking platform which allows users to write short status updates of maximum length 140 characters. It is a rapidly expanding service with over 200 million registered users out of which 100 million are active users and half of them log on twitter on a daily basis - generating nearly 250 million tweets per day. Due to this large amount of usage we hope to achieve a reflection of public sentiment by analysing the sentiments expressed in the tweets. Analysing the public sentiment is important for many applications such as firms trying to find out the response of their products in the market, predicting political elections and predicting socioeconomic phenomena like stock exchange.

Download Full-text

Who owns Internet of Thing devices?

International Journal of Distributed Sensor Networks ◽

10.1177/1550147718811099 ◽

2018 ◽

Vol 14 (11) ◽

pp. 155014771881109 ◽

Cited By ~ 2

Author(s):

Yuxuan Jia ◽

Bing Han ◽

Qiang Li ◽

Hong Li ◽

Limin Sun

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Integrated Approach ◽

Processing Technique ◽

Privacy Concerns ◽

The Public ◽

Response Data ◽

Network Scanning ◽

Iot Devices

Although Internet of Things (IoT) has been recently receiving attention from the research community, undoubtedly, there still exists several privacy concerns about those devices. In particular, IoT devices in the cyberspace are reachable and visible through IP addresses. This article uniquely exploits to qualify the distribution of owner information of IoT devices based on the observation; consumers may write relevant details into the application-layer service on the IoT devices, such as company or usernames. We propose to automatically extract owner annotation by utilizing a set of techniques (network scanning, machine learning, and natural language processing). We use the probing and classifier to determine whether the response data come from an IoT device. The natural language-processing technique is used to extract owner information from IoT devices. We have conducted real-world experiments to evaluate our integrated approach empirically. The results show that the precision is 97% and the coverage is 96%. Furthermore, our approach is running on a more larger unlabeled dataset consisting of 93 million response packets from the whole IPv4 space. Our analysis has drawn upon nearly 4.3 million IoT devices exposed to the public, and it is a typical trail effect of the owner information distribution.

Download Full-text

Design of English Translation Computer Intelligent Scoring System Based on Natural Language Processing

Journal of Physics Conference Series ◽

10.1088/1742-6596/1648/2/022084 ◽

2020 ◽

Vol 1648 ◽

pp. 022084

Author(s):

Hong Yang ◽

Yan Yang

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

English Translation ◽

Scoring System

Download Full-text

Airbnb research: an analysis in tourism and hospitality journals

International Journal of Culture Tourism and Hospitality Research ◽

10.1108/ijcthr-06-2019-0113 ◽

2020 ◽

Vol 14 (1) ◽

pp. 2-20 ◽

Cited By ~ 3

Author(s):

Luisa Andreu ◽

Enrique Bigne ◽

Suzanne Amaro ◽

Jesús Palomo

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Research Performance ◽

Future Research ◽

Content Type ◽

Source Selection ◽

Academic Publications ◽

Tourism And Hospitality ◽

Processing Techniques

Purpose The purpose of this study is to examine Airbnb research using bibliometric methods. Using research performance analysis, this study highlights and provides an updated overview of Airbnb research by revealing patterns in journals, papers and most influential authors and countries. Furthermore, it graphically illustrates how research themes have evolved by mapping a co-word analysis and points out potential trends for future research. Design/methodology/approach The methodological design for this study involves three phases: the document source selection, the definition of the variables to be analyzed and the bibliometric analysis. A statistical multivariate analysis of all the documents’ characteristics was performed with R software. Furthermore, natural language processing techniques were used to analyze all the abstracts and keywords specified in the 129 selected documents. Findings Results show the genesis and evolution of publications on Airbnb research, scatter of journals and journals’ characteristics, author and productivity characteristics, geographical distribution of the research and content analysis using keywords. Research limitations/implications Despite Airbnb having a history of 10 years, research publications only started in 2015. Therefore, the bibliometric study includes papers from 2015 to 2019. One of the main limitations is that papers were selected in October of 2019, before the year was over. However, the latest academic publications (in press and earlycite) were included in the analysis. Originality/value This study analyzed bibliometric set of laws (Price’s, Lotka’s and Bradford’s) to better understand the patterns of the most relevant scientific production regarding Airbnb in tourism and hospitality journals. Using natural language processing techniques, this study analyzes all the abstracts and keywords specified in the selected documents. Results show the evolution of research topics in four periods: 2015-2016, 2017, 2018 and 2019.

Download Full-text

Word Embedding Techniques for Sentiment Analyzers

10.4018/978-1-7998-8061-5.ch013 ◽

2021 ◽

pp. 233-252

Author(s):

Upendar Rao Rayala ◽

Karthick Seshadri

Keyword(s):

Social Networks ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Word Embedding ◽

Word Embeddings ◽

The Public ◽

The Given ◽

Research Domain

Sentiment analysis is perceived to be a multi-disciplinary research domain composed of machine learning, artificial intelligence, deep learning, image processing, and social networks. Sentiment analysis can be used to determine opinions of the public about products and to find the customers' interest and their feedback through social networks. To perform any natural language processing task, the input text/comments should be represented in a numerical form. Word embeddings represent the given text/sentences/words as a vector that can be employed in performing subsequent natural language processing tasks. In this chapter, the authors discuss different techniques that can improve the performance of sentiment analysis using concepts and techniques like traditional word embeddings, sentiment embeddings, emoticons, lexicons, and neural networks. This chapter also traces the evolution of word embedding techniques with a chronological discussion of the recent research advancements in word embedding techniques.

Download Full-text