natural language processing Latest Research Papers

Towards Developing Uniform Lexicon Based Sorting Algorithm for Three Prominent Indo-Aryan Languages

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3488371 ◽

2022 ◽

Vol 21 (3) ◽

pp. 1-20

Author(s):

Mir Ragib Ishraq ◽

Nitesh Khadka ◽

Asif Mohammed Samir ◽

M. Shahidur Rahman

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Second Step ◽

Sorting Algorithm ◽

Good Efficiency ◽

Local Language ◽

Total Accuracy

Three different Indic/Indo-Aryan languages - Bengali, Hindi and Nepali have been explored here in character level to find out similarities and dissimilarities. Having shared the same root, the Sanskrit, Indic languages bear common characteristics. That is why computer and language scientists can take the opportunity to develop common Natural Language Processing (NLP) techniques or algorithms. Bearing the concept in mind, we compare and analyze these three languages character by character. As an application of the hypothesis, we also developed a uniform sorting algorithm in two steps, first for the Bengali and Nepali languages only and then extended it for Hindi in the second step. Our thorough investigation with more than 30,000 words from each language suggests that, the algorithm maintains total accuracy as set by the local language authorities of the respective languages and good efficiency.

Efficient Channel Attention Based Encoder–Decoder Approach for Image Captioning in Hindi

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3483597 ◽

2022 ◽

Vol 21 (3) ◽

pp. 1-17

Author(s):

Santosh Kumar Mishra ◽

Gaurav Rai ◽

Sriparna Saha ◽

Pushpak Bhattacharyya

Keyword(s):

Computer Vision ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

English Language ◽

Image Understanding ◽

Attention Mechanism ◽

Image Captioning ◽

Textual Description ◽

Hindi Language

Image captioning refers to the process of generating a textual description that describes objects and activities present in a given image. It connects two fields of artificial intelligence, computer vision, and natural language processing. Computer vision and natural language processing deal with image understanding and language modeling, respectively. In the existing literature, most of the works have been carried out for image captioning in the English language. This article presents a novel method for image captioning in the Hindi language using encoder–decoder based deep learning architecture with efficient channel attention. The key contribution of this work is the deployment of an efficient channel attention mechanism with bahdanau attention and a gated recurrent unit for developing an image captioning model in the Hindi language. Color images usually consist of three channels, namely red, green, and blue. The channel attention mechanism focuses on an image’s important channel while performing the convolution, which is basically to assign higher importance to specific channels over others. The channel attention mechanism has been shown to have great potential for improving the efficiency of deep convolution neural networks (CNNs). The proposed encoder–decoder architecture utilizes the recently introduced ECA-NET CNN to integrate the channel attention mechanism. Hindi is the fourth most spoken language globally, widely spoken in India and South Asia; it is India’s official language. By translating the well-known MSCOCO dataset from English to Hindi, a dataset for image captioning in Hindi is manually created. The efficiency of the proposed method is compared with other baselines in terms of Bilingual Evaluation Understudy (BLEU) scores, and the results obtained illustrate that the method proposed outperforms other baselines. The proposed method has attained improvements of 0.59%, 2.51%, 4.38%, and 3.30% in terms of BLEU-1, BLEU-2, BLEU-3, and BLEU-4 scores, respectively, with respect to the state-of-the-art. Qualities of the generated captions are further assessed manually in terms of adequacy and fluency to illustrate the proposed method’s efficacy.

Model Transformation Development Using Automated Requirements Analysis, Metamodel Matching, and Transformation by Example

ACM Transactions on Software Engineering and Methodology ◽

10.1145/3471907 ◽

2022 ◽

Vol 31 (2) ◽

pp. 1-71

Author(s):

K. Lano ◽

S. Kolahdouz-Rahimi ◽

S. Fang

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Model Transformation ◽

Requirements Analysis ◽

Model Transformations ◽

Synthesis Process ◽

Development Effort ◽

List Type

In this article, we address how the production of model transformations (MT) can be accelerated by automation of transformation synthesis from requirements, examples, and metamodels. We introduce a synthesis process based on metamodel matching, correspondence patterns between metamodels, and completeness and consistency analysis of matches. We describe how the limitations of metamodel matching can be addressed by combining matching with automated requirements analysis and model transformation by example (MTBE) techniques. We show that in practical examples a large percentage of required transformation functionality can usually be constructed automatically, thus potentially reducing development effort. We also evaluate the efficiency of synthesised transformations. Our novel contributions are: The concept of correspondence patterns between metamodels of a transformation. Requirements analysis of transformations using natural language processing (NLP) and machine learning (ML). Symbolic MTBE using “predictive specification” to infer transformations from examples. Transformation generation in multiple MT languages and in Java, from an abstract intermediate language.

A Computational Look at Oral History Archives

Journal on Computing and Cultural Heritage ◽

10.1145/3477605 ◽

2022 ◽

Vol 15 (1) ◽

pp. 1-16

Author(s):

Francisca Pessanha ◽

Almila Akdag Salah

Keyword(s):

Signal Processing ◽

Natural Language Processing ◽

Speech Recognition ◽

Natural Language ◽

Oral History ◽

Language Processing ◽

Automatic Speech Recognition ◽

Visual Cues ◽

Social Signal Processing ◽

Social Signal

Computational technologies have revolutionized the archival sciences field, prompting new approaches to process the extensive data in these collections. Automatic speech recognition and natural language processing create unique possibilities for analysis of oral history (OH) interviews, where otherwise the transcription and analysis of the full recording would be too time consuming. However, many oral historians note the loss of aural information when converting the speech into text, pointing out the relevance of subjective cues for a full understanding of the interviewee narrative. In this article, we explore various computational technologies for social signal processing and their potential application space in OH archives, as well as neighboring domains where qualitative studies is a frequently used method. We also highlight the latest developments in key technologies for multimedia archiving practices such as natural language processing and automatic speech recognition. We discuss the analysis of both visual (body language and facial expressions), and non-visual cues (paralinguistics, breathing, and heart rate), stating the specific challenges introduced by the characteristics of OH collections. We argue that applying social signal processing to OH archives will have a wider influence than solely OH practices, bringing benefits for various fields from humanities to computer sciences, as well as to archival sciences. Looking at human emotions and somatic reactions on extensive interview collections would give scholars from multiple fields the opportunity to focus on feelings, mood, culture, and subjective experiences expressed in these interviews on a larger scale.

Which environmental features contribute to positive and negative perceptions of urban parks? A cross-cultural comparison using online reviews and Natural Language Processing methods

Landscape and Urban Planning ◽

10.1016/j.landurbplan.2021.104307 ◽

2022 ◽

Vol 218 ◽

pp. 104307

Author(s):

Songyao Huai ◽

Tim Van de Voorde

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Urban Parks ◽

Online Reviews ◽

Cross Cultural ◽

Cultural Comparison ◽

Environmental Features ◽

Cross Cultural Comparison ◽

Negative Perceptions

Natural language processing for smart construction: Current status and future directions

Automation in Construction ◽

10.1016/j.autcon.2021.104059 ◽

2022 ◽

Vol 134 ◽

pp. 104059

Author(s):

Chengke Wu ◽

Xiao Li ◽

Yuanjun Guo ◽

Jun Wang ◽

Zengle Ren ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Current Status ◽

Future Directions

Attention-based Unsupervised Keyphrase Extraction and Phrase Graph for COVID-19 Medical Literature Retrieval

ACM Transactions on Computing for Healthcare ◽

10.1145/3473939 ◽

2022 ◽

Vol 3 (1) ◽

pp. 1-16

Author(s):

Haoran Ding ◽

Xiao Luo

Keyword(s):

Neural Networks ◽

Natural Language Processing ◽

Language Processing ◽

Medical Literature ◽

Graph Model ◽

The Self ◽

Keyphrase Extraction ◽

Text Data ◽

Text Collections ◽

Extraction Model

Searching, reading, and finding information from the massive medical text collections are challenging. A typical biomedical search engine is not feasible to navigate each article to find critical information or keyphrases. Moreover, few tools provide a visualization of the relevant phrases to the query. However, there is a need to extract the keyphrases from each document for indexing and efficient search. The transformer-based neural networks—BERT has been used for various natural language processing tasks. The built-in self-attention mechanism can capture the associations between words and phrases in a sentence. This research investigates whether the self-attentions can be utilized to extract keyphrases from a document in an unsupervised manner and identify relevancy between phrases to construct a query relevancy phrase graph to visualize the search corpus phrases on their relevancy and importance. The comparison with six baseline methods shows that the self-attention-based unsupervised keyphrase extraction works well on a medical literature dataset. This unsupervised keyphrase extraction model can also be applied to other text data. The query relevancy graph model is applied to the COVID-19 literature dataset and to demonstrate that the attention-based phrase graph can successfully identify the medical phrases relevant to the query terms.

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

ACM Transactions on Computing for Healthcare ◽

10.1145/3458754 ◽

2022 ◽

Vol 3 (1) ◽

pp. 1-23

Author(s):

Yu Gu ◽

Robert Tinn ◽

Hao Cheng ◽

Michael Lucas ◽

Naoto Usuyama ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Fine Tuning ◽

Entity Recognition ◽

Language Models ◽

General Domain ◽

Domain Specific ◽

And Task

Pretraining large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks. However, most pretraining efforts focus on general domain corpora, such as newswire and Web. A prevailing assumption is that even domain-specific pretraining can benefit by starting from general-domain language models. In this article, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models. To facilitate this investigation, we compile a comprehensive biomedical NLP benchmark from publicly available datasets. Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks, leading to new state-of-the-art results across the board. Further, in conducting a thorough evaluation of modeling choices, both for pretraining and task-specific fine-tuning, we discover that some common practices are unnecessary with BERT models, such as using complex tagging schemes in named entity recognition. To help accelerate research in biomedical NLP, we have released our state-of-the-art pretrained and task-specific models for the community, and created a leaderboard featuring our BLURB benchmark (short for Biomedical Language Understanding & Reasoning Benchmark) at https://aka.ms/BLURB .

An ensemble approach for healthcare application and diagnosis using natural language processing

Cognitive Neurodynamics ◽

10.1007/s11571-021-09758-y ◽

2022 ◽

Author(s):

Badi Alekhya ◽

R. Sasikumar

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Ensemble Approach ◽

Healthcare Application

Machine Learning and Natural Language Processing Enable a Data-Oriented Experimental Design Approach for Producing Biochar and Hydrochar from Biomass

Chemistry of Materials ◽

10.1021/acs.chemmater.1c02961 ◽

2022 ◽

Author(s):

Amauri J. Paula ◽

Odair Pastor Ferreira ◽

Antonio G. Souza Filho ◽

Francisco Nepomuceno Filho ◽

Carlos E. Andrade ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Experimental Design ◽

Natural Language ◽

Language Processing ◽

Design Approach

natural language processing
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Towards Developing Uniform Lexicon Based Sorting Algorithm for Three Prominent Indo-Aryan Languages

Efficient Channel Attention Based Encoder–Decoder Approach for Image Captioning in Hindi

Model Transformation Development Using Automated Requirements Analysis, Metamodel Matching, and Transformation by Example

A Computational Look at Oral History Archives

Which environmental features contribute to positive and negative perceptions of urban parks? A cross-cultural comparison using online reviews and Natural Language Processing methods

Natural language processing for smart construction: Current status and future directions

Attention-based Unsupervised Keyphrase Extraction and Phrase Graph for COVID-19 Medical Literature Retrieval

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

An ensemble approach for healthcare application and diagnosis using natural language processing

Machine Learning and Natural Language Processing Enable a Data-Oriented Experimental Design Approach for Producing Biochar and Hydrochar from Biomass

Export Citation Format

natural language processingRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Towards Developing Uniform Lexicon Based Sorting Algorithm for Three Prominent Indo-Aryan Languages

Efficient Channel Attention Based Encoder–Decoder Approach for Image Captioning in Hindi

Model Transformation Development Using Automated Requirements Analysis, Metamodel Matching, and Transformation by Example

A Computational Look at Oral History Archives

Which environmental features contribute to positive and negative perceptions of urban parks? A cross-cultural comparison using online reviews and Natural Language Processing methods

Natural language processing for smart construction: Current status and future directions

Attention-based Unsupervised Keyphrase Extraction and Phrase Graph for COVID-19 Medical Literature Retrieval

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

An ensemble approach for healthcare application and diagnosis using natural language processing

Machine Learning and Natural Language Processing Enable a Data-Oriented Experimental Design Approach for Producing Biochar and Hydrochar from Biomass

natural language processing
Recently Published Documents