general domain Latest Research Papers

Pretraining large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks. However, most pretraining efforts focus on general domain corpora, such as newswire and Web. A prevailing assumption is that even domain-specific pretraining can benefit by starting from general-domain language models. In this article, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models. To facilitate this investigation, we compile a comprehensive biomedical NLP benchmark from publicly available datasets. Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks, leading to new state-of-the-art results across the board. Further, in conducting a thorough evaluation of modeling choices, both for pretraining and task-specific fine-tuning, we discover that some common practices are unnecessary with BERT models, such as using complex tagging schemes in named entity recognition. To help accelerate research in biomedical NLP, we have released our state-of-the-art pretrained and task-specific models for the community, and created a leaderboard featuring our BLURB benchmark (short for Biomedical Language Understanding & Reasoning Benchmark) at https://aka.ms/BLURB .

Download Full-text

Laser double optical resonance excitation-ionization of Mo with optogalvanic detection

Physica Scripta ◽

10.1088/1402-4896/ac48a7 ◽

2022 ◽

Author(s):

Hu Lu ◽

Lazaros Varvarezos ◽

Piergiorgio Nicolosi ◽

Alberto Andrighetto ◽

Daniele Scarpa ◽

...

Keyword(s):

Exotic Species ◽

Hollow Cathode ◽

Spectroscopic Investigation ◽

Cost Effective ◽

Resonance Excitation ◽

Optical Resonance ◽

General Domain ◽

Mode Of Operation ◽

Isol Facility ◽

Double Optical Resonance

Abstract We report on measurements of resonant three-step, two-colour ionization of atomic molybdenum, using a hollow cathode lamp (HCL) with optogalvanic detection. Wavelength scans were made for two specific transitions involved in the ionization pathways under investigation, namely 4d5(6S)5s 7S3 - 4d5(6S)5p 7P4 and 4d5(6S)5p 7P4 - 4d5(6S)6d 7D5.So-called ‘slow’ and ‘fast’ optogalvanic signals were observed for each pathway. Results confirm the HCL as a cost effective spectroscopic investigation tool. In particular its use in the optogalvanic mode of operation allows one to precisely, easily and reliably tune the wavelength of one or more lasers to resonances of interest for experiments in the general domain of atomic vapour laser isotope selection (AVLIS). The measurements are closely related to the Selective Production of Exotic Species (SPES) project at the ISOL facility and were performed in the recently established laser laboratory in Legnaro National Laboratories of INFN.

Download Full-text

Examining the Effect of the Ratio of Biomedical Domain to General Domain Data in Corpus in Biomedical Literature Mining

Applied Sciences ◽

10.3390/app12010154 ◽

2021 ◽

Vol 12 (1) ◽

pp. 154

Author(s):

Ziheng Zhang ◽

Feng Han ◽

Hongjian Zhang ◽

Tomohiro Aoki ◽

Katsuhiko Ogasawara

Keyword(s):

Language Processing ◽

Relation Extraction ◽

Medical Data ◽

Biomedical Literature ◽

Literature Mining ◽

Biomedical Domain ◽

Pubmed Central ◽

General Domain ◽

Biomedical Information Retrieval ◽

Science Community

Biomedical terms extracted using Word2vec, the most popular word embedding model in recent years, serve as the foundation for various natural language processing (NLP) applications, such as biomedical information retrieval, relation extraction, and recommendation systems. The objective of this study is to examine how changes in the ratio of the biomedical domain to general domain data in the corpus affect the extraction of similar biomedical terms using Word2vec. We downloaded abstracts of 214,892 articles from PubMed Central (PMC) and the 3.9 GB Billion Word (BW) benchmark corpus from the computer science community. The datasets were preprocessed and grouped into 11 corpora based on the ratio of BW to PMC, ranging from 0:10 to 10:0, and then Word2vec models were trained on these corpora. The cosine similarities between the biomedical terms obtained from the Word2vec models were then compared in each model. The results indicated that the models trained with both BW and PMC data outperformed the model trained only with medical data. The similarity between the biomedical terms extracted by the Word2vec model increased when the ratio of the biomedical domain to general domain data was 3:7 to 5:5. This study allows NLP researchers to apply Word2vec based on more information and increase the similarity of extracted biomedical terms to improve their effectiveness in NLP applications, such as biomedical information extraction.

Download Full-text

On the Construct-Related Validity of Implicit Trait Policies

European Journal of Personality ◽

10.1177/08902070211056901 ◽

2021 ◽

pp. 089020702110569

Author(s):

Jan-Philipp Freudenstein ◽

Patrick Mussel ◽

Stefan Krumm

Keyword(s):

Measurement Error ◽

Personality Traits ◽

Domain Knowledge ◽

Implicit Beliefs ◽

Situational Judgment ◽

Situational Judgment Tests ◽

General Domain ◽

Personality Research

In response to recent calls to incorporate Implicit Trait Policies (ITPs) into personality research, the current study examined the construct-related validity of ITP measures. ITPs are defined as implicit beliefs about the effectiveness of behaviors that reflect a certain trait. They are assessed by utilizing the methodology of Situational Judgment Tests. We empirically examined ( N = 339) several underlying key assumptions of ITP theory, including trait-specificity, the relation to personality traits, their context-independence, and the relation to general domain knowledge. Overall, our results showed little support for these assumptions. Although we found some confirmation for expected correlations between ITPs and personality traits, most of the observed variance in ITP measures was either method-specific or due to measurement error. We conclude that the herein examined ITP measures lack construct-related validity and discuss implications for ITP theory and assessment.

Download Full-text

Measuring Brazilian Portuguese Product Titles Similarity using Embeddings

10.5753/stil.2021.17791 ◽

2021 ◽

Author(s):

Alan da Silva Romualdo ◽

Livy Real ◽

Helena de Medeiros Caseli

Keyword(s):

Brazilian Portuguese ◽

Cosine Similarity ◽

Word Embeddings ◽

Semantic Meaning ◽

General Domain ◽

Domain Specific ◽

Domain Models

Textual similarity deals with determining how similar two pieces of texts are, considering the lexical (surface forms) or semantic (meaning) closeness. In this paper we applied word embeddings for measuring e-commerce product title similarity in Brazilian Portuguese. We generated some domainspecific word embeddings (using Word2Vec, FastText and GloVe) and compared them with general-domain models (word embeddings and BERT models). We concluded that the cosine similarity calculated using the domain-specific word embeddings was a good approach to distinguish between similar and nonsimilar products, but the multilingual BERT pre-trained model proved to be the best one.

Download Full-text

A Systematic Approach for English- Hindi Parallel Database Creation for Transliteration of General Domain English Words

10.1109/gucon50781.2021.9573841 ◽

2021 ◽

Author(s):

Radha Mogla ◽

C. Vasantha Lakshmi ◽

Niladri Chatterjee

Keyword(s):

Systematic Approach ◽

Parallel Database ◽

General Domain

Download Full-text

Named Entity Recognition of Chinese Diabetic Literature by Integrating General Domain Knowledge

10.1109/ccet52649.2021.9544361 ◽

2021 ◽

Author(s):

Duochuan Zhang

Keyword(s):

Domain Knowledge ◽

Named Entity Recognition ◽

Entity Recognition ◽

General Domain ◽

Named Entity

Download Full-text

Inhibitory Mechanisms in the Processing of Negations: A Neural Reuse Hypothesis

Journal of Psycholinguistic Research ◽

10.1007/s10936-021-09796-x ◽

2021 ◽

Author(s):

David Beltrán ◽

Bo Liu ◽

Manuel de Vega

Keyword(s):

Cognitive Control ◽

Response Inhibition ◽

Cognitive Functions ◽

Functional Relation ◽

Control Mechanisms ◽

Inhibitory Effects ◽

General Domain ◽

Inhibitory Mechanisms ◽

Neural Reuse ◽

Theoretical Foundations

AbstractNegation is known to have inhibitory consequences for the information under its scope. However, how it produces such effects remains poorly understood. Recently, it has been proposed that negation processing might be implemented at the neural level by the recruitment of inhibitory and cognitive control mechanisms. On this line, this manuscript offers the hypothesis that negation reuses general-domain mechanisms that subserve inhibition in other non-linguistic cognitive functions. The first two sections describe the inhibitory effects of negation on conceptual representations and its embodied effects, as well as the theoretical foundations for the reuse hypothesis. The next section describes the neurophysiological evidence that linguistic negation interacts with response inhibition, along with the suggestion that both functions share inhibitory mechanisms. Finally, the manuscript concludes that the functional relation between negation and inhibition observed at the mechanistic level could be easily integrated with predominant cognitive models of negation processing.

Download Full-text

Quality Evaluation of the General Domain Chinese Dialogue Generation Models with a BLEURT-based Model

10.1109/iri51335.2021.00067 ◽

2021 ◽

Author(s):

Shih-Hung Wu ◽

Chun-Yu Yeh

Keyword(s):

Quality Evaluation ◽

General Domain

Download Full-text

A perspective on situational judgment tests: From measures of situational judgment to measures of general domain knowledge

Personality and Individual Differences ◽

10.1016/j.paid.2021.110850 ◽

2021 ◽

Vol 177 ◽

pp. 110850

Author(s):

Amy Shaw

Keyword(s):

Domain Knowledge ◽

Situational Judgment ◽

Situational Judgment Tests ◽

General Domain

Download Full-text

general domain
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

Laser double optical resonance excitation-ionization of Mo with optogalvanic detection

Examining the Effect of the Ratio of Biomedical Domain to General Domain Data in Corpus in Biomedical Literature Mining

On the Construct-Related Validity of Implicit Trait Policies

Measuring Brazilian Portuguese Product Titles Similarity using Embeddings

A Systematic Approach for English- Hindi Parallel Database Creation for Transliteration of General Domain English Words

Named Entity Recognition of Chinese Diabetic Literature by Integrating General Domain Knowledge

Inhibitory Mechanisms in the Processing of Negations: A Neural Reuse Hypothesis

Quality Evaluation of the General Domain Chinese Dialogue Generation Models with a BLEURT-based Model

A perspective on situational judgment tests: From measures of situational judgment to measures of general domain knowledge

Export Citation Format

general domainRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

Laser double optical resonance excitation-ionization of Mo with optogalvanic detection

Examining the Effect of the Ratio of Biomedical Domain to General Domain Data in Corpus in Biomedical Literature Mining

On the Construct-Related Validity of Implicit Trait Policies

Measuring Brazilian Portuguese Product Titles Similarity using Embeddings

A Systematic Approach for English- Hindi Parallel Database Creation for Transliteration of General Domain English Words

Named Entity Recognition of Chinese Diabetic Literature by Integrating General Domain Knowledge

Inhibitory Mechanisms in the Processing of Negations: A Neural Reuse Hypothesis

Quality Evaluation of the General Domain Chinese Dialogue Generation Models with a BLEURT-based Model

A perspective on situational judgment tests: From measures of situational judgment to measures of general domain knowledge

general domain
Recently Published Documents