Extrinsic Evaluation of Cross-Lingual Embeddings on the Patent Classification Task

AbstractPatent classification is an expensive and time-consuming task that has conventionally been performed by domain experts. However, the increase in the number of filed patents and the complexity of the documents make the classification task challenging. The text used in patent documents is not always written in a way to efficiently convey knowledge. Moreover, patent classification is a multi-label classification task with a large number of labels, which makes the problem even more complicated. Hence, automating this expensive and laborious task is essential for assisting domain experts in managing patent documents, facilitating reliable search, retrieval, and further patent analysis tasks. Transfer learning and pre-trained language models have recently achieved state-of-the-art results in many Natural Language Processing tasks. In this work, we focus on investigating the effect of fine-tuning the pre-trained language models, namely, BERT, XLNet, RoBERTa, and ELECTRA, for the essential task of multi-label patent classification. We compare these models with the baseline deep-learning approaches used for patent classification. We use various word embeddings to enhance the performance of the baseline models. The publicly available USPTO-2M patent classification benchmark and M-patent datasets are used for conducting experiments. We conclude that fine-tuning the pre-trained language models on the patent text improves the multi-label patent classification performance. Our findings indicate that XLNet performs the best and achieves a new state-of-the-art classification performance with respect to precision, recall, F1 measure, as well as coverage error, and LRAP.

Download Full-text

Propensity Score Matching in Cross-Lingual Test Equating

PsycEXTRA Dataset ◽

10.1037/e662962012-001 ◽

2012 ◽

Author(s):

Xin Liu ◽

Xiaobin Zhou ◽

Jianjun Zhu ◽

Jing-Jen Wang

Keyword(s):

Propensity Score ◽

Propensity Score Matching ◽

Test Equating ◽

Cross Lingual

Download Full-text

Efficient Solutions of the Density Classification Task in One-Dimensional Cellular Automata: Where Can They Be Found?

Complex Systems ◽

10.25088/complexsystems.29.3.669 ◽

2020 ◽

Vol 29 (3) ◽

pp. 669-688

Author(s):

Zakaria Laboudi ◽

Keyword(s):

Cellular Automata ◽

Efficient Solutions ◽

Classification Task ◽

One Dimensional

Download Full-text

Cross-lingual Search in the Psychology Search Engine PubPsych

10.26226/morressier.5cf632c4af72dec2b0554c00 ◽

2019 ◽

Author(s):

Erich Weichselgartner

Keyword(s):

Search Engine ◽

Cross Lingual

Download Full-text

Learning to Adapt Credible Knowledge in Cross-lingual Sentiment Analysis

10.3115/v1/p15-1041 ◽

2015 ◽

Cited By ~ 10

Author(s):

Qiang Chen ◽

Wenjie Li ◽

Yu Lei ◽

Xule Liu ◽

Yanxiang He

Keyword(s):

Sentiment Analysis ◽

Cross Lingual

Download Full-text

Patent Classification System in Japan Relating to Fourth Industrial Revolution Technology and Implication Thereof

Sogang Law Journal ◽

10.35505/slj.2020.10.9.3.3 ◽

2020 ◽

Vol 9 (3) ◽

pp. 3-30

Author(s):

Chihyun Kwon

Keyword(s):

Classification System ◽

Industrial Revolution ◽

Patent Classification ◽

Fourth Industrial Revolution

Download Full-text

Improving DNN Bluetooth Narrowband Acoustic Models by Cross-Bandwidth and Cross-Lingual Initialization

10.21437/interspeech.2017-1129 ◽

2017 ◽

Cited By ~ 1

Author(s):

Xiaodan Zhuang ◽

Arnab Ghoshal ◽

Antti-Veikko Rosti ◽

Matthias Paulik ◽

Daben Liu

Keyword(s):

Acoustic Models ◽

Cross Lingual

Download Full-text

Cross-Lingual Multi-Task Neural Architecture for Spoken Language Understanding

10.21437/interspeech.2018-1039 ◽

2018 ◽

Author(s):

Yujiang Li ◽

Xuemin Zhao ◽

Weiqun Xu ◽

Yonghong Yan

Keyword(s):

Spoken Language ◽

Language Understanding ◽

Spoken Language Understanding ◽

Neural Architecture ◽

Cross Lingual

Download Full-text

Patent-information investigation in the field of photogrammetry

Geodesy and Cartography ◽

10.22389/0016-7126-2019-951-9-25-39 ◽

2019 ◽

Vol 951 (9) ◽

pp. 25-39

Author(s):

V.V. Zabavnikov ◽

A.N. Kobiakov ◽

S.V. Kovalev

Keyword(s):

Technology Development ◽

Temporal Dynamics ◽

Statistical Processing ◽

Current Application ◽

Patent Classification ◽

Patent Information ◽

Inventive Activity ◽

Patent Documents ◽

Object Of Study ◽

Technical Solutions

Informational and analytical studying patent documentation shows the patenting situation either in general in a specific technological area or the patent activity of innovation entities, taking temporal dynamics and the territorial basis into account. Patent-information investigation was carried out in order to get acquainted with the level of photogrammetry technology development and determine its current application areas. Statistical and intellectual patent document text analysis was the basis for relevant data array grouped in 8680 patent families’ creation. The prepared report contains a graphical display of selected patent documents array, related to research topic, analytical and statistical processing. The level of inventive activity was assessed; the world patenting dynamics and location in this technical field were considered. The main groups on the International Patent Classification, as well as the main technological directions, where technical solutions related to the object of study to be patented, are identified. Information on the leading applicants/ patent holders in this technical field is provided; the list of the most cited patent documents is considered.

Download Full-text

Complexity Approximation of Classification Task for Large Dataset Ensemble Artificial Neural Networks

Proceedings of the International Conference on Data Engineering 2015 (DaEng-2015) - Lecture Notes in Electrical Engineering ◽

10.1007/978-981-13-1799-6_21 ◽

2019 ◽

pp. 195-202

Author(s):

Mumtazimah Mohamad ◽

Md Yazid Mohd Saman ◽

Nazirah Abd Hamid

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Classification Task ◽

Large Dataset ◽

Artificial Neural

Download Full-text