An Efficient Cross-Lingual BERT Model for Text Classification and Named Entity Extraction in Multilingual Dataset

Author(s):  
Asoke Nath ◽  
Debapriya Kandar ◽  
Rahul Gupta

In recent times, with the rise of the internet, everyone is being bombarded with tons of information and data from various sources like websites, blogs and articles, social media posts and comments, e-news portals etc. Now all these data are mostly unstructured. In this paper, the authors have tried to explore the efficiency of the cross-lingual BERT model i.e. M-BERT for text classification and named entity extraction on multilingual data. The authors have used datasets of three different languages namely: French, German and Portuguese to evaluate the model performance.

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 32862-32881 ◽  
Author(s):  
Tareq Al-Moslmi ◽  
Marc Gallofre Ocana ◽  
Andreas L. Opdahl ◽  
Csaba Veres

2005 ◽  
Vol 165 (1) ◽  
pp. 91-134 ◽  
Author(s):  
Oren Etzioni ◽  
Michael Cafarella ◽  
Doug Downey ◽  
Ana-Maria Popescu ◽  
Tal Shaked ◽  
...  

2011 ◽  
Vol 145 ◽  
pp. 451-454
Author(s):  
Han Gi Kim ◽  
Kuk Jin Bae ◽  
Eun Sun Kim ◽  
Hyuk Hahn

This paper presents additional linguistic factors that should be considered to more effectively extract terms from the machinery industry documents by augmenting the general extraction patterns. We expand on the general term extraction patterns with patterns that are tailored for machinery industry documents to improve precision and recall. We establish a theoretical basis for developing a system to support information research in the machinery industry. Using this system, we expect to increase the efficiency of new business planning process in the machine industry.


Sign in / Sign up

Export Citation Format

Share Document