LegalNLP - Natural Language Processing methods for the Brazilian Legal Language

We present and make available pre-trained language models (Phraser, Word2Vec, Doc2Vec, FastText, and BERT) for the Brazilian legal language, a Python package with functions to facilitate their use, and a set of demonstrations/tutorials containing some applications involving them. Given that our material is built upon legal texts coming from several Brazilian courts, this initiative is extremely helpful for the Brazilian legal field, which lacks other open and specific tools and language models. Our main objective is to catalyze the use of natural language processing tools for legal texts analysis by the Brazilian industry, government, and academia, providing the necessary tools and accessible material.

Download Full-text

Fast Neural Network Engine for Natural Science Language Processing: A Drug-Search Case.

10.26434/chemrxiv.12800348 ◽

2020 ◽

Author(s):

Vadim V. Korolev ◽

Artem Mitrofanov ◽

Kirill Karpov ◽

Valery Tkachenko

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Natural Science ◽

Therapeutic Agent ◽

Semantic Relations ◽

Chemical Data ◽

Processing Methods ◽

Modern Natural

The main advantage of modern natural language processing methods is a possibility to turn an amorphous human-readable task into a strict mathematic form. That allows to extract chemical data and insights from articles and to find new semantic relations. We propose a universal engine for processing chemical and biological texts. We successfully tested it on various use-cases and applied to a case of searching a therapeutic agent for a COVID-19 disease by analyzing PubMed archive.

Download Full-text

Prevalence of Financial Considerations Documented in Primary Care Encounters as Identified by Natural Language Processing Methods

JAMA Network Open ◽

10.1001/jamanetworkopen.2019.10399 ◽

2019 ◽

Vol 2 (8) ◽

pp. e1910399

Author(s):

Meliha Skaljic ◽

Ihsaan H. Patel ◽

Amelia M. Pellegrini ◽

Victor M. Castro ◽

Roy H. Perlis ◽

...

Keyword(s):

Primary Care ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Processing Methods

Download Full-text

A Comparison of Natural Language Processing Methods for Automated Coding of Motivational Interviewing

Journal of Substance Abuse Treatment ◽

10.1016/j.jsat.2016.01.006 ◽

2016 ◽

Vol 65 ◽

pp. 43-50 ◽

Cited By ~ 22

Author(s):

Michael Tanana ◽

Kevin A. Hallgren ◽

Zac E. Imel ◽

David C. Atkins ◽

Vivek Srikumar

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Motivational Interviewing ◽

Language Processing ◽

Processing Methods ◽

Automated Coding

Download Full-text

Text: An R-package for Analyzing and Visualizing Human Language Using Natural Language Processing and Deep Learning

10.31234/osf.io/293kt ◽

2021 ◽

Author(s):

Oscar Nils Erik Kjell ◽

H. Andrew Schwartz ◽

Salvatore Giorgi

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Rating Scale ◽

State Of The Art ◽

R Package ◽

Language Models ◽

Categorical Variables ◽

Human Language

The language that individuals use for expressing themselves contains rich psychological information. Recent significant advances in Natural Language Processing (NLP) and Deep Learning (DL), namely transformers, have resulted in large performance gains in tasks related to understanding natural language such as machine translation. However, these state-of-the-art methods have not yet been made easily accessible for psychology researchers, nor designed to be optimal for human-level analyses. This tutorial introduces text (www.r-text.org), a new R-package for analyzing and visualizing human language using transformers, the latest techniques from NLP and DL. Text is both a modular solution for accessing state-of-the-art language models and an end-to-end solution catered for human-level analyses. Hence, text provides user-friendly functions tailored to test hypotheses in social sciences for both relatively small and large datasets. This tutorial describes useful methods for analyzing text, providing functions with reliable defaults that can be used off-the-shelf as well as providing a framework for the advanced users to build on for novel techniques and analysis pipelines. The reader learns about six methods: 1) textEmbed: to transform text to traditional or modern transformer-based word embeddings (i.e., numeric representations of words); 2) textTrain: to examine the relationships between text and numeric/categorical variables; 3) textSimilarity and 4) textSimilarityTest: to computing semantic similarity scores between texts and significance test the difference in meaning between two sets of texts; and 5) textProjection and 6) textProjectionPlot: to examine and visualize text within the embedding space according to latent or specified construct dimensions (e.g., low to high rating scale scores).

Download Full-text

Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications

10.26615/978-954-452-072-4_xxx ◽

2021 ◽

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Processing Methods ◽

Recent Advances

Download Full-text

Natural Language Processing methods for Document Matching

International Journal for Modern Trends in Science and Technology - RTT2020 ◽

10.46501/ijmtst061271 ◽

2020 ◽

Vol 6 (12) ◽

pp. 379-383

Author(s):

Maitri Patel and Dr Hemant D Vasava

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Original Work ◽

Educational Systems ◽

Processing Methods ◽

Similar Test ◽

Subject Examination ◽

Different Types ◽

Higher Educational

Data,Information or knoweldge,in this rapidly moving and growing world.we can find any kind of information on Internet.And this can be too useful,however for acedemic world too it is useful but along with it plagarism is highly in practice.Which makes orginality of work degrade and fraudly using someones original work and later not acknowleging them is becoming common.And some times teachers or professors could not identify the plagarised information provided.So higher educational systems nowadays use different types of tools to compare.Here we have an idea to match no of different documents like assignments of students to compare with each other to find out, did they copied each other’s work?Also an idea to compare ideal answeer sheet of particular subject examination to similar test sheets of students.Idea is to compare and on similarity basis we can rank them.Both approach is one kind and that is to compare documents.To identify plagarism there are many methods used already.So we could compare and develop them if needed.

Download Full-text

Comparing Natural Language Processing Methods to Cluster Construction Schedules

Journal of Construction Engineering and Management ◽

10.1061/(asce)co.1943-7862.0002165 ◽

2021 ◽

Vol 147 (10) ◽

pp. 04021136

Author(s):

Ying Hong ◽

Haiyan Xie ◽

Gary Bhumbra ◽

Ioannis Brilakis

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Processing Methods ◽

Construction Schedules

Download Full-text

Topic Modeling for Keyword Extraction: using Natural Language Processing methods for keyword extraction in Portal Min@s

Revista de Estudos da Linguagem ◽

10.17851/2237-2083.23.3.695-726 ◽

2015 ◽

Vol 23 (3) ◽

pp. 695 ◽

Cited By ~ 1

Author(s):

Arnaldo Candido Junior ◽

Célia Magalhães ◽

Helena Caseli ◽

Régis Zangirolami

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Keyword Extraction ◽

Processing Methods ◽

Dirichlet Allocation

Este artigo tem o objetivo da avaliar a aplicação de dois métodos automáticos eficientes na extração de palavras-chave, usados pelas comunidades da Linguística de Corpus e do Processamento da Língua Natural para gerar palavras-chave de textos literários: o WordSmith Tools e o Latent Dirichlet Allocation (LDA). As duas ferramentas escolhidas para este trabalho têm suas especificidades e técnicas diferentes de extração, o que nos levou a uma análise orientada para a sua performance. Objetivamos entender, então, como cada método funciona e avaliar sua aplicação em textos literários. Para esse fim, usamos análise humana, com conhecimento do campo dos textos usados. O método LDA foi usado para extrair palavras-chave por meio de sua integração com o Portal Min@s: Corpora de Fala e Escrita, um sistema geral de processamento de corpora, concebido para diferentes pesquisas de Linguística de Corpus. Os resultados do experimento confirmam a eficácia do WordSmith Tools e do LDA na extração de palavras-chave de um corpus literário, além de apontar que é necessária a análise humana das listas em um estágio anterior aos experimentos para complementar a lista gerada automaticamente, cruzando os resultados do WordSmith Tools e do LDA. Também indicam que a intuição linguística do analista humano sobre as listas geradas separadamente pelos dois métodos usados neste estudo foi mais favorável ao uso da lista de palavras-chave do WordSmith Tools.

Download Full-text