Natural Language Processing and Biological Methods

Author(s):  
Gemma Bel Enguix ◽  
M. Dolores Jiménez López

During the 20th century, biology—especially molecular biology—has become a pilot science, so that many disciplines have formulated their theories under models taken from biology. Computer science has become almost a bio-inspired field thanks to the great development of natural computing and DNA computing. From linguistics, interactions with biology have not been frequent during the 20th century. Nevertheless, because of the “linguistic” consideration of the genetic code, molecular biology has taken several models from formal language theory in order to explain the structure and working of DNA. Such attempts have been focused in the design of grammar-based approaches to define a combinatorics in protein and DNA sequences (Searls, 1993). Also linguistics of natural language has made some contributions in this field by means of Collado (1989), who applied generativist approaches to the analysis of the genetic code. On the other hand, and only from theoretical interest a strictly, several attempts of establishing structural parallelisms between DNA sequences and verbal language have been performed (Jakobson, 1973, Marcus, 1998, Ji, 2002). However, there is a lack of theory on the attempt of explaining the structure of human language from the results of the semiosis of the genetic code. And this is probably the only arrow that remains incomplete in order to close the path between computer science, molecular biology, biosemiotics and linguistics. Natural Language Processing (NLP) –a subfield of Artificial Intelligence that concerns the automated generation and understanding of natural languages— can take great advantage of the structural and “semantic” similarities between those codes. Specifically, taking the systemic code units and methods of combination of the genetic code, the methods of such entity can be translated to the study of natural language. Therefore, NLP could become another “bio-inspired” science, by means of theoretical computer science, that provides the theoretical tools and formalizations which are necessary for approaching such exchange of methodology. In this way, we obtain a theoretical framework where biology, NLP and computer science exchange methods and interact, thanks to the semiotic parallelism between the genetic code and natural language.

Designs ◽  
2021 ◽  
Vol 5 (3) ◽  
pp. 42
Author(s):  
Eric Lazarski ◽  
Mahmood Al-Khassaweneh ◽  
Cynthia Howard

In recent years, disinformation and “fake news” have been spreading throughout the internet at rates never seen before. This has created the need for fact-checking organizations, groups that seek out claims and comment on their veracity, to spawn worldwide to stem the tide of misinformation. However, even with the many human-powered fact-checking organizations that are currently in operation, disinformation continues to run rampant throughout the Web, and the existing organizations are unable to keep up. This paper discusses in detail recent advances in computer science to use natural language processing to automate fact checking. It follows the entire process of automated fact checking using natural language processing, from detecting claims to fact checking to outputting results. In summary, automated fact checking works well in some cases, though generalized fact checking still needs improvement prior to widespread use.


2020 ◽  
Vol 58 (7) ◽  
pp. 1227-1255
Author(s):  
Glenn Gordon Smith ◽  
Robert Haworth ◽  
Slavko Žitnik

We investigated how Natural Language Processing (NLP) algorithms could automatically grade answers to open-ended inference questions in web-based eBooks. This is a component of research on making reading more motivating to children and to increasing their comprehension. We obtained and graded a set of answers to open-ended questions embedded in a fiction novel written in English. Computer science students used a subset of the graded answers to develop algorithms designed to grade new answers to the questions. The algorithms utilized the story text, existing graded answers for a given question and publicly accessible databases in grading new responses. A computer science professor used another subset of the graded answers to evaluate the students’ NLP algorithms and to select the best algorithm. The results showed that the best algorithm correctly graded approximately 85% of the real-world answers as correct, partly correct, or wrong. The best NLP algorithm was trained with questions and graded answers from a series of new text narratives in another language, Slovenian. The resulting NLP algorithm model was successfully used in fourth-grade language arts classes for providing feedback to student answers on open-ended questions in eBooks.


Author(s):  
Gemma Bel-Enguix ◽  
M. Dolores Jiménez-López

The paper provides an overview of what could be a new biological-inspired linguistics. The authors discuss some reasons for attempting a more natural description of natural language, lying on new theories of molecular biology and their formalization within the area of theoretical computer science. The authors especially explore three bio-inspired models of computation –DNA computing, membrane computing and networks of evolutionary processors (NEPs) – and their possibilities for achieving a simpler, more natural, and mathematically consistent theoretical linguistics.


2019 ◽  
Vol 18 ◽  
pp. 160940691988702 ◽  
Author(s):  
William Leeson ◽  
Adam Resnick ◽  
Daniel Alexander ◽  
John Rovers

Qualitative data-analysis methods provide thick, rich descriptions of subjects’ thoughts, feelings, and lived experiences but may be time-consuming, labor-intensive, or prone to bias. Natural language processing (NLP) is a machine learning technique from computer science that uses algorithms to analyze textual data. NLP allows processing of large amounts of data almost instantaneously. As researchers become conversant with NLP, it is becoming more frequently employed outside of computer science and shows promise as a tool to analyze qualitative data in public health. This is a proof of concept paper to evaluate the potential of NLP to analyze qualitative data. Specifically, we ask if NLP can support conventional qualitative analysis, and if so, what its role is. We compared a qualitative method of open coding with two forms of NLP, Topic Modeling, and Word2Vec to analyze transcripts from interviews conducted in rural Belize querying men about their health needs. All three methods returned a series of terms that captured ideas and concepts in subjects’ responses to interview questions. Open coding returned 5–10 words or short phrases for each question. Topic Modeling returned a series of word-probability pairs that quantified how well a word captured the topic of a response. Word2Vec returned a list of words for each interview question ordered by which words were predicted to best capture the meaning of the passage. For most interview questions, all three methods returned conceptually similar results. NLP may be a useful adjunct to qualitative analysis. NLP may be performed after data have undergone open coding as a check on the accuracy of the codes. Alternatively, researchers can perform NLP prior to open coding and use the results to guide their creation of their codebook.


Author(s):  
Sagarmoy Ganguly ◽  
Asoke Nath

Quantum cryptography is a comparatively new and special type of cryptography which uses Quantum mechanics to provide unreal protection of data/information and unconditionally secure communications. This is achieved with Quantum Key Distribution (QKD) protocols which is a representation of an essential practical application of Quantum Computation. In this paper the authors will venture the concept of QKD by reviewinghow QKD works, the authors shall take a look at few protocols of QKD, followed by a practical example of Quantum Cryptography using QKD and certain limitations from the perspective of Computer Science in specific and Quantum Physics in general.


Sign in / Sign up

Export Citation Format

Share Document