scholarly journals Subtractive Mountain Clustering Algorithm Applied to a Chatbot to Assist Elderly People in Medication Intake

2021 ◽  
Author(s):  
Neuza Claro ◽  
Paulo A. Salgado ◽  
T-P Azevedo Perdicoulis

Errors in medication intake among elderly people are very common. One of the main causes for this is their loss of ability to retain information. The high amount of medicine intake required by the advanced age is another limiting factor. Thence, the design of an interactive aid system, preferably using natural language, to help the older population with medication is in demand. A chatbot based on a subtractive cluster algorithm, included in unsupervised learned models, is the chosen solution since the processing of natural languages is a necessary step in view to construct a chatbot able to answer questions that older people may pose upon themselves concerning a particular drug. In this work, the subtractive mountain clustering algorithm has been adapted to the problem of natural languages processing. This algorithm version allows for the association of a set of words into clusters. After finding the centre of every cluster — the most relevant word, all the others are aggregated according to a defined metric adapted to the language processing realm. All the relevant stored information is processed, as well as the questions, by the algorithm. The correct processing of the text enables the chatbot to produce answers that relate to the posed queries. To validate the method, we use the package insert of a drug as the available information and formulate associated questions.

2021 ◽  
Vol 10 (5) ◽  
pp. 17-36
Author(s):  
Paulo A. Salgado ◽  
T-P Azevedo Perdicoulis

In this work, the subtractive mountain clustering algorithm has been adapted to the problem of natural languages processing in view to construct a chatbot that answers questions posed by the user. The implemented algorithm version allosws for the association of a set of words into clusters. After finding the centre of every cluster — the most relevant word, all the others are aggregated according to a defined metric adapted to the language processing realm. All the relevant stored information (necessary to answer the questions) is processed, as well as the questions, by the algorithm. The correct processing of the text enables the chatbot to produce answers that relate to the posed queries. Since we have in view a chatbot to help elder people with medication, to validate the method, we use the package insert of a drug as the available information and formulate associated questions. Errors in medication intake among elderly people are very common. One of the main causes for this is their loss of ability to retain information. The high amount of medicine intake required by the advanced age is another limiting factor. Thence, the design of an interactive aid system, preferably using natural language, to help the older population with medication is in demand. A chatbot based on a subtractive cluster algorithm is the chosen solution.


2020 ◽  
pp. 1-11
Author(s):  
Yu Wang

The semantic similarity calculation task of English text has important influence on other fields of natural language processing and has high research value and application prospect. At present, research on the similarity calculation of short texts has achieved good results, but the research result on long text sets is still poor. This paper proposes a similarity calculation method that combines planar features with structured features and uses support vector regression models. Moreover, this paper uses PST and PDT to represent the syntax, semantics and other information of the text. In addition, through the two structural features suitable for text similarity calculation, this paper proposes a similarity calculation method combining structural features with Tree-LSTM model. Experiments show that this method provides a new idea for interest network extraction.


Traditional encryption systems and techniques have always been vulnerable to brute force cyber-attacks. This is due to bytes encoding of characters utf8 also known as ASCII characters. Therefore, an opponent who intercepts a cipher text and attempts to decrypt the signal by applying brute force with a faulty pass key can detect some of the decrypted signals by employing a mixture of symbols that are not uniformly dispersed and contain no meaningful significance. Honey encoding technique is suggested to curb this classical authentication weakness by developing cipher-texts that provide correct and evenly dispersed but untrue plaintexts after decryption with a false key. This technique is only suitable for passkeys and PINs. Its adjustment in order to promote the encoding of the texts of natural languages such as electronic mails, records generated by man, still remained an open-end drawback. Prevailing proposed schemes to expand the encryption of natural language messages schedule exposes fragments of the plaintext embedded with coded data, thus they are more prone to cipher text attacks. In this paper, amending honey encoded system is proposed to promote natural language message encryption. The main aim was to create a framework that would encrypt a signal fully in binary form. As an end result, most binary strings semantically generate the right texts to trick an opponent who tries to decipher an error key in the cipher text. The security of the suggested system is assessed..


10.2196/20443 ◽  
2020 ◽  
Vol 22 (7) ◽  
pp. e20443
Author(s):  
Xiaoying Li ◽  
Xin Lin ◽  
Huiling Ren ◽  
Jinjing Guo

Background Licensed drugs may cause unexpected adverse reactions in patients, resulting in morbidity, risk of mortality, therapy disruptions, and prolonged hospital stays. Officially approved drug package inserts list the adverse reactions identified from randomized controlled clinical trials with high evidence levels and worldwide postmarketing surveillance. Formal representation of the adverse drug reaction (ADR) enclosed in semistructured package inserts will enable deep recognition of side effects and rational drug use, substantially reduce morbidity, and decrease societal costs. Objective This paper aims to present an ontological organization of traceable ADR information extracted from licensed package inserts. In addition, it will provide machine-understandable knowledge for bioinformatics analysis, semantic retrieval, and intelligent clinical applications. Methods Based on the essential content of package inserts, a generic ADR ontology model is proposed from two dimensions (and nine subdimensions), covering the ADR information and medication instructions. This is followed by a customized natural language processing method programmed with Python to retrieve the relevant information enclosed in package inserts. After the biocuration and identification of retrieved data from the package insert, an ADR ontology is automatically built for further bioinformatic analysis. Results We collected 165 package inserts of quinolone drugs from the National Medical Products Administration and other drug databases in China, and built a specialized ADR ontology containing 2879 classes and 15,711 semantic relations. For each quinolone drug, the reported ADR information and medication instructions have been logically represented and formally organized in an ADR ontology. To demonstrate its usage, the source data were further bioinformatically analyzed. For example, the number of drug-ADR triples and major ADRs associated with each active ingredient were recorded. The 10 ADRs most frequently observed among quinolones were identified and categorized based on the 18 categories defined in the proposal. The occurrence frequency, severity, and ADR mitigation method explicitly stated in package inserts were also analyzed, as well as the top 5 specific populations with contraindications for quinolone drugs. Conclusions Ontological representation and organization using officially approved information from drug package inserts enables the identification and bioinformatic analysis of adverse reactions caused by a specific drug with regard to predefined ADR ontology classes and semantic relations. The resulting ontology-based ADR knowledge source classifies drug-specific adverse reactions, and supports a better understanding of ADRs and safer prescription of medications.


2021 ◽  
Vol 75 (3) ◽  
pp. 94-99
Author(s):  
A.M. Yelenov ◽  
◽  
A.B. Jaxylykova ◽  

This research focuses on a comparative study of the Named Entity Recognition task for scientific article texts. Natural language processing could be considered as one of the cornerstones in the machine learning area which devotes its attention to the problems connected with the understanding of different natural languages and linguistic analysis. It was already shown that current deep learning techniques have a good performance and accuracy in such areas as image recognition, pattern recognition, computer vision, that could mean that such technology probably would be successful in the neuro-linguistic programming area too and lead to a dramatic increase on the research interest on this topic. For a very long time, quite trivial algorithms have been used in this area, such as support vector machines or various types of regression, basic encoding on text data was also used, which did not provide high results. The following dataset was used to process the experiment models: Dataset Scientific Entity Relation Core. The algorithms used were Long short-term memory, Random Forest Classifier with Conditional Random Fields, and Named-entity recognition with Bidirectional Encoder Representations from Transformers. In the findings, the metrics scores of all models were compared to each other to make a comparison. This research is devoted to the processing of scientific articles, concerning the machine learning area, because the subject is not investigated on enough properly level.The consideration of this task can help machines to understand natural languages better, so that they can solve other neuro-linguistic programming tasks better, enhancing scores in common sense.


2010 ◽  
Vol 1 (3) ◽  
pp. 1-19 ◽  
Author(s):  
Weisen Guo ◽  
Steven B. Kraines

To promote global knowledge sharing, one should solve the problem that knowledge representation in diverse natural languages restricts knowledge sharing effectively. Traditional knowledge sharing models are based on natural language processing (NLP) technologies. The ambiguity of natural language is a problem for NLP; however, semantic web technologies can circumvent the problem by enabling human authors to specify meaning in a computer-interpretable form. In this paper, the authors propose a cross-language semantic model (SEMCL) for knowledge sharing, which uses semantic web technologies to provide a potential solution to the problem of ambiguity. Also, this model can match knowledge descriptions in diverse languages. First, the methods used to support searches at the semantic predicate level are given, and the authors present a cross-language approach. Finally, an implementation of the model for the general engineering domain is discussed, and a scenario describing how the model implementation handles semantic cross-language knowledge sharing is given.


Author(s):  
Yuejun He ◽  
Bradley Camburn ◽  
Jianxi Luo ◽  
Maria C. Yang ◽  
Kristin L. Wood

AbstractTextual idea data from online crowdsourcing contains rich information of the concepts that underlie the original ideas and can be recombined to generate new ideas. But representing such information in a way that can stimulate new ideas is not a trivial task, because crowdsourced data are often vast and in unstructured natural languages. This paper introduces a method that uses natural language processing to summarize a massive number of idea descriptions and represents the underlying concept space as word clouds with a core-periphery structure to inspire recombinations of such concepts into new ideas. We report the use of this method in a real public-sector-sponsored project to explore ideas for future transportation system design. Word clouds that represent the concept space underlying original crowdsourced ideas are used as ideation aids and stimulate many new ideas with varied novelty, usefulness and feasibility. The new ideas suggest that the proposed method helps expand the idea space. Our analysis of these ideas and a survey with the designers who generated them shed light on how people perceive and use the word clouds as ideation aids and suggest future research directions.


Author(s):  
Pankaj Kailas Bhole ◽  
A. J. Agrawal

Text  summarization is  an  old challenge  in  text  mining  but  in  dire  need  of researcher’s attention in the areas of computational intelligence, machine learning  and  natural  language  processing. We extract a set of features from each sentence that helps identify its importance in the document. Every time reading full text is time consuming. Clustering approach is useful to decide which type of data present in document. In this paper we introduce the concept of k-mean clustering for natural language processing of text for word matching and in order to extract meaningful information from large set of offline documents, data mining document clustering algorithm are adopted.


2020 ◽  
pp. 1-31
Author(s):  
Abdul Rafae Khan ◽  
Asim Karim ◽  
Hassan Sajjad ◽  
Faisal Kamiran ◽  
Jia Xu

Abstract Roman Urdu is an informal form of the Urdu language written in Roman script, which is widely used in South Asia for online textual content. It lacks standard spelling and hence poses several normalization challenges during automatic language processing. In this article, we present a feature-based clustering framework for the lexical normalization of Roman Urdu corpora, which includes a phonetic algorithm UrduPhone, a string matching component, a feature-based similarity function, and a clustering algorithm Lex-Var. UrduPhone encodes Roman Urdu strings to their pronunciation-based representations. The string matching component handles character-level variations that occur when writing Urdu using Roman script. The similarity function incorporates various phonetic-based, string-based, and contextual features of words. The Lex-Var algorithm is a variant of the k-medoids clustering algorithm that groups lexical variations of words. It contains a similarity threshold to balance the number of clusters and their maximum similarity. The framework allows feature learning and optimization in addition to the use of predefined features and weights. We evaluate our framework extensively on four real-world datasets and show an F-measure gain of up to 15% from baseline methods. We also demonstrate the superiority of UrduPhone and Lex-Var in comparison to respective alternate algorithms in our clustering framework for the lexical normalization of Roman Urdu.


Sign in / Sign up

Export Citation Format

Share Document