scholarly journals A Rule-based Dependency Parser for Telugu: An Experiment with Simple Sentences

2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Sangeetha P ◽  
Parameswari K ◽  
Amba Kulkarni

This paper is an attempt in building a rule-based dependency parser for Telugu which can parse simple sentences. This study adopts Pāṇini’s Grammatical (PG) tradition i.e., the dependency model to parse sentences. A detailed description of mapping semantic relations to vibhaktis (case suffixes and postpositions) in Telugu using PG is presented. The paper describes the algorithm and the linguistic knowledge employed while developing the parser. The research further provides results, which suggest that enriching the current parser with linguistic inputs can increase the accuracy and tackle ambiguity better than existing data-driven methods.

Author(s):  
Asim Abbas ◽  
Muhammad Afzal ◽  
Jamil Hussain ◽  
Taqdir Ali ◽  
Hafiz Syed Muhammad Bilal ◽  
...  

Extracting clinical concepts, such as problems, diagnosis, and treatment, from unstructured clinical narrative documents enables data-driven approaches such as machine and deep learning to support advanced applications such as clinical decision-support systems, the assessment of disease progression, and the intelligent analysis of treatment efficacy. Various tools such as cTAKES, Sophia, MetaMap, and other rules-based approaches and algorithms have been used for automatic concept extraction. Recently, machine- and deep-learning approaches have been used to extract, classify, and accurately annotate terms and phrases. However, the requirement of an annotated dataset, which is labor-intensive, impedes the success of data-driven approaches. A rule-based mechanism could support the process of annotation, but existing rule-based approaches fail to adequately capture contextual, syntactic, and semantic patterns. This study intends to introduce a comprehensive rule-based system that automatically extracts clinical concepts from unstructured narratives with higher accuracy and transparency. The proposed system is a pipelined approach, capable of recognizing clinical concepts of three types, problem, treatment, and test, in the dataset collected from a published repository as a part of the I2b2 challenge 2010. The system’s performance is compared with that of three existing systems: Quick UMLS, BIO-CRF, and the Rules (i2b2) model. Compared to the baseline systems, the average F1-score of 72.94% was found to be 13% better than Quick UMLS, 3% better than BIO CRF, and 30.1% better than the Rules (i2b2) model. Individually, the system performance was noticeably higher for problem-related concepts, with an F1-score of 80.45%, followed by treatment-related concepts and test-related concepts, with F1-scores of 76.06% and 55.3%, respectively. The proposed methodology significantly improves the performance of concept extraction from unstructured clinical narratives by exploiting the linguistic and lexical semantic features. The approach can ease the automatic annotation process of clinical data, which ultimately improves the performance of supervised data-driven applications trained with these data.


2021 ◽  
Vol 7 (15) ◽  
pp. eabe4166
Author(s):  
Philippe Schwaller ◽  
Benjamin Hoover ◽  
Jean-Louis Reymond ◽  
Hendrik Strobelt ◽  
Teodoro Laino

Humans use different domain languages to represent, explore, and communicate scientific concepts. During the last few hundred years, chemists compiled the language of chemical synthesis inferring a series of “reaction rules” from knowing how atoms rearrange during a chemical transformation, a process called atom-mapping. Atom-mapping is a laborious experimental task and, when tackled with computational methods, requires continuous annotation of chemical reactions and the extension of logically consistent directives. Here, we demonstrate that Transformer Neural Networks learn atom-mapping information between products and reactants without supervision or human labeling. Using the Transformer attention weights, we build a chemically agnostic, attention-guided reaction mapper and extract coherent chemical grammar from unannotated sets of reactions. Our method shows remarkable performance in terms of accuracy and speed, even for strongly imbalanced and chemically complex reactions with nontrivial atom-mapping. It provides the missing link between data-driven and rule-based approaches for numerous chemical reaction tasks.


1993 ◽  
Vol 02 (01) ◽  
pp. 47-70
Author(s):  
SHARON M. TUTTLE ◽  
CHRISTOPH F. EICK

Forward-chaining rule-based programs, being data-driven, can function in changing environments in which backward-chaining rule-based programs would have problems. But, degugging forward-chaining programs can be tedious; to debug a forward-chaining rule-based program, certain ‘historical’ information about the program run is needed. Programmers should be able to directly request such information, instead of having to rerun the program one step at a time or search a trace of run details. As a first step in designing an explanation system for answering such questions, this paper discusses how a forward-chaining program run’s ‘historical’ details can be stored in its Rete inference network, used to match rule conditions to working memory. This can be done without seriously affecting the network’s run-time performance. We call this generalization of the Rete network a historical Rete network. Various algorithms for maintaining this network are discussed, along with how it can be used during debugging, and a debugging tool, MIRO, that incorporates these techniques is also discussed.


2017 ◽  
Vol 53 (3) ◽  
pp. 1789-1798 ◽  
Author(s):  
Xiaodong Liang ◽  
Scott A. Wallace ◽  
Duc Nguyen

2018 ◽  
Vol 9 (4) ◽  
pp. 547-560 ◽  
Author(s):  
Kartikay Gupta ◽  
Aayushi Khajuria ◽  
Niladri Chatterjee ◽  
Pradeep Joshi ◽  
Deepak Joshi

2018 ◽  
Vol 5 (3) ◽  
pp. 172265 ◽  
Author(s):  
Alexis R. Hernández ◽  
Carlos Gracia-Lázaro ◽  
Edgardo Brigatti ◽  
Yamir Moreno

We introduce a general framework for exploring the problem of selecting a committee of representatives with the aim of studying a networked voting rule based on a decentralized large-scale platform, which can assure a strong accountability of the elected. The results of our simulations suggest that this algorithm-based approach is able to obtain a high representativeness for relatively small committees, performing even better than a classical voting rule based on a closed list of candidates. We show that a general relation between committee size and representatives exists in the form of an inverse square root law and that the normalized committee size approximately scales with the inverse of the community size, allowing the scalability to very large populations. These findings are not strongly influenced by the different networks used to describe the individuals’ interactions, except for the presence of few individuals with very high connectivity which can have a marginal negative effect in the committee selection process.


2019 ◽  
Vol 8 (4) ◽  
pp. 1809-1814

Sentiment analysis is a technique to analyze the people opinion, attitude, sentiment and emotion towards any particular object. Sentiment analysis has the following steps to predict the opinion of a review sentences. The steps are preprocessing, feature selection, classification and sentiment prediction. Preprocessing is the main important step and it consists of many techniques. They are Stop word Removal, punctuation removal, conversion of numbers to number names. Stemming is another important preprocessing technique which is used to transform the words in text into their grammatical root form and is mainly used to improve the retrieval of the information from the internet. It is applied mainly to get strengthen the retrieval of the information. Many morphological languages have immense amount of morphological deviation in the words. It triggered vast challenges. Many algorithms exist with different techniques and has several drawbacks. The aim of this paper is to propose a rule based stemmer that is a truncating stemmer. The new stemming mechanism in this paper has brought about many morphological changes. The new rule based morphological variation removable stemming algorithm is better than the existing other algorithms such as New Porter, Paice/Lovins and Lancaster stemming algorithm


2021 ◽  
Author(s):  
Bulat Zagidullin ◽  
Ziyan Wang ◽  
Yuanfang Guan ◽  
Esa Pitkänen ◽  
Jing Tang

Application of machine and deep learning (ML/DL) methods in drug discovery and cancer research has gained a considerable amount of attention in the past years. As the field grows, it becomes crucial to systematically evaluate the performance of novel DL solutions in relation to established techniques. To this end we compare rule-based and data-driven molecular representations in prediction of drug combination sensitivity and drug synergy scores using standardized results of 14 high throughput screening studies, comprising 64,200 unique combinations of 4,153 molecules tested in 112 cancer cell lines. We evaluate the clustering performance of molecular fingerprints and quantify their similarity by adapting Centred Kernel Alignment metric. Our work demonstrates that in order to identify an optimal representation type it is necessary to supplement quantitative benchmark results with qualitative considerations, such as model interpretability and robustness, which may vary between and throughout preclinical drug development projects.


Author(s):  
Yunpeng Li ◽  
Utpal Roy ◽  
Y. Tina Lee ◽  
Sudarsan Rachuri

Rule-based expert systems such as CLIPS (C Language Integrated Production System) are 1) based on inductive (if-then) rules to elicit domain knowledge and 2) designed to reason new knowledge based on existing knowledge and given inputs. Recently, data mining techniques have been advocated for discovering knowledge from massive historical or real-time sensor data. Combining top-down expert-driven rule models with bottom-up data-driven prediction models facilitates enrichment and improvement of the predefined knowledge in an expert system with data-driven insights. However, combining is possible only if there is a common and formal representation of these models so that they are capable of being exchanged, reused, and orchestrated among different authoring tools. This paper investigates the open standard PMML (Predictive Model Mockup Language) in integrating rule-based expert systems with data analytics tools, so that a decision maker would have access to powerful tools in dealing with both reasoning-intensive tasks and data-intensive tasks. We present a process planning use case in the manufacturing domain, which is originally implemented as a CLIPS-based expert system. Different paradigms in interpreting expert system facts and rules as PMML models (and vice versa), as well as challenges in representing and composing these models, have been explored. They will be discussed in detail.


Author(s):  
Jose M. Alonso ◽  
Ciro Castiello ◽  
Marco Lucarelli ◽  
Corrado Mencar

Decision support systems in Medicine must be easily comprehensible, both for physicians and patients. In this chapter, the authors describe how the fuzzy modeling methodology called HILK (Highly Interpretable Linguistic Knowledge) can be applied for building highly interpretable fuzzy rule-based classifiers (FRBCs) able to provide medical decision support. As a proof of concept, they describe the case study of a real-world scenario concerning the development of an interpretable FRBC that can be used to predict the evolution of the end-stage renal disease (ESRD) in subjects affected by Immunoglobin A Nephropathy (IgAN). The designed classifier provides users with a number of rules which are easy to read and understand. The rules classify the prognosis of ESRD evolution in IgAN-affected subjects by distinguishing three classes (short, medium, long). Experimental results show that the fuzzy classifier is capable of satisfactory accuracy results – in comparison with Multi-Layer Perceptron (MLP) neural networks – and high interpretability of the knowledge base.


Sign in / Sign up

Export Citation Format

Share Document