scholarly journals An approach in health relation extraction

Author(s):  
Nghia Huu Huynh ◽  
Quoc Bao Ho ◽  
Te An Nguyen

Extracting relations among medical concepts is very important in the medical field. The relations denote the events or the possible relations between the concepts. Information about these relations provides users with a full view of medical problems. This helps physicians and health-care practitioners make effective decisions and minimize errors in the treatment process. This paper collects methods for relations extraction in health texts and presents an approach on one type of specific relation (i.e. template filling). The approach combines methods including rule-based and machine learningbased. The rule-based method uses the relation of semantic dependencies among the concepts to extract the rule set. The machine learning-based method uses the SVM (Support Vector Machine) algorithm and a feature set proposed. The results of the approach were estimated on an accuracy of 0.849.

2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Minoo Aminian ◽  
David Couvin ◽  
Amina Shabbeer ◽  
Kane Hadley ◽  
Scott Vandenberg ◽  
...  

We develop a novel approach for incorporating expert rules into Bayesian networks for classification ofMycobacterium tuberculosiscomplex (MTBC) clades. The proposed knowledge-based Bayesian network (KBBN) treats sets of expert rules as prior distributions on the classes. Unlike prior knowledge-based support vector machine approaches which require rules expressed as polyhedral sets, KBBN directly incorporates the rules without any modification. KBBN uses data to refine rule-based classifiers when the rule set is incomplete or ambiguous. We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection. We validate the approach using two testbeds that model knowledge of the MTBC obtained from two different experts and large DNA fingerprint databases to predict MTBC genetic clades and sublineages. These models represent strains of MTBC using high-throughput biomarkers called spacer oligonucleotide types (spoligotypes), since these are routinely gathered from MTBC isolates of tuberculosis (TB) patients. Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient. The SITVIT KBBN is publicly available for use on the World Wide Web.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Wahiba Ben Abdessalem Karaa ◽  
Eman H. Alkhammash ◽  
Aida Bchir

Extracting the relations between medical concepts is very valuable in the medical domain. Scientists need to extract relevant information and semantic relations between medical concepts, including protein and protein, gene and protein, drug and drug, and drug and disease. These relations can be extracted from biomedical literature available on various databases. This study examines the extraction of semantic relations that can occur between diseases and drugs. Findings will help specialists make good decisions when administering a medication to a patient and will allow them to continuously be up to date in their field. The objective of this work is to identify different features related to drugs and diseases from medical texts by applying Natural Language Processing (NLP) techniques and UMLS ontology. The Support Vector Machine classifier uses these features to extract valuable semantic relationships among text entities. The contributing factor of this research is the combination of the strength of a suggested NLP technique, which takes advantage of UMLS ontology and enables the extraction of correct and adequate features (frequency features, lexical features, morphological features, syntactic features, and semantic features), and Support Vector Machines with polynomial kernel function. These features are manipulated to pinpoint the relations between drug and disease. The proposed approach was evaluated using a standard corpus extracted from MEDLINE. The finding considerably improves the performance and outperforms similar works, especially the f-score for the most important relation “cure,” which is equal to 98.19%. The accuracy percentage is better than those in all the existing works for all the relations.


Author(s):  
Yumin Ma ◽  
Fei Qiao ◽  
Fu Zhao ◽  
John W. Sutherland

Various factors and constraints should be considered when developing a manufacturing production schedule, and such a schedule is often based on rules. This paper develops a composite dispatching rule based on heuristic rules that comprehensively consider various factors in a semiconductor production line. The composite rule is obtained by exploring various states of a semiconductor production line (machine status, queue size, etc.), where such indicators as makespan and equipment efficiency are used to judge performance. A model of the response surface, as a function of key variables, is then developed to find the optimized parameters of a composite rule for various production states. Further, dynamic scheduling of semiconductor manufacturing is studied based on support vector regression (SVR). This approach dynamically obtains a composite dispatching rule (i.e. parameters of the composite dispatching rule) that can be used to optimize production performance according to real-time production line state. Following optimization, the proposed dynamic scheduling approach is tested in a real semiconductor production line to validate the effectiveness of the proposed composite rule set.


Author(s):  
Fei Yang ◽  
Yanchen Wang ◽  
Peter J. Jin ◽  
Dingbang Li ◽  
Zhenxing Yao

Cellular phone data has been proven to be valuable in the analysis of residents’ travel patterns. Existing studies mostly identify the trip ends through rule-based or clustering algorithms. These methods largely depend on subjective experience and users’ communication behaviors. Moreover, limited by privacy policy, the accuracy of these methods is difficult to assess. In this paper, points of interest data is applied to supplement cellular phone data’s missing information generated by users’ behaviors. Specifically, a random forest model for trip end identification is proposed using multi-dimensional attributes. A field data acquisition test is designed and conducted with communication operators to implement synchronized cellular phone data and real trip information collection. The proposed identification approach is empirically evaluated with real trip information. Results show that the overall trip end detection precision and recall reach 95.2% and 88.7% with an average distance error of 269 m, and the time errors of the trip ends are less than 10 min. Compared with the rule-based approach, clustering algorithm, naive Bayes method, and support vector machine, the proposed method has better performance in accuracy and consistency.


2020 ◽  
Vol 19 (01) ◽  
pp. 2040015
Author(s):  
Ahmad Alaiad ◽  
Hassan Najadat ◽  
Belal Mohsen ◽  
Khaled Balhaf

Background and objective: Chronic kidney disease (CKD) is one of the deadly diseases that can affect a lot of vital organs in the human body such as heart, liver, and lungs. Many individuals might be at early stage of kidney disease and not have any signs, which might lead to a sudden death. Previous research showed that early prediction of CKD is very important in the medical field for physicians’ decision-making and patients’ health and life. To this end, constructing an efficient prediction system for CKD, which is the goal of this paper, often reduces medical errors and overall healthcare cost. Methods: Classification and association rule mining techniques were integrated and utilised to construct an efficient system for predicting and diagnosing CKD and its causes using weka and SPSS as platform environments. In particular, five classification algorithms, namely, naive Bayes, decision tree, support vector machine, K-nearest neighbour, and JRip were used to achieve the research goal. In addition, Apriori algorithm was used to discover strong relationship rules between attributes. The experiments were conducted on real medical dataset collected from hospitals and patient monitoring systems. Results: The experiments achieved high accuracy of 98.50% for K-nearest neighbour (KNN) classifier and achieved 96.00% when using classier based on association rule (JRip). Conclusions: We conclude by showing that applying integrative approach by combining classification algorithms and association rule mining can significantly improve the classification accuracy and be more useful for CKD prediction. This research has also several theoretical and practical implications for the medical field and healthcare industry.


2021 ◽  
Vol 2021 ◽  
pp. 1-24
Author(s):  
Youness Mourtaji ◽  
Mohammed Bouhorma ◽  
Daniyal Alghazzawi ◽  
Ghadah Aldabbagh ◽  
Abdullah Alghamdi

The phenomenon of phishing has now been a common threat, since many individuals and webpages have been observed to be attacked by phishers. The common purpose of phishing activities is to obtain user’s personal information for illegitimate usage. Considering the growing intensity of the issue, this study is aimed at developing a new hybrid rule-based solution by incorporating six different algorithm models that may efficiently detect and control the phishing issue. The study incorporates 37 features extracted from six different methods including the black listed method, lexical and host method, content method, identity method, identity similarity method, visual similarity method, and behavioral method. Furthermore, comparative analysis was undertaken between different machine learning and deep learning models which includes CART (decision trees), SVM (support vector machines), or KNN ( K -nearest neighbors) and deep learning models such as MLP (multilayer perceptron) and CNN (convolutional neural networks). Findings of the study indicated that the method was effective in analysing the URL stress through different viewpoints, leading towards the validity of the model. However, the highest accuracy level was obtained for deep learning with the given values of 97.945 for the CNN model and 93.216 for the MLP model, respectively. The study therefore concludes that the new hybrid solution must be implemented at a practical level to reduce phishing activities, due to its high efficiency and accuracy.


2013 ◽  
Vol 12 (2) ◽  
pp. 3277-3285
Author(s):  
Dev Mukherji ◽  
Nikita Padalia

Cardiovascular disease is one of the dominant concerns of society, affecting millions of people each year. Early and accurate diagnosis of risk of heart disease is one of major areas of medical research, aimed to aid in its prevention and treatment. Most of the approaches used to predict the occurrence of heart disease use single data mining techniques. However, performances of predictive methods have recently increased upon research into hybrid and alternative methods. This paper analyses the performance of logistic regression, support vector machine, and decision trees along with rule-based hybrids of the three in an attempt to create a more accurate predictive model.


2010 ◽  
Vol 9 (4) ◽  
pp. 21-28
Author(s):  
John Ferraris ◽  
Christos Gatzidis ◽  
Feng Tian

This publication proposes a novel approach to automatically colour and texture a given terrain mesh in real time. Through the use of weighting rules, a simple syntax allows for the generation of texture and colour values based on the elevation and angle of a given vertex. It is through this combination of elevation and angle that complex features such as ridges, hills and mountains can be described, with the mesh coloured and textured accordingly. The implementation of the approach is done entirely on the GPU using 2D lookup textures, delivering a great performance increase over typical approaches that pass colour and weighting information in the fragment shader. In fact, the rule set is abstracted enough to be used in conjunction with any colouring/texturing approach that uses weighting values to dictate which surfaces are depicted on the mesh


2019 ◽  
Vol 8 (4) ◽  
pp. 2514-2519

Microarray is a fast and rapid growing technology which plays dynamic role in the medical field. It is an advanced than MRI (Magnetic Resonance Imaging) and CT scanning (Computerised Tomography). The purpose of this work is to make fine perfection against the gene expression. In this study the two clustering are used which fuzzy c means and k means and also it classifies with better results. The microarray data base indicates the classification in support vector machine. Segmentation is most important step in microarray image. The classification in support vector machine is compared with other two classifiers which means the k nearest neighbour and with the Bayes classifiers.


Author(s):  
Rudolph Joshua Candare ◽  
Michelle Japitana ◽  
James Earl Cubillas ◽  
Cherry Bryan Ramirez

This research describes the methods involved in the mapping of different high value crops in Agusan del Norte Philippines using LiDAR. This project is part of the Phil-LiDAR 2 Program which aims to conduct a nationwide resource assessment using LiDAR. Because of the high resolution data involved, the methodology described here utilizes object-based image analysis and the use of optimal features from LiDAR data and Orthophoto. Object-based classification was primarily done by developing rule-sets in eCognition. Several features from the LiDAR data and Orthophotos were used in the development of rule-sets for classification. Generally, classes of objects can't be separated by simple thresholds from different features making it difficult to develop a rule-set. To resolve this problem, the image-objects were subjected to Support Vector Machine learning. SVMs have gained popularity because of their ability to generalize well given a limited number of training samples. However, SVMs also suffer from parameter assignment issues that can significantly affect the classification results. More specifically, the regularization parameter C in linear SVM has to be optimized through cross validation to increase the overall accuracy. After performing the segmentation in eCognition, the optimization procedure as well as the extraction of the equations of the hyper-planes was done in Matlab. The learned hyper-planes separating one class from another in the multi-dimensional feature-space can be thought of as super-features which were then used in developing the classifier rule set in eCognition. In this study, we report an overall classification accuracy of greater than 90% in different areas.


Sign in / Sign up

Export Citation Format

Share Document