Combining Supervised Learning Techniques to Key-Phrase Extraction for Biomedical Full-Text

Author(s):  
Yanliang Qi ◽  
Min Song ◽  
Suk-Chung Yoon ◽  
Lori deVersterre

Key-phrase extraction plays a useful a role in research areas of Information Systems (IS) like digital libraries. Short metadata like key phrases are beneficial for searchers to understand the concepts found in the documents. This paper evaluates the effectiveness of different supervised learning techniques on biomedical full-text: Sequential Minimal Optimization (SMO) and K-Nearest Neighbor, both of which could be embedded inside an information system for document search. The authors use these techniques to extract key phrases from PubMed and evaluate the performance of these systems using the holdout validation method. This paper compares different classifier techniques and performance differences between the full-text and it’s abstract. Compared with the authors’ previous work, which investigated the performance of Naïve Bayes, Linear Regression and SVM(reg1/2), this paper finds that SVMreg-1 performs best in key-phrase extraction for full-text, whereas Naïve Bayes performs best for abstracts. These techniques should be considered for use in information system search functionality. Additional research issues also are identified.

Author(s):  
Yanliang Qi ◽  
Min Song ◽  
Suk-Chung Yoon ◽  
Lori deVersterre

Key-phrase extraction plays a useful a role in research areas of Information Systems (IS) like digital libraries. Short metadata like key phrases are beneficial for searchers to understand the concepts found in the documents. This paper evaluates the effectiveness of different supervised learning techniques on biomedical full-text: Sequential Minimal Optimization (SMO) and K-Nearest Neighbor, both of which could be embedded inside an information system for document search. The authors use these techniques to extract key phrases from PubMed and evaluate the performance of these systems using the holdout validation method. This paper compares different classifier techniques and performance differences between the full-text and it’s abstract. Compared with the authors’ previous work, which investigated the performance of Naïve Bayes, Linear Regression and SVM(reg1/2), this paper finds that SVMreg-1 performs best in key-phrase extraction for full-text, whereas Naïve Bayes performs best for abstracts. These techniques should be considered for use in information system search functionality. Additional research issues also are identified.


Author(s):  
V Umarani ◽  
A Julian ◽  
J Deepa

Sentiment analysis has gained a lot of attention from researchers in the last year because it has been widely applied to a variety of application domains such as business, government, education, sports, tourism, biomedicine, and telecommunication services. Sentiment analysis is an automated computational method for studying or evaluating sentiments, feelings, and emotions expressed as comments, feedbacks, or critiques. The sentiment analysis process can be automated using machine learning techniques, which analyses text patterns faster. The supervised machine learning technique is the most used mechanism for sentiment analysis. The proposed work discusses the flow of sentiment analysis process and investigates the common supervised machine learning techniques such as multinomial naive bayes, Bernoulli naive bayes, logistic regression, support vector machine, random forest, K-nearest neighbor, decision tree, and deep learning techniques such as Long Short-Term Memory and Convolution Neural Network. The work examines such learning methods using standard data set and the experimental results of sentiment analysis demonstrate the performance of various classifiers taken in terms of the precision, recall, F1-score, RoC-Curve, accuracy, running time and k fold cross validation and helps in appreciating the novelty of the several deep learning techniques and also giving the user an overview of choosing the right technique for their application.


2017 ◽  
Vol 3 (1) ◽  
pp. 9
Author(s):  
Dian Kartika Utami ◽  
Wisnu Ananta Kusuma ◽  
Agus Buono

Studi metagenom merupakan langkah penting pada pengelompokan taksonomi. Pengelompokan pada metagenom dapat dilakukan dengan menggunakan metode binning. Binning diperlukan untuk mengelompokkan contigs yang dimiliki oleh masing-masing kelompok spesies filogenetik. Pada penelitian ini, binning dilakukan dengan menggunakan pendekatan komposisi berdasarkan supervised learning (pembelajaran dengan contoh). Metode supervised learning yang digunakan yaitu Naïve Bayes Classifier. Adapun metode yang digunakan untuk ekstraksi ciri adalah dengan melakukan perhitungan frekuensi k-mer. Klasifikasi pada metagenom dilakukan berdasarkan tingkat takson genus. Dari proses klasifikasi yang dilakukan, akurasi yang diperoleh dengan menggunakan fragmen pendek (400 bp) adalah 49.34 % untuk ekstraksi ciri 3-mer dan 53.95 % untuk ekstrasi ciri 4-mer. Sementara itu, untuk fragmen panjang (10 kbp), akurasi mengalami peningkatan yaitu 82.23 % untuk ekstraksi ciri 3-mer dan 85.89 % untuk esktraski ciri 4-mer. Dari hasil tersebut dapat disimpulkan bahwa akurasi semakin tinggi seiring dengan semakin panjangnya ukuran fragmen. Selain itu, penelitian ini juga menyimpulkan bahwa metode ekstrasi ciri yang memberikan hasil paling maksimal adalah dengan menggunakan ekstraksi ciri 4-mer.<br /><br />Kata Kunci: metagenom, k-mer, Naïve Bayes Classifier, binning, klasifikasi


2021 ◽  
Vol 23 (04) ◽  
pp. 356-372
Author(s):  
Manpreet Kaur ◽  
◽  
Dr. Dinesh Kumar ◽  

The classification techniques based on various machine learning techniques are having use for the Big data analysis. This will be useful in identifying the classification and then finally the prediction which will be useful for the decision managers for having quality decisions. There are various types of supervised and unsupervised learning techniques which are having capabilities in the terms of driving the analysis. This analysis will be useful for having identification of relationship between the various attributes which is required to device the analysis. There are various supervised learning techniques which are useful to drive the analysis. These techniques are SVM, Logistic regression, KNN, Naïve Bayes, Tree, Neural network. The relative comparison of this technique is done in the terms of various parameters for example AUC, CA, F1, Recall and precision. The accuracy in the terms of AUC, CA is highest for the Naïve Bayes. This shows the Naïve Bayes is having higher true positives, true negative ratio. The proposed technique is having higher accuracy of 81% which is far above than all the remaining techniques. The confusion matrix for the Naïve Bayes is having true positive count as 729, true negative at 103. This shows that the true positive and true negative count is far above for this technique compared to the other techniques.


2019 ◽  
Vol 2 (2) ◽  
pp. 83-88
Author(s):  
Arif Saputra

Manually sorting varieties of apples result in high costs, subjectivity, boredom, and inconsistencies associated with humans. A means is needed to distinguish between types of apples and, therefore, some reliable techniques are necessary to identify varieties quickly and without damage. The purpose of conducting research is to investigate the application and performance for Naive Bayes algorithm for apple varieties. This software methodology involves image acquisition, preprocessing, segmentation and analysis classification varieties for apple. The prototype of Apple's classification system was built using the MATLAB R2017 development platform environment. The results in this study indicate that the estimated average accuracy, sensitivity, precision, and specificity are 81%, 73%, 100%, and 70%, respectively. MLP-Neural shows that performance of the Naive Bayes technique is consistent with Principal, Fuzzy Logic, and Neural analysis with 89%, 91%, 87%, and 82% respectively in terms of accuracy. This study shows that Naif Bayes has excellent potential for identifying nondestructive and accurate apple varieties.


Nowadays people share their views and opinions in twitter and other social media platforms, the way of recognizing sentiments and speculation in tweets is Twitter Sentiment Analysis. Determining the contradiction or sentiment of the tweets and then listing them into positive, negative and neutral tweets is the main classifying step in this process. The issue related to sentiment analysis is the naming of the correct congruous sentiment classifier algorithm to list the tweets. The foundation classifier techniques like Logistic regression, Naive Bayes classifier, Random Forest and SVMs are normally used. In this paper, the Naïve Bayes classifier and Logistic Regression has been used to perform sentiment analysis and classify based on the better accuracy of catagorizing Technique. The outcome shows that Naive Bayes classifier works better for this approach. Data pre-processing and feature extraction is realized as a portion of task.


Author(s):  
Jothikumar R. ◽  
Vijay Anand R. ◽  
Visu P. ◽  
Kumar R. ◽  
Susi S. ◽  
...  

Sentiment evaluation alludes to separate the sentiments from the characteristic language and to perceive the mentality about the exact theme. Novel corona infection, a harmful malady ailment, is spreading out of the blue through the quarter, which thought processes respiratory tract diseases that can change from gentle to extraordinary levels. Because of its quick nature of spreading and no conceived cure, it ushered in a vibe of stress and pressure. In this chapter, a framework perusing principally based procedure is utilized to discover the musings of the tweets related to COVID and its effect lockdown. The chapter examines the tweets identified with the hash tags of crown infection and lockdown. The tweets were marked fabulous, negative, or fair, and a posting of classifiers has been utilized to investigate the precision and execution. The classifiers utilized have been under the four models which incorporate decision tree, regression, helpful asset vector framework, and naïve Bayes forms.


Interminable Kidney Disease (CKD) proposes the realm of kidney chance which may even crumble by means of time and through implying the factors. If it continues finishing all the more dreadful Dialysis is and most desperate conclusive outcomes believable it'd flash off kidney misery (End-Stage Renal Disease). Area of CKD in a starting period should help in filtering by means of the complexities and harm.In the pastwork portrayal applied are SVM and Naïve Bayes, it happened that the execution time took by methods for Naïve Bayes is irrelevant appeared differently in relation to SVM, confused events are substantially less with SVM that results in less request execution of Naïve Bayes, inferable from gentle exactness distinction. It can be corrected by methods for taking less improvements. Unsuspecting Bayes is a probabilistic classifier a fundamental count by utilizing Bayes Theorem with a prohibitive independence supposition. The artistic creations for the most segment brings around growing symptomatic exactness and decrease commitment time, this is the guideline factor. An undertaking is made to develop a form evaluating CKD data collected from a particular course of action of people. From the model data, recognizing verification should be conceivable. This work has enchanted on developing up a system relying upon gathering procedures: SVM, Naïve Bayes, glomerular filtration rate (GFR) is the best pointer of how well the kidneys are working.CKD has got no cure but it can be treated based on symptoms to reduce complicationsand


Sign in / Sign up

Export Citation Format

Share Document