scholarly journals Hybridization of DBN with SVM and its Impact on Performance in Multi-Document Summarization

2021 ◽  
Vol 8 (3) ◽  
pp. 37-51

Data available from web based sources has grown tremendously with growth of the internet. Users interested in information from such sources often use a search engine to obtain the data which they edit for presentation to their audience. This process can be tedious especially when it involves the generation of a summary. One way to ease the process is by automation of the summary generation process. Efforts by researchers towards automatic summarization have yielded several approaches among them machine learning. Thus, recommendations have been made on combining the algorithms with different strengths, also called hybridization, in order to enhance their performance. Therefore, this research sought to establish the impact of hybridization of Deep Belief Network (DBN) with Support Vector Machine (SVM) on precision, recall, accuracy and F-measure when used in the case of query oriented multi-document summarization. The experiments were carried out using data from National Institute of Standards and Technology (NIST), Document Understanding Conference (DUC) 2006. The data was split into training and test data and used appropriately in DBN, SVM, SVM-DBN hybrid and DBN-SVM hybrid. Results indicated that the hybridized algorithm has better precision, accuracy and F-measure as compared to DBN. Pre-classification hybridization of DBN with SVM (SVM-DBN) gives the best results. This research implies that use of DBN and SVM hybrid algorithms would enhance query oriented multi-document summarization.

Plants ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 95
Author(s):  
Heba Kurdi ◽  
Amal Al-Aldawsari ◽  
Isra Al-Turaiki ◽  
Abdulrahman S. Aldawood

In the past 30 years, the red palm weevil (RPW), Rhynchophorus ferrugineus (Olivier), a pest that is highly destructive to all types of palms, has rapidly spread worldwide. However, detecting infestation with the RPW is highly challenging because symptoms are not visible until the death of the palm tree is inevitable. In addition, the use of automated RPW weevil identification tools to predict infestation is complicated by a lack of RPW datasets. In this study, we assessed the capability of 10 state-of-the-art data mining classification algorithms, Naive Bayes (NB), KSTAR, AdaBoost, bagging, PART, J48 Decision tree, multilayer perceptron (MLP), support vector machine (SVM), random forest, and logistic regression, to use plant-size and temperature measurements collected from individual trees to predict RPW infestation in its early stages before significant damage is caused to the tree. The performance of the classification algorithms was evaluated in terms of accuracy, precision, recall, and F-measure using a real RPW dataset. The experimental results showed that infestations with RPW can be predicted with an accuracy up to 93%, precision above 87%, recall equals 100%, and F-measure greater than 93% using data mining. Additionally, we found that temperature and circumference are the most important features for predicting RPW infestation. However, we strongly call for collecting and aggregating more RPW datasets to run more experiments to validate these results and provide more conclusive findings.


2017 ◽  
Vol 25 (3) ◽  
pp. 98-120 ◽  
Author(s):  
Vicki R. Lane ◽  
Jiban Khuntia ◽  
Madhavan Parthasarathy ◽  
Bidyut B. Hazarika

In this study, the authors examine how the internet is changing two critical personal value dimensions of India's youth. Based on values theory, and using data that spans a decade from 2004-2014, they contend that time spent on the internet is an influential factor in changing self-enhancement and self-transcendence values. Given the tremendous increase in exposure to western products, ideals, and people-to-people interaction via internet connectivity (India has over 275 million internet users who communicate in the English language), the authors posit that young Indian consumers would adopt values associated with self-enhancement and individualism, forsaking self-transcendence related ideals. Data pertaining to the Rokeach value scales were collected in New Delhi, and the results support the notion that these values have indeed changed substantially in such a short amount of time, largely due to IT as opposed to other media vehicles such as TV, and print media. Implications of this noteworthy change in values due to the internet in a relatively short period are discussed.


2015 ◽  
Vol 15 (3) ◽  
pp. 1031-1065 ◽  
Author(s):  
Will Dobbie ◽  
Roland G. Fryer, Jr.

Abstract This paper provides causal estimates of the impact of service programs on those who serve, using data from a web-based survey of former Teach For America (TFA) applicants. We estimate the effect of voluntary youth service using a discontinuity in the TFA application process. Participating in TFA increases racial tolerance, makes individuals more optimistic about the life prospects of poor children, and makes them more likely to work in education.


Author(s):  
Nur Azizul Haqimi ◽  
Nur Rokhman ◽  
Sigit Priyanta

Instagram (IG) is a web-based and mobile social media application where users can share photos or videos with available features. Upload photos or videos with captions that contain an explanation of the photo or video that can reap spam comments. Comments on spam containing comments that are not relevant to the caption and photos. The problem that arises when identifying spam is non-spam comments are more dominant than spam comments so that it leads to the problem of the imbalanced dataset. A balanced dataset can influence the performance of a classification method. This is the focus of research related to the implementation of the CNB method in dealing with imbalance datasets for the detection of Instagram spam comments. The study used TF-IDF weighting with Support Vector Machine (SVM) as a comparison classification. Based on the test results with 2500 training data and 100 test data on the imbalanced dataset (25% spam and 75% non-spam), the CNB accuracy was 92%, precision 86% and f-measure 93%. Whereas SVM produces 87% accuracy, 79% precision, 88% f-measure. In conclusion, the CNB method is more suitable for detecting spam comments in cases of imbalanced datasets.


Author(s):  
Jasman Pardede ◽  
Raka Gemi Ibrahim

Hoax atau berita palsu menyebar sangat cepat di media sosial. Berita itu dapat memengaruhi pembaca dan menjadi racun pikiran. Masalah seperti ini harus diselesaikan secara strategis untuk mengidentifikasi berita yang dibaca yang disebarluaskan di media sosial. Beberapa metode yang diusulkan untuk memprediksi tipuan adalah menggunakan Support Vector Classifier, Logistic Regression, dan MultinomialNaiveBayes. Dalam studi ini, para peneliti menerapkan Long Short-Term Memory untuk mengidentifikasi hoax. Kinerja sistem diukur berdasarkan nilai precision, recall, accuracy, dan F-Measure. Berdasarkan hasil eksperimen yang dilakukan pada data tipuan diperoleh nilai rata-rata precision, recall, accuracy, dan F-Measure masing-masing 0,94, 0,96, 0,94, dan 0,95. Berdasarkan hasil eksperimen ditemukan bahwa Long Short-Term Memory yang diusulkan memiliki kinerja yang lebih baik dibandingkan dengan metode sebelumnya.


The goal of Sentiment Exploration (SE) is used for mining the accurate sentiments which are very beneficial for businesses, governments, and individuals, the opinions, recommendations, ratings, and feedbacks are becoming an important aspect in present scenarios. The proposed methodology likewise attempts to introduce a swarm intelligence based sentimental supervised methodology. In order to obtain a relevant feature data set from a large number of data samples, this method used particle swarm optimization to attain the utmost optimum feature set. The evaluation of the optimum feature set is obtained by means of using Minimum Redundancy and Maximum Relevancy measure as the fitness function. The categorization of the extracted feature set is accomplished with the Support Vector Machine classification technique. The experimental outcome for the suggested method is evaluated using four performance measure like precision, recall, accuracy, and f-measure and showed that proposed swarm intelligent based classification method has better performance using IMDB, Movie Lens and Trip Advisor Data Samples.


Author(s):  
Huan Zhang ◽  
Hongyang Wang ◽  
Huiyu Yan ◽  
Xiaoyu Wang

The number of elderly Internet users has increased significantly in the past few years. However, the impact of Internet use on mental health remains unclear. In this study, we performed a difference-in-differences analysis using data from the 2016 and 2018 waves of the China Family Panel Studies (CFPS) to evaluate the impact of Internet usage on mental health among elderly individuals. A total of 5031 validated respondents were included to explore the relationship between Internet use and reduced levels of depression as well as improved life satisfaction among elderly individuals. The results showed that Internet use significantly reduced depression levels. Unexpectedly, Internet use was not found to improve life satisfaction. Moreover, discontinuing Internet use was not significantly associated with improvements in depression or life satisfaction. More research is needed to fully elucidate the relationship between Internet use and depression levels, as well as life satisfaction among elderly individuals.


The massive data accumulation from the internet creates attention for the researchers. The data collected in the form of structured and unstructured data. The structured data consists of messages, transactions, conversations, etc. while unstructured represents video and audio clips. This essentially manages the raw data problem in which unreferenced clustering is used. A hybrid approach is proposed using Cosine Similarity and soft cosine. A novel clustering technique is designed which is cross-validated using the Support Vector Machine (SVM). The validated approach is further verified by using K- means clustering. The clustering results have been further evaluated using parameters precision, recall, and F-measure. The evaluated results show the improvement in precision and recall accuracy due to hybridization of cosine similarity and soft cosine techniques


2015 ◽  
Vol 105 (5) ◽  
pp. 105-109 ◽  
Author(s):  
Natalie Cox ◽  
Benjamin Handel ◽  
Jonathan Kolstad ◽  
Neale Mahoney

The ability of web-based retailers to learn about and provide targeted consumer experiences is touted as an important distinction from traditional retailers. In principal, web-based insurance exchanges could benefit from these advantages. Using data from a large-scale experiment by a private sector health insurance exchange we estimate the returns to experimentation and targeted messaging. We find significant improvements in conversions in one treatment tested. Underlying the average impact were both intertemporal and demographic heterogeneity. We estimate that learning and targeted messaging could increase insurance applications by approximately 13 percent of the baseline conversion rate.


Author(s):  
Rana Alhajri ◽  
Ahmed A. Alhunaiyyan ◽  
Eba' AlMousa

In recent studies, there has been focus on understanding learner performance and behaviour using Web-Based Instruction (WBI) systems which accommodate individual differences. Studies have investigated the performance of these differences individually such as gender, cognitive style and prior knowledge. In this article, the authors describe a case-study using a large student user base. They analysed the performance of combinations of individual differences to investigate how each investigated item influenced learning performance. The data was filtered to validate the data mining findings in order to investigate the sensitivity of the results. Moving data threshold was used to evaluate their findings and to understand what could affect the performance. The authors found that certain combinations of individual differences altered a learner's performance level significantly using Data mining techniques. They conclude that designers of WBI applications need to consider the combination of individual differences rather than considering them individually in measuring learners' performance.


Sign in / Sign up

Export Citation Format

Share Document