web content mining
Recently Published Documents


TOTAL DOCUMENTS

92
(FIVE YEARS 19)

H-INDEX

8
(FIVE YEARS 1)

2021 ◽  
pp. 159-172
Author(s):  
Priyanka Shah ◽  
Hardik B. Pandit

2021 ◽  
Author(s):  
Nur Fitriyani Che Razali ◽  
Masurah Mohamad ◽  
Khairulliza Ahmad Salleh ◽  
Muhammad Hafizuddin Abd Rahman Sani ◽  
Lathifah Alfat

Author(s):  
Monther Khalafat ◽  
Ja'far S. Alqatawna ◽  
Rizik M. H. Al-Sayyed ◽  
Mohammad Eshtay ◽  
Thaeer Kobbaey

<p class="0abstract">Today, the influence of the social media on different aspects of our lives is increasing, many scholars from various disciplines and majors looking at the social media networks as the ongoing revolution. In Social media networks, many bonds and connections can be established whether being direct or indirect ties. In fact, Social networks are used not only by people but also by companies. People usually create their own profiles and join communities to discuss different common issues that they have interest in. On the other hand, companies also can create their virtual presence on the social media networks to benefit from this media to understand the customers and gather richer information about them. With all of the benefits and advantages of social media networks, they should not always be seen as a safe place for communicating, sharing information and ideas, and establishing virtual communities. These information and ideas could carry with them hatred speeches that must be detected to avoid raising violence. Therefore, web content mining can be used to handle this issue. Web content mining is gaining more concern because of its importance for many businesses and institutions.  Sentiment Analysis (SA) is an important sub-area of web content mining.  The purpose of SA is to determine the overall sentiment attitude of writer towards a specific entity and classify these opinions automatically. There are two main approaches to build systems of sentiment analysis: the machine learning approach and the lexicon-based approach. This research presents the design and implementation for violence detection over social media using machine learning approach. Our system works on Jordanian Arabic dialect instead of Modern Standard Arabic (MSA). The data was collected from two popular social media websites (Facebook, Twitter) and has used native speakers to annotate the data. Moreover, different preprocessing techniques have been used to show their effect on our model accuracy. The Arabic lexicon was used for generating feature vectors and separate them to features set. Here, we have three well known machine learning algorithms: Support Vector Machine (SVM), Naive Bayes (NB) and k-Nearest Neighbors (KNN). Building on this view, Information Science Research Institute’s (ISRI) stemming and stop word file as a result of preprocessing were used to extract the features. Indeed, several features have been extracted; however, using the SVM classifier reveals that unigram and features extracted from lexicon are characterized by the highest accuracy to detect violence.</p>


2021 ◽  
Vol 61 ◽  
pp. 102588
Author(s):  
Jinfeng Zhou ◽  
Jinliang Wei ◽  
Bugao Xu

Author(s):  
Canhui Li

Background:: To improve the information efficiency in web text mining, filtration is utilized. Methods:: A web content mining technology based on web text mining, augmented information support (AIS), is proposed for improving the web text mining efficiency. Additionally, the AIS technology is applied to the Xiangshan science conference website, and AIS4XSSC text mining system is developed. The developed system is tested for its efficiency, and its main functions are discussed. Results:: 192 documents are represented by 8352 vectors, and 192 × 8352 vectors are obtained; the similarity between 192 vectors is calculated using the cosine of included angle, 192 × 192 symmetric matrix is obtained, and 35 categories are formed by hierarchical clustering by using similarity between texts. Conclusion:: The results show that the AIS technology can effectively extract information from a large amount of web texts. The proposed system improves information retrieval efficiently and can push the valuable information to users.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Chunmin Lang ◽  
Sibei Xia ◽  
Chuanlan Liu

PurposeThis study intends to examine consumers' fashion customization experiences through a web content mining (WCM) approach. By applying the theory of customer value, this study explores the benefits and costs of two levels of mass customization (MC) to identify the values derived from style (i.e. shoe customization) and fit customization experiences (i.e. apparel customization) and further to compare the dominating dimensions of value derived across style and fit customization.Design/methodology/approachA WCM approach was applied. Also, two case studies were conducted with one focusing on style customization and the other focusing on fit customization. The brand Vans was selected to examine style customization in study 1. The brand Sumissura was selected to examine fit customization in study 2. Consumers' comments on customization experiences from these two brands were collected through social networks, respectively. After data cleaning, 394 reviews for Vans and 510 reviews for Sumissura were included in the final data analysis. Co-occurrence plots, feature extraction and grouping were used for the data analysis.FindingsThe emotional value was found to be the major benefit for style customization, while the functional value was indicated as the major benefit for fit customization, followed by ease of use and emotional value. In addition, three major themes of costs, including unsatisfied service, disappointing product performance and financial risk, were revealed by excavating and evaluating consumers' feedback of their actual clothing customization experiences with Sumissura.Originality/valueThis study initiates the effort to use web mining, specifically, the WCM approach to thoroughly investigate the benefits and costs of MC through real consumers' feedback of two different types of fashion products. The analysis of this study also reflects the levels of customization: style and fit. It provides an in-depth text analysis of online MC consumers' feedback through the use of feature extraction analysis and word co-occurrence networks.


2020 ◽  
Vol 25 (2) ◽  
pp. 1-16
Author(s):  
Rasha Hany Salman ◽  
Mahmood Zaki ◽  
Nadia A. Shiltag

The web today has become an archive of information in any structure such content, sound, video, designs, and multimedia, with the progression of time overall web, the world wide web is now crowded with different data making extraction of virtual data burdensome process, web utilizes various information mining strategies to mine helpful information from page substance and web hyperlink. The fundamental employments of web content mining are to gather, sort out, classify, providing the best data accessible on the web for the client who needs to get it. The WCM tools are needful to examining some HTML reports, content and pictures at that point, the outcome is using by the web engine. This paper displays an overview of web mining categorization, web content technique and critical review and study of web content mining tools since (2011-2019) by building the table's a comparison of these instruments dependent on some important criteria


Sign in / Sign up

Export Citation Format

Share Document