web text mining Latest Research Papers

With the advent of the big data era and the rapid development of the Internet industry, the information processing technology of text mining has become an indispensable role in natural language processing. In our daily life, many things cannot be separated from natural language processing technology, such as machine translation, intelligent response, and semantic search. At the same time, with the development of artificial intelligence, text mining technology has gradually developed into a research hotspot. There are many ways to realize text mining. This paper mainly describes the realization of web text mining and the realization of text structure algorithm based on HTML through a variety of methods to compare the specific clustering time of web text mining. Through this comparison, we can also get which web mining is the most efficient. The use of WebKB datasets for many times in experimental comparison also reflects that Web text mining for the Chinese language logic intelligent detection algorithm provides a basis.

Download Full-text

A web text mining approach for the evaluation of regional characteristics at the town level

Transactions in GIS ◽

10.1111/tgis.12763 ◽

2021 ◽

Author(s):

Shu Wang ◽

Lang Qian ◽

Yunqiang Zhu ◽

Jia Song ◽

Feng Lu ◽

...

Keyword(s):

Text Mining ◽

Regional Characteristics ◽

Web Text Mining ◽

The Town

Download Full-text

Hadoop-Based Painting Resource Storage and Retrieval Platform Construction and Testing

Complexity ◽

10.1155/2021/9933330 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Chenhua Zu

Keyword(s):

Remote Sensing ◽

Data Analysis ◽

Data Storage ◽

Image Data ◽

Remote Sensing Image ◽

Development Environment ◽

Log Data ◽

Storage And Retrieval ◽

Web Text Mining ◽

Log Data Analysis

This paper adopts Hadoop to build and test the storage and retrieval platform for painting resources. This paper adopts Hadoop as the platform and MapReduce as the computing framework and uses Hadoop Distributed Filesystem (HDFS) distributed file system to store massive log data, which solves the storage problem of massive data. According to the business requirements of the system, this paper designs the system according to the process of web text mining, mainly divided into log data preprocessing module, log data storage module, log data analysis module, and log data visualization module. The core part of the system is the log data analysis module. The analysis of search keywords ranking, Uniform Resource Locator (URL), and user click relationship, URL ranking, and other dimensions are realized through data statistical analysis, and Canopy coarse clustering is performed first according to search keywords, and then K-means clustering is used for the results after Canopy clustering, and the calculation of cosine similarity is adopted to realize the grouping of users and build user portrait. The Hadoop development environment is installed and deployed, and functional and performance tests are conducted on the contents implemented in this system. The constructed private cloud platform for remote sensing image data can realize online retrieval of remote sensing image metadata and fast download of remote sensing image data and solve the problems in storage, data sharing, and management of remote sensing image data to a certain extent.

Download Full-text

Research on an Enhanced web Information Processing Technology based on AIS text Mining

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096513999201026224357 ◽

2020 ◽

Vol 13 ◽

Author(s):

Canhui Li

Keyword(s):

Text Mining ◽

Symmetric Matrix ◽

Information Support ◽

Web Content ◽

Mining System ◽

Information Efficiency ◽

Web Content Mining ◽

Content Mining ◽

Web Text Mining ◽

Text Mining System

Background:: To improve the information efficiency in web text mining, filtration is utilized. Methods:: A web content mining technology based on web text mining, augmented information support (AIS), is proposed for improving the web text mining efficiency. Additionally, the AIS technology is applied to the Xiangshan science conference website, and AIS4XSSC text mining system is developed. The developed system is tested for its efficiency, and its main functions are discussed. Results:: 192 documents are represented by 8352 vectors, and 192 × 8352 vectors are obtained; the similarity between 192 vectors is calculated using the cosine of included angle, 192 × 192 symmetric matrix is obtained, and 35 categories are formed by hierarchical clustering by using similarity between texts. Conclusion:: The results show that the AIS technology can effectively extract information from a large amount of web texts. The proposed system improves information retrieval efficiently and can push the valuable information to users.

Download Full-text

A new model for iris data set classification based on linear support vector machine parameter's optimization

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v10i1.pp1079-1084 ◽

2020 ◽

Vol 10 (1) ◽

pp. 1079

Author(s):

Zahraa Faiz Hussain ◽

Hind Raad Ibraheem ◽

Mohammad Alsajri ◽

Ahmed Hussein Ali ◽

Mohd Arfian Ismail ◽

...

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Data Set ◽

Principle Components Analysis ◽

Web Text Mining ◽

Linear Svm ◽

Iris Data

Data mining is known as the process of detection concerning patterns from essential amounts of data. As a process of knowledge discovery. Classification is a data analysis that extracts a model which describes an important data classes. One of the outstanding classifications methods in data mining is support vector machine classification (SVM). It is capable of envisaging results and mostly effective than other classification methods. The SVM is a one technique of machine learning techniques that is well known technique, learning with supervised and have been applied perfectly to a vary problems of: regression, classification, and clustering in diverse domains such as gene expression, web text mining. In this study, we proposed a newly mode for classifying iris data set using SVM classifier and genetic algorithm to optimize c and gamma parameters of linear SVM, in addition principle components analysis (PCA) algorithm was use for features reduction.

Download Full-text

Contemporary Chinese parents' needs and questions of parenting for young children: a Web text-mining approach.

Information Research: an international electronic journal ◽

10.47989/irpaper877 ◽

2020 ◽

Vol 25 (4) ◽

Author(s):

Huihua He ◽

◽

Si He ◽

Yan Li ◽

◽

...

Keyword(s):

Text Mining ◽

Young Children ◽

Age Groups ◽

Parental Knowledge ◽

Policy Makers ◽

Mainland Chinese ◽

Care Givers ◽

Chinese Parents ◽

Web Text Mining ◽

Contemporary Chinese

Introduction. The current study investigated characteristics of parenting needs and questions of Mainland Chinese parents of young children. Specifically, Web text-mining technology was used to identify themes of parenting needs and questions, and parents' emotional status hidden in their question texts. Method. Total of 921,483 questions that parents posted from the top five parenting Websites in China during a 36-month study period were collected. Results. Daily care is one of the most important topics that concerned parents. Contemporary Mainland Chinese parents tend to raise questions about parental knowledge and skills. Different themes of questions could also be identified from different care-givers and different age groups of young children. Conclusions. From a parenting-oriented perspective, contemporary Chinese parents asked pesonalised questions through the Internet frequently. The considerable needs of grandparenting emerged. Programme designers and social policy makers should empower and support young children's parents with their parental knowledge, skills and emotional competence.

Download Full-text

Analysis of Users' Health Knowledge Requirement and Health Perception in Senior Online Community Based on Web Text Mining

Proceedings of the 2nd International Conference on Social Science, Public Health and Education (SSPHE 2018) ◽

10.2991/ssphe-18.2019.49 ◽

2019 ◽

Author(s):

Yuxing Qian ◽

Huayang Zhou ◽

Hao Li ◽

Meiling Ren ◽

Wenxuan Gui ◽

...

Keyword(s):

Text Mining ◽

Online Community ◽

Health Knowledge ◽

Health Perception ◽

Community Based ◽

Web Text Mining

Download Full-text

EKSTRAKSI DAN VISUALISASI WEB TEXT MINING MENGGUNAKAN JSOUP

Journal of Computer and Information System ( J-CIS ) ◽

10.31605/jcis.v1i1.230 ◽

2018 ◽

Vol 1 (1) ◽

pp. 40-49

Author(s):

Sugiarto Cokrowibowo ◽

Ismail Majid

Keyword(s):

World Wide Web ◽

Text Mining ◽

World Wide ◽

Word Cloud ◽

Web Text Mining

Terdapat milyaran dokumen web di world wide web yang terus bertumbuh dalam volume, kecepatan dan kompleksitas yang besar dan secara alamiah sebagian besar kontennya tidak terstruktur. Diperlukan adanya teknik atau alat untuk mengekstraksi data teks dari sebuah halaman web yang dapat beradaptasi terhadap konten yang tidak terstruktur maupun semi terstruktur dari halaman web. Pada penelitian ini penulis mengajukan pustaka Java Jsoup untuk mengekstraksi dokumen web kemudian memvisualisasikan hasilnya dalam bentuk word cloud.

Download Full-text

Comparative Analysis of Stemming Algorithms for Web Text Mining

International Journal of Modern Education and Computer Science ◽

10.5815/ijmecs.2018.09.03 ◽

2018 ◽

Vol 10 (9) ◽

pp. 20-25 ◽

Cited By ~ 1

Author(s):

Muhammad Haroon ◽

Keyword(s):

Comparative Analysis ◽

Text Mining ◽

Web Text Mining

Download Full-text

Research on the Promotion of Word of Mouth in Tourist Scenic Spots Based on Web Text Mining——the Case Study of Wanlu Valley in Guangdong Province

MATEC Web of Conferences ◽

10.1051/matecconf/201817303060 ◽

2018 ◽

Vol 173 ◽

pp. 03060

Author(s):

ZHANG Ying

Keyword(s):

High Frequency ◽

Word Of Mouth ◽

Semantic Network ◽

Guangdong Province ◽

Sharing Economy ◽

Tourism Industry ◽

Industrial Transformation ◽

Content Mining ◽

Web Text Mining ◽

Improve Service Quality

Under the background of Internet economy and sharing economy, tourist scenic spots should pay more attention to tourists' network public opinion and do a good job in cultivating network word of mouth. Taking Wanlu Valley ecotourism area in Guangdong province as an example, the paper collects Baidu index and uses ROST Content Mining software to excavate the post-consumer evaluation text of five tourist websites, such as Tongcheng, Ctrip, Grasshopper's Honeycomb, Meituan, Qunar, etc. By mining the high-frequency characteristic words of the tourist evaluation text, constructing the social semantic network matrix map, and then synthetically analyzing the tourist network attention index and the tourists' evaluation perception information, the result demonstrate that the characteristics of scenic spots, service attitude and tourist facilities are the focuses of tourist evaluation: the number of high-frequency words is large and the degree of praise is high. Therefore, the scenic spots should pay attention to the integration development of "tourism +" industry, improve service quality, enrich tourism experience projects, promote the industrial transformation and update and innovation development of eco-tourism destination.

Download Full-text

web text mining
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Logical Intelligent Detection Algorithm of Chinese Language Articles Based on Text Mining

A web text mining approach for the evaluation of regional characteristics at the town level

Hadoop-Based Painting Resource Storage and Retrieval Platform Construction and Testing

Research on an Enhanced web Information Processing Technology based on AIS text Mining

A new model for iris data set classification based on linear support vector machine parameter's optimization

Contemporary Chinese parents' needs and questions of parenting for young children: a Web text-mining approach.

Analysis of Users' Health Knowledge Requirement and Health Perception in Senior Online Community Based on Web Text Mining

EKSTRAKSI DAN VISUALISASI WEB TEXT MINING MENGGUNAKAN JSOUP

Comparative Analysis of Stemming Algorithms for Web Text Mining

Research on the Promotion of Word of Mouth in Tourist Scenic Spots Based on Web Text Mining——the Case Study of Wanlu Valley in Guangdong Province

Export Citation Format

web text miningRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Logical Intelligent Detection Algorithm of Chinese Language Articles Based on Text Mining

A web text mining approach for the evaluation of regional characteristics at the town level

Hadoop-Based Painting Resource Storage and Retrieval Platform Construction and Testing

Research on an Enhanced web Information Processing Technology based on AIS text Mining

A new model for iris data set classification based on linear support vector machine parameter's optimization

Contemporary Chinese parents' needs and questions of parenting for young children: a Web text-mining approach.

Analysis of Users' Health Knowledge Requirement and Health Perception in Senior Online Community Based on Web Text Mining

EKSTRAKSI DAN VISUALISASI WEB TEXT MINING MENGGUNAKAN JSOUP

Comparative Analysis of Stemming Algorithms for Web Text Mining

Research on the Promotion of Word of Mouth in Tourist Scenic Spots Based on Web Text Mining——the Case Study of Wanlu Valley in Guangdong Province

web text mining
Recently Published Documents