web crawler
Recently Published Documents


TOTAL DOCUMENTS

487
(FIVE YEARS 175)

H-INDEX

15
(FIVE YEARS 4)

2022 ◽  
Vol 2022 ◽  
pp. 1-10
Author(s):  
WenNing Wu ◽  
ZhengHong Deng

Wi-Fi-enabled information terminals have become enormously faster and more powerful because of this technology’s rapid advancement. As a result of this, the field of artificial intelligence (AI) was born. Artificial intelligence (AI) has been used in a wide range of societal contexts. It has had a significant impact on the realm of education. Using big data to support multistage views of every subject of opinion helps to recognize the unique characteristics of each aspect and improves social network governance’s suitability. As public opinion in colleges and universities becomes an increasingly important vehicle for expressing public opinion, this paper aims to explore the concepts of public opinion based on the web crawler and CNN (Convolutional Neural Network) model. Web crawler methodology is utilised to gather the data given by students of college and universities and mention them in different dimensions. This CNN has robust data analysis capability; this proposed model uses the CNN to analyse the public opinion. Preprocessing of data is done using the oversampling method to maximize the effect of classification. Through the association of descriptions, comprehensive utilization of image information like user influence, stances of comments, topics, time of comments, etc., to suggest guidance phenomenon for various schemes, helps to enhance the effectiveness and targeted social governance of networks. The overall experimentation was carried out in python here in which the suggested methodology was predicting the positive and negative opinion of the students over the web crawler technology with a low rate of error when compared to other existing methodology.


2022 ◽  
Vol 2022 ◽  
pp. 1-11
Author(s):  
Lin Li ◽  
Sang-Bing Tsai

This paper conducts an in-depth research analysis on the precise employment of college graduates in the context of big data using a number-driven approach. The textual information of the study is obtained by using in-depth interviews, and the evaluation index system of college students’ employment quality is constructed by combining the step-by-step coding method with rooting theory. The research on the current situation of employment recommendation platform research and the application status of big data in the employment recommendation platform is explored by using a bibliometric approach. And the innovative use of web crawler technology is used to comprehensively understand the recommendation function and status quo of the same type of recommendation platform, which provides a reference for the research of this platform. Based on the preliminary analysis of platform requirements and overall design, the overall design and functional implementation of the big data employment recommendation platform are carried out by using big data crawler technology, big data architecture technology, text mining technology, database technology, etc. The construction of a recommendation module based on user history information, a recommendation based on real-time user online behavior data, and hybrid recommendation carried out on the recommendation module to grasp all-round the platform is built based on a stakeholder perspective. Based on the platform construction, the initial platform operation and maintenance management mechanism was established from the stakeholder’s perspective. The Pearson correlation coefficient is used to objectively evaluate the current situation of talent supply in universities and talent demand in enterprises from the perspective of image and data. In the research on the development status of the big data education industry, the Lorenz curve and Gini coefficient are used to match the status of new big data majors with their college construction volume in each province and provide data support for the reasonable adjustment of majors setting in each province according to the education level.


2022 ◽  
Vol 12 (1) ◽  
pp. 0-0

The WWW contains huge amount of information from different areas. This information may be present virtually in the form of web pages, media, articles (research journals / magazine), blogs etc. A major portion of the information is present in web databases that can be retrieved by raising queries at the interface offered by the specific database and is thus called the Hidden Web. An important issue is to efficiently retrieve and provide access to this enormous amount of information through crawling. In this paper, we present the architecture of a parallel crawler for the Hidden Web that avoids download overlaps by following a domain-specific approach. The experimental results further show that the proposed parallel Hidden web crawler (PSHWC), not only effectively but also efficiently extracts and download the contents in the Hidden web databases


SAGE Open ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 215824402110672
Author(s):  
Xueli Li ◽  
Songtao Geng ◽  
Suyu Liu

Tourists’ perceived image is the core of destination marketing. As an important niche tourist destination, the analysis of tourists’ perceived image of tropical forest parks has great value. This study takes the Yalong Bay Tropical Paradise Forest Park as a case site and collects a total of 1,44,022 words from online travel reviews on Ctrip.com via the Python web crawler technology. Firstly, through high-frequency word analysis, we identified 77 core elements and a total of five image themes, which are attraction image, emotional image, service facility image, crowd image, and activity image. Secondly, building on Net Draw analysis, a network structure diagram of the tourist’ perceived image of the Yalong Bay Tropical Paradise Forest Park is constructed. Finally, the overall network and individual network of tourists’ perceived image are analyzed. The results indicate that low overall network density is in possession of core and periphery. Guojianglong Cable Bridge, battery car, and glass path enjoy both high degree centrality and betweenness centrality. They also show significant advantages of structural holes. Therefore, they are on the top of the network. The academic and practical value of tourism image projection and development in tropical forest park is discussed.


2021 ◽  
Vol 14 ◽  
pp. 163-167
Author(s):  
Li Li

With the explosive growth of network information and the advent of the era of big data, it is of great significance to analyze and process employment data by using web crawler technology. This article takes Lagou.com as an example, uses crawler technology to collect data on the basis of Python and MySQL, and analyzes the collected employment data in various aspects, and uses these data analysis results to help college students in their employment and career planning. Provide reference basis, provide objective reference.


2021 ◽  
Vol 9 ◽  
Author(s):  
Bi Fan ◽  
Tingting Wu ◽  
Yufen Zhuang ◽  
Jiaxuan Peng ◽  
Kaishan Huang

With the challenges posed by the intermittent nature of renewable energy, energy storage technology is the key to effectively utilize renewable energy. China’s energy storage industry has experienced rapid growth in recent years. In order to reveal how China develops the energy storage industry, this study explores the promotion of energy storage from the perspective of policy support and public acceptance. Accordingly, by tracing the evolution of the energy storage policies during 2010–2020 comprehensively, a better understanding of the policy intention and implementation can be obtained. Meanwhile, this paper collects the information of Weibo users and posts related to energy storage by web crawler technology. The status of public attention and sentiment orientation toward energy storage are investigated with a text mining method. The main results are as follows. 1) The evolution of energy storage is characterized by three stages: the foundation stage, the nurturing stage, and the commercialization stage. 2) Most people have a positive attitude towards energy storage and recognize the potential of the energy storage industry, and it is discovered that the public attitudes towards energy storage exist cognitive bias. 3) More policies concerning market mechanism, R&D, and subsidies should be introduced to enhance the effect of energy storage policies and increase public recognition. These findings help to understand the energy storage policy and provide better strategies for policymaking.


Author(s):  
Dana R. Stojiljković ◽  
Marko Mihić ◽  
Dragan Bjelica

Research question: The aim of this paper is to examine the financial success of projects in game development industry in comparison with to projects in other industries hosted on crowdfunding platform Kickstarter.com. Motivation: We live in the world of technology where companies arise and disappear on daily basis. The traditional way of financing was expanded with alternative online platforms. Our goal was to conduct an empirical analysis of one crowdfunding platform (Kickstarter.com) in order to understand if technology projects are doing better than other projects. If they do, what are the key factors for their success? Idea: Our goal was to better understand how crowdfunding model supports small indie development projects and projects in other industries. Data: We used the data of 148.510 companies applied for crowdfunding financing between 2015 and 2020, published on the web crawler platform Web Robots. Tools: All statistical analyses were performed using statistical software IBM® SPSS® Statistics v.21. The data were presented using standard methods of non-parametric descriptive statistics (absolute and relative frequencies for medians and interquartile ranges for numeric outcomes). For testing of statistical significance of difference between two groups we used the Pearson´s Chi-Quadrat test and Mann-Whitney test, where appropriate. The effect size for the 2x2 analyses was estimated using Odds ratios. Findings: The paper analyses the financial success of gaming and not gaming projects and tries to identify key factors for successful funding. We found a statistically significant higher prevalence of successful financing in game development projects, with 2.3 times higher odds of successful funding compared to not-gaming industry. Our analysis of quantitative indicators such as the number of backers, goal amount, pledged amount, pledged amount per backer and pledged to goal ratio also showed that projects in game development statistically outperformed projects in other categories. Promising game projects were supported by three times more backers on average and attained almost as double funds as other projects, while still sporting more modest pledged amounts per backer. These findings support the notion of crowdfunding being a viable modality of financing independent game development in emerging economies. Contribution: This paper expands the existing research related to the crowdfunding platforms and indie development companies and formulates key factors for successful financing for technology startup firms.


Author(s):  
Suraj Rakesh Gupta

Abstract: Phishing is a crime that involves the theft of personal information from users. Individuals, corporations, cloud storage, and government websites are all targets for the phishing websites. Anti-phishing technologies based on hardware are commonly utilised, while software-based options are preferred due to cost and operational considerations. Current phishing detection systems have no solution for problems like zero-day phishing assaults. To address these issues, a three-phase attack detection system called the Phishing Attack Detector based on Web Crawler was suggested, which uses a recurrent neural network to precisely detect phishing incidents. Based on the classification of phishing and non-phishing pages, it covers the input features Web traffic, web content, and Uniform Resource Locator (URL). Keywords: Attack detection, Recurrent Neural Network, Deep Learning.


Author(s):  
Palika Jajoo

Web crawling is the method in which the topics and information is browsed in the world wide web and then it is stored in big storing device from where it can be accessed by the user as per his need. This paper will explain the use of web crawling in digital world and how does it make difference for the search engine. There are a variety of web crawling available which is explained in brief in this paper. Web crawler has many advantages over other traditional methods of searching information online. Many tools are made available which supports web crawling and makes the process easy.


Sign in / Sign up

Export Citation Format

Share Document