Automated Discovery of Network Cameras in Heterogeneous Web Pages

Ryan Dailey; Aniesh Chawla; Andrew Liu; Sripath Mishra; Ling Zhang; Josh Majors; Yung-Hsiang Lu; George K. Thiruvathukal

doi:10.1145/3450629

Automated Discovery of Network Cameras in Heterogeneous Web Pages

ACM Transactions on Internet Technology ◽

10.1145/3450629 ◽

2022 ◽

Vol 22 (1) ◽

pp. 1-25

Author(s):

Ryan Dailey ◽

Aniesh Chawla ◽

Andrew Liu ◽

Sripath Mishra ◽

Ling Zhang ◽

...

Keyword(s):

Emergency Response ◽

Web Pages ◽

Traffic Patterns ◽

Web Page ◽

The World ◽

Automated Discovery ◽

Many Sources ◽

The Cost ◽

Network Camera ◽

Programming Interface

Reduction in the cost of Network Cameras along with a rise in connectivity enables entities all around the world to deploy vast arrays of camera networks. Network cameras offer real-time visual data that can be used for studying traffic patterns, emergency response, security, and other applications. Although many sources of Network Camera data are available, collecting the data remains difficult due to variations in programming interface and website structures. Previous solutions rely on manually parsing the target website, taking many hours to complete. We create a general and automated solution for aggregating Network Camera data spread across thousands of uniquely structured web pages. We analyze heterogeneous web page structures and identify common characteristics among 73 sample Network Camera websites (each website has multiple web pages). These characteristics are then used to build an automated camera discovery module that crawls and aggregates Network Camera data. Our system successfully extracts 57,364 Network Cameras from 237,257 unique web pages.

Download Full-text

A Simulation of the Structure of the World-Wide Web

Sociological Research Online ◽

10.5153/sro.684 ◽

2002 ◽

Vol 7 (1) ◽

pp. 9-25 ◽

Cited By ~ 2

Author(s):

Moses Boudourides ◽

Gerasimos Antypas

Keyword(s):

World Wide Web ◽

Power Law ◽

Web Sites ◽

World Wide ◽

The Internet ◽

Web Pages ◽

Small Worlds ◽

Web Page ◽

Simple Simulation ◽

The World

In this paper we are presenting a simple simulation of the Internet World-Wide Web, where one observes the appearance of web pages belonging to different web sites, covering a number of different thematic topics and possessing links to other web pages. The goal of our simulation is to reproduce the form of the observed World-Wide Web and of its growth, using a small number of simple assumptions. In our simulation, existing web pages may generate new ones as follows: First, each web page is equipped with a topic concerning its contents. Second, links between web pages are established according to common topics. Next, new web pages may be randomly generated and subsequently they might be equipped with a topic and be assigned to web sites. By repeated iterations of these rules, our simulation appears to exhibit the observed structure of the World-Wide Web and, in particular, a power law type of growth. In order to visualise the network of web pages, we have followed N. Gilbert's (1997) methodology of scientometric simulation, assuming that web pages can be represented by points in the plane. Furthermore, the simulated graph is found to possess the property of small worlds, as it is the case with a large number of other complex networks.

Download Full-text

PREDICTION OF DESIGN ASPECTS OF WEB PAGE BY HTML PARSER

International Journal of Engineering Technologies and Management Research ◽

10.29121/ijetmr.v5.i2.2018.157 ◽

2020 ◽

Vol 5 (2) ◽

pp. 143-158

Author(s):

Satinder Kaur ◽

Sunil Gupta

Keyword(s):

User Interaction ◽

Web Pages ◽

Evaluation Tool ◽

Web Page ◽

Design Quality ◽

Automated Evaluation ◽

The World ◽

Before And After ◽

The Web

Inform plays a very important role in life and nowadays, the world largely depends on the World Wide Web to obtain any information. Web comprises of a lot of websites of every discipline, whereas websites consists of web pages which are interlinked with each other with the help of hyperlinks. The success of a website largely depends on the design aspects of the web pages. Researchers have done a lot of work to appraise the web pages quantitatively. Keeping in mind the importance of the design aspects of a web page, this paper aims at the design of an automated evaluation tool which evaluate the aspects for any web page. The tool takes the HTML code of the web page as input, and then it extracts and checks the HTML tags for the uniformity. The tool comprises of normalized modules which quantify the measures of design aspects. For realization, the tool has been applied on four web pages of distinct sites and design aspects have been reported for comparison. The tool will have various advantages for web developers who can predict the design quality of web pages and enhance it before and after implementation of website without user interaction.

Download Full-text

Legal Protection of the Web Page as a Database

Database Technologies ◽

10.4018/978-1-60566-058-5.ch157 ◽

2009 ◽

pp. 2616-2631

Author(s):

Davide Mula ◽

Mirko Luca Lobina

Keyword(s):

Intellectual Property ◽

Legal Protection ◽

World Intellectual Property Organization ◽

Web Pages ◽

Future Trends ◽

Web Page ◽

European Directive ◽

The World ◽

The Web

Nowadays the Web page is one of the most common medium used by people, institutions, and companies to promote themselves, to share knowledge, and to get through to every body in every part of the world. In spite of that, the Web page does not entitle one to a specific legal protection and because of this, every investment of time and money that stays off-stage is not protected by an unlawfully used. Seeing that no country in the world has a specific legislation on this issue in this chapter, we develop a theory that wants to give legal protection to Web pages using laws and treatment that are just present. In particular, we have developed a theory that considers Web pages as a database, so extends a database’s legal protection to Web pages. We start to analyze each component of a database and to find them in a Web page so that we can compare those juridical goods. After that, we analyze present legislation concerning databases and in particular, World Intellectual Property Organization Copyright Treatments and European Directive 96/92/CE, which we consider as the better legislation in this field. In the end, we line future trends that seem to appreciate and apply our theory.

Download Full-text

A Signal-Representation-Based Parser to Extract Text-Based Information from the Web

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2010.p0531 ◽

2010 ◽

Vol 14 (5) ◽

pp. 531-539

Author(s):

Mu-Chun Su ◽

◽

Shao-Jui Wang ◽

Chen-Ko Huang ◽

Pa-ChunWang ◽

...

Keyword(s):

Web Services ◽

World Wide ◽

Information Sources ◽

State Of The Art ◽

Value Added ◽

Web Pages ◽

Web Page ◽

Web Information ◽

The World ◽

The Web

Most of the dramatically increased amount of information available on the World Wide Web is provided via HTML and formatted for human browsing rather than for software programs. This situation calls for a tool that automatically extracts information from semistructured Web information sources, increasing the usefulness of value-added Web services. We present a signal-representation-based parser (SIRAP) that breaks Web pages up into logically coherent groups - groups of information related to an entity, for example. Templates for records with different tag structures are generated incrementally by a Histogram-Based Correlation Coefficient (HBCC) algorithm, then records on a Web page are detected efficiently using templates generated by matching. Hundreds of Web pages from 17 state-of-the-art search engines were used to demonstrate the feasibility of our approach.

Download Full-text

Web Algorithms for Information Retrieval

International Journal of Mobile Computing and Multimedia Communications ◽

10.4018/ijmcmc.2014010101 ◽

2014 ◽

Vol 6 (1) ◽

pp. 1-16

Author(s):

Bouchra Frikh ◽

Brahim Ouhbi

Keyword(s):

World Wide ◽

Information Dissemination ◽

Quality Information ◽

Web Pages ◽

Web Page ◽

Page Rank ◽

Ranking Algorithms ◽

Internet Users ◽

The World ◽

The Web

The World Wide Web has emerged to become the biggest and most popular way of communication and information dissemination. Every day, the Web is expending and people generally rely on search engine to explore the web. Because of its rapid and chaotic growth, the resulting network of information lacks of organization and structure. It is a challenge for service provider to provide proper, relevant and quality information to the internet users by using the web page contents and hyperlinks between web pages. This paper deals with analysis and comparison of web pages ranking algorithms based on various parameters to find out their advantages and limitations for ranking web pages and to give the further scope of research in web pages ranking algorithms. Six important algorithms: the Page Rank, Query Dependent-PageRank, HITS, SALSA, Simultaneous Terms Query Dependent-PageRank (SQD-PageRank) and Onto-SQD-PageRank are presented and their performances are discussed.

Download Full-text

Mother Nature’s Revolution and the COVID-19 Pandemic: A Scooping Review of COVID-19 and the Environment

European Scientific Journal ESJ ◽

10.19044/esj.2021.v17n19p100 ◽

2021 ◽

Vol 17 (19) ◽

pp. 100

Author(s):

Endurance Uzobo ◽

Stanley E. Boroh

Keyword(s):

United States Of America ◽

Negative Impact ◽

The United States ◽

Online News ◽

Web Pages ◽

Web Page ◽

Environmental Behaviour ◽

Secondary Sources ◽

The World ◽

Almost All

Many studies have solely focused on the negative impact of the coronavirus while ignoring the fact that the coronavirus was also a blessing in disguise to certain institutions. This study focuses on an exploration of some of the environmental-related benefits accruing from the outbreak of the coronavirus, which eventually led many countries of the world to declare a national lockdown. The study utilised secondary sources from 21 articles gleaned from hand-searched literature from various web pages and online news, accessed through google web page (google.com) between March to September 2020. Key search words used in the search were COVID-19 and the environment, benefits of COVID-19 to the environment, the environmental impact of COVID-19, and the environmental behaviour during COVID-19. The study reported that some positive benefits of COVID-19 concerning the environment from China, the United States of America, Europe, and Africa. Findings from the review indicated that almost all the continents in the world have experienced improve environmental quality as a result of the outbreak of the coronavirus. The study further added that one of the most important dividends arising from the outbreak is the positive change in behaviour in people towards the environment. It was, therefore, recommended that there is a need for nations of the world to leverage the window of opportunity provided by the coronavirus to encourage green economic behaviour to save the environment.

Download Full-text

Eccentric Methodology with Optimization to Unearth Hidden Facts of Search Engine Result Pages

Recent Patents on Computer Science ◽

10.2174/2213275911666181115093050 ◽

2019 ◽

Vol 12 (2) ◽

pp. 110-119 ◽

Cited By ~ 3

Author(s):

Jayaraman Sethuraman ◽

Jafar A. Alzubi ◽

Ramachandran Manikandan ◽

Mehdi Gheisari ◽

Ambeshwar Kumar

Keyword(s):

Search Engine ◽

World Wide ◽

Optimization Techniques ◽

Web Pages ◽

Web Page ◽

Search Engine Optimization ◽

The World ◽

Search Engine Result ◽

The Web ◽

New Framework

Background: The World Wide Web houses an abundance of information that is used every day by billions of users across the world to find relevant data. Website owners employ webmasters to ensure their pages are ranked top in search engine result pages. However, understanding how the search engine ranks a website, which comprises numerous web pages, as the top ten or twenty websites is a major challenge. Although systems have been developed to understand the ranking process, a specialized tool based approach has not been tried. Objective: This paper develops a new framework and system that process website contents to determine search engine optimization factors. Methods: To analyze the web page dynamically by assessing the web site content based on specific keywords, elimination method was used in an attempt to reveal various search engine optimization techniques. Conclusion: Our results lead to conclude that the developed system is able to perform a deeper analysis and find factors which play a role in bringing the site on the top of the list.

Download Full-text

Interactive Proxy for URL Correction

Issues of Human Computer Interaction ◽

10.4018/978-1-59140-191-9.ch005 ◽

2011 ◽

pp. 72-84

Author(s):

Kai-Hsiang Yang

Keyword(s):

Web Sites ◽

World Wide ◽

Web Pages ◽

Web Content ◽

Web Page ◽

Proxy Servers ◽

Personal Favorite ◽

The World ◽

Cache Miss ◽

The Web

This chapter will address the issues of Uniform Resource Locator (URL) correction techniques in proxy servers. The proxy servers are more and more important in the World Wide Web (WWW), and they provide Web page caches for browsing the Web pages quickly, and also reduce unnecessary network traffic. Traditional proxy servers use the URL to identify their cache, and it is a cache-miss when the request URL is non-existent in its caches. However, for general users, there must be some regularity and scope in browsing the Web. It would be very convenient for users when they do not need to enter the whole long URL, or if they still could see the Web content even though they forgot some part of the URL, especially for those personal favorite Web sites. We will introduce one URL correction mechanism into the personal proxy server to achieve this goal.

Download Full-text

Leveraging Semantic Markups for Incorporating External Resources Data to the Content of a Web Page

Russian Digital Libraries Journal ◽

10.26907/1562-5419-2020-23-3-494-513 ◽

2020 ◽

Vol 23 (3) ◽

pp. 494-513

Author(s):

Evgeny L’vovich Kitaev ◽

Rimma Yuryevna Skornyakova

Keyword(s):

Programming Languages ◽

World Wide ◽

Software Tool ◽

Web Pages ◽

Web Page ◽

External Resources ◽

Semantic Data ◽

The World ◽

Programming Skills ◽

Application Developers

The semantic markups of the World Wide Web have accumulated a large amount of data and their number continues to grow. However, the potential of these data is, in our opinion, not fully utilized. The semantic markups contents are widely used by search systems, partly by social networks, but the usual approach to using that data by application developers is based on converting data to RDF standard and executing SPARQL queries, which requires good knowledge of this language and programming skills. In this paper, we propose to leverage the semantic markups available on the Web to automatically incorporate their contents to the content of other web pages. We also present a software tool for implementing such incorporation that does not require a web page developer to have knowledge of any programming languages other than HTML and CSS. The developed tool does not require installation, the work is performed by JavaScript plugins. Currently, the tool supports semantic data contained in the popular types of semantic markups “microdata” and JSON-LD, in the tags of HTML documents and the properties of Word and PDF documents.

Download Full-text

Template-Based Delta Compression of Large Scale Web Pages

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.2666 ◽

2013 ◽

Vol 347-350 ◽

pp. 2666-2672

Author(s):

Kai Lei ◽

Guang Yu Sun ◽

Lian En Huang

Keyword(s):

Control Systems ◽

Large Scale ◽

World Wide ◽

Web Pages ◽

Web Page ◽

Clustering Techniques ◽

The World ◽

Version Control Systems ◽

Delta Compression ◽

The Web

Delta compression techniques are commonly used in the context of version control systems and the World Wide Web. They are used to compactly encode the differences between two files or strings in order to reduce communication or storage costs. In this paper, we study the use of delta compression in compressing massive web pages according to the similarity of their templates. We propose a framework for template-based delta compression which uses template-based clustering techniques to find the web pages that have similar templates and then encode their differences with delta compression techniques to reduce the storage cost. We also propose a filter-based optimization of Diff algorithm to improve the efficiency of the delta compression approach. To demonstrate the efficiency of our approach, we present experimental results on massive web pages. Our experiments show that template-based delta compression achieves significant improvements in compression ratio as compared to individually compressing each web page.

Download Full-text