web pages
Recently Published Documents





2022 ◽  
Vol 22 (1) ◽  
pp. 1-25
Ryan Dailey ◽  
Aniesh Chawla ◽  
Andrew Liu ◽  
Sripath Mishra ◽  
Ling Zhang ◽  

Reduction in the cost of Network Cameras along with a rise in connectivity enables entities all around the world to deploy vast arrays of camera networks. Network cameras offer real-time visual data that can be used for studying traffic patterns, emergency response, security, and other applications. Although many sources of Network Camera data are available, collecting the data remains difficult due to variations in programming interface and website structures. Previous solutions rely on manually parsing the target website, taking many hours to complete. We create a general and automated solution for aggregating Network Camera data spread across thousands of uniquely structured web pages. We analyze heterogeneous web page structures and identify common characteristics among 73 sample Network Camera websites (each website has multiple web pages). These characteristics are then used to build an automated camera discovery module that crawls and aggregates Network Camera data. Our system successfully extracts 57,364 Network Cameras from 237,257 unique web pages.

Meftah Mohammed Charaf Eddine

In the field of machine translation of texts, the ambiguity in both lexical (dictionary) and structural aspects is still one of the difficult problems. Researchers in this field use different approaches, the most important of which is machine learning in its various types. The goal of the approach that we propose in this article is to define a new concept of electronic text, which makes the electronic text free from any lexical or structural ambiguity. We used a semantic coding system that relies on attaching the original electronic text (via the text editor interface) with the meanings intended by the author. The author defines the meaning desired for each word that can be a source of ambiguity. The proposed approach in this article can be used with any type of electronic text (text processing applications, web pages, email text, etc.). Thanks to the approach that we propose and through the experiments that we have conducted using it, we can obtain a very high accuracy rate. We can say that the problem of lexical and structural ambiguity can be completely solved. With this new concept of electronic text, the text file contains not only the text but also with it the true sense of the exact meaning intended by the writer in the form of symbols. These semantic symbols are used during machine translation to obtain a translated text completely free of any lexical and structural ambiguity.

حنان الصادق بيزان

Social networking is one of the most recently used technologies because of its advantages, spread and interaction. It is one of the most prominent applications of the second generation Web 2.0, which has effectively imposed itself on the users of the internet. Facebook network comes second after search engine at the global level, "Google". It is noted that they are highly efficient in providing information services and representation of information institutions and facilities in the virtual world. it is agreed that the progress of the societies is measured according to their ability to free and fast access to information and to use it to generate knowledge that reaches wisdom, progress and excellence. At this point, the importance of studies of information is shown in general and Webometrics in particular, which means that the set of statistical methods and measurements used to study the quantitative and qualitative aspects of information resources, structures, uses and techniques on the web, is found to be bibliometric studies designed to study and analyze reference citations, can be applied to the information resources available on the web such as the links of web pages and the use of those sites. Therefore, the study aims to monitor students' attitudes towards the use of social networking sites in general, and the Facebook page of the department of information studies of the Libyan Academy particularly. To identify the view of the students of the information management division and the management of the archive to their identify satisfaction with the information services provided by the page, and the extent of knowledge of the links of electronic sources of information, and the extent to meet their needs and scientific desires, and to what extent related to academic and research interests.

AI & Society ◽  
2022 ◽  
Lise Jaillant ◽  
Annalina Caputo

AbstractCo-authored by a Computer Scientist and a Digital Humanist, this article examines the challenges faced by cultural heritage institutions in the digital age, which have led to the closure of the vast majority of born-digital archival collections. It focuses particularly on cultural organizations such as libraries, museums and archives, used by historians, literary scholars and other Humanities scholars. Most born-digital records held by cultural organizations are inaccessible due to privacy, copyright, commercial and technical issues. Even when born-digital data are publicly available (as in the case of web archives), users often need to physically travel to repositories such as the British Library or the Bibliothèque Nationale de France to consult web pages. Provided with enough sample data from which to learn and train their models, AI, and more specifically machine learning algorithms, offer the opportunity to improve and ease the access to digital archives by learning to perform complex human tasks. These vary from providing intelligent support for searching the archives to automate tedious and time-consuming tasks.  In this article, we focus on sensitivity review as a practical solution to unlock digital archives that would allow archival institutions to make non-sensitive information available. This promise to make archives more accessible does not come free of warnings for potential pitfalls and risks: inherent errors, "black box" approaches that make the algorithm inscrutable, and risks related to bias, fake, or partial information. Our central argument is that AI can deliver its promise to make digital archival collections more accessible, but it also creates new challenges - particularly in terms of ethics. In the conclusion, we insist on the importance of fairness, accountability and transparency in the process of making digital archives more accessible.

First Monday ◽  
2022 ◽  
Antoinette Fage-Butler ◽  
Loni Ledderer ◽  
Niels Brügger

This article uses Internet archives to explore the emergence and spread of the term ‘mHealth’ (mobile health technologies) in the Danish Web domain from 2006 to 2018, focusing on the actors that contributed to its evolution. We propose three methods for investigating the Web pages and Web sites that employed the term ‘mHealth’. Our findings highlight temporal developments in the use of ‘mHealth’, with diverse actors using it, though none clearly dominated. The article attends to challenges in working with Web archive data, and presents methods that can be used by others wishing to engage empirically with Internet archives, which remain vast, but largely under-exploited resources.

2022 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Geeta Marmat

Purpose This paper aims to empirically explore the influence of website aesthetic attributes (classical and expressive) on customer brand engagement (CBE) intention. Design/methodology/approach This research develops a framework and a few research hypothesis based on available literature on the concept of aesthetics, aesthetic attribute of websites and CBE, as well as other reliable resources, relevant theories, wherever required and tested it on the data collected from 400 respondents of the Y generation (Gen Y) of India by means of structural equations modelling using SPSS AMOS 21. Findings The findings indicate that expressive aesthetics of the brand Web pages of the beauty products is positively associated with drawing attention. Expressive aesthetics and classical aesthetics together explained 16% of the variance in attention. This indicates that aesthetic attributes indeed play a role in drawing the attention of the customer. However, mere attention is not sufficient to form the behavioural intention in the customer to engage with that particular brand unless the customer does get fully absorbed with aesthetic attributes of the brand Web pages. Research limitations/implications The outcome of this research is based on the view of only 400 Gen Y individuals from the city of Indore in India. This limits its generalizability across India and other country context. This study makes important contribution to brand website aesthetic and CBE literature by empirically investigating the concept of brand website aesthetics as important in interactive marketing approach to initiate CBE intention formation. It further argues that cognitive engagement is the first and foremost engagement dimension and underscores aesthetic attributes as important in forming the customer first perception based on which subsequent CBE behavioural intention develops. Originality/value This research adds novel insight in the relationship of the brand website aesthetic attributes and CBE by studying the impact of the aesthetic attribute of brand Web pages on the two cognitive elements, namely, attention and absorption and further its effect on brand behavioural intention taking as sample of Gen Y of India.

2022 ◽  
pp. 913-934
Ryoichi Ishitobi ◽  
Fumio Nemoto ◽  
Youko Sugita ◽  
Susumu Nakamura ◽  
Toru Iijima ◽  

Most of the present authors, the teachers at the School for the Mentally Challenged at Otsuka, University of Tsukuba, have been creating original teaching aids and materials using low-tech and high-tech methods. Original teaching aids created with woodworking and metalworking are usually used for students with an intellectual disability. The original teaching materials with Grid Onput dot code, which could link multimedia, such as audio, movies, web pages, html files, and PowerPoint files were created in collaboration with one of the present authors, Professor Shigeru Ikuta, who organized a large research project, and Gridmark Inc. that developed Grid Onput dot code. The present authors have recently developed a new software program, SmileNote, to help students create presentation slides in expressing their feelings, will, and desires to classmates, teachers, and parents. Basic information on these materials and their use in schools is presented in this chapter.

2022 ◽  
Vol 12 (1) ◽  
pp. 1-18
Umamageswari Kumaresan ◽  
Kalpana Ramanujam

The intent of this research is to come up with an automated web scraping system which is capable of extracting structured data records embedded in semi-structured web pages. Most of the automated extraction techniques in the literature captures repeated pattern among a set of similarly structured web pages, thereby deducing the template used for the generation of those web pages and then data records extraction is done. All of these techniques exploit computationally intensive operations such as string pattern matching or DOM tree matching and then perform manual labeling of extracted data records. The technique discussed in this paper departs from the state-of-the-art approaches by determining informative sections in the web page through repetition of informative content rather than syntactic structure. From the experiments, it is clear that the system has identified data rich region with 100% precision for web sites belonging to different domains. The experiments conducted on the real world web sites prove the effectiveness and versatility of the proposed approach.

2022 ◽  
Vol 12 (1) ◽  
pp. 0-0

Understanding the actual need of user from a question is very crucial in non-factoid why-question answering as Why-questions are complex and involve ambiguity and redundancy in their understanding. The precise requirement is to determine the focus of question and reformulate them accordingly to retrieve expected answers to a question. The paper analyzes different types of why-questions and proposes an algorithm for each class to determine the focus and reformulate it into a query by appending focal terms and cue phrase ‘because’ with it. Further, a user interface is implemented which asks input why-question, applies different components of question , reformulates it and finally retrieve web pages by posing query to Google search engine. To measure the accuracy of the process, user feedback is taken which asks them to assign scoring from 1 to 10, on how relevant are the retrieved web pages according to their understanding. The results depict that maximum precision of 89% is achieved in Informational type why-questions and minimum of 48% in opinionated type why-questions.

2022 ◽  
pp. 394-414
Mohamed ElSayed ElAraby ◽  
Ahmed M. Anter

Web content is diverse and is regarded as the primary source of accessible information that can be accessed through reference links. Web facial images are one type of web content that relates to important web pages and is considered important information for individuals. This chapter proposes face recognition as a service architecture that is based on real-world images from the web. The proposed service is implemented as a service for other third parties via cloud computing; additionally, its architecture is built via cloud using virtual machines that can be expanded based on resource demands. Web crawlers crawl web pages and retrieve images for elastic cloud storage. The collected images are then used to remove human faces and prepare the face images for identification and identifying the matched face of the set through successive phases. This chapter used PCA for features extraction and KNN for identification. Experiments show that increasing the number of crawler instances improves crawling speed and improves face recognition accuracy by preferring Euclidean over other metrics.

Sign in / Sign up

Export Citation Format

Share Document