Readability Enhancement for PDF Documents

Readability has been studied for decades, ranging from traditional paper reading to digital document reading, Web page reading, etc. Different audiences have different needs and the needs trigger the researchers to investigate innovative solutions. For example, in recent years, researchers have studied readability enhancement of English articles for non-native English readers, either on paper reading or hypertext document reading. Using a variety of methods, researchers were able to enhance the reading comprehension and the users’ satisfaction on hypertext document reading, such as changing content presentation with visual-syntactic text formatting (VSTF) format or Jenga format. In terms of dynamically changing content presentation for reading, one less explored format is Portable Document Format (PDF), which was traditionally viewed within a modern Web browser or Adobe Acrobat reader on the desktop. PDF format was standardized as an open format in 2008 and has been widely used to keep a fixed-layout content. However, a fixed layout document presents a challenge to apply existing transformation methods, not mention on mobile devices. In this paper, we not only present a system that uses a novel algorithm to decode PDF documents and apply content transformation to enhance its readability, but we also generalize it to a framework that allows the users to apply customizations and the developers to customize their needs. Although we used Jenga format as an example to enhance the readability of PDF documents, we envision the proposed framework can be used to adopt different customizations and transformation methods. The current result is promising, and we believe it is worth further investigation to make PDF documents readable and accessible for different populations, such as non-native English readers, people with dyslexia or special needs, etc.

Download Full-text

Semi-Automatic Online Tagging with K-Medoid Clustering

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194014400075 ◽

2014 ◽

Vol 24 (08) ◽

pp. 1115-1130 ◽

Cited By ~ 1

Author(s):

He Hu ◽

Xiaoyong Du

Keyword(s):

Clustering Algorithm ◽

Prototype System ◽

Web Pages ◽

Web Page ◽

Web Browser ◽

User Input ◽

Browser Extension ◽

Annotation Process ◽

Efficiency And Effectiveness ◽

Automatic Mechanism

Online tagging is crucial for the acquisition and organization of web knowledge. We present TYG (Tag-as-You-Go) in this paper, a web browser extension for online tagging of personal knowledge on standard web pages. We investigate an approach to combine a K-Medoid-style clustering algorithm with the user input to achieve semi-automatic web page annotation. The annotation process supports user-defined tagging schema and comprises an automatic mechanism that is built upon clustering techniques, which can automatically group similar HTML DOM nodes into clusters corresponding to the user specification. TYG is a prototype system illustrating the proposed approach. Experiments with TYG show that our approach can achieve both efficiency and effectiveness in real world annotation scenarios.

Download Full-text

Content-Determined Web Page Segmentation and Navigation for Mobile Web Searching

Result Page Generation for Web Searching - Advances in Web Technologies and Engineering ◽

10.4018/978-1-7998-0961-6.ch007 ◽

2021 ◽

pp. 88-108

Keyword(s):

Web Pages ◽

Web Searching ◽

Web Page ◽

Page Segmentation ◽

Web Browser ◽

Mobile Web ◽

Desktop Computers ◽

Equal Importance ◽

Web Contents ◽

Music Player

Nowadays the usage of mobile phones is widely spread in our lifestyle; we use cell phones as a camera, a radio, a music player, and even as a web browser. Since most web pages are created for desktop computers, navigating through web pages is highly fatigued. Hence, there is a great interest in computer science to adopt such pages with rich content into small screens of our mobile devices. On the other hand, every web page has got many different parts that do not have the equal importance to the end user. Consequently, the authors propose a mechanism to identify the most useful part of a web page to a user regarding his or her search query while the information loss is avoided. The challenge here comes from the fact that long web contents cannot be easily displayed in both vertical and horizontal ways.

Download Full-text

Recognize Flexible Malevolent Web pages in Real Time

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1158.09811s19 ◽

2019 ◽

Vol 8 (11S) ◽

pp. 877-881

Keyword(s):

Web Sites ◽

Mobile Internet ◽

Web Pages ◽

Web Page ◽

Web Browser ◽

Assessment Technique ◽

Browser Extension ◽

Information Strategies ◽

Mobile Information ◽

Cellular Mobile

Mobile precise internet web sites dissent drastically from their computer laptop equivalents in cloth, format and functionality. Sooner or later, present techniques to sight detrimental net internet internet sites rectangular movement now not probably to determine for such webpages. During this paper, we often typically have a propensity to format and exercising paintings over, a mechanism that distinguishes amongst terrible and benign mobile net net web sites. Activity over makes this energy of will supported normal picks of a net internet web page beginning with the quantity of iframes to the life of identified dishonourable cellular mobile cellphone numbers. First, we have a tendency to via attempting out show the requirement for mobile information strategies so installation a spread of new regular options that very correlate with cellular malicious pages. We will be predisposed to then use work over to a dataset of over 350,000 famous benign similarly to volatile cellular webpages and show 90th accuracy in splendor. In addition, we frequently normally normally have a tendency to discover, end up aware of and furthermore document choice of websites incomprehensible through Google Safe Surfing and furthermore Virus Total, however decided through art work over. Lastly, we will be inclined to growth a web browser extension victimization undertaking over to comfortable customers from damaging mobile internet web sites in length. In doing consequently, we provide the number one everyday assessment technique to view volatile cellular webpages

Download Full-text

Understanding Traversal of a Packet in Internet

10.34048/2020.4.f3 ◽

2020 ◽

Author(s):

Ram Rustagi P

Keyword(s):

Experiential Learning ◽

Network Protocols ◽

Network Activity ◽

Local Network ◽

Application Layer ◽

Web Page ◽

Web Browser ◽

Holistic View ◽

Packet Delivery ◽

Networking Technologies

In this series of articles on Experiential Learning of Networking Technologies, we have discussed a number of network protocols starting from HTTP [7] at application layer, TCP [3] and UDP [1] protocols at transport layers that provide end to end communications, and IP addressing [2] and routing for packet delivery at network layer. We have defined a number of experiential exercises for each underlying concept which provide a practical understanding of these protocols. Now, we would like to take a holistic view of these protocols which we have learned so far and look at how all these protocols come into play when an internet user makes a simple web request, e.g., what happens from network perspective when a user enters google.com in the URL bar of a web browser [12]. From the perspective of user, web page of Google’s search interface is displayed in the browser window, but inside the network both at the user’s local network and the internet, a lot of network activity takes place. The focus of this article is to understand the traversal of packets in the network triggered by any such user activity.

Download Full-text

A PDF Tile Model for Geographic Map Data

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8090373 ◽

2019 ◽

Vol 8 (9) ◽

pp. 373

Author(s):

Xiaodong Zhou ◽

Tinghua Ai ◽

Nina Meng ◽

Peng Xie

Keyword(s):

Portable Document Format ◽

Web Map ◽

Establishment Method ◽

Software And Hardware ◽

Tile Mapping ◽

Map Data ◽

Pdf Format ◽

Map Service ◽

Quality Map ◽

Tile Model

Vector tile mapping is an important issue in web map research. At present, vector tile mapping requires the symbolization of geographic information, as supported by cartographic software, and the development of a corresponding symbolic database when web map services are provided for users. The development of PDF (portable document format) mapping makes it possible to use symbolized map data directly for web map services. This paper presents a PDF tile map model suitable for web map services, in order to provide a new solution for vector tile mapping. This paper details the construction and establishment method of the PDF tile map model and verifies that the model has the characteristics of a vector tile map and that this approach can provide web map services. In addition, this model has some other noticeable characteristics. First, the map service supported by this model does not require the support of cartographic software, and thus has low system software and hardware requirements. Second, the maps displayed in the service system can be directly used for high-quality map publishing. Third, the high popularity of the PDF format can promote the sharing of PDF tile maps and reduce the use threshold of ordinary users.

Download Full-text

Extensions of Web Browsers useful to Knowledge Workers

Next Generation Search Engines ◽

10.4018/978-1-4666-0330-1.ch011 ◽

2012 ◽

pp. 239-273

Author(s):

Sarah Vert

Keyword(s):

Knowledge Workers ◽

Knowledge Worker ◽

Working Environment ◽

The Internet ◽

Web Browsers ◽

Web Page ◽

Web Browser ◽

Internet Explorer ◽

The One ◽

The Web

This chapter focuses on the Internet working environment of Knowledge Workers through the customization of the Web browser on their computer. Given that a Web browser is designed to be used by anyone browsing the Internet, its initial configuration must meet generic needs such as reading a Web page, searching for information, and bookmarking. In the absence of a universal solution that meets the specific needs of each user, browser developers offer additional programs known as extensions, or add-ons. Among the various browsers that can be modified with add-ons, Mozilla’s Firefox is perhaps the one that first springs to mind; indeed, Mozilla has built the Firefox brand around these extensions. Using this example, and also considering the browsers Google Chrome, Internet Explorer, Opera and Safari, the author will attempt to demonstrate the potential of Web browsers in terms of the resources they can offer when they are customizable and available within the working environment of a Knowledge Worker.

Download Full-text

Nessi: An EEG-Controlled Web Browser for Severely Paralyzed Patients

Computational Intelligence and Neuroscience ◽

10.1155/2007/71863 ◽

2007 ◽

Vol 2007 ◽

pp. 1-5 ◽

Cited By ~ 51

Author(s):

Michael Bensch ◽

Ahmed A. Karim ◽

Jürgen Mellinger ◽

Thilo Hinterberger ◽

Michael Tangermann ◽

...

Keyword(s):

Self Regulation ◽

Web Pages ◽

Web Page ◽

Slow Cortical Potentials ◽

Web Browser ◽

Brain Responses ◽

Wide Range ◽

Alphabetical List ◽

E Mail ◽

Opening Up

We have previously demonstrated that an EEG-controlled web browser based on self-regulation of slow cortical potentials (SCPs) enables severely paralyzed patients to browse the internet independently of any voluntary muscle control. However, this system had several shortcomings, among them that patients could only browse within a limited number of web pages and had to select links from an alphabetical list, causing problems if the link names were identical or if they were unknown to the user (as in graphical links). Here we describe a new EEG-controlled web browser, called Nessi, which overcomes these shortcomings. In Nessi, the open source browser, Mozilla, was extended by graphical in-place markers, whereby different brain responses correspond to different frame colors placed around selectable items, enabling the user to select any link on a web page. Besides links, other interactive elements are accessible to the user, such as e-mail and virtual keyboards, opening up a wide range of hypertext-based applications.

Download Full-text

Multimedia and You

1001 Computer Words You Need to Know ◽

10.1093/oso/9780195167757.003.0017 ◽

2004 ◽

Author(s):

Jerry Pournelle

Keyword(s):

Local Network ◽

Multimedia Content ◽

Web Page ◽

Web Browser ◽

Hard Drive ◽

Media Player ◽

Media Formats

While we haven’t yet reached the age of true interactive Internet, there are many video, audio and multimedia formats you will encounter that can enrich your computing experience. The big three are Windows Media Player, Quicktime, and RealPlayer. All three are compatible with Macintosh and Windows, although you may have to download them. Each handles streaming audio and video, and downloadable formats such as MP3 and MPEG. All three programs will automatically download software, if available, when they need it to play a file, and each has plugins which are automatically installed for use by your web browser, so you can play multimedia content directly from a web page. Be aware that the three programs will battle for control over which program plays which files. All three have preference settings which will allow you to make that program the default player for your chosen formats. However, each also handles a couple of proprietary formats which the others do not, so it’s good to have all three. Besides playing audio and video, the latest versions of Windows Media Player (http://windowsmedia.com/) can help you make audio CDs or import music from CDs to your hard drive. Be aware that if you rip music from your CD collection to certain Windows Media formats, those files might not be playable on other computers. WMP plays files ending in the suffixes .wmv and .wma, among others. Quicktime (http://www.apple.com/quicktime/) and iTunes (http://www.apple.com/itunes/) together handle audio and video playing, as well as advanced functions of CD-burning and music-importing. iTunes also offers the ability to share song lists over your local network—even between Macs and PCs—to convert music to MP3s, and to interface with an iPod. Quicktime plays files ending in the suffix .mov, among others. RealOne (http://www.real.com/) is the latest version of RealPlayer, which plays its own proprietary streaming formats, as well as many of the standard formats. It comes in free and pay versions, although you may have to dig for the free version. RealOne plays files ending with the suffixes .rm and .ram, among others.

Download Full-text

Ontology for Cross-Site-Scripting (XSS) Attack in Cybersecurity

Journal of Cybersecurity and Privacy ◽

10.3390/jcp1020018 ◽

2021 ◽

Vol 1 (2) ◽

pp. 319-339

Author(s):

Jean Rosemond Dora ◽

Karol Nemoga

Keyword(s):

Web Application ◽

Credit Cards ◽

Web Page ◽

Sensitive Data ◽

Web Browser ◽

User Input ◽

Ontology Model ◽

Core Meaning ◽

Frequent Problem ◽

Cross Site

In this work, we tackle a frequent problem that frequently occurs in the cybersecurity field which is the exploitation of websites by XSS attacks, which are nowadays considered a complicated attack. These types of attacks aim to execute malicious scripts in a web browser of the client by including code in a legitimate web page. A serious matter is when a website accepts the “user-input” option. Attackers can exploit the web application (if vulnerable), and then steal sensitive data (session cookies, passwords, credit cards, etc.) from the server and/or from the client. However, the difficulty of the exploitation varies from website to website. Our focus is on the usage of ontology in cybersecurity against XSS attacks, on the importance of the ontology, and its core meaning for cybersecurity. We explain how a vulnerable website can be exploited, and how different JavaScript payloads can be used to detect vulnerabilities. We also enumerate some tools to use for an efficient analysis. We present detailed reasoning on what can be done to improve the security of a website in order to resist attacks, and we provide supportive examples. Then, we apply an ontology model against XSS attacks to strengthen the protection of a web application. However, we note that the existence of ontology does not improve the security itself, but it has to be properly used and should require a maximum of security layers to be taken into account.

Download Full-text

The application of instrumentation system on a contactless robotic triage assistant to detect early transmission on a COVID-19 suspect

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v22.i3.pp1334-1344 ◽

2021 ◽

Vol 22 (3) ◽

pp. 1334

Author(s):

Niko Azhari Hidayat ◽

Prisma Megantoro ◽

Abdufattah Yurianta ◽

Amila Sofiah ◽

Shofa Aulia Aldhama ◽

...

Keyword(s):

Medical Staff ◽

Vital Signs ◽

Portable Document Format ◽

Hand Gesture ◽

Questions And Answers ◽

Specific Care ◽

Contact Free ◽

Pdf Format ◽

Document Format

<p>This article discusses the instrumentation system of airlangga robotic triage assistant version 1 (ARTA-1), a robot used as a contact-free triage assistant for Coronavirus disease (COVID-19) suspects. The triage process consists of automatic vital signs check-up as well as the suspect’s anamnesis that in turns will determine whether the suspect will get a specific care or not. Measurements of a suspect’s vital conditions, i.e. temperature, height, and weight, are carried out with sensors integrated with the Arduino boards, while a touch-free, hand gesture questions and answers is carried out to complete anamnesis process. A portable document format (PDF) format of the triage report, which recommends what to do to the suspect, will then be automatically generated and emailed to a designated medical staff.</p>

Download Full-text