Flint: From Web Pages to Probabilistic Semantic Data

The World Wide Web (WWW) offers an enormous wealth of information and data, and assembles a tremendous amount of knowledge. Much of this knowledge, however, comprises either non-structured data or semistructured data. To make use of these unexploited or underexploited resources more efficiently, the management of information and data gathering has become an essential task for research and development. In this paper, the author examines the task of researching a hostel or homestay using the Google search web service as a base search engine. From the search results, mining, retrieving and sorting out location and semantic data were carried out by combining the Chinese Word Segmentation System with text mining technology to find geographic information gleaned from web pages. The results obtained from this particular searching method allowed users to get closer to the answers they sought and achieve greater accuracy, as the results included graphics and textual geographic information. In the future, this method may be suitable for and applicable to various types of queries, analyses, geographic data collection, and in managing spatial knowledge related to different keywords within a document.

Download Full-text

Leveraging Semantic Markups for Incorporating External Resources Data to the Content of a Web Page

Russian Digital Libraries Journal ◽

10.26907/1562-5419-2020-23-3-494-513 ◽

2020 ◽

Vol 23 (3) ◽

pp. 494-513

Author(s):

Evgeny L’vovich Kitaev ◽

Rimma Yuryevna Skornyakova

Keyword(s):

Programming Languages ◽

World Wide ◽

Software Tool ◽

Web Pages ◽

Web Page ◽

External Resources ◽

Semantic Data ◽

The World ◽

Programming Skills ◽

Application Developers

The semantic markups of the World Wide Web have accumulated a large amount of data and their number continues to grow. However, the potential of these data is, in our opinion, not fully utilized. The semantic markups contents are widely used by search systems, partly by social networks, but the usual approach to using that data by application developers is based on converting data to RDF standard and executing SPARQL queries, which requires good knowledge of this language and programming skills. In this paper, we propose to leverage the semantic markups available on the Web to automatically incorporate their contents to the content of other web pages. We also present a software tool for implementing such incorporation that does not require a web page developer to have knowledge of any programming languages other than HTML and CSS. The developed tool does not require installation, the work is performed by JavaScript plugins. Currently, the tool supports semantic data contained in the popular types of semantic markups “microdata” and JSON-LD, in the tags of HTML documents and the properties of Word and PDF documents.

Download Full-text

Geographic Information Retrieval and Text Mining on Chinese Tourism Web Pages

Models for Capitalizing on Web Engineering Advancements ◽

10.4018/978-1-4666-0023-2.ch012 ◽

2012 ◽

pp. 219-239

Author(s):

Ming-Cheng Tsou

Keyword(s):

Text Mining ◽

Data Gathering ◽

Geographic Information ◽

Structured Data ◽

Web Pages ◽

Chinese Word Segmentation ◽

Geographic Information Retrieval ◽

Semantic Data ◽

Amount Of Knowledge ◽

Searching Method

The World Wide Web (WWW) offers an enormous wealth of information and data, and assembles a tremendous amount of knowledge. Much of this knowledge, however, comprises either non-structured data or semi-structured data. To make use of these unexploited or underexploited resources more efficiently, the management of information and data gathering has become an essential task for research and development. In this paper, the author examines the task of researching a hostel or homestay using the Google search web service as a base search engine. From the search results, mining, retrieving and sorting out location and semantic data were carried out by combining the Chinese Word Segmentation System with text mining technology to find geographic information gleaned from web pages. The results obtained from this particular searching method allowed users to get closer to the answers they sought and achieve greater accuracy, as the results included graphics and textual geographic information. In the future, this method may be suitable for and applicable to various types of queries, analyses, geographic data collection, and in managing spatial knowledge related to different keywords within a document.

Download Full-text

ASHA News: New Web Pages for State Telepractice and Ethics Regulations

ASHA Leader ◽

10.1044/leader.an7.18112013.63 ◽

2013 ◽

Vol 18 (11) ◽

pp. 63

Keyword(s):

Web Pages

Download Full-text

Appropriate Targets for Search Advertising as Part of Online Gatekeeping for Suicide Prevention

Crisis ◽

10.1027/0227-5910/a000486 ◽

2018 ◽

Vol 39 (3) ◽

pp. 197-204 ◽

Cited By ~ 1

Author(s):

Hajime Sueki ◽

Jiro Ito

Keyword(s):

Suicide Prevention ◽

Daily Life ◽

Web Pages ◽

Service Users ◽

Online Questionnaire ◽

Consultation Service ◽

Self Disclosure ◽

Search Advertising ◽

Consultation Services ◽

Age Range

Abstract. Background: Gatekeeper training is an effective suicide prevention strategy. However, the appropriate targets of online gatekeeping have not yet been clarified. Aim: We examined the association between the outcomes of online gatekeeping using the Internet and the characteristics of consultation service users. Method: An advertisement to encourage the use of e-mail-based psychological consultation services among viewers was placed on web pages that showed the results of searches using suicide-related keywords. All e-mails received between October 2014 and December 2015 were replied to as part of gatekeeping, and the obtained data (responses to an online questionnaire and the content of the received e-mails) were analyzed. Results: A total of 154 consultation service users were analyzed, 35.7% of whom were male. The median age range was 20–29 years. Online gatekeeping was significantly more likely to be successful when such users faced financial/daily life or workplace problems, or revealed their names (including online names). By contrast, the activity was more likely to be unsuccessful when it was impossible to assess the problems faced by consultation service users. Conclusion: It may be possible to increase the success rate of online gatekeeping by targeting individuals facing financial/daily life or workplace problems with marked tendencies for self-disclosure.

Download Full-text

ADEAR Clinical Trials web pages redesigned

PsycEXTRA Dataset ◽

10.1037/e492642006-002 ◽

2004 ◽

Author(s):

Keyword(s):

Clinical Trials ◽

Web Pages

Download Full-text

Gender Differences in First Impressions of Web Pages: The Role of Attractiveness, Complexity, and Brightness on Perceived Design Quality

PsycEXTRA Dataset ◽

10.1037/e572172013-333 ◽

2012 ◽

Cited By ~ 1

Author(s):

Jo R. Jardina ◽

Mikki Phan ◽

Duy Nguyen ◽

Barbara S. Chaparro

Keyword(s):

Gender Differences ◽

First Impressions ◽

Web Pages ◽

Design Quality

Download Full-text

Template Detection Technique From Assorted Web Pages

International Journal of Scientific Research ◽

10.15373/22778179/sep2013/54 ◽

2012 ◽

Vol 2 (9) ◽

pp. 148-150 ◽

Cited By ~ 1

Author(s):

Marriboyina Rajendra ◽

◽

S. Suresh Babu

Keyword(s):

Detection Technique ◽

Web Pages

Download Full-text

A FRAME WORK FOR WEB INFORMATION EXTRACTION AND ANALYSIS

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v7i2.3459 ◽

2013 ◽

Vol 7 (2) ◽

pp. 574-579 ◽

Cited By ~ 3

Author(s):

Dr Sunitha Abburu ◽

G. Suresh Babu

Keyword(s):

Information Extraction ◽

Data Extraction ◽

Research Work ◽

Web Pages ◽

Web Documents ◽

E Learning ◽

Structured Information ◽

Frame Work ◽

Effective Decision ◽

The Web

Day by day the volume of information availability in the web is growing significantly. There are several data structures for information available in the web such as structured, semi-structured and unstructured. Majority of information in the web is presented in web pages. The information presented in web pages is semi-structured.Â But the information required for a context are scattered in different web documents. It is difficult to analyze the large volumes of semi-structured information presented in the web pages and to make decisions based on the analysis. The current research work proposed a frame work for a system that extracts information from various sources and prepares reports based on the knowledge built from the analysis. This simplifies Â data extraction, data consolidation, data analysis and decision making based on the information presented in the web pages.The proposed frame work integrates web crawling, information extraction and data mining technologies for better information analysis that helps in effective decision making.Â Â It enables people and organizations to extract information from various sourses of web and to make an effective analysis on the extracted data for effective decision making.Â The proposed frame work is applicable for any application domain. Manufacturing,sales,tourisum,e-learning are various application to menction few.The frame work is implemetnted and tested for the effectiveness of the proposed system and the results are promising.

Download Full-text

Flint: From Web Pages to Probabilistic Semantic Data

Structrued and semantic data extraction from web pages

Geographic Information Retrieval and Text Mining on Chinese Tourism Web Pages

Leveraging Semantic Markups for Incorporating External Resources Data to the Content of a Web Page

Geographic Information Retrieval and Text Mining on Chinese Tourism Web Pages

ASHA News: New Web Pages for State Telepractice and Ethics Regulations

Appropriate Targets for Search Advertising as Part of Online Gatekeeping for Suicide Prevention

ADEAR Clinical Trials web pages redesigned

Gender Differences in First Impressions of Web Pages: The Role of Attractiveness, Complexity, and Brightness on Perceived Design Quality

Template Detection Technique From Assorted Web Pages

A FRAME WORK FOR WEB INFORMATION EXTRACTION AND ANALYSIS

Export Citation Format