web document Latest Research Papers

Does More Context Help? Effects of Context Window and Application Source on Retrieval Performance

ACM Transactions on Information Systems ◽

10.1145/3474055 ◽

2022 ◽

Vol 40 (2) ◽

pp. 1-40

Author(s):

Tung Vuong ◽

Salvatore Andolina ◽

Giulio Jacucci ◽

Tuukka Ruotsalo

Keyword(s):

Web Search ◽

User Behavior ◽

Contextual Information ◽

Search Performance ◽

Context Condition ◽

Experimental Conditions ◽

Search Queries ◽

Web Document ◽

Context Models ◽

Search History

We study the effect of contextual information obtained from a user’s digital trace on Web search performance. Contextual information is modeled using Dirichlet–Hawkes processes (DHP) and used in augmenting Web search queries. The context is captured by monitoring all naturally occurring user behavior using continuous 24/7 recordings of the screen and associating the context with the queries issued by the users. We report a field study in which 13 participants installed a screen recording and digital activity monitoring system on their laptops for 14 days, resulting in data on all Web search queries and the associated context data. A query augmentation (QAug) model was built to expand the original query with semantically related terms. The effects of context window and source were determined by training context models with temporally varying context windows and varying application sources. The context models were then utilized to re-rank the QAug model. We evaluate the context models by using the Web document rankings of the original query as a control condition compared against various experimental conditions: (1) a search context condition in which the context was sourced from search history; (2) a non-search context condition in which the context was sourced from all interactions excluding search history; (3) a comprehensive context condition in which the context was sourced from both search and non-search histories; and (4) an application-specific condition in which the context was sourced from interaction histories captured on a specific application type. Our results indicated that incorporating more contextual information significantly improved Web search rankings as measured by the positions of the documents on which users clicked in the search result pages. The effects and importance of different context windows and application sources, along with different query types are analyzed, and their impact on Web search performance is discussed.

An Algorithm of Scene Information Collection in General Football Matches Based on Web Documents

Security and Communication Networks ◽

10.1155/2021/5801631 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Bin Li ◽

Ting Zhang

Keyword(s):

Semantic Analysis ◽

Computational Cost ◽

Web Crawler ◽

Web Documents ◽

Web Document ◽

Football Game ◽

Average Accuracy ◽

Local Contribution ◽

Scene Information ◽

The Web

In order to obtain the scene information of the ordinary football game more comprehensively, an algorithm of collecting the scene information of the ordinary football game based on web documents is proposed. The commonly used T-graph web crawler model is used to collect the sample nodes of a specific topic in the football game scene information and then collect the edge document information of the football game scene information topic after the crawling stage of the web crawler. Using the feature item extraction algorithm of semantic analysis, according to the similarity of the feature items, the feature items of the football game scene information are extracted to form a web document. By constructing a complex network and introducing the local contribution and overlap coefficient of the community discovery feature selection algorithm, the features of the web document are selected to realize the collection of football game scene information. Experimental results show that the algorithm has high topic collection capabilities and low computational cost, the average accuracy of equilibrium is always around 98%, and it has strong quantification capabilities for web crawlers and communities.

Semantic Conceptual Relational Similarity Based Web Document Clustering for Efficient Information Retrieval Using Semantic Ontology

KSII Transactions on Internet and Information Systems ◽

10.3837/tiis.2021.09.001 ◽

2021 ◽

Vol 15 (9) ◽

Keyword(s):

Information Retrieval ◽

Document Clustering ◽

Web Document ◽

Relational Similarity ◽

Efficient Information ◽

Web Document Clustering ◽

Semantic Ontology

Web Document Encoding for Structure-Aware Keyphrase Extraction

Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval ◽

10.1145/3404835.3463067 ◽

2021 ◽

Author(s):

Jihyuk Kim ◽

Young-In Song ◽

Seung-won Hwang

Keyword(s):

Keyphrase Extraction ◽

Web Document

Extraction of Effective Textual and Semantic Features in Learning to Rank for Web Document Retrieval

Iranian Journal of Information Processing and Management ◽

10.52547/jipm.36.4.1081 ◽

2021 ◽

Vol 36 (4) ◽

pp. 1081-1112

Author(s):

Mohaddeseh Mahjoob ◽

Faezeh Ensan ◽

Sanaz Keshvari ◽

Parastoo Jafarzadeh ◽

Mohammadamin keyvanzad ◽

...

Keyword(s):

Learning To Rank ◽

Document Retrieval ◽

Semantic Features ◽

Web Document

Web document classification using topic modeling based document ranking

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i3.pp2386-2392 ◽

2021 ◽

Vol 11 (3) ◽

pp. 2386

Author(s):

Youngseok Lee ◽

Jungwon Cho

Keyword(s):

Topic Modeling ◽

High Speed ◽

Data Retrieval ◽

Web Pages ◽

Web Documents ◽

Document Ranking ◽

Web Document ◽

New Information ◽

Data Update ◽

Ranking Technique

In this paper, we propose a web document ranking method using topic modeling for effective information collection and classification. The proposed method is applied to the document ranking technique to avoid duplicated crawling when crawling at high speed. Through the proposed document ranking technique, it is feasible to remove redundant documents, classify the documents efficiently, and confirm that the crawler service is running. The proposed method enables rapid collection of many web documents; the user can search the web pages with constant data update efficiently. In addition, the efficiency of data retrieval can be improved because new information can be automatically classified and transmitted. By expanding the scope of the method to big data based web pages and improving it for application to various websites, it is expected that more effective information retrieval will be possible.

CDA: a Cost Efficient Content-based Multilingual Web Document Aligner

10.18653/v1/2021.eacl-main.266 ◽

2021 ◽

Author(s):

Thuy Vu ◽

Alessandro Moschitti

Keyword(s):

Web Document ◽

Cost Efficient

Visual Query Posing in Multimedia Web Document Retrieval

2021 IEEE 15th International Conference on Semantic Computing (ICSC) ◽

10.1109/icsc50631.2021.00086 ◽

2021 ◽

Author(s):

Antonio M. Rinaldi ◽

Cristiano Russo ◽

Cristian Tommasino

Keyword(s):

Document Retrieval ◽

Web Document ◽

Visual Query

Web Document Categorization Using Knowledge Graph and Semantic Textual Topic Detection

10.1007/978-3-030-86970-0_4 ◽

2021 ◽

pp. 40-51

Author(s):

Antonio M. Rinaldi ◽

Cristiano Russo ◽

Cristian Tommasino

Keyword(s):

Knowledge Graph ◽

Topic Detection ◽

Web Document ◽

Document Categorization

How do Lawyers Engineer and Develop LegalTech Projects? A Story of Opportunities, Platforms, Creative Rationalities, and Strategies

Law, Technology and Humans ◽

10.5204/lthj.v3i1.1558 ◽

2020 ◽

Vol 2 (2) ◽

Author(s):

Christophe Dubois

Keyword(s):

Qualitative Study ◽

Technological Innovation ◽

Research Methodology ◽

Scientific Literature ◽

Legal Services ◽

Limited Knowledge ◽

Web Document ◽

Start Up ◽

Literature Reviews ◽

Start Ups

Over the last 15 years, the working context of lawyers has undergone many changes. Evolving in an increasingly competitive, deregulated, and globalized market, they are subject to higher tax pressure while being exposed to unbridled technological innovation. Indeed, a growing number of entrepreneurs are using digital solutions to provide online legal services that are supposed to be faster and cheaper. If many of them are nonlawyer legal entrepreneurs, many lawyers are also engineering innovative projects and launching their own start-up companies, known as “LegalTech” or “LawTech.” However, few studies—or none to our limited knowledge—provide an empirically grounded analysis of such projects, leaving some questions unanswered. Who are these entrepreneurial lawyers? How and why do they engineer and develop LegalTech projects? How do they challenge the legal profession? To answer these questions, this article draws on a qualitative study of three contrasted start-ups Belgian lawyers have recently developed. The research methodology combines gray and scientific literature reviews, web-document (hereafter “manifestos”) analysis, and semi-directive interviews led with the start-up’s founders (n = 5), the Bar Association’s representatives (n = 3), and some members of the main Belgian LegalTech network (n = 4).

web document
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Does More Context Help? Effects of Context Window and Application Source on Retrieval Performance

An Algorithm of Scene Information Collection in General Football Matches Based on Web Documents

Semantic Conceptual Relational Similarity Based Web Document Clustering for Efficient Information Retrieval Using Semantic Ontology

Web Document Encoding for Structure-Aware Keyphrase Extraction

Extraction of Effective Textual and Semantic Features in Learning to Rank for Web Document Retrieval

Web document classification using topic modeling based document ranking

CDA: a Cost Efficient Content-based Multilingual Web Document Aligner

Visual Query Posing in Multimedia Web Document Retrieval

Web Document Categorization Using Knowledge Graph and Semantic Textual Topic Detection

How do Lawyers Engineer and Develop LegalTech Projects? A Story of Opportunities, Platforms, Creative Rationalities, and Strategies

Export Citation Format

web documentRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Does More Context Help? Effects of Context Window and Application Source on Retrieval Performance

An Algorithm of Scene Information Collection in General Football Matches Based on Web Documents

Semantic Conceptual Relational Similarity Based Web Document Clustering for Efficient Information Retrieval Using Semantic Ontology

Web Document Encoding for Structure-Aware Keyphrase Extraction

Extraction of Effective Textual and Semantic Features in Learning to Rank for Web Document Retrieval

Web document classification using topic modeling based document ranking

CDA: a Cost Efficient Content-based Multilingual Web Document Aligner

Visual Query Posing in Multimedia Web Document Retrieval

Web Document Categorization Using Knowledge Graph and Semantic Textual Topic Detection

How do Lawyers Engineer and Develop LegalTech Projects? A Story of Opportunities, Platforms, Creative Rationalities, and Strategies

web document
Recently Published Documents