web document
Recently Published Documents


TOTAL DOCUMENTS

356
(FIVE YEARS 33)

H-INDEX

20
(FIVE YEARS 2)

2022 ◽  
Vol 40 (2) ◽  
pp. 1-40
Author(s):  
Tung Vuong ◽  
Salvatore Andolina ◽  
Giulio Jacucci ◽  
Tuukka Ruotsalo

We study the effect of contextual information obtained from a user’s digital trace on Web search performance. Contextual information is modeled using Dirichlet–Hawkes processes (DHP) and used in augmenting Web search queries. The context is captured by monitoring all naturally occurring user behavior using continuous 24/7 recordings of the screen and associating the context with the queries issued by the users. We report a field study in which 13 participants installed a screen recording and digital activity monitoring system on their laptops for 14 days, resulting in data on all Web search queries and the associated context data. A query augmentation (QAug) model was built to expand the original query with semantically related terms. The effects of context window and source were determined by training context models with temporally varying context windows and varying application sources. The context models were then utilized to re-rank the QAug model. We evaluate the context models by using the Web document rankings of the original query as a control condition compared against various experimental conditions: (1) a search context condition in which the context was sourced from search history; (2) a non-search context condition in which the context was sourced from all interactions excluding search history; (3) a comprehensive context condition in which the context was sourced from both search and non-search histories; and (4) an application-specific condition in which the context was sourced from interaction histories captured on a specific application type. Our results indicated that incorporating more contextual information significantly improved Web search rankings as measured by the positions of the documents on which users clicked in the search result pages. The effects and importance of different context windows and application sources, along with different query types are analyzed, and their impact on Web search performance is discussed.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Bin Li ◽  
Ting Zhang

In order to obtain the scene information of the ordinary football game more comprehensively, an algorithm of collecting the scene information of the ordinary football game based on web documents is proposed. The commonly used T-graph web crawler model is used to collect the sample nodes of a specific topic in the football game scene information and then collect the edge document information of the football game scene information topic after the crawling stage of the web crawler. Using the feature item extraction algorithm of semantic analysis, according to the similarity of the feature items, the feature items of the football game scene information are extracted to form a web document. By constructing a complex network and introducing the local contribution and overlap coefficient of the community discovery feature selection algorithm, the features of the web document are selected to realize the collection of football game scene information. Experimental results show that the algorithm has high topic collection capabilities and low computational cost, the average accuracy of equilibrium is always around 98%, and it has strong quantification capabilities for web crawlers and communities.


2021 ◽  
Vol 36 (4) ◽  
pp. 1081-1112
Author(s):  
Mohaddeseh Mahjoob ◽  
Faezeh Ensan ◽  
Sanaz Keshvari ◽  
Parastoo Jafarzadeh ◽  
Mohammadamin keyvanzad ◽  
...  

Author(s):  
Youngseok Lee ◽  
Jungwon Cho

In this paper, we propose a web document ranking method using topic modeling for effective information collection and classification. The proposed method is applied to the document ranking technique to avoid duplicated crawling when crawling at high speed. Through the proposed document ranking technique, it is feasible to remove redundant documents, classify the documents efficiently, and confirm that the crawler service is running. The proposed method enables rapid collection of many web documents; the user can search the web pages with constant data update efficiently. In addition, the efficiency of data retrieval can be improved because new information can be automatically classified and transmitted. By expanding the scope of the method to big data based web pages and improving it for application to various websites, it is expected that more effective information retrieval will be possible.


2021 ◽  
pp. 40-51
Author(s):  
Antonio M. Rinaldi ◽  
Cristiano Russo ◽  
Cristian Tommasino

2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Christophe Dubois

Over the last 15 years, the working context of lawyers has undergone many changes. Evolving in an increasingly competitive, deregulated, and globalized market, they are subject to higher tax pressure while being exposed to unbridled technological innovation. Indeed, a growing number of entrepreneurs are using digital solutions to provide online legal services that are supposed to be faster and cheaper. If many of them are nonlawyer legal entrepreneurs, many lawyers are also engineering innovative projects and launching their own start-up companies, known as “LegalTech” or “LawTech.” However, few studies—or none to our limited knowledge—provide an empirically grounded analysis of such projects, leaving some questions unanswered. Who are these entrepreneurial lawyers? How and why do they engineer and develop LegalTech projects? How do they challenge the legal profession? To answer these questions, this article draws on a qualitative study of three contrasted start-ups Belgian lawyers have recently developed. The research methodology combines gray and scientific literature reviews, web-document (hereafter “manifestos”) analysis, and semi-directive interviews led with the start-up’s founders (n = 5), the Bar Association’s representatives (n = 3), and some members of the main Belgian LegalTech network (n = 4).  


Sign in / Sign up

Export Citation Format

Share Document