Working with Semistructured Data

2019 ◽  
pp. 147-175
Author(s):  
Dmitry Anoshin ◽  
Dmitry Shirokov ◽  
Donna Strok
Keyword(s):  

1997 ◽  
Vol 26 (4) ◽  
pp. 24-31 ◽  
Author(s):  
Jason McHugh ◽  
Jennifer Widom




2021 ◽  
Vol 1 (2) ◽  
pp. 65-77
Author(s):  
T. E. Vildanov ◽  
◽  
N. S. Ivanov ◽  

This article explores both popular and newly invented tools for extracting data from sites and converting them into a form suitable for analysis. The paper compares the Python libraries, the key criterion of the compared tools is their performance. The results will be grouped by sites, tools used and number of iterations, and then presented in graphical form. The scientific novelty of the research lies in the field of application of data extraction tools: we will receive and transform semistructured data from the websites of bookmakers and betting exchanges. The article also describes new tools that are currently not in great demand in the field of parsing and web scraping. As a result of the study, quantitative metrics were obtained for all the tools used and the libraries that were most suitable for the rapid extraction and processing of information in large quantities were selected.



Author(s):  
Kartik Menon ◽  
Sanjay Madria ◽  
A. Badia


2018 ◽  
pp. 1553-1555
Author(s):  
Gillian Dobbie ◽  
Tok Wang Ling




Author(s):  
Tetsuhiro Miyahara ◽  
Yusuke Suzuki ◽  
Takayoshi Shoudai ◽  
Tomoyuki Uchida ◽  
Sachio Hirokawa ◽  
...  


Sign in / Sign up

Export Citation Format

Share Document