scholarly journals CENTAURUS: A Dynamic Parser Generator for Parallel Ad Hoc Data Extraction

2020 ◽  
Vol 28 (0) ◽  
pp. 724-732
Author(s):  
Shigeyuki Sato ◽  
Hiroka Ihara ◽  
Kenjiro Taura
2020 ◽  
pp. 5-9
Author(s):  
Manasvi Srivastava ◽  
◽  
Vikas Yadav ◽  
Swati Singh ◽  
◽  
...  

The Internet is the largest source of information created by humanity. It contains a variety of materials available in various formats such as text, audio, video and much more. In all web scraping is one way. It is a set of strategies here in which we get information from the website instead of copying the data manually. Many Web-based data extraction methods are designed to solve specific problems and work on ad-hoc domains. Various tools and technologies have been developed to facilitate Web Scraping. Unfortunately, the appropriateness and ethics of using these Web Scraping tools are often overlooked. There are hundreds of web scraping software available today, most of them designed for Java, Python and Ruby. There is also open source software and commercial software. Web-based software such as YahooPipes, Google Web Scrapers and Firefox extensions for Outwit are the best tools for beginners in web cutting. Web extraction is basically used to cut this manual extraction and editing process and provide an easy and better way to collect data from a web page and convert it into the desired format and save it to a local or archive directory. In this paper, among others the kind of scrub, we focus on those techniques that extract the content of a Web page. In particular, we use scrubbing techniques for a variety of diseases with their own symptoms and precautions.


2020 ◽  
Vol 10 (18) ◽  
pp. 6291
Author(s):  
Dorina Lauritano ◽  
Giulia Moreo ◽  
Luisa Limongelli ◽  
Michele Nardone ◽  
Francesco Carinci

(1) Introduction: The novel respiratory syndrome coronavirus 2 (SARS-CoV-2), also called coronavirus disease 2019 (COVID-19), is rapidly spreading in many countries and represents a public health emergency of international concern. The SARS-CoV-2 transmission mainly occurs from person-to-person via respiratory droplets (direct transmission route), leading to the onset of mild or severe symptoms or even causing death. Since COVID-19 is able to survive also on inanimate surfaces for extended periods, constituting an indirect transmission route, healthcare settings contaminated surfaces should be submitted to specific disinfection protocols. Our review aimed to investigate the existing disinfection measures of healthcare settings surfaces, preventing the nosocomial transmission of SARS-CoV-2. (2) Materials and Methods: We conducted electronic research on PubMed, Scopus, Science Direct, and Cochrane Library, and 120 items were screened for eligibility. Only 11 articles were included in the review and selected for data extraction. (3) Results: All the included studies proposed the use of ethanol at different concentrations (70% or 75%) as a biocidal agent against SARS-CoV-2, which has the capacity to reduce the viral activity by 3 log10 or more after 1 min of exposure. Other disinfection protocols involved the use of chlorine-containing disinfectant, 0.1% and 0.5% sodium hypochlorite, quaternary ammonium in combination with 75% ethanol, isopropyl alcohol 70%, glutardialdehyde 2%, ultraviolet light (UV-C) technology, and many others. Two studies suggested to use the Environmental Protection Agency (EPA)-registered disinfectants, while one article chooses to follow the WST-512-2016 Guidance of Environmental and Surfaces Cleaning, Disinfection and Infection Control in Hospitals. (4) Conclusion: Different surface disinfection methods proved to reduce the viral activity of SARS-CoV-2, preventing its indirect nosocomial transmission. However, more specific cleaning measures, ad hoc for the different settings of the healthcare sector, need to be formulated.


2021 ◽  
Vol 8 (3) ◽  
pp. 140-144
Author(s):  
G Midhu Bala ◽  
K Chitra

Web scraping is the process of automatically extracting multiple WebPages from the World Wide Web. It is a field with active developments that shares a common goal with text processing, the semantic web vision, semantic understanding, machine learning, artificial intelligence and human- computer interactions. Current web scraping solutions range from requiring human effort, the ad-hoc, and to fully automated systems that are able to extract the required unstructured information, convert into structured information, with limitations. This paper describes a method for developing a web scraper using R programming that locates files on a website and then extracts the filtered data and stores it. The modules used and the algorithm of automating the navigation of a website via links are mentioned in this paper. Further it can be used for data analytics.


Author(s):  
Chris Radbone ◽  
James Farrow

ABSTRACT ObjectivesSA NT DataLink’s Next Generation Linkage Management System (NGLMS) provides a novel approach to handling privileged or sensitive data for certain projects without having to replicate or duplicate databases and work to protect privacy. The NGLMS is a collection of records (nodes) and relationship (edges) that forms a graph (in the computer science sense) and is designed to support a mix-and-match and layered approach to data linkage projects. The NGLMS allows for the needs of different clients to be managed from the one graph data set while preserving privacy and honouring the requirement to protect sensitive information without having to relink or duplicate data. ApproachThe NGLMS uses a layer-based approach to project description and design. Projects, a specific data linkage request for example, are composed of various data layers. The data layers consist of data sets, link information in the form of pairwise relationships. These layers are coupled with quality information, e.g. acceptable similarity thresholds and/or the types of relationships to consider as ‘linking’ two records, to construct an effective virtual data set which may be different for each project. A project can be constructed by composing existing linkage data (where it already exists) without having to perform new linkage comparisons. ResultsA case study will be discussed where a data set containing extremely sensitive information (record pairings revealing name changes due to family court proceeding and protection orders) was received for incorporation into the data pool. This information is sensitive for which the particular data custodian who supplied the information would wish to have honoured by only incorporating their records for approved analysis, and otherwise excluded for other non-authorised analysis. By placing these data into a separate layer to be included in some projects and not others the sensitive nature of the data can be accommodated and its effects ‘turned on and off’ at will. ConclusionThe flexible on-demand nature of data extraction and late clustering in the NGLMS Graph based approach the linkage allows for ad-hoc project construction and the dynamic inclusion and exclusion of data without the overhead of relinking data.


Author(s):  
F. Bruno ◽  
A. Lagudi ◽  
G. Ritacco ◽  
M. Muzzupappa ◽  
R. Guida

Remotely Operated underwater Vehicles (ROVs) play an important role in a number of operations conducted in shallow and deep water (e.g.: exploration, survey, intervention, etc.), in several application fields like marine science, offshore construction, and underwater archeology. ROVs are usually equipped with different imaging devices, both optical and acoustic. Optical sensors are able to generate better images in close range and clear water conditions, while acoustic systems are usually employed in long range acquisitions and do not suffer from the presence of turbidity, a well-known cause of coarser resolution and harder data extraction. In this work we describe the preliminary steps in the development of an opto-acoustic camera able to provide an on-line 3D reconstruction of the acquired scene. Taking full advantage of the benefits arising from the opto-acoustic data fusion techniques, the system was conceived as a support tool for ROV operators during the navigation in turbid waters, or in operations conducted by means of mechanical manipulators. <br><br> The paper presents an overview of the device, an <i>ad-hoc</i> methodology for the extrinsic calibration of the system and a custom software developed to control the opto-acoustic camera and supply the operator with visual information.


2021 ◽  
Author(s):  
Marc-André Maheu-Cadotte ◽  
Véronique Dubé ◽  
Sylvie Cossette ◽  
Alexandra Lapierre ◽  
Guillaume Fontaine ◽  
...  

BACKGROUND Based on ethical and methodological arguments, numerous calls have been made to increase end-user involvement in serious game (SG) development. Involving end-users is seen as a way to give them power and control over an educational software designed for them. It can also help identify areas for improvement in SG design and improve its efficacy on targeted learning outcomes. However, no recognized guidelines or framework exist to guide end-user involvement in SG development. OBJECTIVE To describe how end-users are involved in the development of SGs for healthcare professions education. METHODS We examined the literature presenting the development of 45 SGs that had reached the stage of efficacy evaluation in randomized trials. One author performed data extraction using an ad hoc form based on an SG design framework. Data were then coded and synthesized based on similarities. The coding scheme was refined iteratively with the involvement of a second author. Results are presented using frequencies and percentages. RESULTS End-user involvement was mentioned in the development of 21/45 SGs. The number of end-users involved ranged from 12 to 36. End-users were often involved to answer specific concerns that arose during the SG development (n = 6) or in the testing of a prototype (n = 12). In many cases, researchers solicited input from end-users regarding the goals to reach (n = 10) or the functional esthetics of the SGs (n = 7). Most researchers used self-reported questionnaires (n = 7). CONCLUSIONS Researchers mention end-user involvement, which is also poorly described, in the development of less than half of SGs identified. This represents significant limitations to the evaluation of the impact of their involvement on SG efficacy and in making recommendations.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Hua Liang ◽  
Yanhong Shang ◽  
Sha Wang

The vehicle-mounted self-organizing network is a part of the MANET network. It is placed between the roadside vehicle and the fixed communication equipment. It can serve as a hub for road vehicles and can enable multihorsepower wireless mechanisms to exchange data between vehicles. This article is aimed at studying the DTN routing protocol based on machine learning in the vehicle self-organizing network. When data is forwarded, the node will determine the forwarding route selection according to its own coordinate information, the coordinate information of neighboring nodes, and the coordinate information of the destination node. Usually, the purpose is for the geographic coordinates of the node to be stored in the data packet. And data packets are periodically transmitted between nodes on each network. So that when you publish your own coordinate nodes, you can update the location information of nearby nodes at any time. This paper proposes that routing technology has become one of the most important challenges in vehicle self-organization, and there are many reasons for this. These reasons include frequent changes in the network topology and fast-moving mobile nodes. The experimental results in this paper show that more than 67% of the network data is obtained through the Gawk data extraction tool to quantify GPSR performance indicators and obtain the average driving speed of the current vehicle node. When increasing, the average end-to-end transmission delay of the GPSR routing protocol increases, and the average transmission rate decreases.


Author(s):  
W.J. de Ruijter ◽  
M.R. McCartney ◽  
David J. Smith ◽  
J.K. Weiss

Further advances in resolution enhancement of transmission electron microscopes can be expected from digital processing of image data recorded with slow-scan CCD cameras. Image recording with these new cameras is essential because of their high sensitivity, extreme linearity and negligible geometric distortion. Furthermore, digital image acquisition allows for on-line processing which yields virtually immediate reconstruction results. At present, the most promising techniques for exit-surface wave reconstruction are electron holography and the recently proposed focal variation method. The latter method is based on image processing applied to a series of images recorded at equally spaced defocus.Exit-surface wave reconstruction using the focal variation method as proposed by Van Dyck and Op de Beeck proceeds in two stages. First, the complex image wave is retrieved by data extraction from a parabola situated in three-dimensional Fourier space. Then the objective lens spherical aberration, astigmatism and defocus are corrected by simply dividing the image wave by the wave aberration function calculated with the appropriate objective lens aberration coefficients which yields the exit-surface wave.


Sign in / Sign up

Export Citation Format

Share Document