scholarly journals It's more than just overlap: Text As Graph

Author(s):  
Ronald Haentjens Dekker ◽  
David J. Birnbaum

The XML tree paradigm has several well-known limitations for document modeling and processing. Some of these have received a lot of attention (especially overlap), and some have received less (e.g., discontinuity, simultaneity, transposition, white space as crypto-overlap). Many of these have work-arounds, also well known, but—as is implicit in the term “work-around”—these work-arounds have disadvantages. Because they get the job done, however, and because XML has a large user community with diverse levels of technological expertise, it is difficult to overcome inertia and move to a technology that might offer a more comprehensive fit with the full range of document structures with which researchers need to interact both intellectually and programmatically. A high-level analysis of why XML has the limitations it has can enable us to explore how an alternative model of Text as Graph (TAG) might address these types of structures and tasks in a more natural and idiomatic way than is available within an XML paradigm.

Author(s):  
Elvira Albert ◽  
Pablo Gordillo ◽  
Benjamin Livshits ◽  
Albert Rubio ◽  
Ilya Sergey
Keyword(s):  

2019 ◽  
Vol 632 ◽  
pp. A72
Author(s):  
L. Mohrmann ◽  
A. Specovius ◽  
D. Tiziani ◽  
S. Funk ◽  
D. Malyshev ◽  
...  

In classical analyses of γ-ray data from imaging atmospheric Cherenkov telescopes (IACTs), such as the High Energy Stereoscopic System (H.E.S.S.), aperture photometry, or photon counting, is applied in a (typically circular) region of interest (RoI) encompassing the source. A key element in the analysis is to estimate the amount of background in the RoI due to residual cosmic ray-induced air showers in the data. Various standard background estimation techniques have been developed in the last decades, most of them rely on a measurement of the background from source-free regions within the observed field of view. However, in particular in the Galactic plane, source analysis and background estimation are hampered by the large number of, sometimes overlapping, γ-ray sources and large-scale diffuse γ-ray emission. For complicated fields of view, a three-dimensional (3D) likelihood analysis shows the potential to be superior to classical analysis. In this analysis technique, a spectromorphological model, consisting of one or multiple source components and a background component, is fitted to the data, resulting in a complete spectral and spatial description of the field of view. For the application to IACT data, the major challenge of such an approach is the construction of a robust background model. In this work, we apply the 3D likelihood analysis to various test data recently made public by the H.E.S.S. collaboration, using the open analysis frameworks ctools and Gammapy. First, we show that, when using these tools in a classical analysis approach and comparing to the proprietary H.E.S.S. analysis framework, virtually identical high-level analysis results, such as field-of-view maps and spectra, are obtained. We then describe the construction of a generic background model from data of H.E.S.S. observations, and demonstrate that a 3D likelihood analysis using this background model yields high-level analysis results that are highly compatible with those obtained from the classical analyses. This validation of the 3D likelihood analysis approach on experimental data is an important step towards using this method for IACT data analysis, and in particular for the analysis of data from the upcoming Cherenkov Telescope Array (CTA).


2019 ◽  
Vol 9 (9) ◽  
pp. 1827 ◽  
Author(s):  
Je Yeon Lee ◽  
Seung-Ho Choi ◽  
Jong Woo Chung

Precise evaluation of the tympanic membrane (TM) is required for accurate diagnosis of middle ear diseases. However, making an accurate assessment is sometimes difficult. Artificial intelligence is often employed for image processing, especially for performing high level analysis such as image classification, segmentation and matching. In particular, convolutional neural networks (CNNs) are increasingly used in medical image recognition. This study demonstrates the usefulness and reliability of CNNs in recognizing the side and perforation of TMs in medical images. CNN was constructed with typically six layers. After random assignment of the available images to the training, validation and test sets, training was performed. The accuracy of the CNN model was consequently evaluated using a new dataset. A class activation map (CAM) was used to evaluate feature extraction. The CNN model accuracy of detecting the TM side in the test dataset was 97.9%, whereas that of detecting the presence of perforation was 91.0%. The side of the TM and the presence of a perforation affect the activation sites. The results show that CNNs can be a useful tool for classifying TM lesions and identifying TM sides. Further research is required to consider real-time analysis and to improve classification accuracy.


2014 ◽  
Vol 556-562 ◽  
pp. 3949-3951
Author(s):  
Jian Xin Zhu

Data mining is a technique that aims to analyze and understand large source data reveal knowledge hidden in the data. It has been viewed as an important evolution in information processing. Why there have been more attentions to it from researchers or businessmen is due to the wide availability of huge amounts of data and imminent needs for turning such data into valuable information. During the past decade or over, the concepts and techniques on data mining have been presented, and some of them have been discussed in higher levels for the last few years. Data mining involves an integration of techniques from database, artificial intelligence, machine learning, statistics, knowledge engineering, object-oriented method, information retrieval, high-performance computing and visualization. Essentially, data mining is high-level analysis technology and it has a strong purpose for business profiting. Unlike OLTP applications, data mining should provide in-depth data analysis and the supports for business decisions.


2015 ◽  
Vol 51 ◽  
pp. 1373-1382 ◽  
Author(s):  
Marcelo de Paiva Guimarães ◽  
Bruno Barberi Gnecco ◽  
Diego Roberto Colombo Dias ◽  
Jośe Remo Ferreira Brega ◽  
Luis Carlos Trevelin

Author(s):  
Nieraj Singh ◽  
Dean Pucsek ◽  
Jonah Wall ◽  
Celina Gibbs ◽  
Martin Salois ◽  
...  

2007 ◽  
Vol 6 (3) ◽  
pp. 215-232 ◽  
Author(s):  
Niklas Elmqvist ◽  
Philippas Tsigas

We present CiteWiz, an extensible framework for visualization of scientific citation networks. The system is based on a taxonomy of citation database usage for researchers, and provides a timeline visualization for overviews and an influence visualization for detailed views. The timeline displays the general chronology and importance of authors and articles in a citation database, whereas the influence visualization is implemented using the Growing Polygons technique, suitably modified to the context of browsing citation data. Using the latter technique, hierarchies of articles with potentially very long citation chains can be graphically represented. The visualization is augmented with mechanisms for parent–child visualization and suitable interaction techniques for interacting with the view hierarchy and the individual articles in the dataset. We also provide an interactive concept map for keywords and co-authorship using a basic force-directed graph layout scheme. A formal user study indicates that CiteWiz is significantly more efficient than traditional database interfaces for high-level analysis tasks relating to influence and overviews, and equally efficient for low-level tasks such as finding a paper and correlating bibliographical data.


2021 ◽  
Vol 251 ◽  
pp. 03021
Author(s):  
Yo Sato ◽  
Sam Cunliffe ◽  
Frank Meier ◽  
Anze Zupanc

The Belle II experiment is an upgrade to the Belle experiment, and is located at the SuperKEKB facility in KEK, Tsukuba, Japan. The Belle II software is completely new and is used for everything from triggering data, generation of Monte Carlo events, tracking, clustering, to high-level analysis. One important feature is the matching between the combinations of reconstructed objects which form particle candidates and the underlying simulated particles from the event generators. This is used to study detector effects, analysis backgrounds, and efficiencies. This document describes the algorithm that is used by Belle II.


Sign in / Sign up

Export Citation Format

Share Document