Proceedings of Balisage: The Markup Conference 2011
Latest Publications


TOTAL DOCUMENTS

35
(FIVE YEARS 0)

H-INDEX

3
(FIVE YEARS 0)

Published By Mulberry Technologies, Inc.

9781935958031

Author(s):  
Ronald P. Reck ◽  
Kenneth B. Sall ◽  
Wendy A. Swanbeck

As music is a topic of interest to many, it is no surprise that developers have applied web and semantic technology to provide various RDF datasets for describing relationships among musical artists, albums, songs, genres, and more. As avid fans of blues and rock music, we wondered if we could construct SPARQL queries to examine properties and relationships between performers in order to answer global questions such as "Who has had the greatest impact on rock music?" Our primary focus was Eric Clapton, a musical artist with a decades-spanning career who has enjoyed both a very successful solo career as well as having performed in several world-renowned bands. The application of semantic technology to a public dataset can provide useful insights into how similar approaches can be applied to realistic domain problems, such as finding relationships between persons of interest. Clearly understood semantics of available RDF properties in the dataset is of course crucial but is a substantial challenge especially when leveraging information from similar yet different data sources. This paper explores the use of DBpedia and MusicBrainz data sources using OpenLink Virtuoso Universal Server with a Drupal frontend. Much attention is given to the challenges we encountered, especially with respect to relatively large datasets of community-entered open data sources of varying quality and the strategies we employed or recommend to overcome the challenges.


Author(s):  
Walter E. Perry

The IPSA RE (Identity, Provenance, Structure, Aptitude, Revision, and Events) architecture describes and processes the linkages that relate document resources within a network or graph of nodes. The architecture can be formalized by manipulation of document resource nodes by each of the four fundamental RESTful verbs. The system under development handles identity authentication by realizing identity as a chain of provenance or as a unique structure of context rather than relying on any internal attributes of a node to establish either that node’s identity or the identity of any entity for which that node is a proxy.


Author(s):  
David A. Lee

JSON and XML are seen by some as competing markup formats for content and data. JSON has become predominant in the mobile and browser domains while XML dominates the Server, Enterprise and Document domains. Where these domains meet and need to exchange information there is pressure for one domain to impose on the other their markup format. JXON is an architecture that addresses this problem by providing for high quality bidirectional transformations between XML and JSON. Previous approaches provide for only a single mapping intended to cover all cases, but generally cover few cases well. JXON uses Schema and annotations to allow highly customizable transformations that can be tuned for individual schemas, elements, attributes and types yet still be easily configured.


Author(s):  
O'Neil Davion Delpratt ◽  
Michael Kay

This paper attempts to analyze the performance benefits that are achievable by adding a code generation phase to an XSLT or XQuery engine. This is not done in isolation, but in comparison with the benefits delivered by high-level query rewriting. The two techniques are complementary and independent, but can compete for resources in the development team, so it is useful to understand their relative importance. We use the Saxon XSLT/XQuery processor as a case study, where we can now translate the logic of queries into Java bytecode. We provide an experimental evaluation of the performance of Saxon with the addition of this feature compared to the existing Saxon product. Saxon's Enterprise Edition already delivers a performance benefit over the open source product using the join optimizer and other features. What can we learn from these to achieve further performance gains through direct byte code generation?


Author(s):  
Ravit H. David ◽  
Shahin Ezzat Sahebi ◽  
Bartek Kawula ◽  
Dileshni Jayasinghe

This paper is a summary of the experience of local loading of XML Ebooks on the Scholars Portal Ebook platform. It discusses the problems and potential of local loading that emerged from a pilot loading of over 500 titles; the first stage of this pilot was completed in February 2011. More specifically, the paper will review the difficulties encountered during the various stages of the loading, starting from loading files on our MarkLogic server, then the presentation of content via XSLT and ending with transforming the table of contents to achieve functionalities, such as lone-chapter downloading. We will also touch upon our web reader and the features developed to enhance the reading experience of XML Ebooks. Our conclusion is that with the gradual increase in publishers’ switching from PDF to XML format, the need to have a standard for XML Ebooks increases, as well; local loading of XML Ebooks in their current format suggests that much programming work will be called for in order to arrive at the best presentation of the content. Finally, we will suggest that once a satisfying web-based presentation of XML Ebooks is achieved, there will still be an urgent need to develop good readers in order to provide a friendly reading experience.


Author(s):  
Daan Broeder ◽  
Oliver Schonefeld ◽  
Thorsten Trippel ◽  
Dieter Van Uytvanck ◽  
Andreas Witt

XML has been designed for creating structured documents, but the information that is encoded in these structures are, by definition, out of scope for XML. Additional sources, normally not easily interpretable by computers, such as documentation are needed to determine the intention of specific tags in a tag-set. The Component Metadata Infrastructure (CMDI) takes a rather pragmatic approach to foster interoperability between XML instances in the domain of metadata descriptions for language resources. This paper gives an overview of this approach.


Author(s):  
Allen H. Renear ◽  
Richard J. Urban ◽  
Karen M. Wickett

What does markup mean? We may have some intuitive sense of the intent when markup uses terminology that looks familiar, but does it really say what we think it does? Looking at a specialized case, metadata, the authors examine the formal logical implications behind seemingly familiar metadata records.


Author(s):  
Jacques Durand ◽  
Hyunbo Cho ◽  
Dale Moberg ◽  
Jungyub Woo

XML has proved to be a scalable archival format for messages of various kinds (e.g. email with MarkMail). It is also increasingly used as format of choice for several event models and taxonomies (XES, OASIS/SAF, CEE, XDAS) that need be both processable and human readable. As many eBusiness processes are also relying on XML for message content and/or protocol, there is a need to monitor and validate the messages and documents being exchanged as well as their sequences. XTemp is an XML vocabulary and execution language that is event-centric and intended for the analysis of sequence of events that represent traces of business processes. It is designed for both log analysis and real-time execution. It leverages XPath and XSLT.


Author(s):  
Eric van der Vlist

We all know (and worry) about SQL injection, but should we also worry about XQuery injection? With the power of extension functions and the implementation of XQuery update features, the answer is clearly yes! We will see how an attacker can send information to an external site or erase a collection through XQuery injection on a naive and unprotected application using the eXist-db REST API. That's the bad news... The good news is that it's quite easy to protect your application from XQuery injection after this word of warning. We'll discuss a number of simple techniques (literal string escaping, wrapping values into elements or moving them out of queries in HTTP parameters) and see how to implement them in different environments covering traditional programming languages, XSLT, XForms and pipeline languages.


Author(s):  
Julien Seinturier ◽  
Elisabeth Murisasco ◽  
Emmanuel Bruno

This paper presents an XML engine defined to model and query multimodal concurrent annotated data. This work stands in the context of the OTIM (Tools for Multimodal Annotation) project which aims at developing conventions and tools for multimodal annotation of a large conversational French speech corpus; it groups together Social Science and Computer Science researchers. Within OTIM, our objective is to provide linguists with a unique framework to encode and manipulate numerous linguistic domains: morpho-syntax, prosody, phonetics, disfluencies, discourse, gesture and posture. For that, it has to be possible to bring together and align all the different pieces of information (called annotations) associated to a corpus. We propose a complete pipeline from the annotation step to the management of the data within an XML Information System. This pipeline first relies on the formalisation of the linguistic knowledge and data within a OTIM specific XML format. A Java framework is proposed for interfacing with both linguists specific annotation tools and XML Information System. Finally, the querying of multimodal annotations within the XML information system using XQuery is presented. As annotations are time aligned, an extension of XQuery to Allen temporal relations is proposed. The paper conclude on a discussion about the interest of a pure XML approach for linguistic annotations information system and the question of the integration of the semantic within the pipeline.


Sign in / Sign up

Export Citation Format

Share Document