Proceedings of Balisage: The Markup Conference 2011
Latest Publications


TOTAL DOCUMENTS

35
(FIVE YEARS 0)

H-INDEX

3
(FIVE YEARS 0)

Published By Mulberry Technologies, Inc.

9781935958031

Author(s):  
Ronald P. Reck ◽  
Kenneth B. Sall ◽  
Wendy A. Swanbeck

As music is a topic of interest to many, it is no surprise that developers have applied web and semantic technology to provide various RDF datasets for describing relationships among musical artists, albums, songs, genres, and more. As avid fans of blues and rock music, we wondered if we could construct SPARQL queries to examine properties and relationships between performers in order to answer global questions such as "Who has had the greatest impact on rock music?" Our primary focus was Eric Clapton, a musical artist with a decades-spanning career who has enjoyed both a very successful solo career as well as having performed in several world-renowned bands. The application of semantic technology to a public dataset can provide useful insights into how similar approaches can be applied to realistic domain problems, such as finding relationships between persons of interest. Clearly understood semantics of available RDF properties in the dataset is of course crucial but is a substantial challenge especially when leveraging information from similar yet different data sources. This paper explores the use of DBpedia and MusicBrainz data sources using OpenLink Virtuoso Universal Server with a Drupal frontend. Much attention is given to the challenges we encountered, especially with respect to relatively large datasets of community-entered open data sources of varying quality and the strategies we employed or recommend to overcome the challenges.


Author(s):  
Walter E. Perry

The IPSA RE (Identity, Provenance, Structure, Aptitude, Revision, and Events) architecture describes and processes the linkages that relate document resources within a network or graph of nodes. The architecture can be formalized by manipulation of document resource nodes by each of the four fundamental RESTful verbs. The system under development handles identity authentication by realizing identity as a chain of provenance or as a unique structure of context rather than relying on any internal attributes of a node to establish either that node’s identity or the identity of any entity for which that node is a proxy.


Author(s):  
David A. Lee

JSON and XML are seen by some as competing markup formats for content and data. JSON has become predominant in the mobile and browser domains while XML dominates the Server, Enterprise and Document domains. Where these domains meet and need to exchange information there is pressure for one domain to impose on the other their markup format. JXON is an architecture that addresses this problem by providing for high quality bidirectional transformations between XML and JSON. Previous approaches provide for only a single mapping intended to cover all cases, but generally cover few cases well. JXON uses Schema and annotations to allow highly customizable transformations that can be tuned for individual schemas, elements, attributes and types yet still be easily configured.


Author(s):  
O'Neil Davion Delpratt ◽  
Michael Kay

This paper attempts to analyze the performance benefits that are achievable by adding a code generation phase to an XSLT or XQuery engine. This is not done in isolation, but in comparison with the benefits delivered by high-level query rewriting. The two techniques are complementary and independent, but can compete for resources in the development team, so it is useful to understand their relative importance. We use the Saxon XSLT/XQuery processor as a case study, where we can now translate the logic of queries into Java bytecode. We provide an experimental evaluation of the performance of Saxon with the addition of this feature compared to the existing Saxon product. Saxon's Enterprise Edition already delivers a performance benefit over the open source product using the join optimizer and other features. What can we learn from these to achieve further performance gains through direct byte code generation?


Author(s):  
Ravit H. David ◽  
Shahin Ezzat Sahebi ◽  
Bartek Kawula ◽  
Dileshni Jayasinghe

This paper is a summary of the experience of local loading of XML Ebooks on the Scholars Portal Ebook platform. It discusses the problems and potential of local loading that emerged from a pilot loading of over 500 titles; the first stage of this pilot was completed in February 2011. More specifically, the paper will review the difficulties encountered during the various stages of the loading, starting from loading files on our MarkLogic server, then the presentation of content via XSLT and ending with transforming the table of contents to achieve functionalities, such as lone-chapter downloading. We will also touch upon our web reader and the features developed to enhance the reading experience of XML Ebooks. Our conclusion is that with the gradual increase in publishers’ switching from PDF to XML format, the need to have a standard for XML Ebooks increases, as well; local loading of XML Ebooks in their current format suggests that much programming work will be called for in order to arrive at the best presentation of the content. Finally, we will suggest that once a satisfying web-based presentation of XML Ebooks is achieved, there will still be an urgent need to develop good readers in order to provide a friendly reading experience.


Author(s):  
Daan Broeder ◽  
Oliver Schonefeld ◽  
Thorsten Trippel ◽  
Dieter Van Uytvanck ◽  
Andreas Witt

XML has been designed for creating structured documents, but the information that is encoded in these structures are, by definition, out of scope for XML. Additional sources, normally not easily interpretable by computers, such as documentation are needed to determine the intention of specific tags in a tag-set. The Component Metadata Infrastructure (CMDI) takes a rather pragmatic approach to foster interoperability between XML instances in the domain of metadata descriptions for language resources. This paper gives an overview of this approach.


Author(s):  
Allen H. Renear ◽  
Richard J. Urban ◽  
Karen M. Wickett

What does markup mean? We may have some intuitive sense of the intent when markup uses terminology that looks familiar, but does it really say what we think it does? Looking at a specialized case, metadata, the authors examine the formal logical implications behind seemingly familiar metadata records.


Author(s):  
Jacques Durand ◽  
Hyunbo Cho ◽  
Dale Moberg ◽  
Jungyub Woo

XML has proved to be a scalable archival format for messages of various kinds (e.g. email with MarkMail). It is also increasingly used as format of choice for several event models and taxonomies (XES, OASIS/SAF, CEE, XDAS) that need be both processable and human readable. As many eBusiness processes are also relying on XML for message content and/or protocol, there is a need to monitor and validate the messages and documents being exchanged as well as their sequences. XTemp is an XML vocabulary and execution language that is event-centric and intended for the analysis of sequence of events that represent traces of business processes. It is designed for both log analysis and real-time execution. It leverages XPath and XSLT.


Author(s):  
Eric van der Vlist

We all know (and worry) about SQL injection, but should we also worry about XQuery injection? With the power of extension functions and the implementation of XQuery update features, the answer is clearly yes! We will see how an attacker can send information to an external site or erase a collection through XQuery injection on a naive and unprotected application using the eXist-db REST API. That's the bad news... The good news is that it's quite easy to protect your application from XQuery injection after this word of warning. We'll discuss a number of simple techniques (literal string escaping, wrapping values into elements or moving them out of queries in HTTP parameters) and see how to implement them in different environments covering traditional programming languages, XSLT, XForms and pipeline languages.


Author(s):  
Wendell Piez

Markup languages that attempt not only to support particular applications, but to provide encoding standards for decentralized communities, face a particular problem: how do they adapt to new requirements for data description? The most usual approach is a schema extensibility mechanism, but many projects avoid them, since they fork the local application from the core tag set, complicating implementation, maintenance, and document interchange and thus undermining many of the advantages of using a standard. Yet the easy alternative, creatively reusing and abusing available elements and attributes, is even worse: it introduces signal disguised as noise, degrades the semantics of repurposed elements and hides the interchange problem without solving it. This dilemma follows from the way we have conceived of our models for text. If designing an encoding format for one work must compromise its fitness for any other – because the clean and powerful descriptive markup for one kind of text is inevitably unsuitable for another – we will always be our own worst enemies. Yet texts “in the wild” are purposefully divergent in the structures, features and affordances of their design at both micro and macro levels. This suggests that at least in tag sets intended for wide use across decentralized communities, we must support design innovation not only in the schema, but in the instance – in particular documents and sets of documents. By defining, in the schema, a set of abstract generic elements for microformats, we can appropriate tag abuse (at one time making it unnecessary and capturing the initiative it represents), expose significant and useful semantic variation, and support bottom-up development of new semantic types.


Sign in / Sign up

Export Citation Format

Share Document