Proceedings of Balisage: The Markup Conference 2021
Latest Publications


TOTAL DOCUMENTS

31
(FIVE YEARS 31)

H-INDEX

0
(FIVE YEARS 0)

Published By Mulberry Technologies, Inc.

9781935958222

Author(s):  
C. M. Sperberg-McQueen

Eventually, things reach their limit…


Author(s):  
Gerrit Imsieke ◽  
Nina Linn Reinhardt

A JATS customization with a restricted vocabulary that is suitable for publishing metadata has been a desideratum in the JATS community. For some members, the JATS publishing customization (“Blue”) has acquired too many JATS archiving (“Green”) vocabulary items over time. Others want to have a straightforward editing schema without too many alternatives, similar to the authoring (“Pumpkin”) customization, but with support for publishing metadata. This work is an attempt to identify a commonly used subset of Blue (goal: 60% of its elements and attributes) that is able to support at least 90% of the JATS articles found in the wild, where “wild” means several hundred thousand articles sourced from publishers directly and from PubMed Central’s vast collections. In addition, this subset should also support the elements and attributes that have been added to JATS only recently and that are therefore unlikely to be found in large numbers within the articles analyzed. An attempt has been made to scrutinize vocabulary items that have been adopted from Green: Is the adoption merely a sign of the creeping “aquafication” of Blue that some suspect, or have these items really been missing in a more prescriptive and widely applicable journal tag set? Items that are considered important to modern publishing for several reasons – accessibility, open access, machine processability – have been included in this proposed subset. Also items that were underrepresented in the analyzed set of articles, but are considered fundamental to JATS, have been retained.


Author(s):  
Janelle Jenstad ◽  
Tracey El Hajj

In late 2018, the Internet Shakespeare Editions (ISE) software experienced catastrophic code failure. In this paper, we describe the boutique markup language used by the ISE (known as IML for ISE Markup Language), various fundamental differences between IML and TEI, and the challenging work of converting and remediating the ISEʼs IML-encoded files. Our central question is how to do this work in a principled, efficient, well documented, replicable, and transferable way. We conclude with recommendations for re-encoding legacy projects and stabilizing them for long-term preservation.


Author(s):  
Patrick Durusau

This proposal emerges out of conversations about introducting collaborative editing into OpenDocument Format (ODF) applications, as a type of change tracking. Vis-a-vis a document, a lone author is a lesser and included case of collaborative editing. In either case, changes have to be captured, along with their metadata, and reconciled, in the case of conflicting edits. Despite progress on the software side of collaborative editing for a variety of formats, there has been no visible progress on the capturing of changes, or their reconcilation in OpenDocument Format documents. Being habituated, not to say addicted, to markup approaches, it's understandable I find the lack of format discussions disquieting. It's all well and good to have change tracking/collaborative editing, successfully in software, but what the hell am I going to write down in ODF? How to capture changes, from one or many authors, and how to capture reconciliations are the focus of this proposal. That requires unique identification of changes (one or many authors), identifying where changes may be applied, and recording the application of changes (the resulting document).


Author(s):  
Simon St.Laurent

In the late 1990s, multiple groups had plans to transform the technology world, and especially the World Wide Web, with semantic techniques. Over the last two decades, however, semantics seem less and less eager to present themselves as markup.


Author(s):  
Martin Latterner ◽  
Dax Bamberger ◽  
Kelly Peters ◽  
Jeffrey D. Beck

PubMed Central® (PMC) is a free full-text archive of biomedical and life sciences journal literature at the U.S. National Institutes of Health's National Library of Medicine. PMC receives about 70,000 XML articles every month and uses XSLT to convert them into its preferred format. In 2021, PMC started to explore options to modernize its extensive conversion codebase leveraging XSLT 3.0. This paper describes XML conversion and its challenges at PMC. It then outlines the first approach that PMC is evaluating: breaking a single conversion operation into multiple, dynamic transformations using fn:transform, one of the powerful new tools available with XSLT 3.0.


Author(s):  
Tony Graham

Moby-Dick by Herman Melville is frequently used as the example document for EPUB and CSS applications. At around 670 pages, it is also a good choice for demonstrating the automated analysis features of AH Formatter. This presentation describes features of working with – and sometimes augmenting, sometimes correcting – the TEI source for the American first edition of Moby Dick to create a PDF version in the style of the 1851 original.


Author(s):  
Liam Quin

An important principle of writing, and of programming in particular, is that one should be able to understand any particular passage without having to look elsewhere. Of course, there may be concepts that one needs, but in literature having to consult a dictionary several times in every sentence is tedious; in software engineering, having to read function definitions before understanding the code that calls them can be dangerous. This paper describes experiments with CSS Within (a method of embedding CSS style rules into XSLT transformations) and discusses how the proximity of the rules to the corresponding element generation affects maintenance.


Author(s):  
Robin La Fontaine

"Which came first," begins an old joke. But the more interesting question might be, "does it even matter?" There are many obvious and several not-so-obvious ways in which the order of items (be they XML elements or attributes, or JSON maps or arrays) can be understood to be significant or insignificant. These are not new questions and how they’re answered plays out across vocabulary design, schema design, and individual documents. They are important questions when it comes deciding if two documents are “the same” or “different” and to what extent. This paper challenges the one-size-fits-all decree in XML that order needs to be preserved and reviews the implications of 'order'. When ordered elements can be moved then we have something that has some common ground with orderless. This paper establishes a continuum between ordered information and orderless information and proposes that these are not as far apart as they might at first appear.


Author(s):  
Michael Kay

This paper describes a novel data structure for the representation of Unicode strings, designed to efficiently support the usage patterns that arise when processing XML using languages such as XSLT, XPath, and XQuery.


Sign in / Sign up

Export Citation Format

Share Document