Proceedings of Balisage: The Markup Conference 2019
Latest Publications


TOTAL DOCUMENTS

24
(FIVE YEARS 24)

H-INDEX

2
(FIVE YEARS 2)

Published By Mulberry Technologies, Inc.

9781935958208

Author(s):  
Alexander B. Schwarzman ◽  
Jennifer Mayfield

As is typical for many society publishers, OSA–The Optical Society, has both a journal and a conference program. Integrating both journal articles and conference papers within a single data source opens up a pathway to conduct business intelligence analysis over the entire corpus of the published research material, which can benefit both programs and advance the society’s mission. In 2017, having successfully completed a project to convert almost 100 years of its journal legacy material to JATS XML, OSA decided to convert its conference content as well, tag it in a JATS-compatible way, and to combine both content segments in a single MarkLogic database. While it has been well-accepted that JATS and BITS cover the markup needs for journal and book content, respectively, it is less clear what Tag Set would be most suitable for tagging conference proceedings. Even though we thought “we had seen it all” in converting journal content, in the course of the project we learned that handling conference metadata and journal metadata presents very different challenges. In this paper, we share our experience with using BITS for marking up individual conference papers and how our business decisions shaped how we structure the XML. We demonstrate that because BITS was explicitly designed to enable the construction of books composed of units that could be part of many collections, the BITS metadata model is well-suited for representing conference paper’s nested collections, both event- and publication-related. To ensure data quality, we have built workflows, designed XML tools (e.g., Tag Subset, Schematron), and instituted visual QC procedures that allowed us to achieve our objective. We conclude our paper with lessons learned from this project and new opportunities its successful completion has opened up.


Author(s):  
James David Mason

Markup fanatics have long cried, “We need to see the markup!” Yet since the earliest stages of developing the SGML standard, there has been an urge even among standards developers to avoid having to write tags everywhere. The recent urge to create “Invisible XML” is but the latest symptom of a smoldering disease, from which I too suffer.


Author(s):  
Sam Wilmott

The field of programming languages is in continual flux: there are new languages coming along every few years. In the field of text and markup processing languages, things have settled down rather, with XSLT dominating and a few other languages like OmniMark filling in the gaps, but it is no more exempt from change than any other application area. Whether change is always improvement is not a certainty, but we must always be striving for improvement, so one hopes that there is such a thing that applies to our text and markup language field. This paper starts with an overview of some existing text and markup processing languages, and concludes with an outline and examples of a new programming language, that I hope will make text and markup processing easier than is now the case, or at least provide thoughts as to which directions things can go.


Author(s):  
Norman Walsh ◽  
Achim Berndzen

XProc 3.0 is an XML pipeline language for constructing markup centric workflows. With a rich vocabulary of steps and modern control structures, it allows the author to easily build complex pipelines.


Author(s):  
Chandi Perera

Around 15% of the global population has a permanent disability, including approximately 285 million people with a visual impairment and an estimated 700 million people with dyslexia, the most common form of learning disability. The World Blind Union estimates less than 10% of published works are made into accessible formats in developed countries which drops to less than 1% in developing countries. As markup professionals and content models experts, there is a lot we can do to make a positive impact towards making more content accessible. This session will look at accessibility; our social, ethical, and legal responsibilities around content accessibility; and what we can do to make content more accessible.


Author(s):  
Jean Paoli

Some of us building software need to take a hard look in the mirror. For years, we have promised that technology would solve the world’s information management problems, but 85% of business information is still “dark data,” with potentially useful insights lost in a rising tide of disconnected documents, emails, Slack conversations, voice-to-text messages, etc. We need an effective approach to documents and want to start a public conversation about these issues. We believe that effective solutions should be based on: Declarative Markup; AI sympathetic to “Small Data”; focus on company-specific documents; applying AI to documents as a whole; and solutions that do not disrupt existing workflows or require massive investment. The future is not about AI making human beings obsolete; the future is about AI making human beings and companies more productive, effective, and creative


Author(s):  
Joshua Lubell

The Security Content Automation Protocol (SCAP) schema for source data stream collections standardizes the requirements for packaging Extensible Markup Language (XML) security content into bundles for easy deployment. SCAP bundles must be self-contained such that each bundle contains all necessary information without external references, and reversible such that XML components are unmodified when unbundled and re-bundled into new collections. These requirements (along with the need for very long, globally unique identifiers) make authoring the content and bundling a challenge. SCAP Composer, a software application that uses a Darwin Information Typing Architecture (DITA) specialized element type for source data stream collections, makes the authoring process easier. SCAP Composer takes an incremental approach to aiding SCAP content authors: it helps only with creating source data stream collections; it does not offer any help with creating the XML resources encapsulated in a data stream collection. SCAP Composer is implemented using the DITA Open Toolkit and can be used with any DITA authoring software that includes the Toolkit, or with a standalone Toolkit.


Author(s):  
Jeffrey Beck

Maximal flexibility of rules, or ease of reuse — choose one. The tighter the rules, the more consistent documents will be and the easier it will be to reuse them, but only if the rules are reasonable enough to be adopted. (If all the data creators ignore the rules, reuse doesn’t get easier.) JATS4R (JATS for Reuse) is a NISO working group devoted to optimizing the reusability of scholarly content by developing best-practice recommendations for tagging content in JATS XML. The group has devoted particular attention to the flexibility/reuse tradeoff for rules on attribute use and controlled values, and we eventually decided that we needed some rules for ourselves, on how to write rules for attributes in our recommendations. In the process of developing our guidance document for writing rules for attribute values in our recommendations, we learned (or at least articulated) some things along the way.


Author(s):  
C. M. Sperberg-McQueen
Keyword(s):  
Xml Data ◽  

Aparecium is an XQuery / XSLT library for reading non-XML data as XML, under the control of an “invisible XML” grammar describing the structure of the input. The use of the library is illustrated with an application, and some technical issues in the development of the library are discussed.


Author(s):  
C. M. Sperberg-McQueen

Can we have rules for our documents we cannot write down in a schema language? If a conformance requirement is not mechanically checkable, is it a conformance requirement? If a rule is not testable, is it a rule?


Sign in / Sign up

Export Citation Format

Share Document