Markup languages that attempt not only to support particular applications, but to provide encoding standards for decentralized communities, face a particular problem: how do they adapt to new requirements for data description? The most usual approach is a schema extensibility mechanism, but many projects avoid them, since they fork the local application from the core tag set, complicating implementation, maintenance, and document interchange and thus undermining many of the advantages of using a standard. Yet the easy alternative, creatively reusing and abusing available elements and attributes, is even worse: it introduces signal disguised as noise, degrades the semantics of repurposed elements and hides the interchange problem without solving it.
This dilemma follows from the way we have conceived of our models for text. If designing an encoding format for one work must compromise its fitness for any other – because the clean and powerful descriptive markup for one kind of text is inevitably unsuitable for another – we will always be our own worst enemies. Yet texts “in the wild” are purposefully divergent in the structures, features and affordances of their design at both micro and macro levels. This suggests that at least in tag sets intended for wide use across decentralized communities, we must support design innovation not only in the schema, but in the instance – in particular documents and sets of documents. By defining, in the schema, a set of abstract generic elements for microformats, we can appropriate tag abuse (at one time making it unnecessary and capturing the initiative it represents), expose significant and useful semantic variation, and support bottom-up development of new semantic types.