Sustainability of Linguistic Resources Revisited
Data providers, users, and funders alike want and need sustainability of language resources (e.g. language corpora, grammars, etc.); sustainability requires making the resources available according to defined processes, platforms, or archives in a reproducible and reliable way. A three-year project on sustainability of linguistic resources conducted at Tübingen, Hamburg, and Potsdam illuminates some of the difficulties: the prevalence of stand-off markup (requiring a layer of specialized tools atop the XML stack), machine-generated XML of low clarity, ad hoc non-standard tag sets, discoverability, and selection criteria for long-term archiving. XML and other standards are necessary but not sufficient ingredients in the mix.