Introducing Citation Structures

Author(s):  
Hugh Cayless ◽  
Thibault Clérice ◽  
Jonathan Robie

Text Encoding Initiative documents are notoriously heterogeneous in structure, since the Guidelines are intended to permit the encoding on any type of text, from tax receipts written on papyrus to Shakespeare plays or novels. Citation Structures are a new feature in the TEI Guidelines that provide a way for documents to declare their own internal structure along with a way to resolve citations conforming to that structure. This feature will allow systems ike the Distributed Text Services (DTS) API, which process heterogeneous TEI documents to handle tasks like automated table of contents generation, the extraction of structural metadata, and the resolution of citations without prior knowledge of document structure.

2020 ◽  
pp. 232-238
Author(s):  
Michelle Taylor ◽  
Andrew Keck

The Text Encoding Initiative (TEI), a branch of XML, is a mature standard for encoding texts that was developed three decades ago and continues to be improved and expanded upon today. Learn about how TEI was centrally imagined for a project devoted to a corpus of John Wesley material. We will begin by explaining why we chose to use TEI for the project and reviewing the considerations inherent in transitioning from a longstanding print-based project to a digital project, including the challenges of converting thousands of pages of text across different file types into rudimentary TEI. Next, we will move into topics specific to TEI encoding practices, including the creation of XML tagsets designed to maximize the use value of the Wesley Works for its various audiences: scholars, librarians, and clergy. Finally, we will show the TEI in action by sharing an example of an XML file from our first round of encoding.


2018 ◽  
Vol 6 (2) ◽  
pp. 221-242 ◽  
Author(s):  
Mohamed A. H. Ahmed

Abstract The main aim of this study is to introduce a model of TEI (Text Encoding Initiative) annotation of Hebrew elements in Judeo-Arabic texts, i.e., code switching (CS), borrowing, and Hebrew quotations. This article will provide an introduction to using XML (Extensible Markup Language) to investigate sociolinguistic aspects in medieval Judeo-Arabic texts. Accordingly, it will suggest to what extent using XML is useful for investigating linguistic and sociolinguistic features in the Judeo-Arabic paradigm. To provide an example for how XML annotation could be applied to Judeo-Arabic texts, a corpus of 300 pages selected from three Judeo-Arabic books has been manually annotated using the TEI P5. The annotation covers all instances of CS, borrowing, and Hebrew quotations in that corpus.


Author(s):  
Ryan Cordell ◽  
Benjamin J. Doyle ◽  
Elizabeth Hopwood

Ryan Cordell, Benjamin Doyle, and Elizabeth Hopwood’s essay seizes a nineteenth-century invention, the kaleidoscope, as a model and metaphor for pedagogical practices and learning spaces that encourage play and experimentation. Through examples that involve setting letterpress type, the Text Encoding Initiative (TEI) encoding of nineteenth-century texts as an interpretive process, and the collaborative creation of Wikipedia pages, the authors describe how experiments with contemporary technologies help students claim scholarly agency over the texts and tools central to their study of the nineteenth century. Kaleidoscopic pedagogy encourages students to discover how C19 competencies like close reading and contemporary methods of coding and data analysis have the potential to be mutually constitutive, inspiring a more nuanced understanding of both periods.


2008 ◽  
Vol 18 (1) ◽  
pp. 103-119
Author(s):  
JANICE CARRUTHERS

ABSTRACTThe objective of this paper is to describe and evaluate the application of the Text Encoding Initiative (TEI) Guidelines to a corpus of oral French, this being the first corpus of oral French where the TEI has been used. The paper explains the purpose of the corpus, both in creating a specialist corpus ofnéo-contagethat will broaden the range of oral corpora available, and, more importantly, in creating a dataset to explore a variety of oral French that has a particularly interesting status in terms of factors such asconception orale/écrite, réalisation médialeandcomportement communicatif(Koch and Oesterreicher 2001). The linguistic phenomena to be encoded are both stylistic (speech and thought presentation) and syntactic (negation, detachment, inversion), and all represent areas where previous research has highlighted the significance of factors such as medium, register and discourse type, as well as a host of linguistic factors (syntactic, phonetic, lexical). After a discussion of how a tagset can be designed and applied within the TEI to encode speech and thought presentation, negation, detachment and inversion, the final section of the paper evaluates the benefits and possible drawbacks of the methodology offered by the TEI when applied to a syntactic and stylistic markup of an oral corpus.


Author(s):  
Raffaele Viglianti

TEI, the Text Encoding Initiative, was founded in 1987 to develop guidelines for encoding machine-readable texts of interest to the humanities and social sciences. The TEI is a text-centric community of practice in the academic field of digital humanities, operating continuously since the 1980s. The community currently runs several mailing lists, holds an annual conference, and maintains an eponymous technical standard, an online journal, a wiki, a GitHub repository, and a toolchain. The TEI Guidelines, which collectively define an XML format, are the defining output of the community of practice. The format differs from other well-known open formats for text (such as HTML and OpenDocument) in that it’s main mission is for encoding “extant” texts such that they are amenable to scholarly processing. After a brief introduction to the TEI, we will discuss the mechanisms built in to the TEI for customization.


Sign in / Sign up

Export Citation Format

Share Document