What is it? Using corpora to teach languages is nothing new and, while
the term corpus linguistics hails from the 1940s, most language learning
before the 20th century adopted a corpus approach – using a series of texts
in the language under study as a type of corpus on which to base
acquisition. With the advent of widespread computing in the latter half of
the 20th century, corpora began to be digitised, rendering interrogation of
large amounts of data a much simpler and more appealing prospect. Today,
languages in all forms (written, spoken, performed, formal, informal, etc.)
are captured all the time through online and digital platforms, apps, etc.
meaning that the wealth of language data literally at our fingertips is
enormous. This has triggered the development of appropriate tools to explore
these vast data sets.