Week 4: TEI in the Wild – Dictionary Edition

One TEI-using project close to home is the Dictionary of Old English Web Corpus (DOE), which has its physical headquarters on the 14th floor of Robarts. The DOE is a compilation of all surviving Old English texts, some in more than one copy. Each text has been XML-encoded and complies with TEI guidelines. Essentially, the DOE contains all of the surviving vocabulary of the Old English period (around 600-1150 CE). It is searchable in a variety of ways and is one of the best resources in the study of Old English.

Unfortunately, the website gives very little detail about its encoding strategies other than that they are compatible with the TEI-P5 2007 guidelines. It does not make its code available for others.

The project has also not published anything about its methods or challenges. The DOE’s editor, Antonette diPaolo Healey, wrote an article about the move from “manuscripts to megabytes” and the digital tools used by the DOE, yet she does not go so far as to talk about the code behind it.

However, in looking for information on the DOE, I did come across a short paper on the use of XML to create electronic texts of medieval manuscripts. The author goes through a few examples of this, and it showcases a current use for XML. It also has a great title.

That article can be found in the UTL catalogue: Powell, Kathryn. “XML and Early English Manuscripts: Extensible Medieval Literature.” Literature Compass 1 (2003): 1-5. doi: 10.1111/j.1741-4113.2004.00061.x.


DOE website: http://tapor.library.utoronto.ca.myaccess.library.utoronto.ca/doecorpus/index.html

DOE About Page: http://www.doe.utoronto.ca/pages/pub/web-corpus.html

Healey, Antonette diPaolo. “The Dictionary of Old English: From Manuscripts to Megabytes.” Dictionaries: Journal of the Dictionary Society of North America 23 (2002): 156-179. doi: 10.1353/dic.2002.0009.

Leave a Reply

Your email address will not be published. Required fields are marked *