Multimedia Annotation of Classical Texts: What Do We Need?

The imminent creation of the Digital Latin Library under the auspices of the SCS and other institutions and based at the University of Oklahoma raises two of the key problems of digital annotation: selection and visual design. With theoretically limitless space, what resources should scholars provide for readers, and how are they to be presented? Many innovative approaches are currently being tried, from treebanking, to hyper-linked vocabulary, automatic grammatical analysis tools, video read-throughs, crowd-sourced commentary, and text visualization. I would like to argue for the importance of two specific elements that have so far not been the focus either of established projects like Perseus Digital Library, or of other emerging modes of digital edition of classical texts: author-specific lexica, and direct linking by humans to grammatical reference works. These are elements of traditional Latin school editions that can be usefully re-imagined in a digital environment, and will in some ways work better there than they do in books.

Author-specific lexica have the advantage of giving the reader a spectrum of definitions that are known to apply to the passages he or she is reading, and much reduce the frustration and errors caused by the over-richness of a large dictionary, and the poverty of a short definition that does not contain the contextually appropriate meaning.  For commonly taught school-authors there is an abundance of such material available in most modern European languages, waiting to be properly digitized. By editing existing definition data and marrying it with fully parsed texts such as those produced by the Laboratoire d’Analyse Statistique des Langues Anciennes (LASLA), we could have the further advantage creating author-specific lexica that accurately tabulate word frequencies, and help readers prioritize vocabulary acquisition. But even without that, accurate running lists can be created that would substantially ease the reading process.

Online grammars of Latin and Greek exist, but are often difficult to search and to read. One of the key things that intermediate and even advanced readers of a Latin or Greek texts need to know, when confronted with an unusual construction, is what rule or principle the passage in question exemplifies. The authors of print textbooks will frequently give a specific reference to a chapter in a grammar book, both to elucidate the passage and to stimulate the student to learn the relevant rule. If we had truly attractive and navigable grammars of Greek and Latin (ideally several of each), they could be linked directly to problematic passages quite unobtrusively, but with the advantage of immediate consultation via a single click. This kind of simple annotation, with a bare letter abbreviating the name of the grammar and the chapter number, would make the process of annotation simper than it can usually be in books, since the annotator would often be freed of the need to re-explain the principle involved. This kind of work obviously cannot be done by machine, but treebanking and other forms of syntactical tagging could speed the process.

A database of re-edited author-specific dictionaries, and a series of attractively presented Latin and Greek grammars: these are not impossible dreams, because a great deal of such material exists in the public domain. The challenge will be to extract it accurately from often poor optical character recognition that lies behind the deceptively smooth surface of a .pdf, and then to provide it in a pleasing interface, like that of Logeion, in the case of lexical resources. The best visual design of grammars in a digital environment is a problem still to be worked out.

A complete vocabulary of the Aeneid

I am pleased to announce that the DCC Aeneid vocabulary is now up and running. Based on Henry S. Frieze, Vergil’s Aeneid Books I-XIIwith an Introduction, Notes, and Vocabulary, revised by Walter Dennison (New York: American Book Co., 1902), it includes frequency data derived from a human inspection and analysis of every word in the Aeneid (Perret’s text) carried out by teams at the Laboratoire d’Analyse Statistique des Langues Anciennes (LASLA) at the Université de Liège.

Users can search both Latin and English words, and display items alphabetically or by frequency. By using The Bridge, users can create custom lists for line ranges in the Aeneid, including or excluding vocabulary from the DCC core, or from several introductory Latin textbooks.

This data will form the basis for complete running lists for the whole poem, to be created in the coming years as part of a larger multimedia edition of the Aeneid.

Henry Simmons Frieze (1817-1889) (University of Michigan Faculty History Project:

The Frieze-Dennison lexicon was revised and combined with the LASLA frequency data in the summer of 2014 at Dickinson College. Derek Frymark edited the OCR of Frieze-Dennison using ABBYY Finereader, and created a spreadsheet in Excel. Tyler Denton created a preliminary match between Frieze’s headwords and those of LASLA. The interface was built in Drupal by Ryan Burke. Christopher Francese edited the whole, is responsible for remaining errors, and would appreciate being notified of such at Support for the revision and digitization was provided by the Roberts Fund for Classical Studies at Dickinson, and the Andrew W. Mellon Foundation, through a grant for digital humanities at Dickinson College.

I would like to express my heartfelt thanks to LASLA, Bret Mulligan (who created The Bridge in summer of 2014 at Haverford College), and to all those who helped with this project. It would not have been possible without the great dedication and scholarly acumen of Henry Simmons Frieze (1817-1889), whose work I have found on close inspection to be worthy of the highest respect. The obituary written by M.L. D’Ooge and published in The Classical Review 4.3 (Mar., 1890), pp. 131-132, is a fitting tribute, and there is further information about him to be found here.