Johan Winge’s New Latin Macronizer

Inscription_latine_avec_apex_extrait

image credit: Vincent Ramos via Wikimedia Commons

A new Latin macronizer has come on the scene, and it is superb. It should become an essential tool for Latin teachers and editors of Latin texts. The author is Johan Winge, who just completed his undergraduate studies in the Language Technology Programme at Uppsala University, supervised by Joakim Nivre. The macronizer is the result of his thesis work for the degree. I had the opportunity to give it a good test run recently, as I read the Ilias Latina along with about twenty Latin teachers at the Dickinson Summer Latin Workshop. I took the PHI text (Vollmer’s Teubner from 1913) of this 1070-line condensation of the Iliad into Latin hexameters, put it in a Word document, and ran it through Winge’s macronizer. We read the text together and spotted the cases where corrections were needed.

The claim on the site that “The expected accuracy on an average classical text is estimated to be about 98% to 99%” seems like no exaggeration. What makes Winge’s macronizer more effective that other tools such as Kevin Ryan’s Macron Helper or Felipe Vogel’s māccer is that it does not work on the basis of a database of previously macronized forms. Rather, it uses a part-of-speech tagger (RFTagger) trained on the Latin Dependency Treebank, and with macrons provided by a customized version of the Morpheus morphological analyzer.

You’ll have to read Johan’s thesis, Automatic Annotation of Latin Vowel Length, to get all the technical details. I’ll just say that it performed splendidly on the Ilias Latina. Here is a typical stretch, lines 344-374, with the errors highlighted:

dumque inter sēsē procerēs certāmen habērent,
concilium omnipotēns habuit rēgnātor Olympī 345
foederaque intentō turbāvit Pandarus arcū,
tē, Menelāe, petēns; latērīque volātile tēlum
incīdit et tunicam ferrō squāmīsque rigentem
dissecat: excēdit pugna gemebundus Atrīdēs
castraque tūta petit; quem doctus ab arte paternā 350
Paeōniīs cūrat iuvenis Podalīrius herbīs
itque iterum in caedēs horrendaque proelia victor.
armāvit fortēs Agamemnonis īra Pelasgōs
et dolor in pugnam cūnctōs commūnīs agēbat.
bellum ingēns oritur multumque utrimque cruōris 355
funditur et tōtīs sternuntur corpora campīs;
inque vicem Trōumque cadunt Danaumque catervae.
nec requiēs datur ūlla virīs; sonat undique Mavors
tēlōrumque volant cūnctīs ē partibus imbrēs.
occīdit Antilochī rigidō dēmersus in umbrās 360
ēnse Thalysiadēs optātaque lūmina linquit.
inde manū fortī Grāiōrum terga prementem
occupat Anthemiōne satum Telamōnius Aiāx
et praedūrātō trānsfīxit pectora tēlō:
purpureō vomit ille animam cum sanguine mixtam, 365
ōra rigat moriēns. tum magnīs Antiphus hastam
vīribus adversum cōnātūs corpore tōtō
torquet in Aeacidēn: tēlumque errāvit ab hoste
inque hostem cecidit, trānsfīxit et inguine Leucōn:
concīdit īnfēlīx prōstrātus vulnere fortī 370
et carpit viridēs moribundus dentibus herbās.
†impiger †Atrīdēs cāsū concussūs amīcī
Democoonta petit tēlōque adversā trabālī
tempora trānsadigit …

You will note that of the 11 “mistakes” on this page, only one (Mavors) is a genuine error. All the others are simply ambiguous forms, issues that need to be decided by a human. Virtually all of the cases that did not fall into the category of “ambiguous forms that need to be decided by a human” were Greek proper names, in which this text abounds. For some reason the form Achillis consistently came out with a long mark on the final vowel. Paris came out with a final macron twice, but without it three times. There were quantity issues with Nereus, and his daughters.The strange form mēō emerged at line 851. But virtually all the time, with all ordinary Latin words, the macronizer performed brilliantly. The greatest delight was seeing it correctly macronize the phrase rēbus in artīs (line 968), where the final word almost always has a short “i”–but not here. That will have been the result of the Treebank data, I am guessing.

Mr. Winge, I salute you!

Postrcipt 7/21/15: Johan writes that his source code is now available.  Also, the picture I posted originally is not of him but of his friend Francesco Veneziano. Apologies to both Johan and Francesco for that one!

Report on the DCC Shanghai Seminar

Marc Mastrangelo and I traveled to Shanghai in June to meet some leading Chinese scholars of the Greek and Roman classics, with a view to exploring possibilities for collaboration on a Chinese version of the Dickinson College Commentaries websites. Our contacts in China were made via Jinyu Liu, who is Associate Professor and Classics Department Chair at DePauw University, and also Shanghai “1000 plan” Expert/Distinguished Guest Professor at Shanghai Normal University, where she resides in the summer months. The conference was jointly sponsored and funded by Dickinson College, thanks to Dean and Provost Neil Weissman, and by Shanghai Normal University, thanks to Chen Heng, Professor of Humanities and Communications there. Participants included Liu Chun (Peking University), Chen Wei and Bai Chunxiao (Zhejiang University), Zhang Wei and Huang Yang (Fudan University), Xu Xiaoxu (Renmin University of China), Xiong Ying (Nanjing University), Zhang Qiang and Wang Shaohui (Northeast Normal University), and a contingent from Shanghai Normal itself: Kang Kai, Li Shangjun, and Yi Zhaoyin. Unable to attend but interested in the project were Li Yongyi (Chongqing University), and Michele Ferrero (Beijing Foreign Studies University).

The meetings took place in a seminar room in the humanities building at Shanghai Normal University. We were assisted by a wonderful group of SHNU students.

The meetings took place in a seminar room in the humanities building at Shanghai Normal University. We were assisted by a wonderful group of SHNU students.

Prior to the seminar itself Marc and I gave public lectures attended by students and faculty at SHNU. Marc spoke on June 9, on the topic of “Plato’s View of Poetry and the Early Christian Poets.” I spoke on June 10 on the topic of “Sebastian Brant: An Early Modern Editor of Vergil and Multimedia Text Annotation.”

The conference itself began on Friday, June 12. It started with a presentation from me on the topic “Digital Commentary on Classical Texts: Problems and Prospects,” which outlined the goals of the current DCC project within the context of unsolved problems of text annotation in a digital environment. I ended by emphasizing the collaborative nature of this kind of work, and urged the group to think about what kinds of resources are most needed for Chinese students and scholars. Throughout the seminar we talked with Chinese students as well, learned about their needs, and heard about current teaching practices and materials.

Friday afternoon we were treated to a field trip to see the Bibliotheca Zikawei (Xujiahui Library), a historic collection of western and Chinese books and manuscripts, including an impressive collection of Greek and Roman materials, gathered by the Jesuits and now maintained in their original setting by the Shanghai Library. Thanks to Prof. Chen we received rarely-given access to the sections closed to the public. (The library is the subject of an excellent article by Gail King, pdf)

On Saturday morning work began in earnest translating the Greek Core Vocabulary into Chinese, starting with the grammatical terms and categories. The Chinese scholars appreciated this exercise in particular, since the special terms to describe Greek and Latin grammar have yet to be fully standardized in Chinese. They repeatedly said that the opportunity to discuss such issues as a group was very valuable. Saturday afternoon, while work continued, Marc and I took the participants outside one by one and interviewed them on their hopes for the project, and on their views on the importance of the Greek and Roman classics in contemporary Chinese intellectual and cultural life. This video was captured by Eleanor Yan (Dickinson ’18). Her father, who works for a Chinese television station, provided the camera. We plan to edit this video into an introduction for the project on the website when it is developed.

Since the participants arrived having previously done translations of a subset of the Greek and Latin core lists, the editing work proceeded quickly once they got going. Part of Saturday afternoon and most of Sunday was devoted to the Latin list. The latter part of Sunday afternoon was spent on a discussion what direction they would like the project to go.

The most urgent immediate needs identified were:

  • Reliable lexica
  • Introductory readers based on good pedagogy, with accurate translations, and high quality audio recordings
  • Intermediate readers that included key ancient passages dealing with particular themes, such as Athenian Democracy, Roman history, and Greek philosophy
  • A glossary of unfamiliar terms from Greek and Roman culture

Greek and Latin grammars were also identified as an important project, though one that may take longer to complete. And it was agreed that the long term goal would be to produce reliable translations and commentaries on all the major of the works of the Greco-Roman classical canon, an undertaking that will take many years.

As will be apparent from the video, enthusiasm for the project was very high. The group worked together with splendid collegiality, humor, and good will, and with a sense that this is the beginning of something very important for the field. The climax of the event was the agreement on a new Chinese name for the Project and the formulation of a Chinese logo for the new “Dickinson Classics Online.”

Chinese name logo

The team that met in this seminar now constitutes our Editorial Board, the team of classicists who will oversee the development of essential infrastructure such as lexica and grammars, high quality language teaching tools for Latin and Greek, and expert commentaries and translations by Chinese scholars that make the classics fresh, relevant, and interesting to Chinese students. All resources will be provided free of charge on the internet, giving direct access to the words and ideas of the Greek and Romans to millions of people for the first time. A reasonably priced mobile application will allow serious students to learn on a convenient and portable platform.

This initial meeting included a concrete beginning, the production of a communally edited Chinese version of the DCC Greek and Latin Core Vocabularies, which is one of the most widely used features of the DCC site. We plan to have that up as a website this summer, and will work with computer science students to begin creating the mobile application.

In the meantime some prominent western scholars have signed on to be part of an Advisory Board: Shadi Bartsch-Zimmer (University of Chicago), Walter Scheidel (Stanford University), and Jeremy McInerney (University of Pennsylvania). With a distinguished team on both sides of the Pacific, we hope to be in a good position to raise substantial outside funds to make the ambitious project a reality. Our hope is that DCO can bring Chinese scholars to Dickinson to work alongside each other and with the scholarly and web development team that creates the DCC.

Left to right: Bai Chunxiao (Zhejiang University), Zhang Wei (Fudan University), Li Shangjun (Shanghai Normal University), Chen Wei (Zhejiang University), Chris Francese, Xiong Ying (Nanjing University), Jinyu Liu (DePauw University), Marc Mastrangelo, Xu Xiaoxu (Renmin University of China), Zhang Qiang (Northeast Normal University), Huang Yang (Fudan University), Liu Chun (Peking University), Wang Shaohui (Northeast Normal University)

Left to right: Bai Chunxiao (Zhejiang University), Zhang Wei (Fudan University), Li Shangjun (Shanghai Normal University), Chen Wei (Zhejiang University), Chris Francese, Xiong Ying (Nanjing University), Jinyu Liu (DePauw University), Marc Mastrangelo, Xu Xiaoxu (Renmin University of China), Zhang Qiang (Northeast Normal University), Huang Yang (Fudan University), Liu Chun (Peking University), Wang Shaohui (Northeast Normal University)

–Chris Francese

Resources for studying Latin and Greek in Chinese

Gu Zhiying, now a senior at Shanghai University, and soon to be studying classics at the graduate level at Renmin University in Beijing, passes on this list of resources for the study of Latin and Greek for Chinese speakers. If you are interested in more information please send me an email and I can get you in touch with Zhiying.

Dictionaries

  1. Dictionarium Latino-Sinicum, 《拉丁汉文辞典》, 1965 / 1980, by WuJinrui (吴金瑞). Wu was a Catholic priest in Taichung(台中), he said in the preface that this dictionary cost him 15 years. Most of the illustrative sentences are from Cicero and Caesar. (Difficult to buy.)
  2. Dictionarium Latino-Sinicum, 《拉丁语汉语词典》, 1988, by XieDaren (谢大任). It said that Xie majored in medical science, but there seems no more information of him. This dictionary is based on a Latin-Russian dictionary (named Латинско-Русский Словарь, by И. Х. Дворецкий and Д. Н. Корольков, published in 1949. I can’t read Russian, a friend who can read both Latin and Russian told me this). To some extent, Wu’s Dictionary may be a little better than Xie’s, because we cannot know from which work an illustrative sentence comes. 辞典cidianand 词典cidian and same in Chinese. (Also difficult to buy.)
  3. Dictionarium Parvum Latino-Sinicum, 《拉丁语汉语词典》, 1988, by XieDaren (谢大任). It’s an abridged edition of Xie’s Dictionarium Latino-Sinicum. (Difficult to buy.)
  4. Dictionarium Sinico-Latinum, 《汉洋字典》, 1853. It’s a CHINESE-LATIN dictionary, rare and interesting. The preface is in Latin, no more introduction is necessary.
  5. Lexicon Magnum Latino-Sinicum and Lexicon Manuale Latino-Sinicum Sinico-Latinum, mini-dictionaries.

An Austrian professor in RUC named Leopold Leeb (his Chinese name is 雷立柏[LeiLibo]) is popularizing Latin and Greek for undergraduates as well as senior high school students in Beijing, he has compiled a Dictionarium Parvum Latino-Sinicum (《拉丁语汉语简明辞典》, 2011). Perhaps Leeb’s mini-dictionary is easier to use, it is very easy to buy on Amazon.

Grammars

  1. Syntaxis Linguae Latinae Grammatica, 《拉丁文句学》, 1942, by missionaries. The preface is also in Latin.
  2. Basic Course of Latin, 《拉丁语基础》, 1983, By XiaoYuan (肖原). It does not have an original Latin or English name, the title is my translation.
  3. Lingua Latina pro Auto-Studio, 《拉丁语自学读本》, 1989, By XieDaren (谢大任).

Some Latin and Greek courses have been translated or published in China these years, such as Wheelock’s Latin (6e., 2009) and professor LiuXiaofeng (刘小枫)’s Καιρός: Reading Greek [revised edition] (《凯若斯:古希腊文读本[增订版]》, 2013). LuoNiansheng (罗念生) and ShuiJianfu (水建馥)’s Classical Greek-Chinese Dictionary (《古希腊语汉语词典》, 2004) has also been published (but there are many misprints produced in the course of importing Greek letters, what I found have already been more than 300. Luo and Shui passed away before the dictionary published.

2015 Dickinson Summer Latin Workshop Information

Dickinson College Summer Latin Workshop
July 13-18, 2015
LOGISTICAL INFORMATION

MAP OF CAMPUS LOCATIONS SPECIFIC TO THE WORKSHOP: http://goo.gl/9jNnt4

DIRECTIONS TO CARLISLE AND MAPS OF THE DICKINSON COLLEGE CAMPUS: are available on the Dickinson College web site.

ARRIVAL: arrive no earlier than 1:00 p.m., no later than 6:00 p.m. Monday, July 13. Our first meeting will be dinner, Monday at 6:00. Meet in the lobby of the Holland Union Building (map). Check in at the Department of Public Safety at 400 W. North St. (See map. Their phone number is 717-245-1349). There you will receive a key and directions to your residence, along with a card which will allow you to get meals, use the library and the Kline Center athletic facilities and pool, as well as other useful information about the campus and the town of Carlisle.

PARKING: park free on the streets around campus. Public Safety asks that you register your car with them at arrival. A map of parking on campus is available here.

DEPARTURE: the final event will be the farewell lunch, 12:00 Saturday, July 18. Please let me know as soon as possible if you will need lodging on the night of July 18th.

MEETING SCHEDULE: the group will meet in the morning (8:30 a.m. to 12:00 p.m.). Meetings will take place in East College building on Dickinson’s campus (map).

TEXTS:

  • We will read the Latin text of Plessis (1885). You can download the .pdf here for free: and print out the text pages (pp. 67-85 of the pdf.). Plessis can also be had as a print-on-demand book via Amazon for about $20:
  • The best commentary is the dissertation of Tilroe from 1939, available here.  It also includes an English translation. To download it in full simply click on the ‘Save’ button located at the upper right corner of the screen and select the ‘Download’ option from the drop-down menu.
  • Please bring an English translation of the Iliad, preferably that of Robert Fagles.

MEALS: will be taken in the Dickinson College Cafeteria (“the caf”) in the Holland Union Building on first block of North College Street (map). Vegetarian dishes are available. The Quarry is a coffee bar right across the street from the cafeteria, but your meal card will not work there, only cash.

WI-FI ACCESS: You will be issued a group password that will allow you to log on to the campus wireless network. There is also guest access, which lasts for a few hours before requiring a log in.

THINGS TO BRING: participants from previous years have suggested that you may want to bring: a desk lamp, an extra blanket, a swimsuit.

FACEBOOK GROUP: for convenient communication among the group we have started a Facebook group.  If you are on Facebook, please ask to join!

Guangqi Lecture and Seminar Series

Our friend and collaborator Jinyu Liu passes on the following exciting announcement:

Dear Classics friends: On behalf of the newly founded Shanghai Normal University Guangqi International Center for Scholars, we are greatly pleased to announce the launch of the Guangqi Classics Lecture and Seminar Series. Aiming at promoting Classical Studies in China and fostering trans-lingual and trans-cultural conversations about Classics, the Guangqi Lecture and Seminar Series invites Classics scholars from around the world to share their cutting-edge research, provide master classes, and organize international conferences and workshops on diverse aspects of the ancient world. We also warmly welcome resource sharing and collaborative endeavors in various forms.

We are very grateful to Christopher A. Francese and Marc Mastrangelo of Dickinson College, who have been instrumental in putting together the program for Season I, and Lisa Mignone and Richard Billows for enriching the academic events. We also wish to acknowledge the generous support from Dickinson Classics, Shanghai 1000 Plan and Shanghai Normal University. Season II is being planned, which will feature Walter Scheidel.

Please help spread the word, and join us in this long-term endeavor in globalizing Classics.

For DCC Shanghai Seminar, please see http://blogs.dickinson.edu/…/dickinson-college-commentarie…/

Thank you,

Heng Chen (Shanghai Normal University) and Jinyu Liu (Classical Studies at DePauw University)

Note: The Guangqi Lecture and Seminar Series is named after XU Guangqi (1562-1633), one of the first literati Christians in China and the great collaborator of Matteo Ricci (1552-1610), a Jesuit missionary whose role in bringing Western Learning to China can hardly be overstated.

DCC Core Latin and Greek Vocabularies now available in Polish

The DCC Core Latin and Greek Vocabularies are now available in Polish translation, thanks to the efforts of a wonderful Association of Classical teachers called Ship of Phaeacians. They can be followed on Twitter at @statekfeakow . The work was carried out by Statek Feaków, Agnieszka Walczak, and Marcin Kołodziejczyk. Thanks are due to them, and also to our Drupal developer Ryan Burke, who made the translation module work so that all translations can be linked to the same nodes, and created the views. This is the second of our international collaborations for translating the core vocabularies. The Greek core is already available in Portuguese thanks to Caio Camargo. Next month we plan to put up the Chinese translations as well, following the DCC seminar in Shanghai June 12-14. Would you like to help create a new translation in another modern language? Please let us know!

Greek Core Vocabulary in Polish

Latin Core Vocabulary in Polish

DCC Shanghai Seminar June 12-14

A stellar line up of Chinese scholars of the western classical tradition will meet in Shanghai next month to create Latin-Chinese and Greek-Chinese versions of the DCC Core vocabularies, and to form a plan for future collaboration and resource creation. Can’t wait! Thanks to Jinyu Liu of DePauw University for coordinating the event, and Shanghai Normal University for hosting!

2015上师大DCC注疏项目Seminar poster

Searching in Lewis and Short

A little known but extremely useful resource for Latin composition is the digitization of Lewis & Short done at the University of Chicago:

Lewis and Short

http://perseus.uchicago.edu/Reference/lewis.html

Search the FULL TEXT of this dictionary for an English word to find every entry in Lewis & Short where that English word appears. Sort through the various Latin words that pop up until you find one that approximates what you want to say.

For example, a search for “silk” yields the following cornucopia of terms, all classically attested, or at least attested within the generous limits of L&S.

  1. birrus n., Aug. Serm. Divers. 49), = πυρρός (of yellow color), a cloak to keep off rain (made of silk or wool)
  1. bombycinus , a, um, adj. bombyx, of silk, silken (cf. sericus): vestis, Plin. 11, 22, 26, § 76: panniculus,
  1. bombycinus (page 243) 14, 24; Dig. 34, 2, 23, § 1.—Subst.: bombȳcĭna, ōrum, n., silk garments, Mart. 11, 50, 5; 8, 68, 7; App. M. 8, p. 214, 6.—And
  1. bombycinus (page 243) 11, 50, 5; 8, 68, 7; App. M. 8, p. 214, 6.—And bombȳcĭnum, i, n., a silk texture or web, Isid. Orig. 19, 22, 13. bombylis , is,
  1. bombyx (page 243) (f., Plin. 11, 23, 27; Tert. Pall. 3), = βόμβυξ. The silk-worm, Plin. 11, 22, 25, § 75 sqq.; Mart. 8, 33, 16; Serv. ad Verg. G. 2,
  1. bombyx (page 243) Verg. G. 2, 121; Isid. Orig. 12, 5, 8; 19, 27, 5.— Meton. That which is made of silk, a silken garment, silk: Arabius, Arabian (the best), Prop. 2, 3, 15: Assyria bombyx,
  1. bombyx (page 243) Orig. 12, 5, 8; 19, 27, 5.— Meton. That which is made of silk, a silken garment, silk: Arabius, Arabian (the best), Prop. 2, 3, 15: Assyria bombyx, Plin. 11, 23, 27, §
  1. holosericus (page 859) holosericus , a, um, adj., = ὁλοσηρικός, all of silk: vestis, Lampr. Heliog. 20; Vop. Aur. 45; id. Tac. 10; Cod. Th. 15, 9, 1. —Collat. form,
  1. metaxa (page 1140) metaxa or mătaxa, ae, f., = μέταξα and μάταξα, raw silk, the web of silkworms. Lit., Dig. 39, 4, 16; Cod. Just. 11, 7,
  1. necydalus (page 1196) necydalus , i, m., = νεκύδαλος (deathlike), the larva of the silk-worm, in the stage of metamorphosis preceding that in which it receives the name of bombyx: primum eruca fit, deinde, quod vocatur bombylius, ex eo necydalus, ex hoc in sex mensibus bombyx,
  1. Seres (page 1678) 11, 27, 11; Claud. in Eutr. 2.— sērĭ-cum, i, n., Seric stuff, silk, Amm. 23, 6, 67; Sol. 50; cf. Isid. Orig. 19, 17, 6; 19, 27, 5;
  1. sericarius (page 1678) or belonging to silks: textor, Firm. Math. 8: NEGOCIATOR, Inscr. Orell. 1368; 4252.—As substt. SERICARII, silk- dealers, Inscr. Fabr. p. 713, 346.— SERICARIA, ae, f., a slave who took care of silk, Inscr. Orell.
  1. sericarius (page 1678) SERICARII, silk- dealers, Inscr. Fabr. p. 713, 346.— SERICARIA, ae, f., a slave who took care of silk, Inscr. Orell. 2955. sericatus , a, um, adj.
  1. sericoblatta (page 1679) sericoblatta , ae, f. Sericus, a garment of purple silk, Cod. Just. 11, 8, 10; Cod. Th. 10, 20, 13; 10, 20, 18. sericum , i,
  1. vellus (page 1965) Calp. Ecl. 2, 7.— Of woolly material. Wool, down: velleraque ut foliis depectant tenuia Seres, i. e. the fleeces or flocks of silk, Verg. G. 2, 121.— Of light, fleecy clouds: tenuia nec lanae per caelum vellera ferri,

English search is not, I believe, available on Logeion, which is the successor to this site. Hopefully this older one will be kept around for a good while longer. I don’t know anything as good online for searching in English for Latin words. Thanks to Helma Dik and the U. Chicago team for all their work!

Learning from older school editions of classical texts

Here is my talk from CAMWS 2015 on digital text annotation, for those many who were unable or disinclined to come to an 8:00 p.m. (!) session. Please leave a comment if you like.

The field of digital classics is very focused right now on the unsolved problem of how to present scholarly editions of Latin and Greek texts online, and in particular how to represent the apparatus criticus and link to manuscript evidence. Not enough attention, I would suggest, has been paid to the question of how to best present classical texts for ordinary students and readers of Latin and Greek in the digital realm.

Slide2

Isidore of Seville, Etymologiae, Book 1, ed. Max Bänziger (Monumenta Informatik) with links to manuscripts and citation links. http://monumenta.ch/latein/

A deluge of plain text is not going to do it, I think we would all agree. Readers and learners typically want to know “what does this word mean?” and “what’s going on with the grammar here?” Plain digitized texts, or texts enhanced with links to manuscript images like this one provide no guidance on these matters, since they are designed for advanced scholars.

some parsing and vocabulary tools

some parsing and vocabulary tools

The main focus in tool development for readers to deal with these questions has been automatic parsers linked to dictionaries, like nodictionaries.com, the Alpheios plugin, or the Perseus Word Study Tool. But the unreliability of those tools, not to mention their perceived role as crutches, has given them a poor reputation among teachers and thoughtful students alike.

Perseus on Vergil: John Conington. P. Vergili Maronis opera. (London. Whittaker and Co., Ave Maria Lane. 1876), ad Aen. 1.352.

Perseus on Vergil: John Conington. P. Vergili Maronis opera. (London. Whittaker and Co., Ave Maria Lane. 1876), ad Aen. 1.352.

Another tack has been to digitize older, public domain commentaries like Conington’s Vergil. Perseus includes several such works. But whatever the merits of these works in their day, they are often opaque to learners and readers now. Older commentators tend to assume an audience that has already been well-trained in Latin, and just needs a little reminder of a common construction, or might enjoy an apposite quotation from Keats.

Perseus Latin Word Study tool on coit (Vergil, Aeneid 3.30)

Perseus Latin Word Study tool on coit (Vergil, Aeneid 3.30)

The Perseus Word Study tool is designed to provide more basic information. It does its best to guess which dictionary head word a given form derives from, then gives a brief definition, with links to Lewis & Short, and a series of suggested parsings. It guesses the correct parsing based on frequency, and includes a voting feature that lets you select which parsing you think is best, and what definition you think best for the context.

Lewis and Short on Coeo, screen 1

Lewis and Short on Coeo, screen 1

Lewis and Short on Coeo, screen 2, with the relevant definition highlighted

Lewis and Short on Coeo, screen 2, with the relevant definition highlighted

This voting feature is slowly improving the accuracy of the Word Study Tool, but even if the parsing happens to be right (which it is not in this particular case), the dictionary data itself is often not helpful, because the choices are the very brief “short defs” and the full fire hose of Lewis & Short.
A good but under-used solution to this problem of too little or too much is the author-specific dictionary of the type that is contained in many older school editions, such as Henry Simmons Frieze’s editions of the works of Vergil.

This is his version of the Aeneid, revised in 1902 by Walter Dennison. Freize published a full dictionary to all the works of Vergil in various revisions over the 1880s and 90s, and Dennison revised it slightly to focus on the Aeneid material only for this edition.

Shortdef vs. Frieze-Dennison on Coeo

Shortdef vs. Frieze-Dennison on Coeo

Note that Frieze includes all the principal parts, with macrons; a number of Vergilian definitions not included in the Short Def, and citations for all the particular senses. Frieze spent his career at the University of Michigan teaching Vergil and other Latin authors, and working on his Vergil editions. His translations are expert, and his philological acumen at a very high standard.

Lewis & Short vs. Frieze on Orodes

Frieze vs. Perseus on Orodes

Frieze does equally well in the sphere of proper names, where automatic tools are often helpless to distinguish between homonymous figures. This is precious intelligence for readers of Vergil. In 2014 I set out to properly digitize Frieze’s Vergilian dictionary with the ultimate goal of creating running vocabulary lists for the whole Aeneid.

Flowchart: Digitizing Frieze's Vergilian Dictionary

Flowchart: Digitizing Frieze’s Vergilian Dictionary

The process went as follows. The .pdf scan from the Internet Archive went through a OCR program called ABBYY Finereader. The resulting text went into an Excel spreadsheet. To Frieze’s definitions we added frequency data derived from a human inspection and analysis of every word in the Aeneid. This work was carried out by teams at the Laboratoire d’Analyse Statistique des Langues Anciennes (LASLA) at the Université de Liège in Belgium.

DCC version of Frieze-Dennison has search, download, and frequency data

DCC version of Frieze-Dennison has search, download, and frequency data

Once this process was complete and the spreadsheet made, it was uploaded into Drupal, where the database version on DCC can now be searched, ordered by frequency, and downloaded in various formats. It can be reused at will under a Creative Commons license. It will also form the basis for the running lists in our edition of the Aeneid now in development.

The Bridge, developed at Haverford, allows the making of accurate vocabulary lists for custom ranges of text

The Bridge, developed at Haverford, allows the making of accurate vocabulary lists for custom ranges of text

Putting the information in a spreadsheet keyed to LASLA lemmata made it possible to share with The Bridge, a new tool developed by Bret Mulligan at Haverford College. This allows the user to specify a particular line range and get vocabulary lists, either all words, or with certain words excluded, like the DCC Core vocabulary, or the vocabulary of a common introductory textbook.

Detail of The Bridge

Detail of The Bridge

The important thing to emphasize is that the lists include not the headwords that are statistically likely to appear in the passage, but (barring minor textual difficulties) those that actually do, and no others. I also put a column in the spreadsheet listing the headwords as used by Logeion, and thanks to Helma Dik is it also available there.

Frieze-Dennison is now available on Logeion

Frieze-Dennison is now available on Logeion

The facilities are now starting to exist by which accurate lexical information such as this can be shared by the community of classicists, and the Bridge and Logeion are in the vanguard of this development. By excavating and reclaiming more author-specific dictionaries we can all contribute to this positive change and get the resources that students and readers need. Effective digitization of older hand-made tools can be more effective than the creation of new automatic tools.

II. Goodell’s Greek Grammar

Harper and Wallace's edition of the Anabasis annotates with references to four different school grammars

Harper and Wallace’s edition of the Anabasis annotates with references to four different school grammars

Slide24

Detail of Harper and Wallace’s Xenophon (1893)

Slide25

Perseus cross-references to grammars: not tied to specific words, and pointing to advanced grammars only

Another nice feature of older school editions that can be usefully recuperated in the digital realm is reference to grammars. A learner or reader is likely to ask, what’s going on with the grammar in this passage? What rule covers this? Or is it somehow exceptional? This question was sometimes dealt with in older school editions by simply giving a citation from a widely-used grammar book–or to four of them, as in the case of Harper and Wallace’s edition of Xenophon’s Anabasis–and relying on the student to go look it up. In olden days, perhaps they did. The internet allows for much easier cross-referencing of this kind. Sometimes Perseus operates in this way, as with T. Rice Holmes’ Caesar Gallic War, Perseus makes the cross-references clickable.

More common in Perseus is a kind of general reference to grammars for an entire page, typically to a large discussion of “the tenses” or “the cases.” Here we see a page of Thucydides with references to Smyth’s discussion of the article, and the cases, and similar links Kühner-Gerth, and Goodwin’s Moods and Tenses. None of this is keyed to a particular word or phrase in the text. Another issue here is that all the Greek grammars at Perseus are advanced.

Thomas Dwight Goodell, A School Grammar of Attic Greek (New York: D. Appleton, 1902)

Thomas Dwight Goodell, A School Grammar of Attic Greek (New York: D. Appleton, 1902)

In fact none of the more elementary Greek grammars like those referred to by Harper and Wallace have been digitized properly, to my knowledge. While reading Harper and Wallace’s edition of the Anabasis two summer ago I became aware of Thomas Dwight Goodell’s excellent School Grammar of Attic Greek (1902), whose dedication speaks to the attitude of a gifted teacher.

Flowchart: digitizing Goodell

Flowchart: digitizing Goodell

With help from Bruce Robertson at Mount Allison University and some of my students I set about digitizing Goodell. This involved the hand-correction and tagging of the raw OCR output provided by Robertson’s Lace, which in turn went back into Lace and improved its accuracy. This corrected output was then tagged in XML using Oxygen, and converted into html. The html pages were edited by Meagan Ayer, and the navigation created by Ryan Burke at Dickinson.

Slide28

The DCC version of Goodell has search, page thumbnails, XML download, and linked cross-references.

Now we have an easily navigated, attractive Greek grammar, including page images, downloadable XML, and linked cross-references. We can now link directly to that in the notes fields of DCC.

Sample annotation using links to grammars.

Sample annotation using links to grammars.

The aim here is to simplify annotation, and obviate the need to re-explain grammatical features. It has the pedagogical value of not being a crutch, in that the reader must make his or her own connection between the passage at hand and the relevant rule. The typical annotation of this type has four elements: the lemma, the name of the construction, the grammar cross-reference, and a partial translation. One could remove the second and fourth of these elements.

The internet has made all of us potential publishers, and there are many classical teachers out there creating resources for their own students and sharing them with the world. The future, I believe, lies in collaboration, but not just collaboration between ourselves. We should also open ourselves to collaboration with men like Henry Simmons Frieze and Thomas Dwight Goodell, and adapt their durable work to the needs of contemporary readers and students.

Liberating the Text

Gregory Crane has written a fierce new manifesto directed at editors of classical texts, in which he urges scholars to “liberate textual data from corporate control” by publishing editions only in open (Creative Commons) licensed venues and only in TEI-XML tagged formats, thus making them interoperable and freely accessible to a global audience. He laments the lack of progress in this direction, noting that TEI encoding has been around since the 1980s, and open licenses since the 1990s. The main culprit, he says, is academic politics, and the perceived need to publish under an established university press to receive formal academic credit. The publishing of critical texts only in book form is “preventing Classical Greek and Latin from shifting to a fully open intellectual ecosystem.”

source: http://goo.gl/yPDNIw

source: http://goo.gl/yPDNIw

The solution he proposes is for scholarly editors to publish their work themselves:

If editors wish to work on their own to create editions of Greek and Latin texts, they should buy a TEI-aware XML editor and learn how to produce a modern edition. Anyone smart enough to edit an edition of Greek and Latin is smart enough to understand the necessary TEI XML.

Why TEI? Working in interoperable TEI XML will allow for competing editions to be compared:

Here the goal is to have as many TEI XML transcriptions as possible and to help researchers visualize the degree to which different editions differ and to be able to compare different editions.

The ideal of a universal, interoperable apparatus criticus that collates all textual variants and conjectures of scholars based on existing print editions is probably, he admits, an unattainable one. He argues instead for a more pragmatic approach to apparatus, one that allows for word search that links to page images of the original print resources:

Here our goal is to have a maximally clean searchable text but not to add substantive TEI XML markup that captures the structure of the textual notes — the structure of these notes tend to be complicated and inconsistent. Our pragmatic goal is to support “image front searching,” so that scholars can find words in the textual notes and then see the original page images.

Another proposal is to create a series of open-licensed textual commentaries that collate the textual variants that are deemed most significant:

Strategy one: Support advanced graduate students and a handful of supervisory faculty to go through reviews of recent editions, identifying those editorial decisions that were deemed most significant. The output of this work would be an initial CC-BY-SA series of machine-actionable commentaries that could automatically flag all passages in the CC-BY-SA editions where copyrighted editions made significant decisions. In effect, we would be creating a new textual review series. Because the textual commentaries would be open and available under a CC-BY-SA, members of the community could suggest additions to them or create new expanded versions or create completely new, but interoperable, textual commentaries that could be linked to the CC-BY-SA texts. Here the goal is to create an initial set of data about textual decisions in copyrighted editions and a framework that members of the community can extend.

Crane imagines the objection that all this infrastructure is not really needed, since those who use critical editions of classical texts have access to all that they need, and that nobody else really needs scholarly critical editions of classical authors. But this view he sees as essentially suicidal for advanced research that is publicly funded:

If we think that specialists at well-funded academic institutions alone need access to the best textual data, we should express that position clearly so that the federally funded agencies and private foundations know where we stand.

Rather, scholars have an obligation (the word occurs four times) to share their ultimately public-funded work with the public that has ultimately paid for it. The driving force behind this passionately argued essay is a profound sense of duty, a commitment to “our obligation as humanists to advance the intellectual life of humanity.”

My questions and comments are as follows:

  • As someone dedicated to creating high quality CC-BY-SA digital commentaries on classical texts I applaud the vision, clarity, and passion of this essay. I believe with Crane that, as he has expressed in other venues, digitization is philology in the truest and highest sense. Digitization is a central intellectual and (again, Crane is correct) moral challenge facing our profession right now. If his essay shakes loose a few more philologists from unthinking acquiescence in the status quo, then it will be a victory.
  • Why are scholarly editions and apparatus criticus the highest priority? Why not work on wresting better translations and commentaries from copyright, and from the brains of working scholars? Though I hesitate to say it for fear of being seen as lacking scholarly seriousness, we already have digitized texts that are good enough for most purposes, and for most authors significant textual issues can usually be dealt with in the context of an explanatory commentary. There is a significant need for new translations, however. For example, neither Livy nor Polybius have ever been translated into Chinese. This means that two of the seminal and central texts for the study of the Roman Republic are simply not available at all to a large portion of humanity. Even in the much better-served realm of English, public domain translations are often all but unreadable, if not downright misleading. Why not direct some funding and some of the scholarly energies of classicists in that direction?
  • If we can think of the translation audience as the biggest and (arguably) most important circle, then the next concentric audience ring must be ancient language learners. What this group needs above all are well annotated editions with linguistic explanations, interpretations, and links to grammatical and historical reference works. One of the best ways for classical scholars to fulfil their duty to openly disseminate their findings would be to apply those findings to texts, summarizing research findings found in articles and monographs and making them directly relevant to the serious students who take the time to work their way through a dialogue of Plato or a book of Homer in Greek or a speech of Cicero in Latin. Existing open resources for this are woefully inadequate.
  • Finally, if we progress to the innermost circle of textual editors and research scholars, I would like to have some more specificity and examples of the ways in which TEI-XML will allow for interoperability. A recent article in the Journal of TEI by the classically trained Desmond Schmidt suggested that true interoperability of digital scholarly editions via TEI is not really possible, given the subjectivity of tagging. But even if we can all stick strictly to the EpiDoc standards, how does this benefit us in practice? Can we see an example of a pair of correctly tagged editions of the same text from different sources, and what what benefit this interoperability provides? It seems that the minimal tag set proposed for apparatus criticus in the current EpiDoc standards for external apparatus criticus should make this theoretically feasible. But when it comes to in-line commentary, to the actual connecting of a scholarly discourse to a particular passage in a classical text via TEI-XML, the EpiDoc guidelines are a stub. And in the XML tagged commentaries on Perseus, like that of Greenough et al. on Caesar’s Gallic War, there doesn’t seem to be any clear interoperable linking with the Latin text itself. But maybe I’m misunderstanding the tags. I would love to be able to see a few examples of TEI-compliant commentaries on classical texts, and then a demonstration of how the effort needed to produce such bears actual fruit. Then I would consider the large investment of time and money required to put the DCC commentaries into TEI-XML.

Thank you, Dr. Crane, for this bracing and inspiring essay!