Digitizing Gonçalves’ Lexicon Magnum Latino-Sinicum

I’ve been working with others for several years now to digitize a large Latin-Chinese dictionary, but I realized that I have never blogged about the effort and publicly recognized the people involved. I just got back from China, where I discussed the project at a colloquium in Beijing, and work will intensify this summer, so now seems like a good a time as any to let people know about this exciting project.

Book title pages in Chinese and LatinThe goal is to create a large Latin-Chinese dictionary as standalone mobile application and as a database freely available on the website Dickinson Classics Online, which collects resources for Chinese readers of Greek and Latin texts. The source of the dictionary data is the Lexicon Magnum Latino-Sinicum of Joaquim Affonso Gonçalves, first published in 1841. The author was a Portuguese Jesuit professor working with Chinese collaborators in Macau. No similar resource exists, and the increasing numbers of students of Latin in China have little access to the books and references resources familiar to students in the West. The overall goal of the DCO project of which this is a part is to globalize the study of classical texts and the pre-modern humanities.

sample of dictionary

A sample showing Gancalves’ distinctive lemmatizations

Work Already Done

  • The book itself is very rare. In 2016 Don Sailer and the staff at the Waidner-Spahr Library at Dickinson photographed a borrowed copy (thank you, Princeton libraries. They had one of the three existing copies of the last edition, and it was checked out at the time!). In 2016-17 Dickinson students Siyun Yan and Seth Levin  ran the scans through the text recognition program ABBYY, hand-corrected and created an Excel spreadsheet of the result.
headshot photos of two students

Dickinson Students Siyun Yan and Seth Levin carried out initial editing of the ABBYY output.

  • Seth Levin began coordinating Goncalves’ headwords with the large list of Latin dictionary headwords known as Morpheus, used in the Perseus Project. The purpose of this was to make it easier to share and coordinate the data with other large Latin dictionaries, like those available on Logeion. At the same time the headwords were coordinated with the lemma list of The Bridge, a dictionary application created by Bret Mulligan that can create custom vocabulary lists for classical texts. The Bridge list largely overlaps with the Morpheus list, but includes some better definitions and “display lemmas,” the full forms of the dictionary headwords.
screenshot of ABBYY

correcting ABBYY output

Corrected output from ABBYY: single column

Corrected output from ABBYY: single column


screenshot of spreadsheet

Combinng Goncalves’ definitions with Morpheus lemmata and shortdefs

screenshot of spreadsheet

lemmatization problems

  • In 2017 Qizhen Xie, a classics graduate student at the University of New Hampshire, continued the editing of the Chinese and the Latin headwords, and made considerable headway on this very large set of lemmas.

headshot photo of Qizhen Xie

  • English definitions and display lemmas for most items were added from the Morpheus and Bridge data sets. As part of the digitization I made the decision not to preserve Goncalves’ display lemmas, since they are idiosyncratic.
  • In spring 2019 Eli Goings (Dickinson ’18) and I worked on adding missing display lemmas and English definitions for words that are in Goncalves but not in the Morpheus list, or for which the Morpheus display lemmas are inadequate.
  • Most recently, developer Lara Frymark (Dickinson ’12) created the Android mobile application that will carry the data, and Ryan Burke, our Dickinson Drupal specialist (without whom DCC and DCO could not exist) created a content type for it on DCO.
website screenshot

DCO sample

Android mobile app screenshot

Android mobile app screenshot

Work Remaining

  • This summer we plan to finish creation of missing display lemmas. These number in the several thousands. They need to be created in a specific standard format, used in the Bridge, based on information in Gonçalves’ book itself, and added to the Excel spreadsheet. There will also be proof-reading to be done.
  • If time allows, edit and improve the Morpheus English definitions, which are often missing or faulty. 

Gonçalves digitization workflow chart

For those who are interested, here is some English bibliography about western classics in China, and details about Gonçalves’ work.

Western Classics in China

Bartsch, Shadi. “The Ancient Greeks in Modern China: Interpretation and Metamorphosis.” In The Reception of Greek and Roman Culture in East Asia: Texts & Artefacts, Institutions & Practices, ed. A-B. Renger.  Forthcoming from Brill. Pre-print available on Academia.edu.

Coleman, Kathleen. “Nondum Arabes Seresque Rogant: Classics Looks East.” Society for Classical Studies Blog, October 16, 2016. https://classicalstudies.org/scs-blog/kcoleman/blog-nondum-arabes-seresque-rogant-classics-looks-east

Li, Yongyi, “A New Incarnation for Latin in China.” Amphora, October 4, 2014. https://classicalstudies.org/amphora/new-incarnation-latin-china-yongyi-li

Liu, Jinyu. “Virgil in China in the Twentieth Century.” Sino-American Journal of Comparative Literature I (2015): 67–105. Available on Academia.edu.

Goncalves, Macau, and Missionary Scholarship

Gonçalves, Joaquim Affonso. Vocabularium latino-sinicum: pronuntiatione mandarina latinis literis expressa. Macao: A Lauriano Hippolyto typis mandatum, 1836. 246 pages; 17 cm. Repr. 1886. 246 p.; 17 cm. (OCLC: 419787323)

                             . Lexicon manuale latino sinicum continens omnia vocabula latina utilia et primitiva, etiam Scripturae Sacrae. Macai, in Collegio S. Joseph ab E. Rosa typis mandatum, 1839. ii-vii, 498 pages, 23 cm (OCLC: 7482643). Available on Hathi Trust and Google Books. Approximately 10,500 lemmas. 6th ed. Pekini: Typis Lazaristarum, 1937. viii, 446 pages ; 22 cm.

                             . Lexicon magnum latino-sinicum ostendens etymologiam, prosodiam, et constructionem vocabulorum. Macai, in Collegio sancti Joseph. ab E. Rosa typis mandatum, 1841. (OCLC: 39488723). iv, 779 pages 32 cm. Available on Google Books. 3rd edition, Pekini: Typis Congregationis Missionis, 1892 (OCLC: 663670553). Repr. 1936 (OCLC: 42878372).

Lach, Donald. Asia in the Making of Europe, vol. II: A Century of Wonder. Book 3: The Scholarly Disciplines. Chicago: University of Chicago Press, 2010.

Tang, Kaijian. Setting off from Macau: Essays on Jesuit History during the Ming and Qing Dynasties. Leiden: Brill, 2016. Ch. 2: “Macau and the Spread of Catholicism in Mainland China during the Late Ming and Early Qing Dynasties.”

Tiedemann, R.G. Handbook of Christianity in China. Leiden: Brill, 2010.

Latin, Chinese, and Baked Goods

A nice article was recently published by Concord Academy’s website about their successful collaborative project  to translate the DCC Caesar into Mandarin. The project was led by CA’s Latin teacher Liz Penland, with help from their Mandarin teacher and many students. The article quotes Liz saying some very nice things about DCC:

Penland believes a classical education should not just be the mark of the elite. “Anyone should be able to study Latin,” she says. With its peer-reviewed, crowd-sourced approach, DCC is leading a charge to make the classics accessible to anyone with an internet connection. And despite an international trend of declining study of the classical humanities, thousands of students in China are learning Latin and ancient Greek.

Many high schools, colleges, and universities rely on DCC commentaries, as does Penland. By aggregating generations of contextual notes, they reveal “a chain of interpretation, of teaching, and of use,” she says. “They help the text feel more like a cultural object that many people have read.”

A little further down we see how many people were involved, students, administrators, and teachers:

Once Penland had recruited students, Adam Bailey, head of modern and classical languages, and John Drew, assistant head of school and academic dean, offered their support. It seemed the perfect project to encourage research and independent thinking. Mandarin teacher Wenjun Kuai agreed to consult with students. “Wenjun is such a generous colleague and a wonderful teacher,” Penland says. “She did so much work on the Mandarin. The students had responsibility and a voice in how the project ran. Their group work was self-directed. It was a highly collaborative process, a model of linguistic research.”

And then there is the crucial role of baked goods:

 A friendly but intense competition emerged, thanks to weekly “brownie challenges” that earned baked goods from Penland. Lin, who completed numerous translations, says, “I’m not going to lie. It really motivated me.”

Liz put the fundamental purposes of DCC better than I could: access, community, intellectual inquiry. I am so proud of the folks at CA who used DCC in such creative ways, as a learning resource, but also as a way to share knowledge with others and have fun themselves. It shows the potential power of getting students involved in scholarly digital projects at every appropriate level. Here’s hoping DCC can be part of more wonderful projects like this in the future!

The Concord Academy Latin-Mandarin Project team. Photo (by Rebecca Lindegren, use only with permission): Top row from left: Ben Zide, Tenzin Rosson, Ken Lin (林鸿燊), Michael Qiu (邱阳), Anna Dibble, Lysie Jones, Elizabeth Penland. Bottom row from left: Nora Zhou (周安琪), Helen Wu (吴颖怡), Rebecca Yang (杨若祺)

The Concord Academy Latin-Mandarin Project team. Photo (by Rebecca Lindegren, use only with permission): Top row from left: Ben Zide, Tenzin Rosson, Ken Lin (林鸿燊), Michael Qiu (邱阳), Anna Dibble, Lysie Jones, Elizabeth Penland. Bottom row from left: Nora Zhou (周安琪), Helen Wu (吴颖怡), Rebecca Yang (杨若祺)

Workshop: Commenting on Latin Poetic Texts

I am both pleased and daunted to be leading a workshop on writing commentaries on Latin poetic texts, a full-day affair to be held on June 30, 2016 at the Guanqi Center at Shanghai Normal University. Here is an abstract:

Ut tibi sit legisse voluptas: Commenting on Latin Poetic Texts

This workshop will consider the art of commenting on Latin poetic texts, first as it has been done in recent years for English-speaking audiences, and then, in open discussion, considering how it might be done in the future for Chinese-speaking audiences. While scholars sometimes think of commenting on a text as an objective process of collecting the facts necessary for full understanding, in practice, the question of audience is paramount. Commentators mediate a text for an imaged reader, and must have a sympathetic awareness of what that reader needs, desires, and can process or understand. In addition to supplying felt needs, however, the commentator can actively lead and model humanistic practices: the precise appreciation of poetic language, close reading, cultural literacy, and skill in translation. The workshop will analyze some good examples of this kind commentary in English on Ovid, then invite a forward-looking brain-storming session on how best to enhance the experience of reading Ovid for Chinese readers of Latin literature. Topics will include the art of the interpretive paraphrase, gloss, and summary; some reliable resources for finding information about geography, mythology, grammar, Roman customs, and rhetorical and literary devices; and techniques of commenting on style and tone.

(The Latin tag in the title comes from the epigram to Ovid’s Amores.) The workshop is part of the festivities for the second annual Shanghai Normal University Guangqi International Center for Scholars Classics Lecture and Seminar Series, organized by the wonderful team of Jinyu Liu 刘津瑜 and Heng Chen 陈恒


Prof. Liu is the Principal Investigator of “Translating the Complete Corpus of Ovid’s Poetry into Chinese with Commentaries,” a multi-year project sponsored by a Chinese National Social Science Foundation Major Grant (2015-2020). She is collaborating with more than a dozen scholars from four countries A full conference with a very impressive roster of speakers will be held in Shanghai in May 31-June 2, 2017.

I am not directly involved with this project, but it served as a useful handle to think about a commentary-writing workshop in Shanghai, helping achieve a more concrete focus for what is a rather terrifying topic. My own activity as an editor on DCC has given me lots of particular ideas and preferences, but the last thing I would want to do is foist those on a Chinese audience. The really exciting thing here is the opportunity to reinvent the genre in a different context, taking the best aspects from the traditions of European commentary and liberating new energies. My goal is to show a few examples of what I think are particularly good recent instances in English, and let the discussion go where it will. Looking forward to a stimulating discussion!

–Chris Francese

Summer 2016 Paid Research Internships in Classical Studies

Dickinson students are encouraged to apply for any of three 8-week paid research internships in Classical Studies in summer 2016 (the second of these positions is contingent on a pending funding decision by the Dickinson Research and Development Committee). The pay is $350 per week, plus housing on Dickinson’s campus. The work will be carried out under the supervision of Prof. Francese, and result in substantial credited contributions to the Dickinson College Commentaries and Dickinson Classics Online Projects.

Dates: May 30–July 22, 2016

Location: Carlisle, PA

Application deadline: March 11, 2016

Positions 1 and 2 Description: Digital Latin-Chinese Lexicon

Work on the digitization of the Latin-Chinese dictionary of Joaquim-Affonso Gonçalves (Lexicon magnum: latino-sinicum 1841, 779 pp.), which will eventually result in a mobile application, and a database that will form an essential part of the infrastructure of the project Dickinson Classics Online. Begun in 2015, DCO is intended to provide better access to the Greco-Roman classics to Chinese speakers. One student (position 1) will edit Gonçalves’ Chinese definitions to make sure they are properly transcribed and modernized; the other (position 2) will edit the Latin headwords to make them correspond to those of the base dictionary published by the Laboratoire d’Analyse Statistique des Langues Anciennes (LASLA). In many cases Goncalves’ headwords will have to be split or combined to conform to the LASLA headwords, and in every case the format of the Latin headwords will have to be expanded to meet modern lexicographical standards.

Positions 1 and 2 Requirements

Position 1 requirements:

  • strong written Chinese, familiarity with both classical and simplified characters
  • attention to detail
  • interest in languages
  • facility with Excel

Position 2 requirements:

  • upper-intermediate or advanced Latin
  • attention to detail
  • interest in languages
  • facility with Excel

Positions 1 and 2 Schedule

Week 1 (May 30-June 3): orientation to the project:

  • The basics of Latin lexicography, and the similarities and differences between existing dictionaries and their source material
  • Introduction to primary resources that will be used in this project: Joseph Denooz, Nouveau lexique fréquentiel de latin, Logeion, and Goncalves’ Lexicon Magnum Latino-Sinicum.
  • Explanation of LASLA’s working methods and their style of lemmata
  • Examination of the LASLA list of homonyms, and explanation of their labeling conventions and French abbreviations
  • Practice creating dictionary forms in Excel in the existing DCC style, based on
    • LASLA lemma
    • Goncalves’ lemma
    • Lemmas available in Logeion, especially Woordenboek Latijn/Nederlands (2011)
  • Practice typing Latin characters with macra (long marks over vowels) using the Maiori keyboard in Windows, and explanation of where that is necessary, and where to find accurate information about vowel quantity
  • Analysis of the Chinese OCR to determine the extent of the revisions needed to modernize it
  • Practice editing Chinese definitions to conform with edited Latin lemmata, splitting and combining as needed.
  • Practice formatting Chinese definitions to include Latin idioms as in Goncalves

Weeks 3-8: work on creating the database, going alphabetically.

Position 3 Description: Multimedia Edition of the Aeneid

Work on a forthcoming DCC multimedia edition of the Aeneid, which will include

  • Notes, drawn mostly from older school editions, that elucidate the language and the context
  • Images, art, and illustrations, annotated to make clear how they relate to the text
  • Complete running vocabulary lists for the whole poem
  • Audio recordings of the Latin read aloud, and videos of the scansion
  • A full Vergilian lexicon based on that of Henry Frieze
  • Recordings of Renaissance music on texts from the Aeneid
  • Comprehensive linking to Allen & Greenough’s Latin Grammar
  • Comprehensive linking to Pleiades for all places mentioned in the text

Positions 3 requirements:

  • familiarity with the Aeneid in Latin
  • attention to detail
  • familiarity with Adobe Photoshop

Position 3 Schedule

  • Weeks 1–3: gathering, editing, and posting of images medieval manuscripts of the Aeneid
  • Weeks 4–5: transcription, upload, and linking of Aeneid scholarship excerpts
  • Weeks 6–8: creation of RDF file for linked data synching with Pelagios Project, for all places mentioned in the notes

TO APPLY: please send a letter of interest with a curriculum vitae to francese@dickinson.edu by March 11, 2016