Digital Commentaries and Generative AI

Posted on January 14, 2025 by Chris Francese

A fascinating recent paper by Sarah Abowitz, Alison Babeu, and Gregory Crane, all of Tufts University, asks whether “general advances in machine learning could power alternative digital aids” to reading foreign-language sources more easily, “requiring less labor” than the immense effort required to create traditional print commentaries. In other words, how can we best use technology to transcend the limits of print in helping readers of classical works in the digital age? It’s a question at the heart of DCC, and of the other digital commentary projects they mention, the Ajax Multi-commentary Project and New Alexandria. After surveying classical commentary traditions past and present, print and digital (with some very kind words about DCC) they discuss a study conducted on sample commentaries aimed at different audiences on sections from Thucydides’ History and from the Iliad—Book 6 of both works. The authors do not consider the rich tradition of Hebrew and Christian exegesis, but their comments on the classical tradition could perhaps be extended to other traditions as well.

Unknown artist/maker, illuminator, Elijah ben Meshallum, scribe, Elijah ben Jehiel, scribe, et al. Decorated Text Page from the Rothchild Pentateuch, 1296. J. Paul Getty Museum, Los Angeles

For the study, two coders tagged the comments and categorized them by function into six varieties: syntactic aid, translation, semantic aid, inconsistency alerts (which informed the reader of changes the editor made to the original text), stylistic claims (which assert that a certain linguistic feature occurs in a certain way in the text, for example, “this is one of Nicias’ favorite adjectives”), and finally reference pointers, which refer to any specific reference work, primary source, or scholarly paper.

The authors acknowledge that these categories are not watertight, and that quite a few comments straddled more than one of the categories. Unsurprisingly, commentaries that are oriented towards scholars have a higher percentage of reference pointers and less the way of translation and syntactic aids. Those aimed at students have more in the way of syntactic aids and translation. The high number of bare reference pointers that are characteristic of scholarly commentaries (I call them “cf.” notes and attempt to rigorously exclude them from DCC) are often dead ends in an open digital medium, since the material referred to is copyrighted and behind paywalls.

The authors point out that there is very little research about user experience of commentaries, of the kind that would be standard procedure in creating any self-respecting digital interface. What do we really know about how students use commentaries on average, and which types of notes are most helpful? We all have opinions on this topic, but apparently little or nothing has been done formally to investigate the question.

DCC’s practice is now to road-test commentaries with students before publication. In some cases, students are involved in choosing notes from existing public domain commentaries for variorum editions. A new cohort of DCC high school interns is on the way for summer 2025. We’re not gathering any data about student preferences from this process, but we could.

Another interesting point the authors make, coming from a computer science perspective, is that the stylistic claims in commentaries are often based on data (for example the number of occurrences of a given word or phrase), yet classical commenters almost never provide the data to back these assertions up (Some notes in the Cambridge Green & Yellow series do, and Ronald Syme’s Tacitus is a notable pre-digital exception). In a digital medium, it would be quite possible to provide the data, for example using treebank data.

The goals of this kind of discussion, it seems to me, are two: first to serve readers better, and second to use computational means to create more content with less labor. To what extent can artificial intelligence and generative AI aid in this enterprise? This paper suggested to me an interesting approach to testing and moving forward. The creating of a typology of comments is an important advance. This could be refined by analyzing a larger sample of older commentaries, akin to the corpus collected for the Ajax project on Sophocles’ Ajax, though these are all very scholarly, and not very useful for most readers, in my opinion. It would be better to work with school commentaries where multiple parallel school commentaries exist, say Caesar, Cicero, or Vergil, Xenophon, Homer, or Lysias.

Once a refined and well-understood typology of notes is ready, the types can be evaluated in terms of their relative utility for different audiences, and serve as a basis for creating prompts for generative AI. A specific, well-designed prompt might elicit from AI a certain type of comment, and a combination of those could be used to create a draft commentary on new texts, to be evaluated and edited by humans.

Another key piece of the puzzle is something that Gregory Crane mentioned at the recent SCS panel, the fact that Hathi trust now has marvelously good multilingual OCR in its back end. In theory it will be possible to do a much larger harvesting of existing public domain commentaries, tag and use that data to create a more extensive collection of note types and do a kind of sifting operation in which the users select the notes that they find particularly helpful. The knowledge thus gained could be used on creative AI prompts. Obviously, anything produced by generative AI would have to be extensively edited by humans. But getting us part way would be extremely helpful. My own experiments on generating vocabulary lists with ChatGPT seem promising.

SCS 2025 Panel on Using AI in Classical Philology

Posted on January 7, 2025 by Chris Francese

Every year the Digital Classics Association (DCA) sponsors a panel at the meetings of the Society for Classical Studies (SCS), and this year, appropriately, it was all about the potential of LLMs and AI in classical philology. Organized by Neil Coffee, the driving force behind DCA, it was a star-studded panel. Here is (was) the program:

Opening Up Classics with AI (organized by the Digital Classics Association)
Neil Coffee, University at Buffalo, SUNY, Organizer

Neil Coffee, University at Buffalo, SUNY
Introduction
Samuel Huskey, University of Oklahoma
Opening Up Bottlenecks in Digital Classics Workflows with Human-in-the-Loop AI
Patrick Burns, New York University
Prompt Engineering for Latin Teachers
Edward Ross, University of Reading, and Jackie Baines, University of Reading
Generative Image AI and Teaching Classics: A Case of Exaggeration
Gregory Crane, Tufts University
AI, Machine Actionable Publication and Assigning Credit
Joseph Dexter, Harvard University, and Pramit Chaudhuri, University of Texas at Austin
Benchmarking Generative AI Models for Classical Literary Criticism

Slide listing Gregory Crane's goals for his talk at the SCS

Gregory Crane, Tufts University
AI, Machine Actionable Publication and Assigning Credit

The abstracts of the talks are posted here, so I won’t try summarize them. My favorite quote came from the ever-polemical Gregory Crane, who referred to the monographs being sold nearby in the SCS book publishers’ display as “a dark archive,” and said “publications/datasets matter insofar as they fire the human mind.” That is a scholarly goal I can get behind, firing the human mind.

Huskey is working on the gathering comprehensive metadata for the Digital Latin Library, sucking in library records from all over the world and trying to disambiguate author names and work titles, many of which have multiple variants, confusing overlaps, and vagueness in the existing records (opera omnia? opera selecta? Bucolica? Eclogae?)

Burns is working on trying to create extensive reading material for Latin learners, as we go from the extreme scarcity of comprehensible texts for beginners to a world where we can have essentially infinite amounts of Latin pitched at any level. Charmingly, he had an LLM create a story about Odysseus and the Cyclops from the perspective of the sheep. From this talk I learned that prompts can be very large. A human will be confused by a question that is 300 pages long; AIs can easily take it in and synthesize. His main message is you can get a basic understanding of how this things work without being a computer scientist, and it is helpful to have such an understanding.

Ross and Baines are keeping track of AI-generated images that have something to do with the ancient world, and ferreting out distorted history, incorrect information, and modern biases. They showed an amusing image of “Nike the Greek Goddess” flying around wearing a pair of Nike sneakers. Images of Hades draw extensively from Disney’s Hercules. They believe scholars have a duty to keep track of the craziness that is out there, if only to help the image tools get gradually more historically accurate.

Dexter and Chaudhri just finished teaching a seminar on Latin literary history using only fragmentary authors, and are trying to use AI to craft a new narrative about Latin literary history based on this material.

Crane wants to leverage AI to give people without extensive knowledge of historical languages better access to the classics of the world, the whole world, through enhanced translations and reference tools, and to serve audiences in their own languages (e.g. Persian), not just English. I was particularly taken with his effort to us machine translation to translate the examples in Kuhner-Gerth’s Ausführliche Grammatik der griechischen Sprache into English, thus unlocking this fundamental reference work for a broader audience. He’s training AI to figure out which of the 26 uses of ὡς is active in a particular passage of Greek. He also pointed out the the multilingual OCR at Hathi Trust is off-the-chain good at this point.

What struck me was the way that using AI tools requires scholars to be explicit about their goals, what they really want to do, in a way that writing a journal article does not. These papers used AI tools for different, all legitimate, philological and scholarly goals. Do you want to

critique historical bias and inaccuracy on the web? (Ross & Baines)
tell a story about literary history? (Dexter and Chaudhuri)
help people learn Latin? (Burns)
catalogue published texts? (Huskey)
attribute passages correctly? (Dexter and Chaudhuri)
fire the human mind? (Crane)

AI can help. Notably, none of these goals is rewarded by the academic world as currently constituted. Which is one more reason I respect these scholars for doing interesting work despite the professional incentives to churn out another article or book for the dark archive.

I went into this panel rather repelled by AI, more aware of it as a tool for cheating on college writing assignments, and a potential menace to humanity, than as a potential aid in my beloved philology. I came out intrigued with the possibilities and wanting to try to apply it to the workflows of DCC (see this post on my first attempts to create DCC style vocabulary lists with ChatGPT and Claude).

Yuval Noah Harari’s fascinating 2024 book Nexus convinced me that there is no pre-determined end to the AI story, and that we need to be actively engaged in thinking about it and guiding its trajectory for human goals. Harari, a historian, talks about the ways that every new information technology brought good things and bad things. Printing enabled both the scientific revolution and witch hunts. What matters is how we use it and shape it. These papers all showed scholarly uses of AI that seem to me both interesting and productive.

Public Speaking: Secrets from the Classical Tradition

Posted on August 15, 2023 by Chris Francese

Fighting racism, or any wicked, or simply wrongheaded, idea, ultimately demands attempts at persuasion, person to person. All non-violent activism and efforts at social change depend on rhetoric. It is fashionable now to believe that persuasion—the political kind, anyway—is something of a mirage, that much of our thinking is “motivated,” driven primarily not by argument and evidence but by self-interest, tribal loyalties, enduring personality traits, and demographic facts. Identity comes first; the rationalizations that make us feel that we are correct in our prejudices hobble along after. So argues Ezra Klein, for example, based on many psychological and political science studies, in Why We’re Polarized (2020). The role of the art of rhetoric in this model is not to persuade, but to activate and weaponize identities and their powerful latent drives. Politics in this view is best understood not as reasoned civic dialogue but as a high-stakes all-in partisan combat. Persuasion exists, but as a dog tied to the cart of identity group competition—so say the studies.

Classical authors from Aristotle to Demosthenes, Cicero to Quintilian, understood that the antithesis between identity and reason posed by such focus-group-and-psychological-study-wielding social scientists is entirely false. Common sense chimes in with Aristotle’s Rhetoric, which is really a brilliant exploration of psychology and emotion: persuasion is real, but not entirely rational. Eloquence uses reason and emotion, responds to identity and trades in argument. It is founded on the audience’s predispositions, its prejudices and existing opinions, but lives in the art of the orator. The orator’s moral responsibility as a citizen is significant because persuasion has real consequences, sometimes life and death. And the weapon of demagoguery is always at hand. Virtually every classical historian explores this dynamic, not to speak of the orators themselves and the rhetorically trained and gifted classical poets and dramatists. There is no more central topic in the classical canon than the techniques and ethics of persuasion, and no more burningly relevant aspect of the classical tradition today.

The power, delight, and social utility of eloquence, the universal desire among educated people to possess it, and the perception that the classical texts had unique keys to understanding it, lie behind the dominance of classical Greek and Latin in antebellum educational curricula. Caroline Winterer’s The Culture of Classicism: Ancient Greece and Rome in American Intellectual Life, 1780 -1910 (2002) describes how students were willing to put up with punishing pedagogical regimes of memorization and humiliation to acquire access to the “world of words.” In the pre-industrial economy, classical study was the main route away from agricultural work to professional distinction as a lawyer, doctor, or preacher. But as Winterer emphasizes, the classical texts were not just a toolbox for professional success. They came with a set of values seen as key for maintenance of a republic, values that put checks on self-interest and party passion. Later, as grueling preparation in Greek and Latin proved inessential for success (hello, Andrew Jackson and Abraham Lincoln), the rationale for the classics shifted to their more ineffable aesthetic qualities, the wisdom and inner perfection to be found in the deep study of classical culture. The practical, rhetorical-political rationale for the classics shifted to the background. This inwardly directed self-cultivating focus of the classics as it developed in the later nineteenth century was the legacy of classical teaching to the humanities in modern academy, argues Winterer.

Why not revive the tradition of classics as a route to effectiveness in the world via eloquence, minus the Precambrian teaching methods? Many students are anxious about speaking in public, though they know the ability to do so is valuable for almost every profession, career, or ambition. Despite its importance, public speaking is absent from most college curricula. It falls in the cracks between academic disciplines. Classical studies is well placed to meet this educational need. A judicious selection of classical theory and models, combined with modern insights and examples and abundant practice, will improve students’ skills, deepen their appreciation effective speaking, and help them critique unprincipled persuasion and demagoguery. Perhaps most importantly, it will help them get attention for ideas and causes they care about. Classical texts could help them change the world.

One problem is that classicists don’t consider themselves qualified to teach “speech.” Another is that, for many students, speech carries unpleasant reminders of being forced to watch the greatest hits of American political oratory and encouraged to speak in public in pompous platitudes. Then there is simple ignorance of what classical rhetoric is actually about. The peddling of that trio of abstractions, logos, ethos and pathos—terms dimly understood but somehow profound—and the focus on rhetorical devices (more recherche Greek terms) represent all that is irritating and pretentious in classical teaching. Then again, Aristotle’s Rhetoric is no easy read, and ancient rhetorical manuals are forensic in orientation and remote from the needs of the English language. Unfortunately, modern speech textbooks do little to improve on the pedantry of some of their ancient predecessors.

Luckily, materials are starting to become available that could form the basis of a contemporary public speaking class with a classical spin. James May’s How to Win an Argument: An Ancient Guide to the Art of Persuasion (2017) well translates key passages from the oratorical works of Cicero, helpfully introduced and annotated, and (bonus) it includes the Latin texts. Veteran journalist and teacher Roy Peter Clark publishes “x-ray readings” of contemporary speeches, like Greta Thunberg’s UN Speech and Obama’s Philadelphia speech on race, which are essentially classical-style rhetorical analyses without the intimidating verbiage. The Harvard Business Review has for years been publishing brilliant, undogmatic essays on persuasion in a business context, many of them with unacknowledged classical content, such as Jay Conger’s “The Necessary Art of Persuasion.”

One way to avoid the platitudinous reputation of “speech” is to focus on real life rhetorical challenges, like giving a pep talk (Sallust’s Catiline delivers two excellent ones), motivating people to take a looming threat seriously (Demosthenes’ life’s work), or apologizing (Aristotle has excellent advice, Rhet. Book 2, section 3). One can then pair classical precepts with modern examples, which students can find themselves and contribute to the discussion. Ditch logos, ethos and pathos (essentially an analytical framework) in favor for the practical trio of inventio, elocutio, and actio, that is, framing (coming up with arguments to suit a particular situation and audience), style (using memorable language), and delivery. This is Conger’s model, a stripped down, non-forensic version of the classical system. Students tend to be fixated on actio and neglect inventio and elocutio. Conger puts these in balance and adds the insight that an effective persuader/manager must listen as well as talk.

Classically informed analyses of modern speeches, such as Clark’s, or the wonderful essay on Kennedy’s Inaugural by Burnham Carter, Jr.[1] can help to focus attention on tailoring a message to a specific audience and paying close attention to word order, metaphor, sound, clause length, and the like. The classical stylistic criteria of correctness (words in common use, properly designating the things you want to say), clarity (meaning is immediately understandable, avoids excessive abstraction and euphemism), ornamentation (use of tropes and figures to add vitality and polish), and propriety (parts make a whole and the whole fits the occasion) apply to every speech and serve nicely as part of a rubric.

One way to keep the classical content lively is to read about famously high stakes rhetorical moments: the Mytilenaean debate (Johanna Hanink’s How to Think about War: An Ancient Guide to Foreign Policy [2019] excerpts and translates this and all the key speeches from Thucydides), the conspiracy of Catiline, Caesar and the mutiny at Vesontio, Marc Antony at the funeral of Caesar. Truly strong translations of key speeches from classical orators and historians, read aloud and recorded by good actors, would be a great help. Samuel Rowe has made a start by recording the first half of Cicero’s first Catilinarian in a compelling style, though the translation is the nineteenth century one by Yonge.

A syllabus constructed along these lines worked well for me, and the class drew a group more diverse in every way than the ones I teach in a normal classical civilization class. Since some of their speeches were about their own lives, experiences, and interests, I got to know the students better than in any class I have ever taught. Every teacher will have favorite speeches from classical works, so the problem is more one of choice and presentation than of finding suitable material. The balance of ancient and modern, of Aristotle and TED talk, will depend on what the students are ready for. But I am convinced that the vitality of classical rhetoric, its powerful conceptual framework, its ethic of public service, and its stylistic excellence, can speak effectively to contemporary problems and inspire today’s students.

[1] “President Kennedy’s Inaugural Address,” College Composition and Communication 14 (1963), 36–40.

Concordance Liberated: Apuleius

Posted on January 14, 2019 by Chris Francese

About a year ago Bret Mulligan and I started on a project to liberate the data contained in concordances of classical authors, by digitizing the concordance, then unscrambling it to produce a fully lemmatized text. This lemmatized text was then to be combined with dictionary head words and definitions to create a full lexicon. The idea is that those who want to read the author could create full, accurate vocabulary lists based on this data, using The Bridge.

In April 2018 we received a Pedagogy Grant from the Society for Classical Studies (see “Flight of the Concordances“) to begin with the Index Apuleianus by William Abbott Oldfather et. al. (published in 1934 by the American Philological Association). Today I am proud to report on the successful completion of that part of the project.

A website describing the broader Concordance Liberation Project is now live.
The Gituhub repository contains the plain text of the concordance and the lemmatized text with full dictionary forms and definitions.
The searchable interface at The Bridge makes this data available to teachers and others who want to create vocabulary lists for works of Apuleius.

The digitization was performed by NewGen Knowledge Works. Chris Francese and Bret Mulligan performed the data analysis prefatory to processing and conversion. Michael Skalak wrote the code and transformed the plain text to a spreadsheet. Post-processing involved creating equivalencies between the lemmas used by Oldfather and his team and the lemmas or “titles” used by The Bridge; making sure that dictionary forms or display lemmas matched those; and then equipping the dictionary headwords with appropriate definitions. This difficult and meticulous work was carried out by Eli Goings (Dickinson ’18) and John Burgess (Haverford ’19), with funding from Dickinson and Haverford Colleges. As those who know Apuleius are aware, his vocabulary is immense. This work effectively creates a full lexicon of his works with definitions for even the most obscure words.

“Concordance Liberation” is now an ongoing project, and the SCS grant gave it an important impetus, for which we are very grateful. The next author we are tackling is Eutropius, and we have many others in the queue. Please let us know if you have any comments or suggestions.

Classicists without Borders

Posted on March 30, 2016 by Chris Francese

Photo: Quinn Dombrowski, via flickr

Classical outreach programs are proliferating. See, for example, the ones at Oxford, the University of Cincinnati, the Classics in Communities Project in the UK, and the variety of outreach initiatives at the SCS. The problem with the term outreach is the slight air of desperation. There must be people “out” there who have never heard our message, who need to be “reached.” Hands extend into a void, waving cheerfully at passersby, signaling for attention, anxious not to be ignored. I believe we should think less in terms of reaching out and more in terms of service, of finding places where our skills are needed or welcome, even when those are not the places that our ordinary professional lives typically take us. Possibly the best current example of this is the series of workshops run by Classics in Communities, bringing support to those in schools with no Latin programs who want nonetheless to teach Latin. I can think of two other areas where there is a certain void, a space where the voices of Classicists without Borders would potentially be welcome, even useful, but have not so far been heard very much. The first is podcasting. The podcast medium is widely enjoyed as recreation be people as they exercise, walk, travel, go about housework routines, etc. This is an audience hungry for new content, eager to explore new ideas, and interested in all sorts of things. Perhaps they studied Latin at school, or have always had a love of mythology. The mechanics of producing and delivering podcasts to this audience are well within the technological competence of most classicists. Success in the medium, as with much teaching, requires a conversational style, a sense of humor, and an ability to tell stories. A second area is that of digital project reviews. The vast majority of people who are not professional classicists find their information about the classical world on the internet, and there is a heartening proliferation of good quality digital projects about the ancient world. Still, there is a good deal that is slapdash and ill-informed. Who can tell the difference? Classicists can. Where is there a reliable venue of critiquing, evaluating, and commenting on digital resources? Nowhere. The SCS Communications Committee (which I currently chair), among its other activities, is creating just such a venue as part of the SCS website and blog. When qualified review of open digital resources becomes as routine as it is for monographs, the prestige and the quality of open online publications will rise. The SCS Communications Committee has created a clear set of guidelines for such reviews, and is actively soliciting reviewers and projects to review. Please leave a comment if you have any suggestions for this, or ideas about other “Classicists with Borders” initiatives.

The Society for Classical Studies and Digital Publication

Posted on August 22, 2014 by Chris Francese

Every year at this time I have a look at the statements of candidates for leadership offices in the Society for Classical Studies (known until recently as the American Philological Association) to see what kind of positions they take on matters relating to digital humanities and digital publication. Two years ago the Digital Classics Association had just been approved as a Type II Affiliated group, and there were plans for a new multi-million dollar portal of classics digital outreach. Last year the latter initiative was rightly being abandoned, and the discussion was more about the role of our professional association in the world of academic publishing. While some wanted to defend the status and importance of the print monograph, others hoped the APA would help guide web users to quality resources on the internet. In last year’s post I made the point that to focus on the delivery method (paid print vs. open electronic) is to miss a key potential role of the professional association: to foster networks of peer review for scholarship, no matter how it appears.

This year’s candidate statements share a sense of anxiety about the future of the field and the status of the humanities in the academy. Several make the excellent point that more can be done to foster Latin in secondary schools, “literally our lifeline,” as presidential candidate Peter Burian says. As for digital publication, presidential candidate Roger Bagnall is reticent, which is odd given his key role in the development of online scholarly publication of papyri. But Peter Burian emphasizes the key issue, it seems to me, peer review:

The APA has a strong track record, and it could be used to help our profession (and others) move toward full recognition of on-line publication and various kinds of digital scholarship. Works of scholarship that are crucial for specialists are becoming increasingly difficult to get into print, and there are many kinds of scholarship for which print is not the best, or even a satisfactory, medium. A strong, well-understood peer-review process governed by our internationally recognized professional association could make the difference in how such works are weighed by tenure and promotion committees.

Publications and Research is the committee where the changes in scholarly publishing are of course at the center. Here there are two candidates, Emily Greenwood and Nita Krevens. Greenwood urges the association “to explore new avenues for open digital publication in Classics and to support and promote excellent existing sites.” Krevens’ comments are altogether more edgy. She says that electronic publication is “still the elephant in the room.” Krevens continues:

On the one hand, the natural ‘gate-keeping’ function of limited print space is disappearing; this means that scholarly associations like ours are becoming the source of new guidelines for peer review and publication. On the other hand, commercial publishers of academic journals are fighting desperately to preserve their turf as learned society e-publishing emerges as a partial solution to strained library acquisition budgets (witness the battle between Elsevier and the mathematicians). Academic presses are currently caught in the middle of these conflicting imperatives. In addition to setting field-wide standards for electronic journals AND monographs, I believe the APA/SCS can play an important role advocating for the electronic archiving and dissemination of smaller scholarly journals in our field, which are currently not easily available online. These days, if you are not in JSTOR, you are invisible.

I think it is optimistic to say that scholarly associations are becoming the source of peer review guidelines. In any case it’s not so much guidelines that are needed as mechanisms for actual peer review. Only rigorous editing and review of digital publications will generate the prestige that will motivate more good scholars to improve the quality of open resources. As Sander Goldberg put it recently in BMCR it is up to us to insist on the combining of the “accuracy and clarity of [traditional print publication] with the flexibility and accessibility of the [web].” Goldberg also makes the point that many of the most fundamental and traditional activities of classical scholarship, such as the close analysis of syntax, and other tools for close reading, are actually better suited to the web than to print. In some ways the more specialized and technical the issues, the more data that can be put before the reader, the more desirable is a digital presentation.

The SCS as an archiver and provider of access to lesser-known journals not in JSTOR is an idea I find very appealing, and hopefully one that the publishers of such journals would also embrace.

Dickinson College Commentaries

digital commentaries on classical texts

Category Archives: APA/SCS