The Plebs, Roman Snobbery, and Mary Beard

I love it that Mary Beard is using her Don’s Life pulpit to attack the use of the word plebs as if it were plural, and to combat the incorrect use of “pleb,” which is not  . . a . . . word. I thought I would add some observations on the word plebs as a complement to her excellent post. My only quibble with her is that the word is in fact seldom used as a term of abuse in surviving Latin (as are turba, multitudo, etc.). Its dignity is what is so interesting about it, given the extensive Latin lexicon of snobbery.

First, a few quotations:

(Lucius Ampelius, referring to events of 494/3, 449, 445 and 376-367 BC, Liber Memorialis 25:)

There were four secessions of the plebs from the fathers (i.e. the senate): the first secession because of the abuses of the moneylenders, when the plebs withdrew under arms to the Sacred Hill; the second because of the abuses of the Board of Ten when, after the murder of his daughter, Virginius surrounded Appius and his whole faction on the Aventine Hill and brought it about that Appius abdicated his magistracy and that those accused and condemned were punished by various penalties; the third was because of marriage, that plebeians be allowed to marry patricians, which Canuleius incited on the Janiculum Hill; the fourth secession, which Sulpicius Stolo incited, was in the forum because of magistracies, so that plebeians could become consuls.

(Julius Caesar, The Gallic War 6.13:)

(In Gaul) the plebs is held in a condition of near-slavery; they dare nothing on their own initiative, and are included in no decision-making. Most of them, oppressed as they are either by debt or heavy taxation or by the injustices inflicted by the powerful, consign themselves to servitude, and the nobles exercise over them all the rights of masters over slaves.

(Horace, expressing lack of interest in a political career, Letters 1.19.37-8:)

I don’t go hunting for votes cast by the fickle (ventosa) plebs by paying for their dinners and giving them used clothes.

Open class struggle was endemic to the early Roman Republic. The plebs, seeing itself shut out of priesthoods and magistracies by the patricians, and overwhelmed with debts held by wealthy landowners, responded by politicizing itself and forming its own organization. It was a phenomenon unparalleled in ancient history. Through strikes, demonstrations, and their trademark gesture of departing in a body to a hill and refusing to fight in the army (secessio), the plebs over the course of two hundred years of constant conflict with the senate achieved an end to debt-slavery, won official recognition for its representatives (the tribuni plebis, tribunes of the plebs), its own assembly (the comitia tributa, whose laws were made binding on everyone) and gained access to all the most coveted magistracies, even the consulship. The contemporaries of Machiavelli found all this class-based hostility and dissension deplorable and out of place in a well-ordered state. But Machiavelli himself, in his great commentary on the early books of Livy, disagreed. The lesson he drew from the Roman “struggle of the orders” was that in every Republic there are two opposed factions, that of the people and that of the rich, “and that all the laws made in favor of liberty result from their discord.” (Discourses 1.4) The place of the plebs in the Roman constitution was not as dominant as that of the Athenian demos in their democracy, but it was significant. Caesar draws an implicit contrast with the Roman way when he describes the supine condition of the plebs in Gaul.

As a result of this early history of political struggle and success, the word plebs never had the inbuilt sneer of other words for the non-rich, like turba (“mob”), multitutdo (“rabble”), or vulgus (“the common herd”). Livy, who tells the story of the early struggles, speaks of the plebs with considerable respect. And even through the much more violent clashes of the late Republic, rhetorical invective against the plebs itself (as opposed to their self-appointed elite representatives, the populares) is rare. The main criticism we hear is that the plebs is fickle, mobilis, or in the unusual phrase of Horace, ventosa, “windy,” that is, turned by every breeze. Orators and candidates had to cater to the plebs to get elected, and this naturally rankled the aristocrats. An orator is supposed to have said to a military man, when the two were competing for office of consul, that the latter’s chances were slim, “especially because—a thing which above all offends the minds of the plebs—you do not know how to beg.” (Calpurnius Flaccus, Declamations 47)

Under the principate things changed substantially. The Roman plebs lost its right to elect magistrates, and started receiving occasional distributions of grain. The emperors took a decidedly paternalistic attitude. The story goes that when an inventor offered Vespasian a device that would allow him to raise large columns with much less expense and manpower than the usual labor-intensive methods, he gave the man a reward for the device, but decided not to use it, allegedly saying, “let me feed my little plebs” (Vespasian 18). It is at this point that we start to hear denunciations of the plebs as a lazy urban rabble, addicted to free grain and chariot races (bread and circuses), the amenities provided by, or some would say extorted from, the government. In the later imperial historians the meaning of plebs becomes indistinguishable from that of turba or vulgus. To believe them, the disciplined political force of the early Republic has become a gawking mob. At the same time, Roman law was delimiting an ever-stricter barrier between elite and commons, so that the plebs was subject to certain “plebeian” punishments (flogging, torture, consignment to the mines) to which the upper classes were legally exempt. A late Roman compendium of law, the Codex Theodosianus, uses the word plebs to refer to the serfs irrevocably assigned to North African estates in the fourth century. This kind of wretched plebs was a long, long way from the fighting plebs of early Roman Republic, eight hundred years earlier.

Still, the essential dignity of the word made it appropriate in the first Latin translation in the Hebrew Bible for amo, the “people” of God, i.e. the Jews, and (from the fourth century on) an apt word for the Christian faithful (plebs Domini), and finally for a Christian “congregation,” the “laity,” as opposed to the clergy (clerus).

From Christopher Francese, Ancient Rome in So Many Words (New York: Hippocrene, 2007).

The Future of Ancient Greek

“The print textbook will be gone in ten years. What’s the Greek classroom going to look like?”  This is the question that Tom Sienkewicz put to Greek scholar and pedagogical innovator Wilfred Major of Louisiana State University. Major’s response, first given at a 2012 CAMWS panel he co-organized, has just been published in the latest issue of Classical Outlook (“Teaching and Testing Classical Greek in a Digital World,” CO 89.2 [2012], pp. 36-39). It’s an important article that should be read by anyone interested in the teaching of ancient Greek, and since it’s (ironically) not on line, I take the liberty of quoting in extenso.

“A future where digital platforms are the standard mechanism for teaching ancient Greek is nearly in sight,” he says. Crucial advances are being made. Advanced Greek readers are already very well-served on line by Perseus and the TLG. Intermediate Greek is also increasingly well-served by digital resources.

Computerized analysis of the lemmas and morphology of Greek texts has made it possible to prioritize the assistance new readers need at their fingertips, as they make the transition from beginners to intermediate and then to independent readers. Support for this transition includes providing vocabulary (entries appropriate to their level) and morphological data (in the form of parsing information).

Major points to developing projects like the DCC, Geoffrey Steadman’s downloadable Greek readers, and the ongoing series by Evan Hayes and Steve Nimis, which

make texts, facing vocabulary, and other support information accessible at a glance to intermediate students, saving the time and drudgery of flipping through pages and allowing both students and teachers to stay focused on the comprehension and benefits of what they are reading.

The stabilization of the core intermediate vocabulary in the DCC, he argues, means that advanced students can also get involved by generating running vocabulary in a clear, straightforward manner, and have the satisfaction of producing lasting pedagogical materials for other students.

The bottleneck, he argues, is in Introductory Greek, where high-quality but in some ways antiquated print resources have not yet been fully matched by digital counterparts.

with no disrespect to the authors and publishers of these volumes, in terms of presentation, information, layout and design, standard word processing programs can produce virtually everything found in these books. With the addition of images and slide programs (such as Power Point), a teacher can do more, and better, than anything in these books.

Such materials, he insists, must take full advantage of computerized analysis of Greek texts to help make students effective intermediate and advanced readers of digital Greek. This means taking into account vocabulary frequency and density of texts, and also statistical data about the frequency of morphology and syntactical structures (here Major sites Anne Mahoney, “The Forms You Really Need to Know,” Classical Outlook 81 (2004): 101–05, also ironically not on line!).

Beginning Greek must be reconceived as it moves to digital platforms. Merely transferring current print presentations to digital display monitors will strangle the learning of Greek, a shameful prospect when such treasures now loom just beyond the beginning stages.

Another interesting point in the article has to do with the typing of Greek. Students must be helped to become proficient in typing Greek as soon as possible, and must not be required to buy a new piece of software to do so. He urges keyboard designers to work with standard Modern Greek keyboards as a basis.

Both Windows and Apple devices now have polytonic Greek keyboards and inputs built in at the system level, which need only be activated. Both incorporate the Modern Greek keyboard. While the Apple system has more flexible input options, it includes all the same input options as the default Windows system. As things stand, therefore,we should promote this system for its widespread accessibility and compatibility. Expecting or requiring students to purchase and install additional software will inevitably lead to problems as they move from computers to phones, tablets, and so on.

Most important, Major stresses that digital platforms are ideal for encouraging the steady practice, repetition, and feedback with the core material of Greek in a way that best address the frustration and attrition that plague beginning classes.

The vocabulary and parsing tools already established for advanced and intermediate digital materials also provide a goal and clear purpose of method for introducing vocabulary and morphological identity from the earliest stages of beginning Greek. Doing so means we can dispense with relying on the dozens of pages of charts and paradigms that we, explicitly or implicitly, expect students to memorize as a precondition of just beginning to read the simplest continuous Greek passage.

If you are not familiar with Major’s work on this kind of pedagogy, I urge you to check out his articles “On Not Teaching Greek,” Classical Journal103 (2007): 93–98, and “Teaching Greek Verbs: A Manifesto,” Teaching Classical Languages 3 (2011): 23–42 (the latter co-authored with B. Stayskal), and the superb resources available on his frequently updated Greek resources page http://www.dramata.com/. My own thoughts about using the DCC and its core vocabulary in a sight reading-based approach can be found in an earlier post.

The APA and Digital Populism

It’s election season at the American Philological Association. I opened up my ballot for next year’s officers and discovered that digital classics and digital outreach have vaulted to the forefront of debate in ways that would have been hard to predict only a few years ago. The APA is known as one of the more conservative of academic professional associations when it comes to matters digital, and its own web presence has until recently been very minimal. But now not only are they on Facebook and Twitter, but the new Digital Classics Association has been approved as a Type II Affiliated group, and there are plans for a new multi-million dollar portal of classics digital outreach. This is new and big and exciting, so I wanted to offer a little outsider’s guide to the candidates’ personal statements which, combined with some recent posts on the APA website, help to predict the near future of things digital at the APA.

A dash of context: Since 1989 the APA has been a major force behind the digitization of the nonpareil bibliographic research tool L’Annee Philologique. This is primarily for advanced scholars. But the APA has been much slower to engage with social media and the popular ferment of classics on the web. Previous efforts at outreach to secondary teachers and the popular audience (long a strong point of the AIA, for example), have generally been viewed as well-meant but not very successful. The APA has traditionally been a group that focuses on advanced research, and very successfully so. Big new initiatives are now aimed at speaking to the popular audience. A successful capital campaign is well underway to raise $4 million, part of that to be devoted to a new web portal called “The American Center for Classics Research and Teaching.” Ward Briggs’ capital campaign video predicts that the new portal will be “the authoritative site on the web for classical study,” and “will offer the highest quality information about classical civilization to the widest possible audience in the format best suited to each segment of that audience.” It will open the gates of classical learning, so that “a privileged background and an elite education will no longer be requirements . . . in this digital age.”

Briggs seems to see the site as encyclopedic, a kind of more open version of Perseus, with links to a broader variety of approved sites, organized by topic. This would obviously be a massive undertaking. Current President Jeffry Henderson, putting a different spin on the project, connects an APA web presence with public advocacy and improvements in pedagogy. The APA should “build information paths that connect professionals in the field and the lay public to data and information about the state and value of Classics, to 21st century resources for research, and about materials for pedagogical development.” The website, he hopes, will “make full use of social media on all media platforms, so that users can find information, follow developments in the field, enjoy presentations and other learning opportunities, and connect with colleagues.”

This year’s candidates for President and VP for Publications and Research do not exactly endorse all of these extremely ambitious goals. All support some kind of portal or gateway to classics as a way of bringing to bear the scholarly expertise of the APA membership in the digital realm. But how is this actually supposed to work?

Presidential candidate Kathryn Gutzwiller embraces neither Briggs’ expansive Perseus-like vision of an omnibus reference site, nor Henderson’s focus on professional networking and pedagogy. She sees it as more about announcing discoveries and aggregating (free? paid?) research tools: “The APA website should be a place where important discoveries in classics are announced and through which there is easy access to information about publications, to electronic resources, and to research tools. A well-constructed and accessible portal should appeal to classicists and non-classicists alike.”

Gutzwiller’s fellow-candidate for the presidential spot, James Tatum, sees the web site more in terms of teacher development: “It will enable us to support teaching at every level, even more than we do now.” He highlights the need for more dialogue between secondary teachers and college faculty, and thinks the site could help “Increas[e] collaboration between college and university teachers and teachers in secondary schools.” The site will “make it clear that the road between university and secondary education can run in both directions.”

David Blank, a candidate for VP for Publications and Research, helpfully acknowledges the difficulty of these tasks. It will not necessarily be easy to “mak[e] ourselves heard by scholars, students, and the general public amidst the profusion of digital divulgation, diversion, and distraction.” He floats the idea of adding scholarly video content to the APA website, although his specific example–reports from the TLL fellows on their experiences in Munich–would seem to have limited popular appeal.

Blank’s fellow-candidate for Publications and Research VP, Michael Gagarin, wants to make knowledge of the classical world as widely available as possible, but acknowledges that the digital is something of a problem for current modes of production. He identifies “dealing with digitization” as “the biggest long-term challenge for both research and publication.” I’m not sure if he means dealing with the problem of properly evaluating digital work (something the MLA has been grappling with), or if the challenge has to do with the economics of scholarly publishing, or intellectual property, or what. In any case, he is loathe to have the APA take on dissemination roles traditionally assumed by print publishers: “my preference is that the APA should play largely a support role, working with publishers and libraries to promote digital publication and with universities, foundations, individuals, and others to produce digitized resources and make them as widely available as possible.” The distinction between digital “publication” (presumably peer reviewed in some traditional way, and paid for), versus digital “resources” (presumably free but evaluated only after the fact by the APA) is implicit in Gutzwiller’s remarks as well. In both cases the APA should identify “the most reliable and useful” electronic resources and “provide access to these materials for all our members,” presumably through the new portal.

These are all exciting ideas, but implementation would probably take ten or twenty times the money being raised by the capital campaign. Moreover, none of the candidates or current officers mentions a model in another field for the kind of site they have in mind, or mentions any current classics sites that they could build on.

Perhaps a good model would be physics.org. Rather than trying to be the final authority on all matters physical, it merely attempts to keep tabs on what is happening in physics on the web from day to day. It is jaunty and fresh and delightful to explore. This kind of crucial curatorial work is now being done in classics by lone heroes such as David Meadows and Charles Jones. It is highly useful, very popular, and not unduly resource-intensive. It seems to me that the APA could usefully give such work a more well-appointed home. The current APA Facebook page is mostly press releases. Going out and aggregating classical news and projects from around the world would be a great service to the profession.

As for professional networking and pedagogy, a good professional website model is probably the American Mathematical Association or the American Physical Society. But  I’m not sure we need a new APA-sponsored teacher community. The best thing would be for APA members to become more active on existing listservs and social media platforms for teachers, and of course to get out and visit classrooms and meet teachers in their own areas.The College Board’s AP Latin teacher community is very impressive example of collaboration between college, university, and secondary teachers, backed by some (not-for-profit!) corporate web development heft. And there is now Romae.org, a kind of Facebook for classical languages and studies. This is still small, but it is populated by college faculty, secondary and primary teachers, and some students as well.

So it’s great to see this kind of energy coming from the APA. Thanks to all the candidates for volunteering to run and help pilot our profession through the tricky waters that lie ahead.

Summer Accomplishments, part 2

Jimmy Martin (’13) passes on this summary of his work over the summer:

He focused on creating the Greek Core list organized by parts of speech and by TLG frequency, and the Latin list as organized by parts of speech.
He read through the Amores helping create the vocabulary lists as he went. He read  through Cicero’s Pro Caelio, creating vocabulary list for his assigned sections. He read through most of Book 5 of the Gallic Wars, adding and subtracting vocabulary according to the updated Core Latin Vocabulary List.

Thanks, Jimmy, for all your important contributions to the project!

–Chris Francese

Summer 2012 accomplishments

We had a productive summer at DCC, thanks largely to the four wonderful Dickinson students who worked for eight weeks on the project. Here’s a roundup from Alice Ettling (’12) about her accomplishments.

Along with the rest of the group she started out the summer editing the various Latin and Greek vocabulary lists. The Core Greek and Latin lists are now sliced and diced various ways: alphabetically, by parts of speech, by tiered frequency, and by semantic groups.  Alice worked particularly on the Latin semantic grouping list and on organizing the Latin and Greek morphology lists.

With the rest of the crew she read Ovid, Amores 1, to prepare for the compilation of its vocabulary lists. She put together PDFs of all the core vocabulary lists, and kept these updated during the last minute editing process.

As the Amores 1 commentary developed, she checked the Allen and Greenough references already in Prof. Turpin’s commentary and added a few more. She read through a small portion of the Pro Caelio to create vocabulary lists for Prof. Reedy’s Latin 111 course in the fall. She put the Amores introduction online, and finally returned to the Greek list. Once that was done, she refined most of the vocabulary lists in Book 6 of Caesar and implemented some edits I made to the Amores commentary.
Alice is also the one who put together the map animations for Caesar, which have gotten so much positive attention. Thanks, Alice, for all your great contributions to the project, and good luck in all your future endeavors!

–Chris Francese

A Sight Reading Approach to Using the DCC

One of the key features of the DCC site is that each text comes equipped with hand-made running vocabulary lists, containing the main definitions for each word, but also the particular one relevant to the context. Very common words are excluded. These take a lot of effort to prepare, of course, so I thought it would be good to explain why we do this.

The point is not just to make it easier for readers to find the correct lemma behind a given form (something automated tools are still very bad at). It also allows for a way of teaching that focuses students’ out of class efforts on vocabulary acquisition and comprehension, rather than the (much harder task of) translation. A vocabulary-focused sight reading approach can help fight the bane of Latin and Greek pedagogy: students writing down the “correct” translation in class, and giving it back on tests, which improves their ability to memorize English, but doesn’t do much for their Latin or Greek.

In essence this is what is now fashionably called a flipped classroom approach, where easier rote tasks are put outside class time, and the hardest tasks are done inside class, collaboratively. In my view the positive psychological effect of this are well worth the effort. Many classical teachers have used this kind of approach over the years. My own particular inspiration is Edwin Post, a professor at De Pauw around the turn of the 20th c., and author of the wonderful Latin at Sight (1895). I know many teachers out there are doing similar things, and would love to hear suggestions and refinements, especially things that DCC could do to better enable this kind of pedagogy.

The routine as I have worked it out in my own classes (one which of course admits of many variations) is as follows:

Students’ class preparation consist of a mix of
• vocabulary memorization for passages to be read at sight in class, and
• comprehension/grammar worksheets on other passages (ones not dealt with normally in class).
Class itself consists mainly of
• sight translation, and
• review and discussion of previously sight-read passages
• grammar review as needed
Testing consists of
• sight passages with comprehension and grammar questions (like the worksheets), and
• vocabulary quizzes.

Textual analysis is done orally in class, through more interpretive worksheets on previously read passages, and in paper assignments.

The rationale behind doing things this way is that:
• students become good at reading Latin or Greek ex tempore. They lose their fear of it. They start to recognize word groupings and syntactical relationships, rather than isolated vocabulary items.
• students learn to guess at unknown words based on context rather than becoming stuck on the first unfamiliar word, or relying too much on the dictionary
• students have no incentive to memorize English translations; the incentive is to master high frequency vocabulary that is likely to be seen again in a new context. These items are learned contextually.
• students get used to identifying grammatical features that actually occur in the text, rather than isolated grammar lessons that don’t always have a clear relationship to reading. Grammar is less a burdensome extra, but as a tool that allows the extracting sense out of a text.
• total quantity of text covered may be somewhat less in class, but worksheets allow at least as much reading total as in the traditional method, probably more

To implement this it is important to
• Have vocabulary lists made up ahead of time. If working toward a high frequency master list, separate the lists into high and non-high frequency portions. Otherwise, just have reasonably comprehensive lists made up. Put it all on a web site for them to study before class. Quiz these occasionally first thing in class. No need to do this every day. They have an incentive to learn vocab. so as not to look too clueless in class. Midterm and final involve comprehensive vocabulary review of words already seen.
• Have worksheets made up ahead of time. Comprehension questions can be written in Latin or Greek, and call for responses in Latin or Greek. This is very difficult at first, but helpful in the long run. Comprehension questions in English are somewhat easier, but make it possible at times for students to merely skim the text looking for key words. But one needs to be resigned to the fact that they will not glean every single nuance of these passages. This is ok. More exposure is better. For the grammar questions, have them spot several instances of a particular construction; or manipulate things, e.g., find several verbs in the imperfect and put in all six tenses and translate (this is a mini synopsis). Focus on pronouns, relative pronouns, reflexives, participles, transitive vs. intransitive verbs, finding word groupings like transitive verbs and their direct object. This kind of grammatical analysis powerfully reinforces sight reading skill.
• When sight reading in class it is essential to do “pre-reading.” Give a little talk about what the passage is about, point out proper names, unusual vocabulary, tricky constructions ahead of time. That way they go in knowing what it is basically about, and will not be phased by knotty bits.
• Make a point of reviewing everything. This gives lots of confidence, reading fluency, vocabulary reinforcement.
• Progress to more sophisticated worksheets that include interpretive tasks, like picking out the most significant or emphatic words, judging the tone, finding literary and rhetorical techniques, inferring what the author wants you to think about what it being said.
• Throughout it is important to communicate with the students what you are doing and why. The notions of high frequency vocabulary, guessing, getting the gist and not worrying so much about the details, these are things the students can get behind. With this good will you can do a lot of more detailed grammatical discussion and textual analysis.
• Grading should be low stakes on the worksheets, at least initially

The feedback from my students on this has been good. Certainly the relationship to grammar is transformed. They suddenly become rather curious about grammatical structures that will help them figure out what is going on. With the worksheets the assumption is that the text makes some kind of sense, rather than what used to be the default assumption, that it’s Latin (or Greek), so it’s not really supposed to make that much sense anyway, right?

–Chris Francese

Latin, visualized

I’m dreaming of an infographic. It shows the top 1000 most common Latin words, broken into groups by semantic category. In each group, you can see at a glance the relative frequency of the words–which ones are the most common. Maybe the extremely common ones are bigger, maybe they rise higher into the third dimension. This information has been culled from the data assembled painstakingly by hand by the Belgian LASLA group, and in Diederich’s Frequency of Latin Words and Their Endings (1939).

Better than that, one can also see at a glance which words are predominantly poetic, which are predominantly prosaic, and which are neither particularly prosaic of poetic. Maybe that’s a color thing–redder for more poetic, bluer for more prosaic. Or maybe the poetic words are higher, up among the clouds, while the prosaic words tramp on the ground. This too has been determined based on the excellent data of LASLA and Diederich, which enumerates occurrences by poetry and prose.

And since the words are grouped by semantic categories, you can see what the main topics of preserved Latin literature are, its main preoccupations. The body. The house. Violence. Writing. Knowledge. Speech. The elements.

Here’s the question: what should be the visual theme? Should it be a landscape? A library?    A Roman temple? Those of you who are visually inclined, help me out. You can probably tell that I have been reading EdwardTufte (Envisioning Information). What if we could make the awesome frequency data we have come alive in graphic form? How cool a pedagogical tool would that be?

–Chris Francese

Perseus and Classics DH

I am extremely grateful to the folks at Perseus, and to all the others who took the time to reply to my earlier, rather dyspeptic, posts about the Perseus Word Study Tool. A valuable blog post by Bridget Almas, the Senior Programmer at Perseus, is up on the front page of Perseus right now. And the conversation on Google+ was animated as well.

Bridget Almas’ central point is that resources are limited, and we need to prioritize. Is morphology per se really worth the investment, in comparison with other more urgent needs? And would a truly open, distributed editing environment for the annotation classical texts (like that already in place for papyri on Papyri.info, and planned by Perseus) attract enough people interested in that particular issue? Probably not, she suggests.

Helma Dik, heavily involved in improving and re-purposing the Perseus data at the University of Chicago, points out that with any tool, including the LWST, we need to teach students how to use it properly. She insists that the large-scale results produced by the Perseus parser are valuable, and that its accuracy can be substantially improved incrementally by means of hand parsing. She has made good progress on the Greek side with Perseus at Chicago. The Latin parser lags because no one has yet taken the time to improve it, and she suggests that that–rather than grousing about the current inadequacy of the tool–should be the focus of our efforts.

Laura Gibbs of the University of Oklahoma, a pioneer and dynamo in digital pedagogy in classics, makes the (to me) central point that what students most need is not full parsing of every word, but to know from what word a given form comes. From a pedagogical standpoint, the correct dictionary head word is the only crucial information. The process of intelligent glossing and annotation is greatly aided by having a core vocabulary, a list of very common items that will not be glossed at all. She argues that this process of pedagogically-driven glossing must be human-created, not machine-generated. She would like to see a collaborative digital environment for reading Latin and Greek together asynchronously, hopefully one that does not focus exclusively on translation as the goal, but on comprehension and reading.

Justin Schwamm, the driver behind Tres Columnae and another pioneer and expert in digital classical pedagogy, helpfully focuses the discussion on the pedagogical goal, which for him is getting people to read Latin, not just translate it. If the tools don’t contribute to reading fluency, we shouldn’t use them. He also points out that from a user’s perspective, the LWST provide too much information. Students’ eyes tend to glaze over when presented with a solid mass of new information, whether in a print textbook or on a web page.

I think that Bridget Almas has put her finger on the central problem we face right now. What should be the priorities, and (a closely related question) how are we to marshal the labor, and raise the level of interest among classicists in improving current digital tools and creating new ones? This profession is full of amazing, learned, selfless, phenomenally hard-working people. Why are so few of them putting energy into digital collaboration, teaching and dissemination, as opposed to traditional print monographs and articles? If we could get 20% of those PhD’s toiling away on university press monographs to work on digital editions, where would we be?

The value of DH for humanists lies in its collaborative nature and the transformation of scholarly communication it enables; in the innovative and effective pedagogy it facilitates; and in the vast increase in access to information and learning it makes possible. Why are more classicists not excited by this? Things are of course changing slowly, and we’re all working on this in our own ways, but to accelerate the acceptance of digital classics in the profession and bring in more labor to fix things like the LWST, a few things are especially important, I think. None of this is news, just trying to articulate it as clearly as I can:

First, it’s important to keep exploring modes of peer review. Classicists are very sensitive to what Dan Cohen has called the social contract in scholarly communication represented by presses, proof-reading, peer review, and also design aesthetics. Digital publishing has a severe, nay, crippling deficit in prestige. Second, and of course related, we have to keep focusing on the quality of the content. Classicists have a very low tolerance for error, and thus distrust the internet more than most. But quality is not enough, as the lackluster start of the Princeton/Stanford working papers shows. Next, and this is more relevant to the issue at hand, we should make tools like the LWST respond to current pedagogy and reading practices. The tools should be aimed laser-like at the real needs of users, and respond to their culture of reading. Pedagogy, rather than computational linguistics, must be central to the iterative design process. Finally, something that is not really in the debate as far as I know, we should focus on the scholarly voice. Classicists’ reading culture places a high value on the expert, and prizes the trained scholar above all. Most digital tools currently are either highly impersonal (as with the LWST) and thus would be viewed with suspicion even if they were more accurate; or they try to rely on crowd-sourcing, which goes rather against the grain of the classical mind. Reverence for expert opinion both inhibits ordinary readers from contributing to crowd-sourced annotation (compare the rather slow start of The Open Utopia), and prevents most readers from taking it seriously. The good news is that the digital environment allows closer contact with scholars through blogging and especially though audio recordings (check out the classical material on New Books Network).

Ok, I have strayed rather far afield from the Latin Word Study Tool. My grand scheme was to create a distributed editing environment for creating vocabulary lists, like the environment the papyrologists have at Papyri.info, but the minds at Perseus were way, way ahead of me, and have something like that in the works already. In a future post I will think about what kinds of features I might want to see in such a thing.

Do the Flaws in the Perseus Word Study Tool Matter?

In a recent post I tried to categorize the problems of the Perseus Word Study Tool, as tested on a section of Vergil. More surprising to me than the overall rate of error (about one in three words was misidentified in some way) was the fact that many of the errors were not subject to correction by means of Perseus’ “voting” system; and that even when voting was in operation, it often did not correct the error. Sometimes the correct choice was not an available option; other times, unanimous correct votes were ignored, and unanimous incorrect votes were accepted. At Aen. 5.17, to add another example to those mentioned the earlier post, the vocative magnanime was incorrectly called an adverb on the basis of six incorrect user votes.

The inadequacy of the LWST will not have been news to anyone who has used it. The question is, is the level of error pedagogically significant? Is the LWST good enough for the purposes of a typical Latin student? In other words, should the average Latinist care? It is not good enough, and the level of error and the specific types of errors in this flagship classical DH project are pedagogically significant and worthy of attention, I believe, for several reasons.

1. Words that give students the most trouble–relative pronouns, demonstratives, quam, ut, modo, Q-words in general–are exactly those least likely to be handled well by the LWST. The earlier post has some examples from my small sample, but I’ll add here that in Aen. 5.30 (magis . . . ) quam, when it comes to that quam, the LWST offered no fewer than seven possible quams to choose from (all numbered quam 1-7), none of which has the correct definition in the context (“than”).

2. The LWST is of course helpless when it comes to unusual or idiomatic expressions, of which there is a good example in my sample at 5.6, were notum must be translated “the knowledge that.”

3. The tool naturally can analyze only what is there. It cannot tell when something is left out or assumed.

4. A major structural problem is represented by bad short definitions of the type (to choose again from examples offered by my sample)  iubet = “imposed,” iam = “are you going so soon,” frustra = “in deception, in error,” or more subtly, the fact that the common meaning of tendere, “direct one’s course,” does not appear in the short def. for that word.This is important because, even though one can click on and read the full Lewis & Short dictoinary definition, intermediate students are very unlikely to click through and sift through long entries in search of the correct definition.

5. Moreover, the LWST obscures the relationships between words, which is key to learning to read Latin. This is why seemingly minor accidence mistakes are meaningful. Misled on a part of speech, or the gender of an adjective or the case of a noun, the student will likely not see the syntactical connection between words, and thus the tool reinforces the urge to produce the dreaded “word salad” translations.

6. More broadly, with its cryptic statistical data and jumbled pseudo-information, the LWST reinforces the the impression that many students have: that Latin isn’t really supposed to make sense anyway, that it’s all some kind of fiendish crossword puzzle.

Gregory Crane in an important article and apologia for Perseus, has said that the goal of the Perseus Project is to provide “machine-actionable knowledge.”

Reference materials, in particular, are structured to support automatic systems (e.g., the morphological analyzer learns Greek and Latin morphology from a machine actionable grammar) and to be decomposed into small chunks and then recombined to provide dynamic commentaries. If you retrieve a book in a language that you cannot read or on a topic that you cannot understand, the system can find translations where these already exist, machine translation and translation support systems, reference works, and general background information suited to the general background and immediate purposes of the reader. In knowledge bases, the boundaries between books begin to dissolve.

But clearly machines are spectacularly bad at understanding Latin at the moment. Crane thinks in terms of many decades, and is waiting for massive improvements in artificial intelligence, or teams of graduate students to encode correct grammatical analysis in texts. But such a prospect seems increasingly far off, and given the size of the Perseus Digital Library (10.5 million words at the moment), it seem unlikely that the millions of errors can be corrected any time soon, if ever. Indeed, would it be worth huge the investment of time and money? In the meantime, we need to create a collaborative tool for generating reasonably correct and reliable vocabulary lists for Latin (and Greek) authors that will be helpful for students and teachers around the world. Why we should do this, and what kind of tool I have in mind, will be the subjects of future posts.

–Chris Francese

 

Types of Error in the Perseus Latin Word Study Tool

The Perseus Latin Word Study Tool (LWST) is intended to provide dictionary definitions and grammatical analysis of all words in the Latin texts available in the Perseus Digital Library, currently 10.5 million words.

A check of the definitions and grammatical analysis of an arbitrarily chosen chunk of Vergil’s Aeneid (5.1-34, 223 words), found that it was incorrect in 79 instances, or 35.4% of the time (and correct 64.6% of the time). The most common type of error (21 instances,  26.6% of all errors, 9.4% of all words) was a mistake of accidence, for example duri (5.5) was taken as genitive singular instead of nominative plural. In 17 cases (21.5% of errors, 7.6% of all words) words were assigned to the wrong lemma, as when quoque (“and whither”) was derived from quoque (“also, too”), or venti (“winds,” 5.20) was assigned to the verb venio, “come,” as if it were the perfect participle. This particular mistake occurred three times in this passage, and the correct lemma was not listed as a possible option. In 14 instances (17.7% of errors, 6.3% of all words) the dictionary definitions provided were wildly wrong. This was true of some very common words. iam was glossed as “are you going so soon,” nec as “and not yet,” ab as “all the way from.” Elissae (5.3) was glossed as “Hannibal.” In every case this type of error was seen to come from the pulling, seemingly at random, of a word or phrase from the dictionary of Lewis & Short on which the LWRT is based. In 11 instances (13.9% of errors,  4.9% of all words), the relevant definition in the context at hand was not provided (though it could be found by clicking to and reading through the full Lewis & Short dictionary entry). For example, cerno was glossed as “separate, part, sift,” but not “perceive,” or infelicis (5.3) glossed as “unfruitful, not fertile barren,” rather than “unfortunate.” More seriously, all relative pronouns were glossed as interrogatives (“who? which? what? what kind of a?”), and described simply as “pron.” The word “relative” did not appear on the page. In 8 instances (10% of errors, 3.6% of all words) a word was assigned to the incorrect part of speech, as when medium (5.1) was called a noun rather than an adjective, or locutus (5.14) assigned to the rare 4th decl. noun “a speaking” rather than to loquor. In 4 cases (5% of errors, 1.8% of all words), there was no definition available. And in all cases deponent verbs were incorrectly labeled passive (4 instances in this particular section, or 5% of errors, 1.8% of all words).

Now, the makers of Perseus are perfectly aware of the flaws in LWST, and attempt to use the power of social media of help remedy the situation. Subjoined to the analysis of every ambiguous word, after an explanation of the methodology used, one finds a plea to help by voting.

The possible parses for this word have been evaluated by an experimental system that attempts to determine which parse is correct in this context. The system is composed of a number of “evaluators”–each of which uses different criteria to score the possibilities–whose votes are weighted to determine the best answer. The percentages in the table above show each evaluator’s score for each form, which are then combined to determine each form’s overall score.
This selection used the following evaluators:
• User-voting evaluator: Scores parses based on the number of votes each one has received from users. Weighted more heavily as more users vote for a given word in a text.
• Prior-form frequency evaluator: Evaluates forms based on the preceding word in the text; finds the most likely parse among this word’s possible morphological features and the preceding word’s possible features based on the frequency of each possible pair.
• Word-frequency evaluator: Scores parses based on how often the dictionary word appears in the Perseus corpus. Only used when a given form could be from more than one possible word.
• Tagger evaluator: Evaluator based on pre-computed automatic morphological tagging
• Form frequency evaluator: Scores parses based on how often their morphological features (first-person, indicative, plural, and so on) occur among all the words in the Perseus corpus.
User votes are weighted more heavily than the other methods, which are all treated equally.
Don’t agree with the results? Cast your vote for the correct form by clicking on the [vote] link to the right of the form above!

But here too, some problems arose in my sample. First of all, only a handful of doubtful words had any votes. Second, many of the error types identified above do not admit of voting. And third, those that did have votes did not always benefit from having them. Here is the entry on the word rates in ut pelagus tenuere rates (5.8), showing a preference for the (incorrect) accusative, despite nine user votes for the (correct) nominative.

 

On the word pater in Quidve, pater Neptune, paras? (5.14), ten incorrect user votes for the nominative win out over the (obviously correct) vocative.

More common, however, is the lack of any user votes at all, as in this very confusing jumble of information on the word hoc (5.18). Note that the correct lemmatization (> hic) has a nonsensical definition; that the morphological analysis states it can only be a pronoun (“pron.”) whereas here, as often, it is a demonstrative adjective; and finally that the LWST incorrectly concludes that the form derives from the lemma huc.

Another odd and thankfully rare genre of error occurs in the case of deinde (5.14), which is correctly analyzed, but put beside a fictional alternative, the present imperative of a verb *deindo.

I would like to know if the same level of error and types of errors occur when LWST is unleashed on a prose text. Perhaps there the idea of a “prior-form frequency evaluator” would make more sense.

It is not my intent to denigrate the huge achievements of Perseus in our field. It is certainly better to have the LWST than not to have it. My purpose here is just to investigate the nature and extent of its errors. If this sample is at all representative, something along the lines of 3.5 million errors exist in the current database. I would also like to ask, is it realistic to think that qualified people can be found to correct the mistakes of the LWST? What is the incentive for professional Latinists to do so?

I also have a proposal for a different kind of tool, which I will save for another post, since this one is already too long. Your thoughts?

–Chris Francese