Greek Core Vocabulary: A Sight Reading Approach

Crytian Cruz, via Flickr (

(This is a slightly revised version of a talk given by Chris Francese on January 4, 2013 at the American Philological Association Meeting, at the panel “New Adventures in Greek Pedagogy,” organized by Willie Major.)

Not long ago, in the process of making some websites of reading texts with commentary on classical authors, I became interested in high-frequency vocabulary for ancient Greek. The idea was straightforward: define a core list of high frequency words that would not be glossed in running vocabulary lists to accompany texts designed for fluid reading. I was fortunate to be given a set of frequency data from the TLG by Maria Pantelia, with the sample restricted to authors up to AD 200, in order to avoid distortions introduced church fathers and Byzantine texts. So I thought I had it made. But I soon found myself in a quicksand, slowly drowning in a morass infested with hidden, nasty predators, until Willie Major threw me a rope, first via his published work on this subject, and then with his collaboration in creating what is now a finished core list of around 500 words, available free online. I want to thank Willie for his generosity, his collegiality, his dedication, and for including me on this panel. I also received very generous help, data infusions, and advice on our core list from Helma Dik at the University of Chicago, for which I am most grateful.

What our websites offer that is new, I believe, is the combination of a statistically-based yet lovingly hand-crafted core vocabulary, along with handmade glosses for non-core words. The idea is to facilitate smooth reading for non-specialist readers at any level, in the tradition of the Bryn Mawr Commentaries, but with media—sound recordings, images, etc. Bells and whistles aside, however, how do you get students to actually absorb and master the core list? Rachel Clark has published an interesting paper on this problem at the introductory level of ancient Greek that I commend to you. There is also of course a large literature on vocabulary acquisition in modern languages, which I am going to ignore completely. This paper is more in the way of an interim report from the field about what my colleague Meghan Reedy and I have been doing at Dickinson to integrate core vocabulary with a regime based on sight reading and comprehension, as opposed to the traditional prepared translation method. Consider this a provisional attempt to think through a pedagogy to go with the websites. I should also mention that we make no great claim to originality, and have taken inspiration from some late nineteenth century teachers who used sight reading, in particular Edwin Post.

In the course of some mandated assessment activities it became clear that the traditional prepared translation method was not yielding students who could pick their way through a new chunk of Greek with sufficient vocabulary help, which is our ultimate goal. With this learning goal in mind we tried to back-design a system that would yield the desired result, and have developed a new routine based around the twin ideas of core vocabulary and sight reading. Students are held responsible for the core list, and they read and are tested at sight, with the stipulation that non-core words will be glossed. I have no statistics to prove that our current regime is superior to the old way, but I do know it has changed substantially the dynamics of our intermediate classes, I believe for the better.
Students’ class preparation consists of a mix of vocabulary memorization for passages to be read at sight in class the next day, and comprehension/grammar worksheets on other passages (ones not normally dealt with in class). Class itself consists mainly of sight translation, and review and discussion of previously read passages, with grammar review as needed. Testing consists of sight passages with comprehension and grammar questions (like the worksheets), and vocabulary quizzes. Written assignments focus on textual analysis as well as literal and polished literary translation.

The concept (not always executed with 100% effectiveness, I hasten to add) is that for homework students focus on relatively straightforward tasks they can successfully complete (the vocabulary preparation and the worksheets). This preserves class time for the much more difficult and higher-order task of translation, where they need to be able to collaborate with each other, and where we’re there to help them—point out word groups and head off various types of frustration. It’s a version, in other words, of the flipped classroom approach, a model of instruction associated with math and science, where students watch recorded lectures for homework and complete their assignments, labs, and tests in class. More complex, higher-order tasks are completed in class, more routine, more passive ones, outside.

There are many possible variations of this idea, but the central selling point for me is that it changes the set of implicit bargains and imperatives that underlie ancient language instruction, at least as we were practicing it. Consider first vocabulary: in the old regime we said essentially: “know for the short-term every word in each text we read. I will ask you anything.” In the new regime we say, “know for the long-term the most important words. The rest will be glossed.” When it comes to reading, we used to say or imply, “understand for the test every nuance of the texts we covered in class. I will ask you any detail.” In the new system we say, “learn the skills to read any new text you come across. I will ask for the main points only, and give you clues.” What about morphology? The stated message was, “You should know all your declensions and conjugations.” The unspoken corollary was “But if you can translate the prepared passage without all that you will still pass.” With the new method, the daily lived reality is, “If you don’t know what endings mean you will be completely in the dark as to how these words are related.” When it comes to grammar and syntax, the old routine assumed they should know all the major constructions as abstract principles, but with the tacit understanding that this is not really likely to be possible at the intermediate level. The new method says, “practice recognizing and identifying the most common grammatical patterns that actually occur in the readings. Unusual things will be glossed.” More broadly, the underlying incentives of our usual testing routines was always, “Learn and English translation of assigned texts and you’ll be in pretty good shape.” This has now changed to: “know core vocabulary and common grammar cold and you’ll be in pretty good shape.”

Now, every system has its pros and cons. The cons here might be a) that students don’t spend quite as much time reading the dictionary as before, so their vocabulary knowledge is not as broad or deep as it should be; b) that the level of attention to specific texts is not as high as in the traditional method; and c) that not as much material can be covered when class work done at sight. The first of these (not enough dictionary time) is a real problem in my view that makes this method not really suitable at the upper levels. At the intermediate level the kind of close reading that we classicists value so much can be accomplished through repeated exposure in class to texts initially encountered at sight, and through written assignments and analytical papers. The problem of coverage is alleviated somewhat by the fact that students encounter as much or more in the original language than before, thanks to the comprehension worksheets, which cover a whole separate set of material.

On the pro side, the students seem to like it. Certainly their relationship to grammar is transformed. They suddenly become rather curious about grammatical structures that will help them figure out what the heck is going on. With the comprehension worksheets the assumption is that the text makes some kind of sense, rather than what used to be the default assumption, that it’s Greek, so it’s not really supposed to make that much sense anyway. While the students are still mastering the core vocabulary, one can divide the vocabulary of a passage into core and non-core items, holding the students responsible only for core items. Students obviously like this kind of triage, since it helps them focus their effort in a way they acknowledge and accept as rational. The key advantage to a statistically based core list in my view is really a rhetorical one. In helps generate buy-in. The problem is that we don’t read enough to really master the core contextually in the third semester. Coordinating the core with what happens to occur in the passages we happen to read is the chief difficulty of this method. I would argue, however, that even if you can’t teach them the whole core contextually, the effort to do so crucially changes the student’s attitude to vocabulary acquisition, from “how can I possibly ever learn this vast quantity of ridiculous words?” to “Ok, some of these are more important than others, and I have a realistic numerical goal to achieve.” The core is a possible dream, something that cannot always be said of the learning goals implicit in the traditional prepared translation method at the intermediate level.

The question of how technology can make all this work better is an interesting one. Prof. Major recently published an important article in CO that addresses this issue. In my view we need a vocabulary app that focuses on the DCC core, and I want to try to develop that. We need a video Greek grammar along the lines of Khan Academy that will allow students to absorb complex grammatical concepts by repeated viewings at home, with many, many examples, annotated with chalk and talk by a competent instructor. And we need more texts that are equipped with handmade vocabulary lists that exclude core items, both to facilitate reading and to preserve the incentive to master the core. And this is where our project hopes to make a contribution. Thank you very much, and I look forward to the discussion period.

–Chris Francese


Greek Core Vocabulary Acquisition: A Sight Reading Approach

American Philological Association, Seattle, WA

Friday January 4, 2013

Panel: New Adventures in Greek Pedagogy

Christopher Francese, Professor of Classical Studies, Dickinson College


Dickinson College Commentaries:

Latin and Greek texts for reading, with explanatory notes, vocabulary, and graphic, video, and audio elements. Greek texts forthcoming: Callimachus, Aetia (ed. Susan Stephens); Lucian, True History (ed. Stephen Nimis and Evan Hayes).

DCC Core Ancient Greek Vocabulary

About 500 of the most common words in ancient Greek, the lemmas that generate approximately 65% of the word forms in a typical Greek text. Created in the summer of 2012 by Christopher Francese and collaborators, using two sets of data:  1. A subset of the comprehensive Thesaurus Linguae Graecae database, including all texts in the database up to AD 200, a total of 20.003 million words (of which the period AD 100–200 accounts for 10.235 million). 2. The corpus of Greek authors at Perseus Chicago, which at the time our list was developed was approximately 5 million words.

Rachel Clark, “The 80% Rule: Greek Vocabulary in Popular Textbooks,” Teaching Classical Languages 1.1 (2009), 67–108.

Wilfred E. Major, “Teaching and Testing Classical Greek in a Digital World,” Classical Outlook 89.2 (2012), 36–39.

Wilfred E. Major, “It’s Not the Size, It’s the Frequency: The Value of Using a Core Vocabulary in Beginning and Intermediate Greek”  CPL Online 4.1 (2008), 1–24.



Read Iliad 1.266-291, then answer the following in English, giving the exact Greek that is the basis of your answer:


  1. (lines 266-273)  Who did Nestor fight against, and why did he go?





  1. (lines 274-279 ) Why should Achilles defer to Agamemnon, in Nestor’s view?




  1. (lines 280-284) What is the meaning and difference between κάρτερος and φέρτερος as Nestor explains it?




  1. (lines 285-291) What four things does Achilles want, according to Agamemnon?



Find five prepositional phrases, write them out and translate, noting the line number, and the case that each preposition takes.







Find five verbs in the imperative mood, write them out and translate, noting the line number and tense of each.






How principal are Greek principal parts?

I just finished adding the principal parts to the DCC ancient Greek core vocabulary list, something I meant to do last summer, but which got lost in the shuffle. So that’s done, and up. Phew. Anybody who has tried to learn ancient Greek knows what a big hurdle the principal parts are: absolutely essential, but a beastly task of brute memorization. I am here to say that, as one who focuses more on Latin than on Greek, I have to re-learn some of them on a regular basis if I want to read (or teach) Greek well. This is not the fun, life-affirming, profound, aesthetically enriching part of Greek. This is the boot camp, the weight-lifting one must do to get there.

The idea behind principal parts is to put in your hands, and hopefully in your brain, all the different stems of a verb, so that (theoretically) any declined form can be derived from, or traced back to, one of them. But of course it’s not quite that simple.

On the one hand, some verb forms and related things are extremely common, but not really directly derivable from the principal parts as they are traditionally presented. εἰκός, for example, is a very common participial form meaning “likely, plausible” that is not immediately apparent from the principal parts of ἔοικα. It’s in the dictionary, of course, but somewhat buried in the entry on ἔοικα.

On the other hand, many Greek verbs have principal parts whose stems are only very rarely employed. πέφασμαι, for example, is a perfect tense principal part of a very common verb, φαίνω. But forms derived from it are rare. πέφαγκα, another perfect form listed by Smyth among the “principal” parts is very rare indeed, with only seven attestations in the TLG, almost all of those from late antique grammarians and lexica. I guarantee you will never encounter it outside a grammar book.

Part of the problem here is that our apparatus for learning ancient Greek is largely derived from big, comprehensive, scientific grammars of the 19th century, and thus have a tendency to completism, rather than the conveying of what is most essential. This is a general problem that does not only affect the issue of principal parts.

Enter into this picture the database, specifically the TLG and its lemmatizer tool. This is the tool that attempts to determine from what dictionary head word (or lemma), a given form derives. I have complained elsewhere about the impotence of existing lemmatizers when it comes to determining the meaning of homographs–forms that are spelled the same but derive from different lemmas, or forms derived from a single lemma, but which could have more than one grammatical function. This is a serious and as yet unsolved problem when it comes to asking a computer to analyze a given chunk of Greek or Latin. And the homograph problem also substantially compromises frequency data based on machine-analyzed large corpora of Greek and Latin.

But one thing at which the lemmatizers are extraordinarily good–theoretically flawless– is telling how many occurrences of a certain word form there are in a given corpus. And by examining that data you can get in most cases a very accurate picture of how common are the forms derived from a particular stem or principle part in a Greek verb. In other words, the TLG Lemma Search (which is what I have been working with in making the principal parts lists for our site), helps us see more clearly than has ever been possible which principal parts of each verb are the most important, and which very common forms lie slightly outside the traditional lists of principal parts. It has the potential to make principal parts lists far more informative and helpful to the language learner even than the information found in Smyth, LSJ, or any of the current textbooks.

I can think of a couple ways in which TLG lemmatizer data could be used to enhance the presentation of Greek principal parts. One could, for example, have a second list of, say, the five most statistically common forms of a given verb. In the case of πάρειμι, for example, that would be the following (with the total raw occurrences in TLG as of today):

παρόντος (8587), παρόν (5406), παρόντα (4920), παρόντων (4442), παρόντι (3451)

In fact the top 10 or so are all participial. παρών παροῦσα παρόν: that’s what I call a principal part!

Another way to do it would be to print in bold the principal part from which the most forms derive, or even use a couple different font sizes to reflect how commonly used each principal part is. For σῴζω, save, the figures are (roughly) as follows σῴζω (8600) σώσω (1300), ἔσωσα (5500), σέσωκα (400), σέσωσμαι (700), ἐσώθην (8800). Interesting to see the aorist passive stem beat out the present stem. The top vote-getters in terms of forms are σωθῆναι, ἔσωθεν, σώζεται/σῴζεται, σῶσαι, and σῶσον.

People who are better at Greek and spend more time with large corpora and their analysis than I do have probably thought of all this long ago, and there may be some principal parts lists that incorporate some of this data. If so, I would love to hear about it.

Before closing I should give a huge thank you to Prof. Stephen Nimis from Miami University of Ohio and his collaborator Evan Hayes, whose principal parts list in their edition of Lucian’s A True Story (soon to be re-published on our site with extra features) was of great assistance as I was making our list. And I should mention here also the crucial help I have had all along with our Greek list from the great Wilfred Major, of Louisiana State University.




The Future of Ancient Greek

“The print textbook will be gone in ten years. What’s the Greek classroom going to look like?”  This is the question that Tom Sienkewicz put to Greek scholar and pedagogical innovator Wilfred Major of Louisiana State University. Major’s response, first given at a 2012 CAMWS panel he co-organized, has just been published in the latest issue of Classical Outlook (“Teaching and Testing Classical Greek in a Digital World,” CO 89.2 [2012], pp. 36-39). It’s an important article that should be read by anyone interested in the teaching of ancient Greek, and since it’s (ironically) not on line, I take the liberty of quoting in extenso.

“A future where digital platforms are the standard mechanism for teaching ancient Greek is nearly in sight,” he says. Crucial advances are being made. Advanced Greek readers are already very well-served on line by Perseus and the TLG. Intermediate Greek is also increasingly well-served by digital resources.

Computerized analysis of the lemmas and morphology of Greek texts has made it possible to prioritize the assistance new readers need at their fingertips, as they make the transition from beginners to intermediate and then to independent readers. Support for this transition includes providing vocabulary (entries appropriate to their level) and morphological data (in the form of parsing information).

Major points to developing projects like the DCC, Geoffrey Steadman’s downloadable Greek readers, and the ongoing series by Evan Hayes and Steve Nimis, which

make texts, facing vocabulary, and other support information accessible at a glance to intermediate students, saving the time and drudgery of flipping through pages and allowing both students and teachers to stay focused on the comprehension and benefits of what they are reading.

The stabilization of the core intermediate vocabulary in the DCC, he argues, means that advanced students can also get involved by generating running vocabulary in a clear, straightforward manner, and have the satisfaction of producing lasting pedagogical materials for other students.

The bottleneck, he argues, is in Introductory Greek, where high-quality but in some ways antiquated print resources have not yet been fully matched by digital counterparts.

with no disrespect to the authors and publishers of these volumes, in terms of presentation, information, layout and design, standard word processing programs can produce virtually everything found in these books. With the addition of images and slide programs (such as Power Point), a teacher can do more, and better, than anything in these books.

Such materials, he insists, must take full advantage of computerized analysis of Greek texts to help make students effective intermediate and advanced readers of digital Greek. This means taking into account vocabulary frequency and density of texts, and also statistical data about the frequency of morphology and syntactical structures (here Major sites Anne Mahoney, “The Forms You Really Need to Know,” Classical Outlook 81 (2004): 101–05, also ironically not on line!).

Beginning Greek must be reconceived as it moves to digital platforms. Merely transferring current print presentations to digital display monitors will strangle the learning of Greek, a shameful prospect when such treasures now loom just beyond the beginning stages.

Another interesting point in the article has to do with the typing of Greek. Students must be helped to become proficient in typing Greek as soon as possible, and must not be required to buy a new piece of software to do so. He urges keyboard designers to work with standard Modern Greek keyboards as a basis.

Both Windows and Apple devices now have polytonic Greek keyboards and inputs built in at the system level, which need only be activated. Both incorporate the Modern Greek keyboard. While the Apple system has more flexible input options, it includes all the same input options as the default Windows system. As things stand, therefore,we should promote this system for its widespread accessibility and compatibility. Expecting or requiring students to purchase and install additional software will inevitably lead to problems as they move from computers to phones, tablets, and so on.

Most important, Major stresses that digital platforms are ideal for encouraging the steady practice, repetition, and feedback with the core material of Greek in a way that best address the frustration and attrition that plague beginning classes.

The vocabulary and parsing tools already established for advanced and intermediate digital materials also provide a goal and clear purpose of method for introducing vocabulary and morphological identity from the earliest stages of beginning Greek. Doing so means we can dispense with relying on the dozens of pages of charts and paradigms that we, explicitly or implicitly, expect students to memorize as a precondition of just beginning to read the simplest continuous Greek passage.

If you are not familiar with Major’s work on this kind of pedagogy, I urge you to check out his articles “On Not Teaching Greek,” Classical Journal103 (2007): 93–98, and “Teaching Greek Verbs: A Manifesto,” Teaching Classical Languages 3 (2011): 23–42 (the latter co-authored with B. Stayskal), and the superb resources available on his frequently updated Greek resources page My own thoughts about using the DCC and its core vocabulary in a sight reading-based approach can be found in an earlier post.