Summer Accomplishments, part 2

Jimmy Martin (’13) passes on this summary of his work over the summer:

He focused on creating the Greek Core list organized by parts of speech and by TLG frequency, and the Latin list as organized by parts of speech.
He read through the Amores helping create the vocabulary lists as he went. He read  through Cicero’s Pro Caelio, creating vocabulary list for his assigned sections. He read through most of Book 5 of the Gallic Wars, adding and subtracting vocabulary according to the updated Core Latin Vocabulary List.

Thanks, Jimmy, for all your important contributions to the project!

–Chris Francese

4 thoughts on “Summer Accomplishments, part 2

  1. Will the Amores and Pro Caelio sections be added to the website? I’d love to read through those with my high school students.

    • Any day now, Michael! Thanks for your interest. Just finishing up a few last minute touches. I’ll announce on the blog and various listservs.

  2. Hi there! Congrats on the latest topics, they’re pretty interesting guys but I managed to read the resources pages of the web site, the one listing the books and papers etc used to come up with the words lists, and I still wonder about the methods used to get the Latin one done. I mean, the technical method, the programming used to gather the statistics and all. I’m really thrilled by it as I study Classical Latin too, mostly by myself, and I have just presented a paper about using NLP on Cicero’s texts to create words lists for study: (sorry it’s in Portuguese, English abstract’s in page 9 but the technical bit and references should be universal) 🙂

    • Thanks for very much your comment, Caio. I am very interested in software that will aid in the creation of vocabulary lists. But so far have not fond any that saves much time. Please keep me updated on your work. So far, everything has to be hand edited in my experience. This very much applies to the Latin frequency list as well. The Perseus automated tools at this point are far less useful than the hand-made data of Diederich and the LASLA group. They simply parsed texts and tabulated the results by hand, and the magic of that is that we can have real confidence that the lemmatization is substantially correct. But of course there is no lemmmatized sample of all Latin, and the results will necessarily vary somewhat based on the sample. So we used the two main samples, checked against a Perseus-generated postclassical sample, and applied some judgment to come up with a full list. Not scientific by any means, but I think reliable enough for our purposes.

