Gregory Crane has written a fierce new manifesto directed at editors of classical texts, in which he urges scholars to “liberate textual data from corporate control” by publishing editions only in open (Creative Commons) licensed venues and only in TEI-XML tagged formats, thus making them interoperable and freely accessible to a global audience. He laments the lack of progress in this direction, noting that TEI encoding has been around since the 1980s, and open licenses since the 1990s. The main culprit, he says, is academic politics, and the perceived need to publish under an established university press to receive formal academic credit. The publishing of critical texts only in book form is “preventing Classical Greek and Latin from shifting to a fully open intellectual ecosystem.”
The solution he proposes is for scholarly editors to publish their work themselves:
If editors wish to work on their own to create editions of Greek and Latin texts, they should buy a TEI-aware XML editor and learn how to produce a modern edition. Anyone smart enough to edit an edition of Greek and Latin is smart enough to understand the necessary TEI XML.
Why TEI? Working in interoperable TEI XML will allow for competing editions to be compared:
Here the goal is to have as many TEI XML transcriptions as possible and to help researchers visualize the degree to which different editions differ and to be able to compare different editions.
The ideal of a universal, interoperable apparatus criticus that collates all textual variants and conjectures of scholars based on existing print editions is probably, he admits, an unattainable one. He argues instead for a more pragmatic approach to apparatus, one that allows for word search that links to page images of the original print resources:
Here our goal is to have a maximally clean searchable text but not to add substantive TEI XML markup that captures the structure of the textual notes — the structure of these notes tend to be complicated and inconsistent. Our pragmatic goal is to support “image front searching,” so that scholars can find words in the textual notes and then see the original page images.
Another proposal is to create a series of open-licensed textual commentaries that collate the textual variants that are deemed most significant:
Strategy one: Support advanced graduate students and a handful of supervisory faculty to go through reviews of recent editions, identifying those editorial decisions that were deemed most significant. The output of this work would be an initial CC-BY-SA series of machine-actionable commentaries that could automatically flag all passages in the CC-BY-SA editions where copyrighted editions made significant decisions. In effect, we would be creating a new textual review series. Because the textual commentaries would be open and available under a CC-BY-SA, members of the community could suggest additions to them or create new expanded versions or create completely new, but interoperable, textual commentaries that could be linked to the CC-BY-SA texts. Here the goal is to create an initial set of data about textual decisions in copyrighted editions and a framework that members of the community can extend.
Crane imagines the objection that all this infrastructure is not really needed, since those who use critical editions of classical texts have access to all that they need, and that nobody else really needs scholarly critical editions of classical authors. But this view he sees as essentially suicidal for advanced research that is publicly funded:
If we think that specialists at well-funded academic institutions alone need access to the best textual data, we should express that position clearly so that the federally funded agencies and private foundations know where we stand.
Rather, scholars have an obligation (the word occurs four times) to share their ultimately public-funded work with the public that has ultimately paid for it. The driving force behind this passionately argued essay is a profound sense of duty, a commitment to “our obligation as humanists to advance the intellectual life of humanity.”
My questions and comments are as follows:
- As someone dedicated to creating high quality CC-BY-SA digital commentaries on classical texts I applaud the vision, clarity, and passion of this essay. I believe with Crane that, as he has expressed in other venues, digitization is philology in the truest and highest sense. Digitization is a central intellectual and (again, Crane is correct) moral challenge facing our profession right now. If his essay shakes loose a few more philologists from unthinking acquiescence in the status quo, then it will be a victory.
- Why are scholarly editions and apparatus criticus the highest priority? Why not work on wresting better translations and commentaries from copyright, and from the brains of working scholars? Though I hesitate to say it for fear of being seen as lacking scholarly seriousness, we already have digitized texts that are good enough for most purposes, and for most authors significant textual issues can usually be dealt with in the context of an explanatory commentary. There is a significant need for new translations, however. For example, neither Livy nor Polybius have ever been translated into Chinese. This means that two of the seminal and central texts for the study of the Roman Republic are simply not available at all to a large portion of humanity. Even in the much better-served realm of English, public domain translations are often all but unreadable, if not downright misleading. Why not direct some funding and some of the scholarly energies of classicists in that direction?
- If we can think of the translation audience as the biggest and (arguably) most important circle, then the next concentric audience ring must be ancient language learners. What this group needs above all are well annotated editions with linguistic explanations, interpretations, and links to grammatical and historical reference works. One of the best ways for classical scholars to fulfil their duty to openly disseminate their findings would be to apply those findings to texts, summarizing research findings found in articles and monographs and making them directly relevant to the serious students who take the time to work their way through a dialogue of Plato or a book of Homer in Greek or a speech of Cicero in Latin. Existing open resources for this are woefully inadequate.
- Finally, if we progress to the innermost circle of textual editors and research scholars, I would like to have some more specificity and examples of the ways in which TEI-XML will allow for interoperability. A recent article in the Journal of TEI by the classically trained Desmond Schmidt suggested that true interoperability of digital scholarly editions via TEI is not really possible, given the subjectivity of tagging. But even if we can all stick strictly to the EpiDoc standards, how does this benefit us in practice? Can we see an example of a pair of correctly tagged editions of the same text from different sources, and what what benefit this interoperability provides? It seems that the minimal tag set proposed for apparatus criticus in the current EpiDoc standards for external apparatus criticus should make this theoretically feasible. But when it comes to in-line commentary, to the actual connecting of a scholarly discourse to a particular passage in a classical text via TEI-XML, the EpiDoc guidelines are a stub. And in the XML tagged commentaries on Perseus, like that of Greenough et al. on Caesar’s Gallic War, there doesn’t seem to be any clear interoperable linking with the Latin text itself. But maybe I’m misunderstanding the tags. I would love to be able to see a few examples of TEI-compliant commentaries on classical texts, and then a demonstration of how the effort needed to produce such bears actual fruit. Then I would consider the large investment of time and money required to put the DCC commentaries into TEI-XML.
Thank you, Dr. Crane, for this bracing and inspiring essay!
These are all excellent points (some I was going to make myself, especially on the need to support the creation of more translations). These are not simply questions that need to be solved in the classics, but in the entire way we approach the humanities. It’s also a serious problem that producers of e-book software simply don’t allow for much beyond a flat, linear text. Even something like the in-progress HTMLBook specification doesn’t have provision for many elements of page layout commonly found in books before the creation of automated typesetting in the late nineteenth century. Indeed, this problem has much older roots than the creation of the Internet, and we need to stop pretending both that we can solve it ourselves and that someone else will fix it for us.
While everyone seems to be standardizing on CC BY-SA for the purposes of scholarly editions, I haven’t seen a rationalization for this anywhere. I don’t think it’s necessarily wrong, and indeed one could argue that it is simply a codification of what editors have historically practised: it has always been assumed that new editions would take full account of the insights of past scholars, and quite often with minimal modification to their version of the text. Where accuracy is paramount, it also prevents certain forms of abuse, as for instance the case of the online Liddell-Scott-Jones by the TLG, where they claim to have made corrections to the text, and an early version of their informational page indicated that they would post a list of their modifications – but this never happened publicly, nor do they allow the underlying XML to be downloaded. Still, I wonder whether CC BY might be more appropriate in the long term, if our aim is simply to make our work as widely accessible as possible.
Thanks for this, Andrew. I’d love to hear you talk more about the hows whys and wherefores of your work editing Latin in XML, for example the Alexander Neckham text you pointed me to on Twitter https://raw.githubusercontent.com/adunning/alexander-neckam/master/super-mulierem-fortem.xml
Have you been blogging about your workflow and methods? Maybe a guest post for this blog, please pleeeeease?
Interesting idea; EpiDoc is fairly straightforward, once you get the hang of it, but I certainly made a number of missteps along the way in trying to figure out the best workflow. My response to Crane touches on some of the theory behind what I’m doing.