{"id":922,"date":"2015-01-13T13:42:24","date_gmt":"2015-01-13T13:42:24","guid":{"rendered":"http:\/\/blogs.dickinson.edu\/dcc\/?p=922"},"modified":"2015-01-13T13:47:16","modified_gmt":"2015-01-13T13:47:16","slug":"a-new-latin-macronizer","status":"publish","type":"post","link":"https:\/\/blogs.dickinson.edu\/dcc\/2015\/01\/13\/a-new-latin-macronizer\/","title":{"rendered":"A New Latin Macronizer"},"content":{"rendered":"<p>Felipe Vogel has released a new Latin macronizer, <a href=\"http:\/\/fps-vogel.github.io\/maccer\/\">Maccer<\/a>, and I thought I would take it for a spin and share the results. It works based on a database of previously macronized Latin texts (some provided by DCC), and is still in development.<\/p>\n<p>For my test I figured I would use an unusual text I have been working on lately,\u00a0<em>Historiarum Indicarum Libri XVI<\/em>, about the Portuguese exploration of the Far East in the 16th century. It was published\u00a0by the Jesuit humanist Pietro Maffei in 1588, and the Latin is excellent and full of interest. Book 6 is a fascinating ethnography of China, informed by reports from Jesuit missionaries who visited and lived in China over a number of years. The last print edition was 1751: <em>Joannis Petri Maffeii Bergomatis E Societate Jesu Historiarum Indicarum Libri XVI<\/em> (Vienna: Bernardi, 1751), and thanks to a tip from Terence Tunberg (who introduced me to this text) I tracked it down on the site of the <a href=\"http:\/\/digital.slub-dresden.de\/werkansicht\/dlf\/57782\/1\/\">Dresden Library<\/a>. Since there is no fully digitized text, my students and I transcribed Book 6 this past fall. Here is an excerpt, with no macrons.<\/p>\n<blockquote><p>E Sinarum provinciis maxime occidua est Cantonia. Eo priusquam pervenias, multae occurrunt insulae; quas praefecti regii praesidiis et classibus tenent: neque ipsorum iniussu progredi advenas Cantonem est fas. Fernandus Andradius, ut exponere coeperam, cum ad Tamum insulam pervenisset, post diuturnam moram, transitu aegre tandem impetrato, cum duobus expeditis et egregie ornatis navigiis, cetera classe ad Tamum relicta, Cantonis portum invehitur, ac magistratuum permissu Thomam legatum exponit, cui aedes et lautia de more attributa. Ibi Fernandus, mira lenitate ac iustitia contrahendo cum incolis, haud ita difficili negotio aditum ad ea commercia nostris aperuit.<\/p><\/blockquote>\n<p>With Vogel&#8217;s macronizer this becomes<\/p>\n<blockquote><p>\u0112 \u2716Sinarum pr\u014dvinci\u012bs maxim\u0113 \u2716occidua \u272aest \u2716Cantonia. E\u014d priusquam perveni\u0101s, multae occurrunt \u012bnsulae; qu\u0101s \u2716praefecti \u2716regii praesidi\u012bs et classibus tenent: neque ips\u014drum \u2761iniuss\u016b pr\u014dgred\u012b \u2716advenas \u2716Cantonem \u272aest f\u0101s. \u2716Fernandus \u2716Andradius, ut exp\u014dnere \u2716coeperam, cum ad \u2716Tamum \u012bnsulam perv\u0113nisset, post di\u016bturnam moram, tr\u0101nsit\u016b aegr\u0113 tandem \u2716impetrato, cum du\u014dbus exped\u012bt\u012bs et \u0113gregi\u0113 \u2716ornatis n\u0101vigi\u012bs, c\u0113tera classe ad \u2716Tamum \u272arelict\u0101, \u2716Cantonis portum invehitur, ac magistr\u0101tuum \u2761permiss\u016b \u2716Thomam l\u0113g\u0101tum exp\u014dnit, cui aed\u0113s et \u2716lautia d\u0113 m\u014dre \u2761attrib\u016bta. Ibi \u2716Fernandus, \u2712m\u012br\u00e3 \u2716lenitate ac i\u016bstitia \u2716contrahendo cum incol\u012bs, haud ita \u2716difficili neg\u014dti\u014d aditum ad \u2712e\u00e3 commercia nostr\u012bs aperuit.<\/p><\/blockquote>\n<p>The symbols mean this:<\/p>\n<table id=\"key\">\n<tbody>\n<tr id=\"key\" class=\"odd\">\n<td id=\"key\" align=\"left\">\u2716<\/td>\n<td id=\"key\" align=\"left\">unknown word, i.e. not yet in Vogel&#8217;s database.<\/td>\n<\/tr>\n<tr id=\"key\" class=\"even\">\n<td id=\"key\" align=\"left\">\u2712<\/td>\n<td id=\"key\" align=\"left\">ambiguous: uncertain vowels marked with a tilde (~).<\/td>\n<\/tr>\n<tr id=\"key\" class=\"odd\">\n<td id=\"key\" align=\"left\">\u272a<\/td>\n<td id=\"key\" align=\"left\">guessed based on frequency.<\/td>\n<\/tr>\n<tr id=\"key\" class=\"even\">\n<td id=\"key\" align=\"left\">\u2761<\/td>\n<td id=\"key\" align=\"left\">prefix or enclitic detected attached to a known word.<\/td>\n<\/tr>\n<tr id=\"key\" class=\"odd\">\n<td id=\"key\" align=\"left\">\u261b<\/td>\n<td id=\"key\" align=\"left\">invalid characters detected.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>I made sixteen corrections in 92 words.<\/p>\n<p>21 words were flagged as unknown, 10 of those were proper names (Sin\u0101rum, occidua, Cantonia, praefect\u012b, regi\u012b, adven\u0101s, Cantonem, Fernandus, Andradius, coeperam, Tamum, impetr\u0101t\u0101, orn\u0101t\u012bs, Tamum, Cantonis, Thomam, lautia, Fernandus, l\u0113nit\u0101te, contrahend\u014d, difficil\u012b). I made 9 corrections in that group, leaving alone most of the proper names for now.<\/p>\n<p>3 words were guessed based on frequency, all correctly (est, est, relict\u0101).<\/p>\n<p>3 words were marked as &#8220;prefix detected,&#8221; all correctly macronized (iniuss\u016b,\u00a0permiss\u016b,\u00a0attrib\u016bta)<\/p>\n<p>2 were marked as having invalid characters (m\u012br\u0101, ea),\u00a0had tildes over the vowel, and had to be corrected by hand.<\/p>\n<p>Only two words were incorrect but not flagged as in any way problematic (c\u0113ter\u0101,\u00a0i\u016bstiti\u0101). In both cases it was an ambiguous first-declension -a. The other vowels in those words were correct.<\/p>\n<p>The hand-corrected result is as follows:<\/p>\n<p>\u0112 Sin\u0101rum pr\u014dvinci\u012bs maxim\u0113 occidua est Cantonia. E\u014d priusquam perveni\u0101s, multae occurrunt \u012bnsulae; qu\u0101s praefect\u012b regi\u012b praesidi\u012bs et classibus tenent: neque ips\u014drum iniuss\u016b pr\u014dgred\u012b adven\u0101s Cantonem est f\u0101s. Fernandus Andradius, ut exp\u014dnere coeperam, cum ad Tamum \u012bnsulam perv\u0113nisset, post di\u016bturnam moram, tr\u0101nsit\u016b aegr\u0113 tandem impetr\u0101t\u0101, cum du\u014dbus exped\u012bt\u012bs et \u0113gregi\u0113 orn\u0101t\u012bs n\u0101vigi\u012bs, c\u0113ter\u0101 classe ad Tamum relict\u0101, Cantonis portum invehitur, ac magistr\u0101tuum permiss\u016b Thomam l\u0113g\u0101tum exp\u014dnit, cui aed\u0113s et lautia d\u0113 m\u014dre attrib\u016bta. Ibi Fernandus, m\u012br\u0101 l\u0113nit\u0101te ac i\u016bstiti\u0101 contrahend\u014d cum incol\u012bs, haud ita difficil\u012b neg\u014dti\u014d aditum ad ea commercia nostr\u012bs aperuit.<\/p>\n<p>I would call this\u00a0very good results, and it should be possible to do even better given a larger database. In theory we could do even better than that by marrying\u00a0a parser and a dictionary like LaNe that has quantities accurately marked. If all goes well I hope to embark on such a project this fall with the help of a Dickinson Computer Science senior student. The other thing I would like to see is an editing environment that would make inserting macrons as easy as clicking on the vowel. This would really help in the inevitable process of hand correction.<\/p>\n<p>Thank you Felipe, for this amazing tool!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Felipe Vogel has released a new Latin macronizer, Maccer, and I thought I would take it for a spin and share the results. It works based on a database of previously macronized Latin texts (some provided by DCC), and is &hellip; <a href=\"https:\/\/blogs.dickinson.edu\/dcc\/2015\/01\/13\/a-new-latin-macronizer\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":65,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"footnotes":""},"categories":[61804,1],"tags":[95792,95793,95791],"class_list":["post-922","post","type-post","status-publish","format-standard","hentry","category-collaborations","category-uncategorized","tag-felipe-vogel","tag-macronizers","tag-macrons"],"_links":{"self":[{"href":"https:\/\/blogs.dickinson.edu\/dcc\/wp-json\/wp\/v2\/posts\/922","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.dickinson.edu\/dcc\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.dickinson.edu\/dcc\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.dickinson.edu\/dcc\/wp-json\/wp\/v2\/users\/65"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.dickinson.edu\/dcc\/wp-json\/wp\/v2\/comments?post=922"}],"version-history":[{"count":0,"href":"https:\/\/blogs.dickinson.edu\/dcc\/wp-json\/wp\/v2\/posts\/922\/revisions"}],"wp:attachment":[{"href":"https:\/\/blogs.dickinson.edu\/dcc\/wp-json\/wp\/v2\/media?parent=922"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.dickinson.edu\/dcc\/wp-json\/wp\/v2\/categories?post=922"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.dickinson.edu\/dcc\/wp-json\/wp\/v2\/tags?post=922"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}