{"id":462,"date":"2021-12-08T22:25:52","date_gmt":"2021-12-09T03:25:52","guid":{"rendered":"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/?page_id=462"},"modified":"2021-12-13T17:00:42","modified_gmt":"2021-12-13T22:00:42","slug":"data-visualizations","status":"publish","type":"page","link":"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/digital-editions\/whodunnit-the-passionate-pilgrim\/data-visualizations\/","title":{"rendered":"Data Visualizations"},"content":{"rendered":"<p>One of the exciting benefits of analyzing literary data is that it can be\u00a0 \u00a0 visualized! I have included some of my favorite visualizations made with the data taken from the works that we used in our analysis. The\u00a0 \u00a0 visualizations were created in RStudio, with the R programming language.<\/p>\n<p>For reference, here is the data table I worked with:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-471\" src=\"http:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Screen-Shot-2021-12-08-at-9.36.57-PM.png\" alt=\"Table of literary data from analyzed works\" width=\"513\" height=\"445\" srcset=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Screen-Shot-2021-12-08-at-9.36.57-PM.png 1074w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Screen-Shot-2021-12-08-at-9.36.57-PM-300x260.png 300w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Screen-Shot-2021-12-08-at-9.36.57-PM-1024x889.png 1024w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Screen-Shot-2021-12-08-at-9.36.57-PM-768x666.png 768w\" sizes=\"auto, (max-width: 513px) 100vw, 513px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>Here, I plotted the unknown poems from\u00a0<em>The Passionate Pilgrim\u00a0<\/em>(listed by the number they appear under in the text) as well as the confirmed works of the 4 potential authors.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-465\" src=\"http:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-labels.png\" alt=\"\" width=\"2010\" height=\"1368\" srcset=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-labels.png 2010w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-labels-300x204.png 300w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-labels-1024x697.png 1024w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-labels-768x523.png 768w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-labels-1536x1045.png 1536w\" sizes=\"auto, (max-width: 2010px) 100vw, 2010px\" \/><\/p>\n<p>I noticed that there seemed to be a downward correlation between word length and sentence complexity in the works. The red line in the graph below is a linear regression line that shows that correlation.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-466 size-full\" src=\"http:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-regression-line.png\" alt=\"\" width=\"1626\" height=\"1430\" srcset=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-regression-line.png 1626w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-regression-line-300x264.png 300w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-regression-line-1024x901.png 1024w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-regression-line-768x675.png 768w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/2d-plot-w-regression-line-1536x1351.png 1536w\" sizes=\"auto, (max-width: 1626px) 100vw, 1626px\" \/><\/p>\n<p>The implication of this finding is that when the poets used longer words, they used less punctuation in their sentences. It is as if punctuation or word length made their writing more complex, so if one was high, the other didn&#8217;t have to be.<\/p>\n<p>The following bivariate boxplot illustrates how one poem, XVII (or 17), is a significant outlier in the comparison between sentence complexity and word length, with a very high average word length and a very low sentence complexity.<img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-468\" src=\"http:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/bv-boxplot.png\" alt=\"\" width=\"2016\" height=\"1326\" srcset=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/bv-boxplot.png 2016w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/bv-boxplot-300x197.png 300w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/bv-boxplot-1024x674.png 1024w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/bv-boxplot-768x505.png 768w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/bv-boxplot-1536x1010.png 1536w\" sizes=\"auto, (max-width: 2016px) 100vw, 2016px\" \/><\/p>\n<p>Note: Bivariate boxplots represent 2-dimensional data, with the inner dotted circle (called the hinge) denoting where the first 50% of the data ends, and the outer circle (called the fence) denotes where the non-outlier data ends.<\/p>\n<p>To check out whether the outlier-ship of poem XVII makes sense, the first 8 lines are printed below:<\/p>\n<blockquote><p>My flocks feed not,<br \/>\nMy ewes breed not,<br \/>\nMy rams speed not,<br \/>\nAll is amiss:<br \/>\nLove is dying,<br \/>\nFaith&#8217;s defying,<br \/>\nHeart&#8217;s denying,<br \/>\nCauser of this.<\/p><\/blockquote>\n<p>The lines in poem XVII are very short, but the author&#8217;s vocabulary is not noticeably different than the other poems. Hence, the low sentence complexity.<\/p>\n<p>Since the authorship analysis was based mostly on 3 variables (line length, word length, and sentence complexity) it is worth representing all 3 of them in a plot. That is what the 3d scatterplot below does. All of the unknown works are represented.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-467\" src=\"http:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/3d-plot.png\" alt=\"\" width=\"1970\" height=\"1326\" srcset=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/3d-plot.png 1970w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/3d-plot-300x202.png 300w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/3d-plot-1024x689.png 1024w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/3d-plot-768x517.png 768w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/3d-plot-1536x1034.png 1536w\" sizes=\"auto, (max-width: 1970px) 100vw, 1970px\" \/><\/p>\n<p>Using the three attributes, poem\/author fingerprints can be plotted as vectors as well as points. Note how each vector points in a similar direction, indicating that the relationship between the three attributes is quite similar for all 4 authors. This makes sense, since they were all contemporaries.<\/p>\n<p>If you look closely, you can also see that Griffin and Barnfield appear to cluster together and Marlowe and Shakespeare do the same. I note in the project reflection that almost all (4\/5) of the differences between our classifications and those of Elliott and Valenza (1991) are that we attribute a poem to Marlowe when they found it within Shakespeare&#8217;s statistical range.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-545\" src=\"http:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Screen-Shot-2021-12-10-at-4.02.44-PM.png\" alt=\"3d-plotted vectors of author fingerprints\" width=\"1630\" height=\"1084\" srcset=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Screen-Shot-2021-12-10-at-4.02.44-PM.png 1630w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Screen-Shot-2021-12-10-at-4.02.44-PM-300x200.png 300w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Screen-Shot-2021-12-10-at-4.02.44-PM-1024x681.png 1024w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Screen-Shot-2021-12-10-at-4.02.44-PM-768x511.png 768w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Screen-Shot-2021-12-10-at-4.02.44-PM-1536x1021.png 1536w\" sizes=\"auto, (max-width: 1630px) 100vw, 1630px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>Now for some fun stuff: humans are quite good at identifying visual similarities between shapes.<\/p>\n<p>The following star plots display the three attributes in the data table by extending a line out to the magnitude of the attribute for the specific row. The size of the plot indicates the magnitude of all 3 attributes (e.g., Poem 4 has a relatively high word length, line length, and sentence complexity), while the shape indicates the relationship between the attributes.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-470\" src=\"http:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Star-plots.png\" alt=\"\" width=\"1320\" height=\"1374\" srcset=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Star-plots.png 1320w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Star-plots-288x300.png 288w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Star-plots-984x1024.png 984w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Star-plots-768x799.png 768w\" sizes=\"auto, (max-width: 1320px) 100vw, 1320px\" \/><\/p>\n<p>Plots that look similar have similar attributes. For example, Poems 12 and 14 look quite similar in these plots, and our model classified both as being written by Bartholomew Griffin.<\/p>\n<p>The final visualization is a series of Chernoff faces, a technique that maps each column of the data table to a characteristic of the face (e.g., the shade of the face color corresponds to the average word length). This allows the viewer to perceive patterns that would otherwise be difficult to see<img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-469\" src=\"http:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Chernoff-faces.png\" alt=\"\" width=\"1428\" height=\"1146\" srcset=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Chernoff-faces.png 1428w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Chernoff-faces-300x241.png 300w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Chernoff-faces-1024x822.png 1024w, https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/files\/2021\/12\/Chernoff-faces-768x616.png 768w\" sizes=\"auto, (max-width: 1428px) 100vw, 1428px\" \/><\/p>\n<p>For example, the &#8220;Griffin&#8221; face is very similar to the face for Poem 12, and indeed, through our analysis we found with confidence that Poem 12 was written by Bartholomew Griffin.<\/p>\n<p><em><strong><a href=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/digital-editions\/whodunnit-the-passionate-pilgrim\/\">Whodunnit: The Passionate Pilgrim<\/a>\u00a0 \u00a0<\/strong><\/em><\/p>\n<p><a href=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/digital-editions\/whodunnit-the-passionate-pilgrim\/stylistic-analysis-and-findings\/\"><em><strong>Stylistic Analysis<\/strong><\/em><\/a><\/p>\n<p><a href=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/digital-editions\/whodunnit-the-passionate-pilgrim\/the-passionate-p\u2026m-annotated-text\/\"><em><strong>Annotated Text<\/strong><\/em><\/a><\/p>\n<p><em><strong>\u00a0<a href=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/digital-editions\/whodunnit-the-passionate-pilgrim\/reflection\/\">Reflection<\/a><\/strong><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the exciting benefits of analyzing literary data is that it can be\u00a0 \u00a0 visualized! I have included some of my favorite visualizations made with the data taken from the works that we used in our analysis. The\u00a0 \u00a0 visualizations were created in RStudio, with the R programming language. For reference, here is the &hellip; <a href=\"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/digital-editions\/whodunnit-the-passionate-pilgrim\/data-visualizations\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Data Visualizations<\/span><\/a><\/p>\n","protected":false},"author":4792,"featured_media":0,"parent":212,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-462","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/wp-json\/wp\/v2\/pages\/462","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/wp-json\/wp\/v2\/users\/4792"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/wp-json\/wp\/v2\/comments?post=462"}],"version-history":[{"count":0,"href":"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/wp-json\/wp\/v2\/pages\/462\/revisions"}],"up":[{"embeddable":true,"href":"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/wp-json\/wp\/v2\/pages\/212"}],"wp:attachment":[{"href":"https:\/\/blogs.dickinson.edu\/digitalmethodsforthehumanities\/wp-json\/wp\/v2\/media?parent=462"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}