Data Mining Digital Humanities History

“Digging” Deep: Data Mining, Science and digital tools fill in the gaps to the history of our past

Data Mining and Data Mining tools present wonderful opportunities for digging deep into our historical past. This week, EWU students including myself had the opportunity to read and observe data mining tools and examples of how such tools are being used.

What we learned from 5 million books, TED Talks,

One of the first examples we observed, was the use of Google’s Ngram Viewer, from the Ted Talk video, ” What We Learned from 5 Million Books.” Speakers Erez Lieberman Aiden and Jean Baptiste Michel illustrated that by looking at Ngrams (which are appearance of texts from a broad range of speeches of text from “n” of items) across 5 million books, cultural trends and patterns could be examined. One example that they illustrated, was the usage of “throve” and “thrived” in speech over the course of the last two-hundred years. What the Google Ngram viewer showed, was that throve was at its highest in usage in text and speech documents in the 1800’s culture, but declined downward like a slide as it hit the year 2000. On the other hand, “thrived” went up like an escalator in its usage, starting from its lowest point (1800) to its highest (2000).

Next, we observed Carly Minsky’s “How AI helps historians solve ancient puzzles.” Minsky discusses that artificial intelligence is speeding up historical research, with the adoption of artificial intelligence. Before artificial intelligence, uncovering evidence and theories and patterns was hurt by the slower-moving processes such as inputting data from artefacts and older, data, as well as hand-written data. Ayellet Tal, who is a archaeological and science researcher at Israel’s Technicon University, was noted that “applying algorithmic techniques to historical research can improve AI’s capabilities.” Algorithms have not accounted for degrading fragments in artefacts, however A.I models learnt how to reverse the erosion process, thus algorithms and A.I. are a good match for this reason. A tool called “Pythia” led by Yannis Assael and Thea Sommerschield was developed to fill in the gaps of ancient Greek inscriptions and understanding how algorithms work in their decision-making processes. What was learned, was observed similarly to the Ngram Viewer, because by applying weightings to parts of input text, historians could better evaluate predicitions.

There are, of course a view problems to such technological advancements, artificial intelligence. The History Lab digital history professor and investigator Matthew Connelly, states that “It is crucial historians recognize the role that algorithms play in historical understanding” and that “historians must adapt their and transform their own methods of research and analysis.” Machine learning must be fused with data science research techniques, because what is left-out can be very damaging. We must not rely heavily on machine input, and also be able to transform human understanding as it works alongside artificial intelligence.

 Zoe Alker, IHR Digital History Seminar,

Probably one of the coolest pieces of research I have personally observed in awhile, is shown by Zoe Alker in the Youtube Video “Data mining convict tattoos, 1788-1925.” This seminar looks at the tattoos of convicts form the Victorian Age. Alker demonstrates that Digital Panopticon extracted data from over 50 datasets, 4 million records, which observes over 25,000 individuals. Panopticon uses automated record linkage to link together criminal justice, biometric and genealogical data. There’s also data visualization features, which allow you to identify patterns in the large bodies of data. Alker shows that there’s 5 key steps involved in the project methodology, in which this type of data was found, using Panopticon. One, texts were divided into single words, multi-word phrases and body specifiers – Alker used the term “chunks” to refer to this. Secondly, by linking tattoos to associated body parts. Thirdly, by assigning objects (hearts, anchors, etc.). Fourth, linking tattoos to other evidence. And lastly, by tabulating, visualization and analyzation of data.

This study can further be followed on Digital Panopticon: Convict Tattoos. Essentially, this study was made because criminologists and social investigators Cesare Combroso and Henry Meyhew felt that tattoos were a discrete code amongst criminal of Victorian London. What Alker showed, was that by the early 1800’s, there was a paranoia of a a criminal class, and so this may parallel to Lombroso and Meyhew’s hypothesis. However Zoe Alker and Robert Shoemaker’s study, had actually shown that tattoos were a growing trend in the Victorian Age and was gradually accepted. Data-mining techniques helped linked info from other fields and criminal records to help provide the data for their studies. To illustrate a way in which Digital Panopticon showed gradually acceptance and tattoo trends, they fived something called the “five dot” tattoo in which became popular throughout the Victorian Age. 19th Century views of tattoos felt that tattoos were savage, ls demonstrated by the fear of groups of 40 people which were labeled “the Forty Thieves.” Alker and Shoemkaer found that criminal weren’t just using tattoos for criminal fraternity or to be recongized as a crime-class, because men and women for the next 50 years (after 1820) had tattoos with love and naval themes, and things like tombstones which suggest mourning a loss. By the 1880’s, there was even tattoos of “Buffalo Bill” among men and women.” In my own studies using the Digital Panopticon, I found some trends of my own (if you want, you can try searching the dataset here!). I found that from 1841-1880, “heart” tattoos grew massively popular among male convicts, possibly illustrating the material culture centered around loss, which emphasized love more as a result.

Lastly, the class observed Candace Sutton’s Marked men: The secret codes and hidden symbols of Australian convict tattoos. This article, again, focuses on tattoos but among Australia’s convicts. Out of 160,000 convicts sent to Australia, it was discovered that 37% of men and 15% of women convicts had tattoos. It was really interesting to see the meanings of tattoos, especially as it applied to the convicts. For example, 19th century Australian boxers had tattoos on their chest and biceps to illustrate their celebrity. Ann Gough, an Australian woman whos spent 7 years in jail, had tattoos that perhaps told a life story: she had a mermaid tattoo, and anchor, a heart and 2 stars. According to Sutton, these tattoos represented “an undying affection” and a “erotic love.”

As you can see, digging up the trends, patterns of historical data as well as filling in the gaps of our knowledge is being enhanced based on our ability to work with technology and applications. As it transforms, we transform especially if we apply knowledge to learn more about programs have available to them. As technology advances, it is crucial that historians understand how programs do their analysis, to better improve their own research. Humans are limited to how much material we can scan through, but technology continues to improve how it analyzes large amounts of data. In the process, historians gain knowledge about how to improve their techniques, by analyzing the very tools they, themselves, use for research. Data-mining is a perfect example as to how technology and human beings can co-exist, as long as dependency doesn’t rest solely on artificial intelligence.

It is through the process of artificial intelligence, data visualization, data-mining, and scientific, historical analysis that we can provide context for the past in a way that facilitates future research in the present and future.

One thought on ““Digging” Deep: Data Mining, Science and digital tools fill in the gaps to the history of our past

  1. Wow, you clearly have a much better understanding of data mining than I do. I agree that data mining is going to be a vital tool to historians today and more so in the future. Have you ever used any data mining website or program for research? It’s pretty wild. Great post!

Leave a Reply

Your email address will not be published. Required fields are marked *