Data Mining Project: The “Thoreauian” Metaphor Revealed

I have found a very interesting pattern/discovery that I might not have seen without data mining. I noticed that one of the words frequently used by Thoreau was like. It seemed odd at first and I thought that the “stop word” function (stops unnecessary words) had “missed” this word. With further research though, I found out that Thoreau’s refined ways of describing landscapes was through metaphor. I looked into many cases of the word like, and after setting it in context I found that much of the usage was toward metaphor. Granted, one may have seen this by reading the text but when it is set into unified perspective through data mining, the patterns emerge and the trends are ever prevalent. Continue reading →

Data Mining History & the Canadian Landscape

I will be posting my progress for a data mining project in my Historian’s Craft course. The data will be posted as progression (so far it is preliminary data). I am data mining Susanna Moodie’s Roughing it in the Bush (1852) and Henry David Thoreau’s An Excursion to Canada (1853). The purpose is to “not read” or “distant read” these two separate pieces of relatively the same time, place, and theme, to discover patterns between them. The platform I will be using for data mining is Voyant Tools. The new version is still in beta but it is truly a great tool. For example, the word frequency has a tool to take out unnecessary words such as “the” and “is”.

I will post my data along the way, as well as my processes and final work. I predict some very interesting conclusions!

A list of links to the appropriate pages:

Word Frequency

Word Representation (frequency by size)

Raw Graph of Words

Case Study: Thoreau, Moodie, & Metaphors

The Grand Finale