Skip to main content
SearchLoginLogin or Signup

Final Portfolio

Published onMay 07, 2023
Final Portfolio

Part I (Assignment I)

The 19th Century Prison Reform Collection is a collection that consists of various papers that have been digitized that are related to 19th century prison reform. There are 449 items in the collection, and most of them are scraps of paper like bills, letters, and other random papers with little significance. However, there were some papers that gave some insight into a specific prison in New York.

Item number one in the collection (which I will probably focus on the most) is a pamphlet that was printed in 1826. This pamphlet essentially outlines how a prison should be run (for the 19th century) and how it was currently run. There are practical things such as how much food each prisoner gets, how much room each prisoner has, and others, but also information about how prisoners were punished with specific examples. The overall quality of the source is pretty good, although you can tell the pamphlet has been read repeatedly because of some of the discoloration and stains on some of the pages. On pages 16 and 17 you can see where stains have bleed through pages showing that the material of the paper was very cheap and thin, and there are even some pages like pages 30-40 where wavy lines help prove that the source was bent while it was being read, creating lines that were picked up when this source was digitized. Also, it is not hard to believe that this particular source came from 1826 because it was mostly likely printed from a typewriter as there are some letters completely missing from words, and smudges throughout the pamphlet - paired with consistent fading due to time.

This source was intended to train future prison wardens and guards, and that is evident from the handwriting scribbled on certain pages with notes for specific scenarios. While I was unable to find who particularly wrote these scribbles, it could logically be assumed that it was someone of high rank, potentially a warden or someone who had worked there a long enough time to understand how the prison operates. Because of the time period of the early 19th century New York, it would be stereotypical to assume that a high-ranking prison operator would be a white male. Sadly, I could not find out more information about who wrote and created many of these notes and documents, but it is interesting to read them from a racial perspective. The digitized documents specifically mention punishments for inmates including whipping and other forms of punishment that would not be allowed today. After reading Data Feminism, I find it interesting that the race of some of these disciplined prisoners were left out. Realistically it makes sense that there would be racist wardens, but it still alters my interpretation of the sources because now I read and analyze it in many different ways.

Within this pamphlet was a report of Auburn Prison (a prison in New York), so it makes sense that this collection was published and digitized by Cornell University (a college in New York). One important question historians always have to ask is why this is being digitized. I’ll elaborate more in my reflection, but I’ve learned that digitization is very expensive and time consuming to make sure it is done properly. In order to do this, there needs to be a well-funded organization to pay for the workers and the equipment to make sure it is done right the first time. In Data Feminism, there is a section on Big Dick Data. There are numbers indicating how much money was allocated towards food, salaries, and maintenance on the building. Some of this data could have been digitized to make it easier for current prisons to ask for more money; to compare how much funding federally-run prisons receive and then compare that to the prisons of the past to justify more funding. The main idea of Big Dick Data is that companies will use data sets that are manipulated in order to ask for more money through over or underexaggerating. That is why it is crucial to ask why something is digitized.

When I searched for this source, I simply asked my fellow History Major for a historical topic and he gave me prison reform, an oddly specific topic but one that led me directly to this collection. From there I clicked on the first collection that looked promising and looked in depth at the first item in the collection. It is apparent that the words “prison” appears multiple times within the source, and with “reform” also occurring in the title, combined with me searching for sources before 1923 to find something in the public domain, it makes sense why this source was presented to me.

These sources seem trustworthy, and I believe that all of these sources were digitized very well and can be taken at face value.

G. Powers (1789-1831). A Brief Account of the Construction, Management, and Discipline &c. &c. of the New-York State Prison at Auburn. 1826. Division of Rare and Manuscript Collections, Cornell University Library; Enos Thompson Throop Papers.

Part II (Assignment (III)

The data I looked at today is about the Bechdel Test in film. The Bechdel Test is a sexism test that measures how women are represented in different types of film. In order to pass the Bechdel Test, a work of film must have two female characters talk to each other about something other than a man. This collection of data is great for evaluating movies based on how much money was spent, how the movies were rated, and whether or not they passed the Bechdel Test.

My first visual evaluation is the correlation between the average movie rating for movies that passed and movies that failed the Bechdel Test. I wasn’t sure what to expect, but the average rating for a movie that passed the Bechdel Test was a 6.6175, and the average rating for a movie that did not pass the Bechdel Test was a 6.8655. Compared to the rest of the data, this was oddly similar. In your comments you asked me to elaborate on why I thought that the ratings would be dissimilar, and to explain what that means in context. Personally, I assumed that movies that passed the Bechdel Test would have a higher rating just because there might be more thought social value incorporated into the movie, but there are multiple explanations as to why either variable would have a higher rating. I’ve seen many historical war movies that focus on male soldiers and their struggles with war, and I always liked those movies (maybe because I’m a BA History Major), however there were not many women depicted in those films. Either way, I didn’t know exactly what to expect, but I think I was surprised when I first made this visualization and saw that movies that failed the Bechdel Test had an average higher rating. Side note: I am once again recognizing the important of data visualization because without this graph I made it would be very difficult to know which one is valued more by society (and the companies that rate movies) and what that says about society.

Another way of evaluating this data was through the amount of money spent on these movies. The average budget for a movie that failed the Bechdel Test was approximately 16.6 million dollars more than a movie that passed the Bechdel Test. This is significant because in relation to the previous graph comparing ratings, movies with higher budgets that were underrepresenting women were rated very similarly to movies with lower budgets that were representing women better.

It’s also important when examining data to look at how these trends have changed over the years. Although admittingly this next graph is difficult to read (you can click on the graph for a larger picture), it does show how many movies passed and failed the Bechdel Test each year. Starting in 1970, more movies failed the test than passed. This trend continued until about the mid 90s when the numbers began to level out between failed and passed. Then in the mid 2000s one can begin to see more movies pass the test than before. Sometimes there were even more movies that passed than the movies that failed the test in that corresponding year.

This is similar to the argument in Data Feminism, when the author compares the sizes of pockets between men and women from now to the 17th century in Europe. Women were always given smaller purses and pockets for elite fashion reasons, and nothing has changed since then - not necessarily because all film makers (or pocket designers) are sexist, but because it is simple the way things were done before. This type of data can be easily overlooked and justified through the inclusion of female characters in movies, but it is one thing to include female characters, and another to make them relevant and independent in the story.

This dataset is very interesting. It includes all sorts of information on 1,795 popular movies between 1970 and 2013, many of which I have seen making this personally interesting beyond just the data. There is also data about what country each movie is from and what language it was originally produced in. I had originally wanted to create a network analysis with all this data to see if foreign films were more likely to pass the Bechdel Test because of different gender and social norms in other countries. Unfortunately, I was unable to figure out how to do that without manually counting all 1795 movies and tallying which countries they were from and whether or not they passed the test. Either way, the potential is there, and it would be a great source for evaluating sexism between cultures because you could also look at the ratings for these same movies in addition to whether or not the movies pass the Bechdel Test.


Overall, I have learned a lot of things in this class. Looking back at the first thing we did in class I realize how little I knew about the digital humanities. My idea of “digitizing” was someone taking a picture of something with their iPhone and uploading it to some website. Now I know that:

  1. These processes have to be funded by an organization or through volunteer work.

  2. Digitization is an expensive process that takes a lot of time, effort, equipment, and money.

  3. Sometimes it is hard to digitize sources because they may be three dimensional or they may be in delicate condition.

  4. It is difficult to manage who has access to what source and can therefore be used or loaned to different databases.

  5. It costs a lot more money than I thought to rent a database.

  6. Digital storage is not always available, it takes regular maintenance and is very expensive. It is also difficult to change hardware.

  7. Metadata is a complicated process that includes a lot more than just author and basic citation information, and this metadata takes up a lot of storage.

  8. More importantly, whomever completes all these processes are doing it for a reason and likely have some specific motive (could be personal research, could be for corporate purposes, etc.)

Another significant thing I learned from this course is regarding the digital visualization process. Recording history online is a very difficult task to make sure it is done correctly, and after taking this class, I have a newfound respect for all digital humanists. Especially those that work on recording current data as it happens, that is so difficult but also necessary to completely understand what happens initially. This brings to mind both the SUCHO website but also the chapter of Data Feminism talking about how crucial it is to record data exactly how it is initially. There are many data sets in the past that how inaccurate data because definition were incorrect. For instance, early Facebook data before 2014 would not include any other genders besides male and female because it was not an option. So just by looking at this data there would be no way of knowing that there were non-binary people using Facebook before 2014, which leads to incorrect data. That is why it is so important to record data online swiftly, safely, and also accurately.

During our unit of working with digital visualizations, I learned how difficult it is to import data. I am very bad with computers, and this year was the first year I had ever downloaded something that wasn’t on a Chromebook into my Google Drive, so I already was behind coming into this class, however, I discovered a lot of appreciation for those who a) understand this process, and b) take the time to organize the data neatly. Also it is crazy how difficult it is to organize and clean data. Even when I know what I am trying to do with my data, it is hard (for me personally) to turn it into something that is actually relevant and useful for research. All this being said, I do feel much more comfortable using online resources such as QGIS and even tools as simple as Excel and Google Sheets. When I do research, I look at graphs and typically take them for granted as just overall trends for time periods that can be vaguely analyzed, however, now I know that each datapoint was specifically entered and cleaned to make sure there were no mistakes.

Data is a very complex thing. Digital humanities are the future, and it is imperative that our past, present, and future history is preserved somehow in a way that is accurate, but also safe so nothing is corrupted through hardware changes. I think this class is very useful, and it should be required for all History majors. Not only did I learn a lot, but I enjoyed the things that I learned.

No comments here
Why not start the discussion?