Category Archives: Posts

Data Visualization Blog. Praxis Visualization

My experience with using Tableau online was an interesting one. At first, I had never used this type of application before, I did not know that I could turn numbers into visual bars and graphs. When I first started, I was wondering what type of information I could turn into a bar or graph and then I saw that we can explore the open datasets that were available online. I used the NYCOPENDATA website, and I started to search Community Gardens around my area. I was able to find a data set that included numerous amounts of identifying information. https://data.cityofnewyork.us/Environment/ARCHIVED-NYC-Greenthumb-Community-Gardens/ajxm-kzmj 

The data was in an excel format and identifying items like Borough in which the garden is located 

Community Board, Council District, Garden Name, Address and size of the garden in acres. It had about  

536 rows and 17 columns.  

When I started working on Tableau online, I was lost for some time trying to input this spreadsheet. After researching some YouTube videos, I was able to input this spreadsheet and able to create bar graph as below.  

This bar chart indicates how many community gardens there are in each borough. What I found amazing is that the application tool can seamlessly redistribute the data in different types of graphs, It can also create a mapping visualization if one was needed as well.  

As I was creating these different types of charts, I couldn’t stop thinking about what Drucker mentioned in her Article, Humanities Approaches to Graphical Display. In her article, she describes that “Data” is usually seen as a one-dimensional avenue with numbers and names. We often overlook the fact that these data sets have other hidden messages not displayed when simplified in a bar chart. For example, when I uploaded this data, I was thinking how many of these community gardens are famous, if any of these gardens have a wonderful history that can’t be explained in a bar chart. A thought that came to mind was, how many fruits and vegetables were stolen today and was that recorded?  

Overall, my experience was fun and the digital humanist in me greatly sprouted ideas all over.  

Final Project & Last Thoughts – Caitlin Cacciatore

Hi, class I wish you all a Happy Solstice, a Good Yule, a belated Happy Hanukkah, a Merry Christmas, and a Blessed Kwanzaa, and a very safe, healthy, and happy New Year.

It has been a pleasure to learn and grow with all of you.

For my final project, I decided to put my own spin on Urban Dictionary – specifically, an academic, humanistic, DH-centered spin. I proposed a Dictionary Academicus, an academic Urban Dictionary that is also a platform for debate, as well as a ledger of new concepts and terms in DH.

New academic terms, phrases, and concepts are continually being introduced to the literature and canon of Digital Humanities. Often, this influx of jargon can be overwhelming to newcomers to the field, and those in adjacent academic fields. The Dictionary Academicus can change this. In its capacity as an interactive glossary of terms, it has great potential for educating new students and creating consensus amongst the DH community. The Dictionary Academicus can also serve as a platform for debate amongst scholars of DH and related fields.

Caitlin Cacciatore

The Dictionary Academicus is intended to be scalable, and has the potential to expand into other humanistic fields and the other social sciences.

Just as a pearl begins with a single grain of sand, I see the Dictionary Academicus as a scalable website and platform for discourse. It has the potential to expand into other humanistic fields, such as anthropology, sociology, literary studies, history, and philosophy. The possibilities are limited only by the capacity of the human imagination and ability to record, examine, and analyze the facets of life and of culture. In a similar vein to how the Urban Dictionary made common vernacular accessible and immediate to its audience, so too can Dictionary Academicus accelerate the rate at which new concepts, ideas, and turns of phrase trickle into the academic lexicon we share as scholars of the Digital Humanities and adjacent fields.

Caitlin Cacciatore

Ultimately, my project was inspired by the reading we did this semester, and the new terms, concepts, ideas, and turns of phrase I found myself grappling with. At some point in the semester, Professor Gold suggested during a class session that we keep our own Excel document of the ideas we’ve encountered throughout the semester, and that we come across in graduate school. His words stuck with me, and while I was ruminating on the expansion of the English language, the idea of the Dictionary Academicus came to me.

I immensely enjoyed the process of fleshing out the idea of an academic Urban Dictionary, from the original seed of the idea to the seedling it has grown into.

I look forward to working with you all next semester. May your season be merry and bright.

My Thoughts About the Class and Final Project–by Lu

This semester I learned a lot about the analysis and applications of Digital Humanities. Every class session debate and readings helped me to understand the concepts and different approaches to Digital Humanities.

For my last project, the final proposal assignment, I decided to expand the topic that I chose for the mapping praxis assignment that is “ Haunted Places in New York.”
First, I wanted to cover all the haunted places in New York State. However, I had to narrow my list a little bit, and I decided to only include the top five most popular haunted places in New York City based on TripAdvisor ratings and reviews.

I really like the topic of my proposal paper because I believe that haunted places in New York City represent an important field to analyze in order to promote paranormal tourism. Therefore, my final proposal project was to create a website that uses digital humanities tools to promote paranormal tourism in New York City. First, my project aims to create an interactive StoryMap showing the locations of the top five paranormal locations in New York City. Also, my project involves developing a section that describes the historical information of these places and its link to the paranormal phenomena observed in each one of them. In addition, my project includes a gallery of images of the featured places along with a link to their official websites with the purpose of satisfying the curiosity of readers. Finally, I decided to incorporate a section with audio narrative in the form of podcasts that explain the historical context of these locations.

The idea of doing this project was born after completing my mapping praxis assignment when I noticed the absence of a comprehensive paranormal tourism guide of places offering historic and enchanting experiences in New York State. In fact, the New York State’s official tourism site known as I Love NY does not have a special section that provides information about paranormal tourism or haunted places in New York. I really hope to expand my project and be able to add more haunted places to the list I created.

I feel really motivated to continue creating digital maps, analyzing texts, and incorporating visual tools in my academic research life. Thanks to the GC Digital Initiatives, I had the opportunity to attend to different workshops this semester. I learned about Python and Microsoft Excel analysis, and I was introduced to GIS. I feel grateful that I will be able to apply this valuable information that I learned in these workshops to any project or academic field.
Thank you so much Professor Gold, and everybody that was part of this class. It was a pleasure to be part of this wonderful group. Happy Holidays!

Blog Post #6 – Text Mining Obsessive Compulsive Disorder

When reading about text mining, one of the first issues that stood out to me was the notion that text mining for DH seems to be a method that allows reinforcement of ones own biases, and is not always scientifically objective. It allows us to make connections and correlations with data we extract from texts, however the methods to extract data are not always replicable to achieve the same results. I’m mainly thinking about method used by Matthew Jockers, stylometry, which is “the statistical analysis of variation between one writer genres and another”. He used it to analyze variations between ‘male’ and ‘female’ writers of novels, however it became increasingly problematic in identifying feminine and masculine styles of writing. The issue about using a dataset spanning over 200 years is that one has to consider that gender associations change over time, and since gender is a social construct it therefore changes as society changes. The results are that Jockers has confirmed not an objective reality, but the premise of his own studies.

One example of data mining I’ve looked at is an analysis for the show Seinfeld. In an effort to understand the popularity of a show that is specifically about upper middle class people in New York City in the 80s, and how it has connected with so many people across time, they used data mining from the script to see any patterns over 9 seasons. Ultimately I found the findings not as interesting as I thought I would. It’s hard to find nuance and complexity through a shows script that is technically about ‘nothing’ and the banalities of everyday life. Link

It’s difficult picking a topic for text mining, knowing that I might partake in another method of reinforcing my own biases through a set of texts. One of my attempts of text analysis was a single article I found in JSTOR when researching the artist Balthus, who I researched for my senior thesis while attending Pratt. After learning that one of his paintings, Thérèse Dreaming (1938), stayed on view at New York’s Metropolitan Museum of Art after Mia Merrill created a petition to convince the museum to remove the painting from gallery 907, or at least provide ‘more context’, (Link) I was curious to learn how some scholars interpreted Balthus body work. In her petition, she says “I was shocked to see a painting that depicts a young girl in a sexually suggestive pose… It is disturbing that the Met would proudly display such an image… The Met is, perhaps unintentionally, supporting voyeurism and the objectification of children.” (Link) I decided to just use one text (‘The Virtues and Dangers of Connecting Art to Life: can Pragmatism Address Balthus?’ by Mary Magada-Ward) to analyze and see what came up as a result.

The Virtues and Dangers of Connecting Art to Life: Can Pragmatism Address Balthus? Mary Magada-Ward

Words I had to manually block out were: art, balthus, balthus’s, 05, 25.1_03_magada, ward.indd, 146.96.128.36, downloaded, university, fi, jsp, 2021, 02, about.jstor.org, https.
I wasn’t exactly sure what to make of the results. Words like girls, aesthetic, young, experience, philosophy, ethical, and speculative all made sense when describing Balthus’s overall work, but I didn’t really understand the point of text mining using this article.

Amazon.com: KWGQM Balthus Therese Dreaming Figurative Painting Abstract  Canvas Art Poster Print Vintage Woman Art Oil Painting Modern Home  Decoration 50x60cm No Frame : Everything Else
Thérèse Dreaming, 1938

My next attempt was to use the first 5 articles that came up when Googling “Obsessive Compulsive Disorder”. The idea mainly came to be when I realized most of my knowledge about mental health came from online resources using the Google search engine, through mental health blogs and articles, and less from mental health professionals directly. This only became problematic to me when I started to question who were the authors of such blogs. Are self-help gurus a credible source when dealing with mental health? Are they representing a correct ‘ideal’ and ‘successful’ person who handles their mental health responsibly? And am I achieving that goal? An excerpt from The Culture of Narcissism (1979) by cultural historian Christopher Lasch explains it better:

“The mass media, with their cult of celebrity and their attempts to surround it with glamour and excitement, have made Americans a nation of fans, moviegoers. The media give substance to and thus intensity narcissistic dreams of fame and glory, encourage the common man to identify himself with the stars and to hate the ‘herd’, and make it more and more difficult for him to accept the banality of everyday existence.

To the performing self, the only reality is the identity he can construct out of materials furnished by advertising and mass culture, themes of popular film and fiction, and fragments torn from a vast range of cultural traditions, all of them equally contemporaneous to the contemporary mind.

In order to polish and perfect the part he has devised for himself, the new Narcissus gazes at his own reflection, not so much in admiration as in unremitting search of flaws, signs of fatigue, decay. Life becomes a work of art, while ‘the first art work in an artist’, in Norman Mailer’s pronouncement, ‘is the shaping of his own personality.'”

– Christopher Lasch, The Culture of Narcissism (1979)

How is the media portraying the ‘ideal’ mentally healthy person? Am I constructing an identity furnished by mass culture– through themes of popular film and fiction? One example of how mental health classification can affect culture is when the DSM-II, first published in 1968, listed homosexuality as a mental disorder. The World Health Organization only removed homosexuality from its ICD classification with the publication of ICD-10 in 1992. Had it not been removed from the DSM, how would those who don’t identify as heterosexual deal with their perceived ‘mental illness’?

First 5 Google results from ‘obsessive compulsive disorder’ search
Most used words from first 5 google search results

Even after this search I wasn’t sure what the results meant. I think a lot of what I wanted to explore shouldn’t use the text mining method. Text mining is probably best used to categorize documents, or trace history of particular features (words or phrases) over time. I’m also hesitant about results from text mining, as I don’t want to repeat what Jockers did, which was confirm a premise of his own studies rather than an objective reality.

Text Mining “120 Days of Sodom”

Marquis de Sade’s 120 Days of Sodom describes the day-to-day depredations of four libertines who essentially quarantine in a remote castle with several men, women, boys, and girls. The text, never finished, is repulsive and hard to read, but it also says important things about power, fascism, and sexuality. The disturbing content of the narrative makes it difficult to analyze objectively or statistically, so I thought that mining the text might reveal patterns, trends, and structure that are difficult to discern in a text that affectively troubles the reader in its abjection.

I chose Voyant because it offers all the modes of analysis I wanted, and I used a pdf of the text digitized by Supervert 32C. Sade’s use of language is repetitive and even mechanical (although the vocabulary density is .067). I looked at which terms occur most frequently using the cirrus tool. I added several trivial terms into the stopword list in order to focus on those that reveal the thematic and bodily fixations of the text. I was surprised to find that “little” is the most recurrent word. This fact demonstrates the extent to which hebephilia and ephebophilia are the dominant themes of the text, or the the dominant perversions practiced by the libertines. We can see that the taboo the text most transgresses is that of the abuse of pubescent children.

And, since the text is so concerned with the body, I was interested in which parts it most emphasizes. I found that the text most emphasizes “mouth,” then “ass,” then “prick.” The text’s anal and oral fixations may relate to its obsession with children. We can also see that the text’s eroticism is most dependent on non-genital organs. I would be interested in further analysis of the “body” of the text, such as a visualization of a homunculus whose body parts are sized in proportion to their appearance in the text.

Using the terms tool, I identified how often each major character appears. From greatest to least, the order is: the Duc, the aristocrat; Curval, the judge; Duclos, prostitute/story-teller; Durcet, the banker; and lastly, the Bishop. This order suggests a hierarchy among the four libertines and their corresponding social institutions, and it’s notable that Duclos, the chief madam, appears more often than two of the libertines.

The trends graph of the five major characters shows that they all follow fairly regular sine-wave oscillations, although the Duc and Curval occur more frequently in the last sections, while the Bishop and Durcet occur less frequently. Duclos follows the most regular oscillations, which reflects her role in the structure of the novel: she opens each day with an erotic story, then recedes into the background as the libertines reenact it, until the cycle repeats next day.

There are many other directions a mining analysis of this text could go (an analysis of the distribution of abstract concepts, seeing how gender correlates with age, comparing individual days).

“The Fetishisation of Excellence,”

Felicity Howlett

The authors portray “excellence” as “the gold standard of the University world.” They suggest that the overwhelming, overweening focus on the term (whether applied to academicians, to output, to guiding principles, institutions, or networks) has turned it into a “pernicious and dangerous rhetoric that undermines the very foundations of good research and scholarship” (1). How did this come to be? A trope that I remember from graduate studies and following is the threat/command “Publish or Perish.” At least “excellence” seems to suggest a positive direction. Abused now, the word may have begun its life in academic circles as an intentionally positive and uplifting principle. I have not attempted to study how the historical academic use of the word has changed or how it has morphed into having negative connotations implying a specific path to accomplishment.

While I have no argument with the authors’ argument nor with their reasons for making it, I found the article very difficult to digest. Part of the reason was the different voices that appeared within the discussion. These ranged from the first and only footnote on page 1, a glib/sarcastic/cute account of how authors’ names had been ordered in the title information, to page 21, and the suggestion that, as it pertains to critics of disciplinary expansion, “such criticism is always more Canutian than effective.” Having skirted through a reference to Hippocrates in the preceding paragraph, I got stopped short by Canute and had to go to Google  (See  https://en.wikipedia.org/wiki/King_Canute_and_the_tide). All this creates some wobbly stones along the path of a discussion of (mis)applications of the word excellence.

Most of the first seventeen pages comprise a complaint about the use and abuse of the term in academic settings. Only at the halfway point are we are told that the problem is not with the word “excellence,” but with its improper application:   . . .”we diagnose the problem not as the pursuit of quality per se, but as the concentration of attention (and resources) on the intense competition to make it into the top few percent—it is not ‘excellence’ or its pursuit that is the problem; it is the concentration only on only the excellent ” (18).

If that had been clarified on page one, a good amount of paper might have been saved. Later, concern focuses on how the field of Digital Humanities is affected by (the oppressiveness of) the wholesale embrace of “excellence.” Up to this point, the concerns seemed more appropriately focused on the conservative tendencies of the academy in general (and their negative effect on many areas).

Near the conclusion, the authors suggest that a rhetoric built around ‘soundness’ offers opportunities: “Soundness appears to be a plausible basis on which to build a new narrative, or rather to combine existing threads into a more consistent rhetorical framework” (29). Despite all the good reasons offered, it is surprising that, after all they have been through, the authors would suggest an alternative, substitute word for “excellence.” “Excellence” has been abused, misinterpreted, and forced into false representations. Once, it must have been a worthy word—its antonym is inferiority or mediocrity. How can we possibly imagine that the same fate would not befall the word “soundness?”

Throughout the article, I felt that the authors needed to liberate themselves from the conventional academic constraints they were arguing against—that their arguments were hampered by the same context that caused their objections. Substituting “substance” for “excellence” does not offer an alternative solution. It just offers an alternative noun. Perhaps the way the word “excellence” is scattered throughout academic life is more a symptom than a cause.

“Excellence” should not be used as a term used to protect older areas of the academy from its upstarts. Yet it seems that there will always be greater weight given to most established areas than to the new. What is exciting is how quickly newer areas gain representation and become part of the foundation! We bear witness to this in the increasing role of the field of Digital Humanities and its contributions.

 Moore, Samuel; Neylon, Cameron; Eve, Martin Paul; O’Donnell, Daniel; Pattinson, Damian (2016): Excellence R Us: University Research and the Fetishisation of Excellence. figshare. Journal contribution.  https://doi.org/10.6084/m9.figshare.3413821.v1

Postscript to “Visualizing Trends in Language Usage: Important(ly),” Felicity Howlett

A couple of thoughts in retrospect about my experience with the visualization and/or text mining praxis assignment.

1) As I searched for ways to mine text for the terms “important” and “importantly” used in an adverbial context with and without modifiers such as “more” and “most,” my first success was at the New York Public Library’s digital newspaper archives. However, most of the archives I discovered did not extend into the 21st century, and many were unique to a specific newspaper. The Free Google Newspaper Archives presented similar obstacles. I was looking for a fat collection of diverse USA and English newspapers or literary texts that would demonstrate how words were used over a specific period, in this case, 1969-2021. Searching for a literature dataset, I soon understood that The Project Gutenberg Collection, which has been an invaluable collection for so much research, was not appropriate for this project given its concentration of pre-21st century literature. I found the Google Ngram Viewer that mined language up to 2012, and soon after that, the Google update, extending to 2109. Apart from the Google collection, I found no supply that permitted a similar search for the data for my specific dates. (Although I looked at various public information sites, the use of language in these sites did not seem to be representative of traditional, habitual language use.)  Shortly after submitting my assignment, I met with Filippa Cordero to ask her, among other things, why it had been so difficult to locate an appropriate body of text. She explained that locating a data site (or creating one) can be among the more challenging aspects of the text mining experience.

 2) I used Ngram to visualize text. It seemed too easy. Voyant had been recommended. When I cruised the site, it seemed that the visualizations were more complicated than the linear graph I had made However, once I took my time going through Voyant, I saw that the “Trends” illustration in the visualization examples exactly matched what I was achieving through Ngram. The major obstacle was that I could not find a data site that would provide me with the collected information to reproduce from Voyant what I could achieve through Ngram (with its store of Google material).

3) I haven’t had any luck seeking a data source for my feelings about this experience, but I did find a description in the first few seconds of an old tune: “It’s Easy When You Know How.” Clive Burke, 1948.

Final Project – Broken English: an ethnographic research about ESL Schools in NYC challenges during the Coronavirus pandemic in 2020

What is my project about?

Broken English is a web-based platform that presents an ethnographic research about the lives of ESL (English as a Second Language) students and teachers in English Schools in New York City, emphasizing the challenges they faced during the Coronavirus pandemic in 2020. This research will be presented to users through a narrative that is developed in two parts:

Part 1: ESL school challenges during the Pandemic

Users will start understanding English as a Second Language (ESL) programs and the challenges that their communities (students, teachers and staff) faced during the Pandemic. Texts, images, maps, and short stories in the form of videos will be presented in order to provide users a deep understanding of the problem through different lenses.

Part 2: ESL reimagined

Based on the participants’ suggestions of the ethnographic research, users will have access to a toolkit with different methods and activities that can be developed in online, in-person, and hybrid ESL classes.

___

How do I plan to do it?

I intend to develop this platform in two phases:

Phase 1: Pilot research, prototype, and testing

Estimated time: 2 months

I will start this work by doing a pilot research interviewing between 8 to 12 people, including students and teachers of two ESL schools based in NYC: New York Language Center and ZONI, which are both institutions that I have easy access to.

After that, I will create a prototype and present it to the participants to collect feedback. I should also do the work by myself to avoid unnecessary expenses.

The prototype should use two web applications: Story Maps, which is a free platform to present interactive and multimodal stories through the use of maps, videos, and other sorts of media; and Kumu, also a free application that helps you create system visualizations.

Phase 2: Complete research and platform evolution

Estimated time: 4 months

Based on the learnings of Phase 1, Phase 2 is the process of evolving the platform, including more research with communities of more English Schools in NYC. Updates of the application and constant testing should be developed throughout the process. This phase should also involve the work of a dedicated team composed of myself and three other specialists in the fields of Digital Humanities, Design, and Education. I may use the same web applications used on the Phase 1.

___

Why did I choose this idea?

I have three reasons why I want to develop this project:

1 – As I was an international English student in an ESL School in NYC, I can use my personal experience and network to access participants to my research;

2 – It gives me the possibility of analyzing the problems and opportunities of the phenomena through the lenses of Educommunication, a field of studies and practices that I studied during my first Master’s in Communication and Education at the University of São Paulo. For those unfamiliar with this term, Educommunication is a field of study and practices related to planning, implementing, and evaluating processes, programs, and products that create and strengthen communicative ecosystems in educational contexts. It was founded by the Latin American theoretical currents of liberating pedagogy, popular communication, and cultural studies.

3 – I want to help ESL communities rethink their practices by providing visibility for the challenges they faced during the Pandemic and accessible resources for successful educational experiences.

Blog #6: Twenty life stories- a text analysis

Background

In 2012, along with DH software developers Alejandro Peña and Francisco Onielfa, I started to work on a digital oral history archive that gathers, preserves and provides access to the testimonies of Spanish women who became adults and mothers during the Francoist dictatorship (1939-1975). Their daughters, who came of age during the Spanish transition to democracy  and its subsequent democratic governments, interview them about their recollections of the pre-democracy years and the socio-cultural differences they perceive between the two generations. The interviews are recorded on video. The archive, Mothers and Daughters of the Spanish Transition to Democracy, has collected 51 interviews to this date, and we continue expanding it.

Corpus

For this praxis assignment, I have used the first 20 interviews of our oral history archive.

Brief description of the sources:

  • The participating mothers were born between 1921 and 1942.
  • Their daughters were born between 1944 and 1977.
  • The interviews were conducted between January and June of 2012.
  • The interviews followed a semi-structured, open-ended format.
  • The interview time-average is 92 minutes.
  • The 20 interviews have been combined in one document for a total of 246,659 words.

Tools

All the interviews in the archive are processed with Dédalo. After being transcribed, they actually undergo a text analysis . We “index” them by linking different interview segments to the thesaurus descriptors that we have created for this specific project.

For this praxis activity, however, I have not used our thesaurus descriptors, and I have worked with the unindexed texts.

After a superficial exploration of Voyant, the tool that I decided to use, I was under the impression that it did not have multi language functionalities, so I consulted with Filipa Calado, our Digital Fellow, who used Python to clean my text.

Here’s the code that Filipa wrote to eliminate words that were irrelevant for my analysis:

I actually made Filipa go through a good deal of unnecessary work because, upon a more thorough investigation of Voyant, I found that the tool is, indeed!, multilingual, and that it provides interesting options to users, such as the possibility to edit the stopword list, which I took advantage of.

After I applied the new stopword list to my text, the count went down to 165,215 words.

Process

Making decisions about which words should stay or leave was not easy. For example, after applying my first modified list to the text, the analysis showed that the most frequent word was “no.” I wondered: Was the presence and frequency of this adverb saying something about the project participants’ experience of repression under the dictatorship? I began playing with the stopword list to try different scenarios, and decided that the analysis was richer when “no” was absent.

Similar questions arose with words such as “bueno,” which in Spanish can be used as a filler that marks a moment of reflection or hesitation (“Well…”) or as the adjective “good,” in opposition to “bad.” Eliminating all the “bueno” words might hide important information. I began to see how digital text analysis needs a good amount of linguistic tweaking in order to guide interpretation in a reliable way.  

After playing with the stopword list for some time, I decided to keep this Cirrus visualization for the time being:

https://voyant-tools.org/?corpus=35b02a36f7be1d7db83bf3775994e054

“Yo” (“I” in English; 1348 occurrences) and “madre” (“mother”; 1148 occurrences) are the highest frequency words. One could formulate some preliminary interpretations based on this data. For instance, subject pronouns are generally implied in Spanish. Speakers do not need to insert the subject pronoun in every sentence because verb conjugations already indicate who or what the subject of the sentence is. The excessive presence of subject pronouns is redundant, unless it is used for clarification or reinforcement. Thus, the fact that “yo” is the most frequent word in the interviews might denote self-assertiveness: the mothers are asserting themselves as the protagonists of the interviews. If confirmed, this would be a positive outcome, as many of them expressed fears and insecurities before participating in the project. They often said that their lives were “normal and uninteresting,” and that they didn’t think they had anything to share with the larger public.

The Links tool of Voyant shows the occurrence of “yo” in connection with “creo” (“I think/believe”; 198 simultaneous occurrences) and “sé” (“I know”; 127 simultaneous occurrences), which would support the idea of the interview as a space for self-definition and self-determination.

Links tool- https://voyant-tools.org/?corpus=35b02a36f7be1d7db83bf3775994e054

However, because I had eliminated the word “no” from the analysis, I do not know whether the verbs “creo” and “sé” might be have been used, at least in some instances, in the context of negative statements, as in “I don’t know.” An analysis of both scenarios should be made before arriving to conclusions.

There is another caveat to the “assertiveness” interpretation: the corpus contains the daughters’ questions too, and, in all probability, they have used “yo”. This distortion could be easily avoided by eliminating the daughters’ questions from the corpus before uploading it to Voyant.

The same caveat applies to my entire text analysis, which focuses on the mothers but has, nevertheless, included the daughters’ questions in the corpus to be analyzed. However, considering that the mothers’ narrative is a lot more extensive than their daughters’ interventions, my improvised interpretations might not be completely invalid.

The high presence of the word “mother” is intriguing. You might say that it is not a surprise: after all, the project is “all about mothers” (wink to Almodóvar). But, are they speaking about their own maternal role or are they referring to their mothers? I am inclined to think that they are speaking about their own mothers, which would show the presence of a matrilineal focus in the interviews.

The Links tool (see above) did not provide me with information about the presence of the mothers’ mothers in the interview, but did reveal that the term “mother” is collocated in the environment of “father,” which might indicate that the interviewee is, indeed, speaking about her parents when the term “mother” appears. The Links visualization also shows the terms “daughter” (“hija”) and “granddaughter” (“nieta”) in connection to “mother,” which could support the hypothesis of the matrilineal angle. Again, one could say that the project itself is matrilineal by design, but the interviewers and interviewees were not asked to focus on the grandmother-daughter-granddaughter line. If anything, the semi-structured interview-guide includes questions about family and children in general.

Back to Voyant word-lists options, I’d like to highlight the “White List” function (I wish they would have called it something less racialized), which allows users to observe the behavior of terms of interest to them. In order to use the “White List” options, it is important to set the “Stopword” list to “None:”

I chose to look at the historical and political terms of the periods that the interviews cover: republic, war (“guerra”), dictatorship, democracy. I also inserted some terms frequently associated with them: repression, Church (“Iglesia”), sin (“pecado”), freedom (“libertad”), free (“libre”).

https://voyant-tools.org/?corpus=35b02a36f7be1d7db83bf3775994e054&stopList=&whiteList=keywords-10f35e54b6e5df854daf51cf4a366421&view=Cirrus

“War” is the highest frequency term, which shows its robust presence in the collective memory of the project participants –a stronger presence than that of the 40-year dictatorship. A possible interpretation is that the questionable and imperfect nature of Spain’s democratic transition has failed to facilitate an unambiguous condemnation of the dictatorship, which might lead the participants to address the term indirectly, use euphemisms or avoid it altogether. By contrast, the “horrors-of-the-civil-war-narrative” does not carry any ambivalence in Spanish collective memory, which might account for the strong presence of the term “guerra.” Of interest, too, is that the “república,” the democratic period immediately preceding the war, has a minimal presence, which might corroborate the ineffectiveness of the Spanish democratic transition to rehabilitate the memory of its pre-war democratic precedent: the much-demonized, very progressive, and shortly-lived Spanish Second Republic (1931-1936).

There are many other interesting observations based on this quick analysis. For example, the participants might have codified the term “repression” as “sin” and “Church,” judging from the disparate presence of those three terms in the Cirrus visualization. The terms “libertad” and “libre” are more frequent than “dictadura,” and about as frequent as “democracia,” perhaps signaling a more defined and stable presence in the collective memory of the participants. 

Possibilities

Voyant is a versatile tool that offers multiple possibilities for a project like mine. I could, for instance, separate the interviews to compare age and term frequency; I could analyze daughters and mothers separately; I could compare my interviews to other memory projects covering the same period; etc.

It is important to note, though, that variable control and a careful design of the analysis are necessary steps if we are to rely on Voyant’s data. For instance, we must be sure of the accuracy and homogeneity of the interview transcriptions (i.e. you cannot compare terms referring to time if dates have not been transcribed homogeneously). The stopword list is also of paramount importance because it has a direct impact on the type of information the analysis will yield. Additionally, in a project like the one I am working with, the data collection process must be taken into account as well: project design, interview format and questions, participants’ profiles, how interviews have been processed, etc.

Digital Textuality and Education

What Ted Underwood and Michael Witmore both call us to consider are the ways in which the practice of distant reading is continuous with practices that long precede the introduction of computational tools into the academy. Underwood is concerned with a very specific social-scientific way of reading literary texts that dates back to the sociologically and anthropologically-influenced projects of Raymond Williams and Janice Radway. This is an interesting and seemingly important historical tracing that can help distant reading position itself in the humanities. However, Witmore is concerned with a much broader genealogy that addresses some more general philosophical issues that I believe the digital humanities must grapple with if it is to becomes effective as a discipline. He gives us an epistemological framework, but it is one that I argue must be thought in relation to different organological scales (to use his own word), particularly that of the neuroscientific and the educational.

Witmore shows us that distant reading allows us to think something much deeper and overarching about the act of reading (and the technical constraints that have always conditioned it). It allows us to think distant reading as simultaneously continuous and discontinuous with hermeneutics as it preceded digitalization. He points to the fact that reading, as an addressing that can take place at many different levels, is always a historically and disciplinarily contingent mode of attention wherein a discipline, over time, determines what its object of attention is. This object of attention consists (ideally) over different scales. Witmore shows us that text can be addressed at the level of the book, genre, words, lemmatizations, and I might argue there are an infinite number of other ways it could be addressed. He also emphasizes the power of reading to read different texts and different scales together. In this way, different scales of text form the material basis for a mode of attention (a reading as a multiplicity of addresses) whereby meaning is produced. A mode of attention is what he calls the dispositif that the reader creates (as well as learns and transmits, in my opinion), creating connections between elements of the texts in/with the imagination.

Witmore calls for “a phenomenology of these acts [of reading], one that would allow us to link quantitative work on a culture’s ‘built environment’ of words to the kinesthetic and imaginative dimensions of life at a given moment.” However, we need to go beyond a mere phenomenology to construct what Bernard Stiegler calls an organology of reading, one that takes into account the biological organs, the technical organs, and the social organizations that condition the phenomenon of reading in the way that Witmore conceives of it. His analysis includes at least part of the technical and social dimension. For instance, he writes that addressing text at the level of the word “is itself an artifact of manuscript culture, one that could be perpetuated in print through the affordances of moveable type.” He also calls for a thinking of how the technical innovations of digital computation change the limits of what kinds of reading are possible. He hints at the biological when he references the kinesthesis of reading, but this leaves out the neuronal and the synaptic, which are essential in understanding how new connections can be made among textualities by creating new connections in the brain. The main thing I believe he leaves out, however, is a core social component, the most important social mechanism at play in reading being education. The dispositif that the reader uses to address textuality are constrained/produced by many differential forces of power, but the most important one in a post-Enlightenment world is the psychopower of educational institutions. The scalar concentrations of a reading are determined within the context of a discipline over the course of its history, between the play of the generations and all the contradictions therein. How to choose what elements of a text to capture and what to do with them follows rules established by the discipline that are then passed on to and eventually re-formed or trans-formed by students. By calling for an ontology of digital objects of address, Witmore is laying out for us what these rules need to be for those of us studying, teaching, reading, and writing in the digital humanities. What are the objects of our inquiry and how do our technical apparatuses change the conditions of our inquiry. What possibilities of address are opened up by the digital turn of distant reading and what dangers must we warn against? These are the questions that must be asked when we consider distant reading within the history of texuality, a history that is constituted by technical innovations and shifts in thinking, reading, writing, and teaching.