The history: https://
1 min read
The history: https://
1 min read
Databases like END can act as 18th century index-style finding aids that help researchers find new texts to close read, as well as or instead of enabling distant reading projects.
2 min read
For anyone at NYU who's interested, this was our Theory Thursday reading assignment for today! All the files are available in the Dropbox: https://
I’m attaching a couple short readings that we’d like you to take a look at before Thursday afternoon. The file titled "Theory Thursday" has all the required reading in it, but if you'd like to see the full text of either reading, I've attached those too.
The first reading is from Leah Price’s The Anthology and the Rise of the Novel (2003), one example of literary criticism that combines close reading with an attention to the materiality and reception history of texts. We’ll be reading part of the intro (pp. 1-8), which lays out some of her major concerns and methodologies, and part of the first chapter, which focuses on Samuel Richardson’s work specifically (pp. 13-18). (We’ll be looking at physical copies of most of the texts she talks about on Thursday!)
The other reading is a very short excerpt (pp. 222-3) from Robin Valenza’s “How Literature Becomes Knowledge: A Case Study” (2009). This should get us thinking about possible connections between 18th century practices of indexing/excerpting and contemporary databases like END. For context, the rest of the article deals with Samuel Johnson, compiler of one of the first English-language dictionaries, and his use of excerpts from Richardson as evidence for his dictionary. (Feel free to read more of the article if you have the time or inclination - it's very relevant to END.)
We also recommend that you take a minute to read a short excerpt from one of Richardson’s novels (Pamela, Clarissa, or Sir Charles Grandison) on Project Gutenberg or ECCO, or in print. You can also check out this online edition of Clarissa compiled by two Penn professors: https://
1 min read
"Experts say that temperatures within the office can dip below 65 degrees during the summer months and well into the fall. It is likely that the female data entry clerk may have prolonged her life with frequent bathroom breaks and unnecessary trips to the copier; however, these reprieves were only temporary."
1 min read
Miriam Posner put the text of her keynote talk online. It's really great! She talks about ways to represent data that don't treat categories like race and gender as static, and the importance of working with messy data in DH.
9 min read
First presentation: “The Napoleonic Theater Corpus: towards a representative corpus of nineteenth-century French” by Angus B. Grieve-Smith, a linguist. His research looked at 19th century plays as a way of studying how French negation had changed over time. He randomly selected four plays, one from each quarter of the century, from two different sources. The first was the FRANTEXT corpus, a collection of 2900+ French texts that literary scholars viewed as significant, compiled in the 60s on punch cards and paper tape then converted into magnetic media in the 90s. The second source was a list of all French plays published in the 19th century, compiled by Beaumont Wicks, who spent about 30 years on the project—like a much more involved version of our 1760s project?
This study let Grieve-Smith see how “ne pas” was becoming an increasingly popular form of negation in French at this time, but it also provided an argument for the importance of looking at less canonical texts when making broader arguments about the literature or language of a particular period. “Ne pas” made up 87% of negations in the plays randomly sampled from the list of all plays published during the century, but only 49% of those in the plays chosen as significant by scholars. Grieve-Smith suggested that this could be because the plays now seen as canonical disproportionately deal with nobility who speak a more formal, traditional version of French, while the full list of plays includes more about ordinary people whose speech is more likely to reflect how the average person would have spoken at the time. This seemed in line with Moretti’s argument about the importance of distant reading, and how we may see particular texts as worthy of study specifically because they’re exceptional, which then makes it difficult to get a sense of what an “average” text from the period would have looked like.
Second presentation: “On Creating the Digital Joyce Word Dictionary” by Natasha Chenier (online at joycewords.com). Chenier started out by talking about how frequently James Joyce is cited in different editions of the Oxford English Dictionary. The first edition was compiled in 1928 and tended to draw its citations from canonical authors, but Joyce wasn’t yet famous enough to be included. In the next two editions, however, he became the most commonly cited modernist writer, with 1800 citations in the second edition (many of them for obscenities, which were added for the first time in this edition) and 2400 in the third. The vast majority of these citations were from Ulysses, suggesting that the OED’s focus was on “great works” rather than “great writers”.
Chenier then focused in on the inclusion (and exclusion) of Joyce’s neologisms in the OED. Again, the majority of these (74/89 in one edition) came from Ulysses; she suggested that while Finnegan’s Wake includes more neologisms, their meaning is usually so unclear that giving an accurate definition would be difficult. Chenier also noted that OED’s choice of which neologisms to include seemed somewhat arbitrary, which was why she was beginning to create an online dictionary of Joyce’s neologisms.
I very much enjoyed hearing the way Chenier talked about the principles behind this project. She talked about wanting to make the site aesthetically appealing, and to present it in a way that made it look inviting to a non-academic audience. She also wanted the project to align with Joyce’s “or, more accurately Leopold Bloom’s politics” (referring to the protagonist of Ulysses; I understood this to mean something like openness/flexibility/acceptance of multiple interpretations and contradictory information). In practice, this meant that anyone could submit a definition to the dictionary, and the site would allow multiple definitions or interpretations to be displayed alongside each other rather than showing only one authoritative version. This also seemed like a good way around issues like the ones OED might have faced with neologisms from Finnegan’s Wake, where there were so many possible interpretations of each word’s meaning that it was impossible to arrive at an authoritative one.
Third presentation: Literary Periodization and the (D)evolution of Distinctive Gender Markers, based on research done by Sean G. Weidman and James O’Sullivan, and presented by Weidman. According to Weidman, this study was meant to build off a paper by David Hoover titled “Textual Analysis.” Hoover compared 13 male and female contemporary poets and apparently used this to make claims about gendered differences in word use. Pulling a quote from the paper: “Relatively common words like mother are found in twenty women’s sections but only eleven men’s […] Female markers like children and mirrors and male markers like beer and lust seem almost stereotypical, but there are also surprises, like the female marker fist and the male markers song and dancing.” (Source: https://
Weidman said that he and O’Sullivan were skeptical of the conclusions Hoover drew, wondering whether it fit too closely with gender stereotypes, and set out to do a similar study with a larger sample size. They looked at prose instead of poetry and covered Victorian, modernist, and contemporary literature, using texts from 9 male and 9 female authors from each period. However, their findings so far have been similar to Hoover’s. Some of the findings they highlighted were that women use more “emotive” language, personal pronouns, and “relational” words like “wife” or “brother”, while men’s language becomes increasingly sexual and colloquial moving into the contemporary period, and male modernists in particular have a tendency to use quantitative language. Male and female gender markers were most similar in the Victorian period; in the modernist period, men’s texts stayed fairly similar to each other while women’s had a huge number of outliers; and the contemporary period had the most clearly defined gender markers, with both men’s and women’s texts clustered tightly together.
Weidman presented this study with a lot of disclaimers: for example, that it was only a preliminary study; it relies on static or traditional ideas of canonicity and periodization when these categories are actually contested; that the appearance of a word like “mother” in a text doesn’t tell us the relationship the author has to that word or concept; and that (paraphrasing this point, possibly inaccurately?) that Zeta, the program it uses, is set up to look for differences between two discrete groups of texts, which means that what stands out in this study will inevitably be difference, not similarity or overlap between categories. Audience members pointed out a number of other potential issues during the Q&A: the study treats “male” and “female” as clear, fixed categories even though they might look different or need to be applied differently for (e.g.) trans authors; that the study needs to account more for historical limitations on what female authors could write about; and that this type of research doesn’t necessarily control for situations like female authors writing in the voice of male characters, and what effect this might or might not have on the vocabulary they use.
I appreciated hearing both the presenter’s and the audience’s reflections on the assumptions embedded in and potential limitations of these types of studies, even if they can be a valuable way of testing common assumptions about gender and writing. I’d also like to add that I’m often frustrated by how often conversations about this issue seem to end up in an attempt to defend women writers against the claim that they they disproportionately focus on (e.g.) emotions, domestic spaces, or family or romantic relationships, either by rejecting it entirely or by explaining what historical and social forces might have made it difficult for women to write on other topics. I very much support that type of historical inquiry and think it’s necessary as a corrective to the kind of thinking that wants to claim women are innately more relationship-focused than men (which everyone in the room seemed committed to avoiding), but I also worry that sometimes it ends up being framed in a way that implicitly accepts the idea that emotions/relationships/domestic space/etc. are less valuable topics than whatever men are supposedly writing about instead (in Hoover’s study, apparently lust and beer). Since this is a study of prose fiction specifically, why not reframe women writers’ focus on relationships, emotion, and domesticity to see it as indicative of their central role in the development of the novel, which is often defined by its ability to address family and romantic relationships and characters’ interiority in increasing detail? Why don’t we see studies like this and immediately rush to defend men against the charge that they can only write shallow texts about numbers, beer, and sex?
And what if we read this study with the assumption that its findings will refute gender stereotypes? If I understood the more technical part of the presentation correctly, Weidman and O’Sullivan found a much higher degree of variance in the language used by modernist women than the language used by modernist men; might this be used to argue that, contrary to the more common assumption, modernist women were more formally innovative than their male counterparts? The finding that women used more verbs than men on average also seems to challenge the type of discourse that I’ve seen in a lot of (questionable) advice for fiction writers that treats adjectives and adverbs as ornamental, flowery, or feminine, and verbs as action-oriented and masculine.
The question I take away from this is: one of the major advantages of DH work appears to be the possibility of engaging in more "scientific" studies than humanities work typically allows, in theory allowing us to challenge "common-sense" assumptions that may in fact be incorrect. But what happens when the findings of these studies (or the methods that produced those findings) themselves require subjective, and thus potentially biased, interpretation?
1 min read
From Pamela Censured (1741). I feel somewhat censured as well.
4 min read
I should probably find some way to distinguish between more general plot summary and more detailed reference to specific scenes in the sources I’m looking at. Using examples from Pamela Censured:
“[Mr. B] tries all Arts to seduce her thereto, but finding them all ineffectual…”
(2) “instead of letting her return in Safety to her Father and Mother as he had promised her, and which more speciously to make her believe, he complements her with his own Chariot to carry her, but at the same Time gives private Orders to his Servants to convey her far from the Place she desires to go to…”
(3) “Then follows a Proposal at large to induce her to commence a kept Mistress: The Particulars of which, the Author hath fully set forth, in order to instruct the young Gentlemen of Fortune how to proceed in such a Case, and that young Girls of small Fortunes may see what tempting Things they have to trust to.”
I’d like to exclude something like (1) from consideration. I could easily pick out multiple scenes this type of statement could be said to refer to, but the point seems to be less to reference a scene or even a collection of scenes and more to convey a more general fact about what happens in the novel. Like saying something like “Pamela marries Mr. B” - while it’s true that this happens in a specific letter, is the actual passage in the text really the point?
(3) seems like the type of thing I want to include: reference to a specific event, followed by commentary on that section of the text. (2) is more borderline. It more clearly refers to a scene or sequence of scenes than does (1), but it still reads to me like summary more than anything else.
I guess I’m making these judgments based on level of detail in a few different senses: the use of particular details of plot, setting, characterization, etc. drawn from Pamela, but also whether the language of the original is imitated or (as in some parts of Pamela Censured) quoted directly. Maybe this could be summarized as a concern with the textuality of the text, or with the kind of detail that Watt sees as defining the novelistic form.
Identifying a reference seems like it should be easy in moments like those in Pamela Censured that quote from the actual text, usually with page and edition numbers included. Though even in a case that seems so clear-cut, it would probably be useful to find a way of distinguishing between more and less significant or detailed references (for example, Pamela Censured quotes what appears to be nearly all of the cross-dressing scene but doesn’t spend the same amount of time on many/any other sections).
Another question I should consider: if I do something like an annotated text, which edition of Pamela should I use? Keymer and Sabor make a convincing argument that Shamela should be read as a parody of the third edition specifically. Pamela Censured is explicit about the fact that it’s referring to multiple editions of Richardson’s text. Assuming I’m interested in responses to Pamela that deal with different editions of the text, or don’t clearly identify an edition they’re concerned with, can I treat a reference to (say) a scene in the first edition as basically the same as a reference to its equivalent in the third? Or do I need to find some better way to make that distinction? (Some of this must depend on how significant the revisions to each edition are. Some of the sources I’ve found on the history of Richardson’s revisions should help with that. Another consideration would be that if I’m interested in looking at Richardson’s revisions and responses to his own text and how this does or doesn’t correlate with reader reactions, then the edition number has to matter in some way.)
2 min read
"The secret history of Betty Ireland, who was trepann’d into marriage at the age of fourteen, and debauched by Beau M-te at Fifteen, by whom she had one Son; the vile Injury she did to that Gentleman, and her turning Prostitute; her Amour with the Lord M-d when she came to London; and her Ingratitude to that Noble Gentleman. Her Incest with her own Son, by whom she conceiv’d and brought forth a Daughter, on whom she settled a handsome Annuity; her taking a House and selling Punch, &c. her being carted for a Bawd; her Revenge on one of the Justices who was principally concerned in causing her to undergo that Shame. Her Amours with a Jew, whom she caused to be arrested for 300l. and with three Merchants (who were Brothers) to each of whom she was married in seven Days, without the Knowledge of either; and afterwards separated upon Articles of Agreement. Her Behaviour in Yorkshire; particularly in Relation to the aforesaid Justice of Peace; her Liberality in that County; her being robbed on Epping-Forest, having first shot one of the Highwaymen, and being afterwards shot in the Shoulder by another; her taking a House and intriguing with Smutty-Will, an Irishman, who lived by Sharping. His Tricks with several Tradesmen; his Confinement and Death in Newgate. Her associating with Shoplifters; her being taken in the Fact; and the Stratagem she used to escape a Prosecution. Her inveigling a young Man to sell his Patrimony before he came of Age; her turning a Strolling Player, with the Manner how she made herself Mistress of the Company; her enticing her Daughter to leave her Father; their Arrival at Cork in Ireland, after they had escaped a violent Storm; their Success there for some Years, with an Account of her sudden Death. The sixth edition." (ESTC, record T109761)