Recently I attended a colleague’s job-talk concerning the integration of DH technology with theoretical hermeneutics, and he made a statement with which I very much agreed—paraphrased, that raw data sans some theoretical or analytical methodology for interpreting the data is meaningless. My experience with the Woodchipper project, sponsored by the MYTH cohort at U. Maryland, provided me with an opportunity to test that belief. The Woodchipper tool is itself a remarkably promising conception—as the following screenshot shows, the interface attempts to isolate and graph the occurrence of certain verbal chains throughout the selected text or texts.
The initial guidelines for testing were somewhat vague—choose several texts of the same or different genres, plug them in and see what comes out. I tried this at first, but found myself confounded by the seeming randomness of the data.
There seems to be a stereotype as of late of over-zealous scholars who tear the text asunder, hacking away at the text’s organic foliage for the sake of isolating one small twig, itself mercilessly subjected to overreading and underreading and misreading. One imagines the analyst in the grab of a Spanish Inquisitor, the text tied to the wrack, mercilessly stretched beyond recognition for the sake or some theory or other.
I want to offer another viewpoint here. The literary text, as I see it, is a conglomeration of divergent elements—one doesn’t have to endorse Bakhtin here, it is readily apparent even in the hierarchical structure of letter, word, phrase, sentence, paragraph, chapter, book. If they are all unified completely and exclusively, remove one and the text falls apart—but texts don’t emerge from a vacuum, or come fully formed from the author’s imagination. Texts emerge over time, as one concept is proffered, then interpolated by its neighbors; the text is a very effective network guided by the authorial intention (even if it manifests itself by its absence). So why shouldn’t we pluck off one competent, see how it works with the rest, and not so much prune as remove seed and replant?
All this is by way of preface to the methodology I choose. Always fascinated by Eve Sedgwick’s Imagery of the Surface in the Gothic Novel, I re-read her essay, attempting to derive certain key concepts, from which I extracted certain word-patterns that one would expect to find in the text, should Sedgwick’s analysis be borne out. I then attempted to interpret the data along these lines and the results, included below, were far from conclusive. But I did see the possibility for theoretical interpretations of the data along the lines indicated by Sedgwick—it merely required, as with everything in the humanities, a little bit of faith and a large amount of interpretative work.
I decided to utilize Eve Sedgwick’s earth-shattering analysis of surfaces and veils in the Gothic (attached) to see if the output of woodchipper could provide a measure of verisimilitude for works of more abstract theory.
Without going too much into detail, Sedgwick’s analysis yielded the following thematic points of emphasis:
Opposition between surface and veils
Relationship between writing/inscription
Character or self as socially constructed or externally imposed rather than innate
Emphasis on the visual proprieties, especially facial characteristics, when determining character
Failure or doubt of the aforementioned identification
Semantic ambiguity, especially as regards the criterion for visual identification of character
Metonymic slippage of surfaces/the veil; that is, the veil seems to absorb some aspect of its wears’ personality, and transmits this to other characters
‘Two dimensional,’ stock or underdeveloped characters
Repetition of motifs of landscape, color, music; these form a fixed, stable backdrop by dint of their repetition
The word ‘candor’ and the pallor of white are identified as terms of tantamount importance
Relationship between blood and flesh
The rational (conscious) versus irrational (libidinal)—Sedgwick identifies this as the dominant tenor of critical theory concerning the Gothic, which she immediately dismisses
The erotic charge of veils and surfaces
II. The Experiment
From the above, I was looking to see what verbal chains woodchipper identified within two of the three texts Sedgwick cited, viz. The Mystery of Udolpho and The Monk.
III. The Results
A1) felt made conduct received heart
A2) moment made escape found length
A3) god heaven life death man
A4) love heart loved world happy
A5) hand eyes face looked hands
B1) felt made conduct received heart
B2) trees woods mountains tree green
B2) mind heart tears grief seemed
B4) door room open opened light
B5) dear see young good man
IV: (Very Subjective) Interpretation of Results:
A1/B1) The line
felt made conduct received heart
was both common to both texts and rather frequent, if I understand the Woodchipper interface correctly; at any rate, it’s important. At first reading I understood ‘conduct’ to mean conductivity in a sense of heat or electrical exchange, which would lend support to Sedgwick’s point about the metonymic character of the veil. Closer examination (and common sense!) revealed, however, that conducted referred here to one’s persona, one’s character. The word ‘conduct’ has a connotation of exteriority—it is how one comports oneself in public, rather than what one’s true self might be. This, in proximity with the words ‘made’ and ‘received,’ may be interpreted as supporting Sedgwick’s assertion that character is imposed from the outside rather than innate to the individual character.
A2) The presence of ‘god heaven life death man’ in The Monk and not in Udolpho is unsurprising given the former’s supernatural elements in contrast to the latter’s ultimately natural explanation for events.
A3) The chain ‘moment made escape found length’ is not surprising, given the association of the gothic with captivity and discovery. However, the prevalence of ‘moment’ and length’ do hint at a text which is remarkably preoccupied with temporality and its measurement. I would have to look at a ‘control group’ of texts, however, to see if discussion of time in this fashion is merely a necessary symptom of novelic discourse.
A4) love heart loved world happy’
To be honest I don’t quite know what to make of this one. The association of variations on love, heart and happy is hardly surprising; the intrusion of the word ‘world,’ though, might be a fluke. One could proffer a connection between intimate interior space (happiness and love) with the external world, but this assumes a definition of ‘world’ with umwelt in the philosophical sense, and this might appear a bit of a stretch.
A5) The line ‘hand eyes face looked hands’ most supports Sedgwick’s claims, illustrating an inordinate preoccupation with ocular descriptions of character; the association of these terms with ‘looked’ would seem to tie them to a discourse of characterizing and description.
B2) ‘trees woods mountains tree green’ would seem to confirm Sedgwick’s claim (which is by no means unique) about the omnipresence of natural language in the gothic, and does support her arguments about the presence of colour.
B3) mind heart tears grief seemed
The close association of ‘mind’ with ‘heart’ lends itself to the notion of emotional depths, which is supported by the affective terms ‘tears’ and ‘grief;’ this would at first glance support the argument Sedgwick is writing against, viz. the existence of hidden emotional depths beneath a repressive ego/super-ego exterior. However the outcropping of the fifth term ‘seemed’ can be explained by Sedgwick’s claim that appearances are frequently misleading. If one can infer from these results to the text, one could speculate that the terms ‘mind and heart,’ which connote interiority, can only be manifested in outward shows of emotion (‘tears,’ ‘grief’) therefore their apprehension is always somewhat dubious—’seemed.’
B4) ‘Door room open opened light’ confirms the Gothic’s obsession with spatiality and the sense of unfolding rooms, a motif which lends itself, in my opinion, to the opening up (ha ha) of a surface/depth discourse.
B5) ‘dear see young good man’ does lend itself, although this is a bit of a stretch, to the argument about the limited nature of character development in the Gothic (‘young’ and ‘good’ aren’t exactly bursting with psychological complexity). This is a tenuous argument to make, so I’ll stick with the use of the word ‘see,’ and its confirmation of the importance of ocular description in the identification and semantic construction of character.
It would be madness to claim that woodchipper has ‘proved’ Sedgwick’s psychoanalytic reading of gothic fiction, and it is quite clear that the noted corollaries do require a fair bit of ‘massaging’ to fit the theory to the empirical data. That being said, I do think some aspect of her argument may be supported by a good textual woodchipping. In particular, the linkage of the visual to establishment of character appears reasonably solid. Again, these corollaries are not self-evident but needs be interpreted by the experimenter—and I, personally, would not have it any other way. I would have to be better acquainted with woodchipper’s heuristic processing to feel confident in the results of this experiment but, to address Neil’s concern over the chasm between high theory and digital humanities, for me this is a promising—or at the very least provocative—start.
Now that I’ve made my point, I have a confession to make. The data have changed since I ran this experiment four weeks ago. I constructed an analysis based on the word associations generated within certain texts; I was under the impression that they were generated from the selected text, but was prepared for their emergence from a standardized ‘word bank’ culled at some point from the other texts; it would be the thematic association among gothic works, then, and this would be just as good, but more difficult for accurate interpretation. What I did not expect, however, was that these values were variable based on the texts in the machine; texts are not taken in isolation, or so I must assume. In comparison with the original data used for my analysis, the differences are not catastrophic (see screenshot below for a comparision), and I believe the key arguments of my interpretation still work. However, it is a troubling concern, and if nothing else a stirring reminder of the risks involved in a methodology that is increasingly data-driven.