Library Treasures, Rebooted

As we prepare to continue work on our class Omeka project, I’ve been thinking—as I’m sure we all have—about what impact the digitization of these books might have on a larger scale.  What does it mean to have such a large collection of marginalia?  How might one best make it available online?  And, perhaps most important, what purpose would such digitization serve?

I was still pondering this issue as I read “The Library Rebooted,” by Scott Corwin, Elisabeth Hartley, and Harry Hawkes.  I have to admit, I found the article immensely appealing, if only because—unlike several of the doom-and-gloom pieces we’ve discussed—it attempts to provide answers to the conundrums that it highlights.  Yes, libraries appear to have less value now than they did previously.  Funding, available space, and reader interest are all decreasing, while journal prices skyrocket and books gather dust on the shelves.  But rather than mourning the loss of a rosily-glowing past, Corwin, Hartley, and Hawkes turn their attention toward the future, encouraging libraries to rethink their operating models and innovate to better serve their users.

I will leave the merits of most of their suggestions for later discussion.  For now, I’d like to look at two particular points amid their much larger arguments, which, for me, sparked new thoughts on the ultimate value of our Omeka project.

First, Corwin, Hartley, and Hawkes draw attention to libraries’ abilities to bring their “treasures” to the people through digitization, highlighting the British Library’s guided web tour of its most prized possessions: the original manuscript of Alice in Wonderland, Leonardo da Vinci’s notebooks, Mozart’s notations on a piano concerto (6-7).  In contrast, when encouraging librarians to “rethink the operating model” of their libraries, Corwin et al suggest that “keeping even infrequently loaned books on selves” is “outmoded” (8).

The driving force behind such a decision is the concept of user value.  Transforming obvious treasures into an “engaging interactive experience” allows users to enjoy them digitally (and thus, to better recognize the library’s value), while getting rid of little-read books frees up room for more user-friendly spaces, such as “a lending library and areas for children, teens, and seniors” (6).  In making such suggestions, however, these authors reinforce a controversial underlying judgment: namely, that the manuscripts of Leonardo da Vinci are treasures and the “infrequently loaned” books in the stacks are not.

I won’t quarrel with the first part of this statement (how could I?), but the second brings us back to those books in the NINES office with which we have been working this semester.  These books have obvious value: historical, cultural, and sentimental.  Each presents a microcosm of the history of reading, of book making, and of book culture throughout the nineteenth century.  They give us a window into the past, into the lives of the people who wrote in them, and into the reception history of the authors in whose works the marginalia has been written.  Such a wealth of meaning makes these works treasures in their own right, even if they spend most of their time on the libraries shelves collecting dust (and not bar codes).

We, who have spent a semester working with these books, presumably do recognize their value.  We are, however, also English graduate students—book-loving by nature and also a very a small subset of the general population.  It seems to me, though, that our viewpoint is one worth sharing. It also leads me to suggest that the goal for our project should not just be to digitize these books, but to do so in a way that allows us to convince the rest of the population—Corwin, Hartley, Hawkes, and all—of their value.

To do this, it will not simply be enough to post images of these books and their marginalia online because, as “The Library Rebooted” has shown us, such images will not be treated as terribly valuable.  In order to engage a larger audience, we must show why these books—and their marginalia—are interesting.  This would mean doing much of what we have already discussed in class: tracing the person, time, and place behind each note or doodle to give the book a history and a character; providing context for the newspaper clippings and letters tucked inside them; even offering a textual history of the book itself to highlight its value. 

In order to reach a large audience, though, we would have to include this information in a more compelling way than tags or lists of metadata.  Instead, we would have to follow the British Library’s example and make the presentation of our “treasures” an “engaging, interactive experience.”  An interesting interface would allow us to draw in readers not already interested in the history of books or marginalia.  Clear and comprehensive explanations of the materials would allow the digitized pages of these books to serve as educational tools, in addition to digital records.

Such a project is much bigger in scope that what we’ve done so far on Omeka.  Quite possibly, it is far too big a project to undertake at all.  Reading this article, though, it occurs to me that if we want to encourage people to “save the library,” so to speak, we have to show them why the library is worth saving.  That does not mean just showing them the obvious treasures—rare manuscripts, famous first editions, etc.—but encouraging them to see the rest of the library’s holdings as treasures, too.


Wolfram|Alpha and “analyzing” Shakespeare


, ,

Unfortunately I don’t have time to write a full post tonight, but since this seemed so timely I figured I should just share it with you all asap. The online “computational knowledge engine” Wolfram|Alpha announced today that it can now analyze Shakespeare’s plays. Screenshots from searches on Midsummer Night’s Dream, Romeo and Juliet, Julius Caesar, and other plays demonstrate the kinds of information you can expect Wolfram|Alpha to return for your queries. Definitely worth a look, even if you only skim the post.

Interesting intro-to-Wolfram|Alpha video here.

Also note that they do have one British C19 text in their database at the moment (Great Expectations) and they’re soliciting suggestions for what other texts ought to be added. What would you ask for? Does this seem like a useful (by which, of course, I mean “provocative”) tool?

Re-visualizing ‘Rear Window’

In light of our recent focus on (re-)visualization, I found an interesting video on Slate magazine and wanted to share it with you guys. It is not really literature-related, though the Manovich article did talk about film for a bit. The video creator gathered all the close-up apartment shots from Rear Window to create a panorama of the entire space so one can see all of the plot events happening, sped-up, from the point of view of an establishing shot. I don’t think the movie itself even has an angle quite as encompassing. Here it is (n.b.: the embedded screen is rather small, so making it bigger may or may not be helpful):

This is not meant to count as an official blog post; I do not have much to say beyond tangential thoughts and my initial reaction, which was: “That’s so cool!” I suppose that, in general, the video does seem to give the viewer a sense of both critical distance through the re-mapping of both space and time. In terms of space: rather than forcing us to look through one particular window, this wider perspective allows us to choose where to gaze among various locations, albeit with the obvious loss of detail. I think Andre Bazin talked about this in relation to Citizen Kane: that the film’s deep focus and economy of montage allows for the viewer to choose where to direct her attention and thus she edits the shot herself. That constituted a sophisticated kind of film realism. (Am I getting this right?! It’s been awhile). This shot is not really deep focus– and the lighting and the fuzziness in certain spots work to direct your gaze–but the steady shot does seem to offer greater objectivity to the narrative. Realism? Maybe not.

I think that the speeding of the scenes also adds a unique and odd type of critical distance to the film. In the original version, Jeffries, the injured and immobile photographer-protagonist, directs the gaze–we see most of the scenes through his inflected eyes. The sped-up camera takes away some of the voyeuristic/emotional charge of the film (recall Hitchcock’s gauzy, slo-mo introduction to Grace Kelly in the original) and detaches us from Jeffries projections onto the apartment “windows” in front of him. Question: How does the GK crossing the threshold between buildings part contrast/compare with the original version, given the two different perspectives and speeds?

I’m not sure how useful this video would be in a scholarly conversation, but I thought it was interesting, at least, and it would be fun to talk about!

Tools, Data, Moretti’s Models… and the tricky bits that get left out

It’s beginning to seem to me that the best thing these digital tools and modeals can do for us is offer a kind of critical distance. They let us view works from above, not from within; Elizabeth Gaskell might say, from a “mount of observation.” Voyant (and even more than voyant, woodchipper) gives us unbiased analyses of important themes and threads. Maps, graphs, and trees let us see data in new patterns that we weren’t necessarily looking for; the tools give us distance. They remove some of the subjectivity that has always been integral to literary criticism. But Daniel Rosenberg’s talk left me convinced that there was no such thing as raw, unbiased data. If data is rhetorical, then it is all biased or skewed in some way. Even by choosing what information to record, we skew the results. Analysing texts in Voyant only works well if we remove certain common words, but what do we miss when we remove “it” and “he” and “she” and “I”? Data models pretend to offer critical distance, but I’m not sure whether they can—or whether eventually they will, when digital humanists have developed a basic set of data points that can be gathered. When a chemist does an experiment, he measures temperature, time, mass, volume, color, and so forth; what are the literary scholar’s equivalents?

I’m not at all sure Franco Moretti gives us an answer to that question. The examples he sets out are suggestive, but to me, Moretti seems to be explaining only one part of a larger theory. I like to think of literature as a conversation of sorts. Moretti’s idea that form comes from force is amazing: social forces shape a writer, who creates a work. The work then goes out into the world and is mediated and received.

Moretti skips over half of that conversation: he stops once the work has been written and published. His graphs of genres, and all the reasoning from those graphs, are based on publication numbers—how many books in each genre have been published. But how many of those novels were flops? How many sold well? Were books read once and then discarded, or did readers reread novels—and how often? Did some novels—or some genres—have longer reading lifespans (as opposed to publishing lifespans) than others? That’s a much more complicated question, of course—but perhaps not an impossible one to tackle. Maybe there’s information about sales in bookseller’s archives or publisher’s archives, and information about circulation in circulating library archives. This is also, of course, where annotations and mediated copies come in—this is the sort of information that the books we’re putting on Omeka can give us, on the large scale.

I loved the section on maps, even though it did seem to cover only half of the conversation again. He talks about the way village story collections display the social geography and mindset (mentalité) of the time; but that’s an explanation of how social forces shaped the work. Did the works have any influence on society? Did they provoke any response? I can’t imagine how to figure this one out. In any case, this sort of intersection of real-world humans and art is exactly why I study English, and exactly why I think it’s an important discipline: the humanities show how we process our world, and are how we process our world. But that means that it’s even more necessary to study the second half of the conversation—not just what the author produces, and why and how it was produced, but what happened to it once it was produced. The trickiest bits might be the most important.

The Cather Archive

I probably should have taken some more time between the end of class and posting this blog, because there is no way that I will not sound like a lovesick teenager about the Cather Archive. It has basically everything that I could want in an archive, with a few exceptions I’ll mention later.

It is easy to get around. It looks pretty. And it has simply fantastic tools. Here are a few of them-

The Geographic tool, which maps her world using Google maps and awesomeness.

It is excellent that it even tells you what she did in the place she visited.

My actual text was “On the Gulls Road“, published in McClure’s Magazine in 1908. The entire story is laid out, and you can access scanned images that edition (or at least, I’m assuming it is that edition). I decided to use the short story and play around with TokenX. TokenX is a really impressive (to me, at least) tool that allows readers to manipulate the tale in many, many ways. You can WordCloud it. You can search for keywords. You can even, for reasons I don’t quite understand, replace all of the words with blocks. I think the tool I liked the best was the Replace Words with Images tool. I think it is a fantastic and innovative way to visualize keywords. I searched “look”, “looked”, “gaze”, and “eyes” (for basically no reason), and got this. I think that is just a really interesting way to think about the frequency of a particular theme (perhaps not in this story, which I know nothing about, but in general).

So I had very few problems with this site. I do think that the images they use to replace words could be altered- for example, you can insert a fish but no mouth? And my object-specific problem is that I think ignoring the rest of the magazine the story appeared in is problematic. I do not know much about that specific magazine, but it must be important where the story was placed in the layout, what other stories were chosen, what ads were used in that magazine, etc, etc. I think those questions are crucial to determining how Cather’s own society categorized her writing.

That said, I absolutely loved the site. My agony at this not existing for the Alice novels or Emily Bronte is intense.

“The page that is only [?] text”


, ,

I wanted to share with you all a post I read this weekend on the New York Review of Books blog. Novelist, essayist, and translator Tim Parks writes about the differences between paper and e-books, claiming that we lose essentially nothing (and perhaps even gain something) when we make the switch from print to digital. What we gain, Parks says, is freedom from the “extraneous and distracting” materiality of a printed book: “In this sense the passage from paper to e-book is not unlike the moment when we passed from illustrated children’s books to the adult version of the page that is only text. This is a medium for grown-ups.”

The post arrested my attention all the more forcefully because I discovered it so shortly after our excursion to MITH, where we spent a good deal of energy working out how to record and preserve the physical features of the Frankenstein manuscript in XML: I hoped sharing it with you, my markup compatriots, would be cathartic. I would welcome a discussion particularly because I wonder how much my textual-criticism-and-book-history background skews or otherwise influences how I’ve interpreted Parks’s remarks.

One of the reasons I have trouble with the piece is because Parks fails to distinguish between born-digital texts and digital surrogates of print texts. He claims that Jane Austen, Dan Brown, and James Joyce read the same on an e-reader as they do in any printed book. I think that at least part of his argument is really about the difference between the work itself and the witnesses, or texts, through which we experience the work. The novel Pride and Prejudice does not exist in one particular printed instantiation, and Parks writes evocatively about this truth: “We all know that once the sequence of words is over and the book closed what actually remains in our possession is very difficult, wonderfully difficult to pin down, a richness (or sometimes irritation) that has nothing to do with the heavy block of paper on our shelves.” I, for one, always imagine that works float around up above our heads: ethereal, almost imaginary, and, as Parks says, “wonderfully difficult to pin down.”

However, Parks seems to insist that because works are necessarily immaterial, the medium through which we experience a version of that work does not affect “the [essential] literary experience.” Mediation is insignificant; or, rather, electronic mediation is somehow preferable, because it does away with those pesky, distracting physical trappings that make up the printed codex: paper, binding, type, layout, advertisements. “We can change everything about a text but the words themselves and the order they appear in,” he says. “The literary experience… lie[s] in the movement of the mind through a sequence of words from beginning to end.” To read Ezra Pound’s The Cantos, a Kindle (or for that matter a stock ticker!) should do the same kind of work for the reader that any printed copy of the poem does, because “unlike painting there is no physical image to contemplate, nothing that impresses itself on the eye in the same way, given equal eyesight. Unlike sculpture, there is no artifact you can walk around and touch. You don’t have to travel to look at literature.” Perhaps using a visually provocative poem like Pound’s is unfair to Parks, but I also wonder how he would answer my objection. Never mind that before the Blake Archive existed, one did “have to travel to look at literature,” because there was no way to access the work without doing so.

For those books originally printed in ink on paper, specific physical objects—witnesses of the un-pin-down-able work itself—do exist, on coffee tables and bookshelves and in libraries and private collections around the world. Even leaving aside books like Oscar Wilde’s The Sphinx, which we know only look the way they do because their authors insisted upon this binding, those illustrations, and that kind of paper, we are living, as Madonna has it, in a material world. Materiality always has and always will affect both paper and electronic reading experiences. Dust jackets and covers, which Parks dismisses as “repositor[ies] of misleading images and tediously fulsome endorsements,” influence our readings of novels and poems and essays, just as they influenced readers in Dickens’s nineteenth-century London. As technology changes, we will continue to encounter born-digital materials too through new mediums and electronic platforms. Twenty years from now our Kindles and Nooks and iPads will be electronic dinosaurs, and the way we read Joyce electronically will be foreign to later generations. Perhaps future scholars will study electronic forms of mediation with the same enthusiasm that some of us now apply to our study of print culture. In any case, I would like to insist that there is more to “the literary experience” than “only the sequence of the words.”

Just some things to keep in mind as we work on our next reading assignment, which is, delightfully enough, an illustrated children’s book: Alice’s Adventures Under Ground. What do you all think? Does medium matter? How are you encountering Alice? What would Tim Parks say?

Seeing Gender in Charlotte Bronte’s Villette and The Professor

After I finally got Cirrus to work on my computer, I found it to be mildly addicting. I did several novels and some poems that ultimately did not yield very interesting (or surprising, as Lingerr noted) results, but one novel’s word cloud that intrigued me was Bronte’s Villette, which you can view here. Apologies in advance for those who haven’t read the novel yet.

I found that the Voyant tool was useful in detecting names and words that fell below the radar as I was recollecting Villette. Since Bronte’s novel itself is so concerned with invisibility and surveillance, I thought it was interesting how my memory seemed to elide over certain characters’ ubiquity in the novel. A question for fellow Bronte-ists: if you had to make your own word cloud for Villette, would you immediately list “Madame” (331 times, 4th in overall frequency, 1st character) as one of the primary words? I suppose I would pick out John (163)/Graham (202), Paul Emmanuel (166), or Polly (didn’t even make the word cloud) before Madame Beck. But upon further reflection, her prevalence makes complete sense, given her omnipresence and omniscience, operating as the “thought” (307) police within the girls’ school that she runs. Did I just forget about her because the bildungsroman and romance plots stuck in my mind more than other aspects of the book a couple months later?

Her predominance also made me think about the similarities between Lucy and Madame Beck; I suppose that Mme Beck is Lucy’s closest female “double,” in a way. One could re-evaluate the power and gender dynamics of the novel — particularly in terms of Lucy herself — and how that manifests itself in the story’s structure as a whole. Bronte demonizes Mme Beck in so many ways, but if one chose to be *generous* and look past her heartless disposal of employees or the drugging Lucy into an opium frenzy…she does ultimately secure Lucy’s independence by the end of the story.

Since Villette is Bronte’s reconfiguration of her earlier novel, The Professor, I was interested in how issues of power and gender emerged in that respective word cloud. Here it is. One obvious difference between the two novels is that one is narrated by a man and the other by woman, but it’s more interesting how Bronte subtly encoded gender in each story. (N.B. Villette is much longer than The Professor so the proportions might be slightly off.) A word what occurred more often than I remembered: “monsieur” (155 times, also the 4th in frequency). “Mdlle” (112) occurs pretty often, too, but that seems to be dispersed among multiple female characters, whereas “Madame” in Villette generally just refers to Madame Beck. I am interested to know what you guys think of these results — do they alter/re-adjust one’s perception of the novels?

Other interesting and slightly amusing word frequencies: “little” is second in frequency for both novels. Going through Villette, the word seemed mostly to apply to Paulina, though there is also that scene when Lucy enters the “very tiny” salon with its “little couch” and “little chiffonniere” (p. 485). This maybe reflects Bronte’s preoccupation with compartmentalization and containment. As aforementioned, Bronte refers to Dr. John Graham Bretton more often as “Graham,” his name as a youth, than as “John.” Notable absences from Villette: “Emmanuel” and “Polly/Paulina.” Are Paul E. and Polly supposed to add up to each other in a way? And “Hundsen” figures predominantly in The Professor’s word cloud, though he just seemed to be a disconcerting presence that showed up haphazardly at various points in the story.

Several Unsurprising Things I Realized Whilst Using Voyant

The least surprising thing is that I decided to use this took to look at Emily Bronte’s Wuthering Heights; anyone who knows me knows my dangerous obsession with the novel (as I write this, I’m wearing a “Team Bronte” shirt).

So I copied and pasted the entire text from Project Gutenberg into the Voyant tool site. This was a bit tedious since I literally copy/pasted it all- you can use the URL but I did not want all the excess data. The most frequents words were “a”, “the”, etc, so I removed those. These are my final results, and whilst they are not at all groundbreaking (none of us are going to faint away in horror that “Heathcliff” is the most frequently used word in Wuthering Heights), I’m interested in them nonetheless.

So to look at my results, we have the names: Heathcliff, Catherine, Linton. To a lesser extent we have Hareton, Edgar, Joseph. We also have a lot of words that have to do with speaking: said, say, replied. That may simply be a result of this being a novel, but I think it speaks to the greater focus on what the characters say to each other in Bronte’s work, and how important it is. We have a lot of master, Miss, Mrs, and Mr, again not entirely surprising but it certainly affirms the focus on status in Bronte’s novel. Who marries whom? Who gets respect? Heathcliff transitions between having one name to being both Mr. and Master Heathcliff.

A few intriguing words: father at 102 times. This is pretty interesting- is there a big focus on fatherhood in Wuthering Heights? There certainly is not one on motherhood, all of the Mrs. Earnshaws die pretty quickly (except, hopefully, Cathy II), Cathy II is never really a mother, Isabella is not one. Of course Nelly Dean is a mother figure; though her dubious take on this motherhood (she is pretty wretched to the young Heathcliff and Catherine, she is not allowed to raise Hareton, though she unequivocally attempts to take care of Cathy II). But there is certainly a plethora of father figures: Mr. Earnshaw, Heathcliff himself, Hindley, Hareton, Edgar, Joseph.

Another one is eyes. Is there a detailed emphasis on looking in Wuthering Heights? There is certainly a focus on perception, and after double-checking Jstor there are several articles about the novel with “gaze”, “look”, “perception” or “eyes” in the title.

I’m interested in hearing your responses to this. Though, as I said, nothing is super groundbreaking, I think it highlights Voyant’s usefulness that I could find these words, and I can analyze them to draw conclusions. I’m also excited to see what the rest of you choose to analyze!

The Marriage of Digital and Paper, or, Without Contraries is no progression.



While this is probably my fourth or fifth close reading of Blake’s Marriage of Heaven and Hell, I must admit that I have never once managed to get through the entire thing exclusively on the Blake Archive. For this reading assignment I went back into the Alderman stacks for a copy of the Princeton edition of Blake’s illuminated books, knowing from experience that I wouldn’t be satisfied solely with the electronic archive. I think my reservations may have something to do with Eliza’s comment, below: “By transforming each plate of Blake’s books into an individual ‘object,’ … the Archive breaks down the sense of cohesion conveyed through an entire, bound work.” While an archive user can click through all of the plates of any given extant copy of the Marriage, the category “bound illuminated book” serves as only one among many ways to classify Blake’s plates. (As Eliza points out, one can search by terms or images: a search for all objects containing the word or image “eagle” returns a different kind of Blakean corpus altogether.)

This is not to say that I do not find the Archive useful, especially for a work like The Marriage of Heaven and Hell, which presents a somewhat complicated bibliographical situation. There are three* different plate orders in the nine extant (complete) copies. Trying to keep plate orders straight in one’s head (or even written down on paper) is difficult when one only has a single book, but it’s remarkably easy to compare my printed facsimile (Copy F, Pierpont Morgan Library) with the plates as ordered in the non-standard copies thanks to the online Archive.

Like Tess, I have always found the metacommentary in Blake’s Marriage striking–I am thinking here of the quotation she points to at the beginning of her post as well as of the “mighty Devil” who writes the sentence “with corroding fires”– the same sentence Blake has written himself with his brush and acid-resistant varnish before applying the corrosive acid to the copper plate. And of course we cannot forget the “Printing house in Hell.” These moments draw the reader/viewer’s attention to the fact that he or she is reading a book printed “in the infernal method by corrosives.” What happens when he or she views that book online? Obviously for most readers of Blake (and even for many very good Blake scholars) rarely will the opportunity to examine Blake’s physical copies present itself. Perhaps it’s a good thing that the electronic interface draws our attention to the digital facsimile’s status as a surrogate: I know it gives me pause and makes me reflect a bit more carefully on the status of the illuminated page (and book) as an object.

*The Blake Archive does not register the variant plate sequence of Copy E [: 1-3, 5-10, 4, 11, 14, 12-13, 16-27, 15]. It was rebound (into the normative sequence) in 1957 by Geoffrey Keynes.