Tools for a Digital Humanities
I’ve recently discovered Project Bamboo, an initiative that describes itself on the project website as a multi-institutional, interdisciplinary, and inter-organizational effort that brings together researchers in arts and humanities, computer scientists, information scientists, librarians, and campus information technologists to tackle the question:
“How can we advance arts and humanities research through the development of shared technology services?”
Come again? At first, the concept of shared technology services may seem a little vague. But a closer look at the full project proposal makes it fairly clear what is meant.
While academics use digital technology and the Net for a wide variety of things (research, teaching, publishing, communication), all of these uses have a degree of improvisation to them. Very few of the tools we use are developed specifically for the context of science and research, and sometimes this limitation shows.
For example, I’ve started to use del.icio.us to tag all books I read in Google Books (see what I’ve recently tagged). Del.icio.us is an all-purpose bookmark management application, yet the ability to collaboratively create bibliographies with colleagues in the same subfield makes it a useful tool for researchers. Del.icio.us is not the only example - Google Documents can be used to collaboratively work on a publication and SlideShare is great for making your presentations available directly and linking them to your CV (see my own), instead of just offering them for download. But for other, more specialized tasks there is still a severe lack of tools.
A few months ago, a colleague of mine needed a corpus (a collection of texts for linguistic analysis) for her research. Corpora exist in a wide variety of shapes and sizes, but the specific issue she was working on made it necessary for her to create an entirely new corpus (built from blog texts) instead of working with material from more traditional sources (newspapers, fiction etc). In addition, she also had only a basic working knowledge of corpora and the ways in which they can be used.
We approached the problem from two different angles. I helped her build a specialized corpus by using a piece of software that I had developed for my own work on blogs. To analyze the data, I pointed her to two interesting functions of Many Eyes, a web-based application for visualizing statistical information: tag clouds and word trees.
Tag clouds (or, in this case, word clouds) make it possible to visualize how often a word occurs in a piece of writing. Simply paste a text into the appropriate form field on the site and Many Eyes will do the rest (have a look at this cloud for Shakespeare’s complete works for a nice example).
Word trees visualize textual data in another way, allowing the reader in essence to navigate from one word to the next.
There are of course specialized tools for corpus analysis that do a whole lot more than this in terms of statistics and Many Eyes lacks a whole range of feature that a genuine linguistic research tool would need (say, differentiating between different word classes). Yet Many Eyes has several advantages that the more specialized tools lack. It is
- web-based
- freely accessible
- easy to use
and - versatile
In a sense, the points above make all the difference. Desktop-based software is under all sorts of constraints: you have to acquire it, install it and figure out how to get data from and to it, keep it up to date and do all sorts of other “chores” that have little to with your main objective. And then you can’t even share your data and collaborate as easily as you can on the Web. In other words, you’re using a program, not a service.
Of course Project Bamboo is not just about developing new tools (well, at least not in my mind). The assumption has long been that as soon as someone puts a useful service on the web, a user community will magically appear. This may be true of web video, blogging, wikis and many other services with a broad appeal, all of which can and should be used much more in academia. But with more specialized services, adoption is something that should be actively supported. In others words: we need to do more than just develop tools. We should work to popularize general-purpose services like del.icio.us and document ways in which they can be appropriated for research and teaching - and (most importantly) how they can be connected to one another. At the same time, just putting developers and researchers into a room together can produce impressive results.
A great example for both a mashup of services and a new way of looking at data is the Web version of the World Atlas of Language Structures (WALS). It’s a combination of Google Maps with the print version of the atlas, which shows the distribution of linguistic features across the world’s languages (say, which languages have definite articles). Not only is WALS Online more convenient to use than both the print version and the CD-ROM that comes with it (not to forget it is also free), but it makes entirely new uses possible. Think about collaborative annotation or linking research articles directly to WALS. Imagine an paper that lives on the Web and shows a map section from WALS in a side window, with the text flowing around it.
Developing services like WALS and getting them out there has the potential to completely transform academia in the long run, making it much collaborative and transparent than it is today. It will be exciting to see what role Project Bamboo plays in that context.
Edit: I forgot to include a link to the project outline, plus a workshop transcript and some background information.



