Main content

Rebuilding Rare Books in a Virtual Space: A Conversation with Curator Dot Porter

Posted on by Rebecca Ortenberg

I like to joke that one of the reasons I decided to study history is that it was unlikely to involve very much math. So when Dot Porter, curator of digital research services in the Penn Libraries Schoenberg Institute for Manuscript Studies, started to tell me about manuscript collation formulas, I got worried. Formulas? That sounds pretty math-y to me. 

Thankfully, the math wasn’t as scary as I initially anticipated. “A collation formula is the traditional way of describing the structure of a codex,” she explains. Book-shaped manuscripts, which are known as codices, are made up of sheets of paper or parchment that have been stacked together and folded to make what’s called quires. Each sheet used to make a quire becomes two leaves, which are physically connected through the center of the quire. These quires are then sewn together to form the textblock. 

How do collation formulas work, exactly? Porter offers this example. “The formula ‘1(8), 2(6), 3(4)’ means that quire one has eight leaves—that is, four sheets folded together—and quire two has six leaves, and quire three has four leaves. A formula can also tell you if leaves are missing. It could say ‘-4.’ Or it might say ‘+1’ if a leaf has been added.” 

Over the past few years, Porter has thought deeply about the information that those little strings of numbers can provide to librarians, manuscript researchers, and anyone interested in old books. To make this obscure information more accessible, readily available, and useful, she, along with the team behind the VisColl Project, developed a piece of software called VCEditor, which can create collation models: diagrams that show the user how a codex has been bound together, alongside the text and illustrations that appear on each page.  

Most recently, and with the help of a research fellowship from the University of Glasgow Libraries, Porter spent the month of April in the Glasgow Libraries Archives and Special Collections Reading Room, examining 119 manuscripts from their Hunterian Collection as part of her efforts to explore the possibilities offered by a tool like VCEditor.  

Math-phobia overcome, I was fascinated to learn more about her fellowship, so I recently sat down with her to talk about manuscript structures, digitization, and how she tries to bring very old books to life in the digital world. Here’s an excerpt from our conversation. 


Three colorful illustrations from an illuminated manuscript: A highly decorative D, a group of people on a gold leaf background gaze up at a pair of angels, and a smiling man wearing a robe and a crown
Illustrations from the Hunterian Psalter, University of Glasgow Libraries
To get us started, tell me a little bit about what you did during your fellowship in Glasgow.

The project that I had was to spend every day in the University of Glasgow Archives and Special Collections Reading Room, looking at their manuscripts and building coalition models of them.

I was actually able to do a lot of work before I arrived because the Hunterian collection has an excellent printed catalog. It is old—published in 1908 — but the descriptions of the manuscripts in that book have collation formulas and descriptions, and I was able to use that information to build collation models before I went to Glasgow.

When I arrived on my first day, I called a bunch of manuscripts, and then I was able to compare the actual manuscripts with the model that I built. Sometimes I only had to make minor changes to the formula. But sometimes it turned out the formula was completely wrong, and I had to redo it.

So I spent all month looking at manuscripts, which was fantastic.
That sounds amazing. Let’s back up for a second, though, and talk a little bit about collation models. What makes them so important, anyway?

One of the interesting things about physical collation information is that you can overlay it with things like textual contents and illustrations. For example, you might notice if some quires were written by one hand, and some quires were written by another hand, which might tell you how the text was divided up to be copied. Additionally, it was not uncommon at all for manuscripts to be taken apart, rebound, and pieces of them moved around, so you can also use this structural information to understand the history of the book.

But collation formulas alone are just not a great way to see that information.
And collation models can help with that?

Basically, when you open a collation model in a piece of software like VCEditor, it provides you with a diagram that shows you the organization of the leaves of the manuscript and the stuff on the leaves.

We put a lot of information in catalog records: we have collation formulas, we have lists of contents, we have lists of illustrations. A collation model puts that information together in a way that lets you see what’s happening in the book. Having models for every manuscript in a collection would be really helpful and potentially groundbreaking.

The Glasgow fellowship was my first chance to really spend a lot of time using VCEditor to build models and to try using them. I'm giving the data I collected back to Glasgow, and they're interested in incorporating that into their catalog records, which is the same thing I want to do at Penn.
Tell me a bit more about VCEditor.

VCEditor is built on the VisCodex system developed at the University of Toronto’s Old Books, New Science Lab, but updated and with new functionality added. It’s a piece of software that allows you to pretty easily build a collation model. You can say how many quires your manuscript has, and how many leaves are in each quire. You can also go in and indicate physical things that you just can’t show in a formula—like how leaves are attached. You can say there’s a leaf that’s stuck in and glued. You can show where the sewing is.

You can also create lists of terms that you can then map onto the model. For example, if you had a manuscript of the Bible, you could set up terms for each book of the Bible, and then you could show exactly where in the in the manuscript each one is. And if there are illustrations, you can show where the illustrations are.

And so essentially it gives you a visualization of the manuscript.
What are some things that you hope to see come out of this project?

The big thing that I would love to see come out of this is a normalization of including this sort of structural information in library catalogs. Right now, usually the only structural information included is a collation formula. And that’s...fine. It’s fine. But you can include more! You can include diagrams!

One of the things that I'm doing with this pilot project is showing how you can include more information without changing your whole library catalog system. What we're starting to do here at the Penn Libraries is to include links to collation models in our Franklin catalog records. That way, anyone viewing the record can look at the diagram of the manuscript. They can also take the data model file, and take a link to our digitized manuscript images, and they can load them into VC Editor, and then they can create a collation model themselves.

The structure of books is an important part of them, and it doesn't make sense to me that we're spending all this time and effort digitizing them and providing people with flat images, but not talking about the structure. It's like cutting pages out of a book and throwing it around the room, you know what I mean?
Completely. And what you’re doing seems to be making that structural information so much less abstract than a collation formula. With that in mind, how does this project relate to other work you’ve done at the Libraries, like working on major digitization projects, developing social media content, and hosting virtual programs?

My work is all about looking at different ways to bring this physicality of the manuscript into a virtual space. I was kind of pooh-poohing digital images earlier in our conversation, but they’re still important because it you can see what's on the page, right? But you still need to put it together somehow.

When I talk about digitization, I always say that what we're doing is taking the book apart and then rebuilding it in a virtual space. My work is all about asking, what are all the different ways that we can rebuild it that will tell us something about this physical object? VCEditor is a really important part of that, but almost everything else I do is also about that.
Did you have any favorite manuscripts that you got to see in Glasgow?

One manuscript I got to see is called the Hunterian Psalter. Apparently, this is not a manuscript that just anybody gets to see! It's kept in a lock box that's in another locked box, in a safe, in a cage. And it’s just so freaking gorgeous. It is, by far, the most beautiful manuscript I have ever seen in person.

It’s also trimmed, which means that it was cut around the edges to make it fit into the binding at some point. When you work with manuscripts, you get really used to uneven edges. It's really unusual to have even edges, and this manuscript had the most even edges I have ever seen. I kept having to remind myself that it was real and not a facsimile because it was so perfect.

I will probably never ever get to like touch anything like that again. I was just really honored to be allowed to.