I thought it would be fun to see what google poems come out of history, libraries and archives. So here you go. Curious to hear if these mean anything to you.
Bizarro World AHA would be totally into Open Access
The American Historical Association published a Statement on Policies Regarding the Embargoing of Completed History PhD Dissertations. I found myself wishing that there was some kind of bizaro world AHA. I imagine this bizarro world AHA might have made remarks based on these bullet points. These are just a rough draft. I encourage others to refine and further develop them.
Assert that the scholarly society’s goals are for the proliferation of knowledge not the proliferation of a particular kind of media (like monographs) or a particular business model (like selling academic monographs, primarily to university libraries).
Thank doctoral students who have made their dissertations accessible to anyone for supporting the value of sharing their research.
Note that dissertations are fundamentally different than the books a university press might edit, develop and revise based on them. Beyond that, assert that open access to dissertations in no way compete with books that are developed from dissertations.
Explain that the scholarly society would speak out against publishers who decided to blackball scholars who had made their dissertations publicly accessible through their universities repositories.
Suggest that it is fundamentally problematic that the tenure and promotion of historians is based directly on the commercial viability of academic books. Where scholars in other disciplines often control the primary means of tenure (journal articles) in fields like history that rely on book publication those decisions are (in large part) made by academic presses.
Call for members of the association to explore, and encourage the development of new models for the review and evaluation of a wide range of historical work, particularly those that make scholarship as widely accessible as possible.
Note that it is a fundamental problem that career development for historians in the academy is focused on the production of books that are read by few people and encourage the community of historians to refocus their energy on how they can produce historical work that people will read and can have an impact on society.
Often folks in the world of Libraries, Archives and Museums are asked to give an account of how many items you have. At which point the person asked must come up with a way of taking account. It’s good to take account. Decisions about what constitutes an item and what constitutes a collection or a series of items is often presented as if it was a simple matter of fact when it’s actually based on a set of decisions in a given context.
The point being, items don’t exist as much as they are made by applying a set of judgement calls on things that exist. Item-ness isn’t innate as much at is the result of a process of making the world legible.
Item’s are made when we make judgement calls about the relative importance of these features:
Physical distinct-ness (The item represents a physical whole, a discrete physical or digital object)
Authorship/Creatorship (The item represents an intellectual whole, it comes from a particular author or creator or process)
(Are there other things you would add?)
Items and their item parts:
A book is an item, it is also a collection of pages which are themselves items, it might contain chapters, potentially written by different authors, each of those is an item, each of the figures printed in the book are items.
An archive is an item, each folder in an archive is an item, the individual letters in that folder are an item, the five letters that someone stapled together that are in that folder are an item too.
A videotape is an item, each of the individual recordings on that tape are items.
A web archive is an item, each URL in the web archive is an item, each file in the web archive is an item, each directory is a kind of item.
A newspaper is an item, the articles in the newspaper are items, a years worth of newspapers bound together are an item.
24 hours of radio or television broadcast is an item, all of the individual shows are also items, each commercial is an item, each distinct block of air time with it’s individual commercials is an item.
A computer is an item, each of it’s hard-drives are items, the directories on the hard drive are items, the files in those directories are items, the sectors of the disk are items
Last week I was excited to participate in as a panelist in a small conference at the Bard Graduate Center called Digital/Pedagogy/Material/Archives. The goal of the event was to bring together scholars working at the intersection of these four terms to think about how best to grapple with the challenge of archiving new forms of digital scholarship coming out of the classroom.
I’ve posted my slides on slideshare (and embedded them below) but in the course of our discussion I thought it would be helpful to share some links and brief points to round out my perspective both for those who participated in the conference and anyone who comes across my slides on the web.
Part of the goal of the meeting was to provide provocations on these terms. So what follows, and what is in my slides is intended to be a bit provocative.
I decided it would be best to share a particular case study of a student project as an example to think through what exactly someone might want to save in a digital object. I shared a place based game that Laura Heiman & Caitlin Miller (students in my History and the Digital Age course) created. The course is a research methods course for doctoral students at American university and an elective for students in their Public History Master’s program. I picked these two blog posts to share for context on the project.
In my talk I tried to push up against some of the default assumptions that we bring to talking about preserving digital objects and artifacts. I tried to articulate a perspective on preservation that is grounded in first identifying what material objects (digital or analog) would be able to testify, or be the place where evidence exists, of the information you think is interesting. For background and context on where some of these remarks come from here are some pieces I’ve written and drawn from.
Significance is in the eye of the stakeholder : What’s important about this piece is the notion that objects don’t have significant properties. Instead, there are different properties of objects that are significant to different potential users/audiances/stakeholders.
Objects have an infinite number of properties, those properties include traces of the past that people can interpret as evidence for claims. With that said, to take a preservation action is to decide to act to ensure access to properties that one finds particularly important or significant.
The is of the digital object and the is of the Artifact : This is mostly a riff on the materiality of digital objects. What is important in this context is that there are a lot of different material artifacts that have traces of the things that we care about. I mention Ian Bogost’s example of the 11 different things that the video game E.T. the Extraterrestrial is to underscore that even defining what that game is requires an extensive set of decisions about what someone might want to do with it. Now if you want to preserve that video game you need to think about which of those 11 objects you want to have evidence about.
So, if you want to preserve E.T. the Extraterrestrial, or anything for that matter, it’s critical to realize that drawing boundaries around what the object itself is requires you to make decisions about the significance of particular properties for particular potential uses.
Glitching Files for Understanding: With all this said, digital objects are material objects and as such they have very real properties. Once you have identified an artifact that you care about you need to attend to the facts or features of its existence. In this post, I try to break apart many of the assumptions of screen essentialism, the idea that what digital objects look like on the screen is their essence. In contrast, digital objects are in fact bits of encoded information.
If we understand and respect the formal and forensic materiality of these objects in the context of the properties of them that we want to preserve then we are well on our way to making this work.
In this case, if you want to know about the Albert Einstein Memorial, the physical object of the statue is not where you find the evidence you need. This is not a particularly interesting point, but what is interesting is that the archival record of the memorial’s creation in the National Academy of Sciences archives is far less interesting as a source of information about what the memorial means to people than the very ephemeral information about the memorial in reviews of it on Yelp and TripAdvisor and the pictures people share of it on Flickr. This is to say that there is a wake of artifacts that exist in a network of meaning around this particular memorial and that many of the most interesting artifacts to get at what the memorial means to people are not the thing itself (the memorial) or the records about the thing itself (the documents in the archives) but are instead things people are saying about it (posts on Yelp & TripAdvisor) or doing with it (using it as a prop to take photographs). So, depending on what you think matters about the object would push you to try and save different kinds of objects that can provide potential evidence on that subject.
Archives in Context and as Context: During our discussion we slide around a good bit in what we meant by archiving, archives, and preservation. So its worth linking out to Kate’s great piece on this subject as it provides good grist for pinning down what we mean at different moments about archives.
The main point here is that one needs to be careful in clarifying what one means with the word archive. Kate focuses on a few points, but one of the trickiest that comes up is the use of “archiving” as a verb or saying something “has been archived.” Archives are places, and similarly preservation is something that institutions do not something that is accomplished. Nothing is preserved, there are only things that are being preserved. Nothing is archived, there are only things that are in archives.
If there is one thing we can count on it’s entropy. All material objects in the world are wearing down and degrading. Everything in the world eventually succumbs to it’s own inherent vices. At the end of the day the question is what traces of the world recorded on artifacts we want to commit ourselves to ensuring long term access to.
Wisdom of the Ancients: the web-comic-epigraph for my dissertation proposal, from XKCD
As of last monday, I have now successfully defended my dissertation proposal. In the context of my doctoral program, that means there is just one more hurdle to climb over to finish. I’m generally rather excited about the project, and would be thrilled to have more input and feedback on it (Designing Online Communities Proposal PDF). I would be happy for any and all comments on it in the comments of this post.
Designing Online Communities: How Designers, Developers, community Managers, And Software Structure Discourse And knowledge Production On The Web
Abstract: Discussion on the web is mediated through layers of software and protocols. As scholars increasingly turn to study communication, learning and knowledge production on the web, it is essential to look below the surface of interaction and consider how site administrators, programmers and designers create interfaces and enable functionality. The managers, administrators and designers of online communities can turn to more than 20 years of technical books for guidance on how to design and structure online communities toward particular objectives. Through analysis of this “how-to” literature, this dissertation intends to offer a point of entry into the discourse of design and configuration that plays an integral role in structuring how learning and knowledge are produced online. The project engages with and interprets “how-to” literature to help study software in a way that respects the tension that exists between the structural affordances of software with the dynamic and social nature of software as a component in social interaction.
What’s Next?
At some point in the next year I will likely defend a completed dissertation. Places do dissertations differently, in my program the idea is that what I just defended is actually the first three chapters of a five chapter dissertation. So, at this point I need to follow through on what I said I would do in my methods section (to create chapter 4, results) and then write up how it connects with the conceptual context section (to create chapter 5, conclusions). So I should be able to grind this out in relatively short order.
At this point, I think this project should be interesting enough to warrant a book proposal. So I’ll likely start exploring putting together a book proposal for it in the next year as well. With that in mind, any suggestions for who might be interested in receiving a proposal on this topic are welcome.
I’m always interested to hear about how different scholarly communities are changing their communications practices. Things like PLOS One, and projects like PressForward are putting forward interesting and new models for when and where review happens and how we establish credibility and mark for quality. At the recent ScienceOnline conference I had the pleasure of chatting a bit with David Zureick-Brown, a mathematician and one of the founders of MathOverflow. Given how forward thinking much of the math community has been in this scholarly communication space I was thrilled to have a chance to pick his brain about similarities and differences between fields.
High Rates of Rejection in Math Journals: It works different there
I was initially taken back by something David said. It was something like “In my field, if you aren’t getting at least a 50% rejection rate on papers you submit to journals you aren’t aiming high enough.” The idea being, that you should try to get your work into more prestigious journals, and many of these journals have two-year backlogs. In one situation, a paper was rejected that had largely positive reviews because it wasn’t important/exciting enough. This is exactly the thing that projects like PLOSone are set up to get around. To try and stop evaluating papers for quality and start doing a minimal evaluation of them as passing a minimum bar.
Publication happens before Publication
Initially I thought this sounds terrible! You submit your papers, wait for rejections, and then shift down a bit. Wouldn’t this hold up getting your work out there? But then I remembered that Math is different. At this point there is an expectation that you put all your work up on arXiv as soon as they are coherent enough to be called papers. So this review process wasn’t holding up the publication process. As soon as work is done it’s published. People start reading it on arXiv. When I realized this I suggested “Oh, so publication in a journal is actually really just like a mark of quality, it’s like a merit badge.” Now, it’s a really important merit badge in the field, as the quality of the journals you are published in is a key factor for tenure and promotion. So getting a piece published in a particularly prestigious journal is effectively a seal of quality/approval that a given work matters and is significant for the field.
Small Pieces Loosely Kludged
This kluged together system seems like a great outcome. I can’t imagine anyone set out to make this system work this way. Anything can get published on arXiv, at which point anyone can see the work, cite the work, and reference it. The journals are now really just serving as amplifiers. The peer review of this work is actually post publication peer review. In this system it sort of doesn’t matter if journals want to become open access. If they let you put up pre-prints you’re good to go. The content of the journals is already published and open access. It only costs folks money to see the papers if they want to see the fancy PDFs.
It’s largely about when you call it a publication
So post publication peer review and pre-publication review are actually much more dependent on what we call the publication. Humanities and Social Science folks can just start to put all their stuff up in places like Academia.edu, or up on SSRN before submitting it. In many social sciences at this point this is a standard practice. While I’m a big fan of institutional repositories, I find the situations where the field specific platforms have emerged a bit more exciting. In these cases, the expectations and behaviors of scholars have shifted. It’s the norm to expect that you can see your colleagues work as quickly as it’s come together online in these spaces.
So why doesn’t this happen in History and the Humanities?
The fact that arXiv, SSRN and sites like RePEc and a few other disciplinary networks emerged for sharing scholarship in draft form and that nothing like them has taken off in the humanities is an indictment of the humanities. How come Mathematicians, Astronomers, Economists and a range of other fields could just set up places to share their work and humanists haven’t? As you can see from the Math situation, if a scholarly community just shifts to sharing pre-prints and everybody does it then it basically doesn’t matter what publishers want to do in terms of open access. This is to say that scholars have no one but themselves and their peers to point to if they don’t like how scholarly communication works. As the math case shows, we can patch our scholarly communication system one kludge at a time and end up with a system that embraces broad open access and rapid dissemination and retains merit badges for quality.
I may not be at AHA 2013, but that won’t stop me from participating on a panel. Below is a series of videos I created for an AHA 2013 panel. “Front Lines: Early-Career Scholars Doing Digital History.” Each video responds to a prompt for discussion. Both Miriam Posner and I are virtually participating, so I will be interested to hear how it ends up working out in meatspace. For those of you who stay up late, you can see me participate in the panel before it actually happens.
For starters it is probably a good idea for each of us to describe what it is we actually do and why we think what we do counts as digital history.
What is relationship between your digital work and your larger body of historical scholarship?
How have digital projects changed your approach to degree requirements, publishing, promotion, (and tenure if relevant)?
Looking back at your education and training (both formal and informal) what are some of the most important experiences, the things that set you up with the skills you need to land the job you have?
What kinds of resources can institutions offer to early-career digital historians (especially institutions that are not home to DH centers)? Where can digital historians find important communities/resources outside of their institutions?
Here is the abstract for the session:
Front Lines: Early-Career Scholars Doing Digital History
Digital history’s growth in popularity has been accompanied by anxiety about how, and whether, these new methods and their practitioners will fit into traditional history departments. At the 2012 meeting of the American Historical Association, discussions of digital history often turned to questions about graduate education, the job market, publication, and promotion. This roundtable aims to approach these questions head-on, relaying experiences and recommendations from early-career scholars navigating these transitions.
Digital historians who elect to enter the professoriate often find themselves faced with a number of questions related to credentialing, tenure, and promotion. Many digital projects, for example, require publication venues other than the bound monograph. What sorts of avenues exist for digital publications? Will tenure committees be prepared to accept and evaluate these nontraditional projects? How many universities can be expected to offer the infrastructure and resources digital historians need?
The AHA’s leaders have suggested that for new Ph.D.s, one solution to the jobs crisis may lie in seeking careers outside of the professoriate — an option that digital historians have been particularly interested in pursuing. How can graduate students gain the experience to prepare themselves for these positions? If new Ph.D.s turn to these alternative academic careers, what can they expect? Can a historian in a nontraditional career expect to pursue a research agenda? What are these alternative jobs, and how well are new Ph.D.s adapting to them?
In this roundtable, a group of digital historians, in jobs both on and off the tenure track, will take up these questions, drawing on their own experience to suggest how we can prepare young digital scholars to enter various job markets, and how we can prepare employers to receive them.
Looking back on this year makes me exhausted. It looks like I managed to put up 34 posts on The Library of Congress Digital Preservation Blog as well as 11 posts on Play the Past and 24 posts here on my own blog. Seven different things I wrote ended up churning their ways through the process of becoming journal articles or book chapters, and by my count I was involved in 12 conferences (4 of which I was involved in planning). All of that led me to make the face below.
Photo of me from the OSI Newsletter
What follows is my attempt to make sense of it all and provide anyone interested in an overview of what I’ve been up to with a run down. Looking back over what I have gotten into this year I think I can (broadly speaking) fit most of what I have worked on into one of two buckets, digital strategy for cultural heritage organizations and work trying to further advance digital history.
Digital Strategy for Cultural Heritage Organizations
Earlier this year i had a chance to interview Michael Edson from Smithsonian for the LC blog. In working up one of my questions for that interview I think I’ve found one of the central questions that much of my work responds to.
Where do you think the home should be for digital media in a cultural heritage organization? Or, how do you think one should divide up roles and responsibilities when digital is increasingly becoming a key part of nearly every part of cultural heritage organizations? We are increasingly acquiring, preserving and exhibiting born-digital and digitized materials, using social media for outreach and public relations, supporting researchers and fielding reference questions through digital channels, and supporting all of that work with a substantive IT infrastructure. Who should be whom’s ramp and loading doc?
I was thrilled to have the opportunity to forward my own answer to this question when I was invited to keynote the Connecticut Digital Initiatives Forum. I think some of the features of the digital makes it possible to apply a lot of the ideas that have come out of the open source software movement into how we do a lot of other work. I called this Do Less More Often An Approach to Digital Strategy for Cultural Heritage Organizations. Everybody is trying to do too much at once. Find the low hanging fruit and pick it. Get the boxes off the floor. Release early and release often. Put things out there and find out how you should be doing things. I think this idea cuts across all parts of digital cultural heritage work. Everything from, collecting, processing, arranging, preserving, making available, and exhibiting can be re-framed in this mindset.
As an example, alongside this year’s Digital Preservation Conference I helped to facilitate CURATEcamp processing. An unconference focused on bringing notions of archival processing and computational processing. The event itself (minimally planned and programmed and participant driven) to me, exemplifies do less more often. At the same time, some of the great work on applying More Product, Less Process for Born-Digital Collections and Born Digital Minimum Processing and Access are also great fits in that they become ways to think about iteratively structuring work. A similar iterative approach is evident in the NDSA levels of digital preservation project. Which went from a concept to a release candidate over the course of the year.
Another big area of strategy I did a good bit of thinking and writing about this year was crowdsourcing. You can see a recap of most of my Crowdsourcing Cultural heritage posts here.
Advancing Digital History: Practices, Tools, and Data
This year I wrote a bit about how historical research is changing as a result of digital tools, I worked on building and designing a tool for historians, and I was thrilled to be able to participate in ongoing conversations about how historians thinking about
Where do we go from here?
So I think I’ve had a productive year. I imagine most of these threads will continue into the new year, but I am also excited about the prospect of getting involved in some other new and exciting projects both at LC and on my own.
The new ITHAKA report, Supporting the Changing Research Practices of Historians is something that everybody working with cultural heritage collections should read. It’s full of good stuff, but in my opinion the key finding is that Google is now (by and large) the first step in historical research. Fred Gibbs and I reported on nearly the same finding in our recent paper on digital tools for historians. The Google search box is the first place historians go when they start their research, it plays a key role in their discovery process. This is particularly true for idiosyncratic terms, phrases and people’s names which often turn up results from Google books. So, the next time someone tells you that they want to make a “gateway” a “portal” or a “registry” of some set of historical materials you can probably stop reading. It already exists and it’s Google.
The report makes some suggestions for what libraries and archives should do to help make their materials more accessible. Namely, that they work to integrate them with discovery tools and that they do what they can to make more finding aids accessible online. Both of these are valuable, but I think both goals fail to fully integrate the finding about Google and Google Books. If a library, archive, or museum wants its resources to be found as part of the discovery process, the initial phase of theory development, they need to be thinking about how they get their materials (or information about their materials) to show up in Google search results.
Are more and bigger online finding aids really an answer?
The report suggests that we cultural heritage organizations should be getting more finding aids up. That’s great, that would be useful. However, given the finding about Google, I think an even bigger potential lesson here is that if you want your collections to be used by researchers (digital or otherwise) the first thing you need to think about is not finding aids but about making web pages about items, boxes, collections, etc that will be discoverable in Google. In short, I would rather see a well-structured web page with a well-chosen title and persistent URL before one even begins to make a finding aid. This is not about SEO, it’s about doing very simple things that make for better HTML pages. Importantly, if an org makes a single PDF out of a finding aid for a collection and puts it on the web that finding aid is almost useless as far as Google is concerned.
What would finding aids look like if they assumed the existence of the web and web search?
To me this begs a rather controversial question. If the goal of the finding aid is to help researchers find things and the way they do that is to search Google (which is really good at looking for particular things in HTML pages) then why is the HTML page a byproduct of the EAD XML finding aid and not the primary thing that the archivist authors? We designed an infrastructure around EAD and found ways to make that into HTML pages, but in the meantime Google came around and historians found out that Google was such a more useful and powerful way to search that they only consult the finding aids to round out the ideas they have already started developing. So, what would minimal archival processing for access look like if we thought first about creating an HTML web page for every collection or every box?
I’ve been dabbling a bit with Cinamagram this week. It’s a free app that lets you create Cinamagraphs. Their tagline is “Create a stunning hybrid between photo and video” and it does a nice job at letting you create something that does just that. It’s done a nice job of getting me to see my walk to and from work a little bit differently.
You record short 2 second videos and then draw a mask on the photo to identify the part of the image you want to be animated. The rest of the image stays still. The end product is an animated gif. For example, in the image above I set it to keep counting down at the end of the walk signal. You’re always just about to have the light switch to red.
It’s an interesting process. It get’s you to see spaces in different ways. It’s fun to look for things that can run as repetitive motions in scenes where a lot of other action is held still. For example, getting things like the car in the image to blur from motion while keeping the lightly flapping flag going.
It’s tricky to get them to pan out exactly right. But it is a lot of fun to try and find things that you can play back and forth with.
By focusing in on very little movements, like rustling leaves or the lights on a police car you end up with things that have this strange quality of being something between a photography and video. Aside from being neat, it’s rather easy.
I think it’s always fun to get a new toy like this that prompts you to look around at the world a little differently, to try and see with a different eye.