Annotating PDFs Without URLs

By jeremydean | 13 June, 2016

For sometime now, you’ve been able to annotate PDFs using Hypothesis, both on the web and locally, with hosted PDFs syncing with local instances and various local instances syncing with each other. Jon Udell wrote about this magical feature here over a year go.

For those that tried it out, however, there was one annoying snag, especially if you were trying to lead a large group (of students, say) through the process: users had to create an annotation on a local PDF before they would be able to view any pre-existing annotations created elsewhere–in other local instances or where originally hosted. (The same would happen for identical PDFs hosted at two distinct URLs.)

So it was possible to be sent a PDF that had supposedly been annotated, open it, activate Hypothesis and not see any annotations. Even if you knew about the need to create an annotation to view annotations, you were entering that conversation blindly or else creating a dummy annotation to be deleted later. This added step admittedly took away some of the “mind-blowingness” of annotating PDFs across multiple locations using Hypothesis.

Now that step is no longer necessary.

What’s changed?

We used to use the URL as the primary identifier of PDFs. That’s what the Hypothesis client would search for in the database to anchor annotations on a page. Now we use the digital fingerprint that is baked into PDFs from their generation as part of the spec for the format. We did use this fingerprint previously as a secondary identifier to map local PDFs to hosted ones or PDFs hosted at different URLs to each other, which is what caused the lag between new annotation creation and appearance of pre-existing annotations. This shift from URL to PDF fingerprint will truly enhance the portability of annotations on the format across the web.

For example, if the same scholarly journal article is housed at two different repositories, annotations created at either location will show up at the other (assuming both PDFs have the same fingerprint, an assumption that is not always the case). Public annotations created on a local version of the same PDF will also be immediately viewable. If I annotate an essay at a permanent URL on JSTOR, then download the article and share it via email, or host it on my own WordPress site, my annotations will anchor through all incarnations of that PDF.

Though not technically a “bug,” this glitch had its benefits. Some users wanted to create different instances of the same PDF. The practice is common in classroom use of Hypothesis, as educators move from class to class and semester to semester. Of course, our groups feature can be used to create different layers on a single document. But we’re also looking into how to help folks “rewrite” PDFs easily so that new instances with new fingerprints can be generated. Stay tuned.

Share this article