Why does bioscience need open annotation?

By memartone | 21 January, 2016

Bioscience and biomedicine are vast and conducted across thousands of platforms and tools. The biomedical literature currently stands at over 20 million articles, with > 1 million added per year. The biomedical corpus is scattered across multiple publishers, platforms and workspaces. The fragmentation of knowledge and information is recognized as a major barrier to discovery in biomedicine, impacting key challenges such as reproducibility and limiting our capacity to apply state of the art computational techniques for complex data integration.

I joined Hypothesis because I believe that an open annotation layer can serve as a dynamic, unifying technology for addressing some of structural weaknesses in our current biomedical platforms.

Annotation is the process of enriching research objects-narrative, data, code-through the addition of human knowledge. Annotation, particularly scholarly annotation, is distinct from current commenting systems in that annotations are addressed to a specific portion of a research object, i.e., a statement, an object in an image, a gene sequence.

Annotation is, in fact, integral to the biosciences and science in general. Every day, scientists are reading and underlining, taking notes and cross-referencing documents throughout the scientific workflow. Peer reviewers, editors, data analysts, technicians, PI’s, funders: everyone annotates. Scientists, increasingly with the help of computers, are annotating data with meaning or details. Whether purely manual or algorithm-assisted, annotations capture that most precious of resources: human knowledge.

Web-based annotation

Current annotation systems tend to be built for specific applications using proprietary software and custom data formats. Unlike web pages, which can be globally searched and shared through a common format and protocol, annotations are currently locked within each system and are not easily shared or searched. And in the new paradigm of networked, open science, knowledge that cannot be shared is largely wasted.

Many scientists are now realizing the power of cloud-based platforms for collaborative knowledge creation, e.g., Google Docs, GitHub, Mendeley. The appeal of these platforms lies not only in their accessibility, but in the collaborative tools for editing and feedback. Few Google Docs are shared without a stream of annotations in the margins.

These annotations are more than just scribbles; they are dialog boxes that open up conversation channels around specific issues. Linked to information conduits like email or social media, they support asynchronous collaboration and highly efficient use of scarce attentional resources.

At Hypothesis, we are implementing a new paradigm, Web Annotations, to create an open, knowledge layer on top of the web. With Hypothesis, any web page or pdf can become an interactive workspace. Although annotations are currently stored on Hypothesis’ servers, they are owned by whoever creates them, not Hypothesis.

Our program in biosciences seeks to harness the power of annotation, re-engineered for the web, to improve communications across current silos in biomedicine. Whether simple note taking, making science more accessible, improving the effectiveness and efficiency of peer review or deploying a dedicated conversation channel for reproducibility, we think that open annotation in general, and Hypothesis in particular, is an important new capability that should be broadly deployed across biomedicine.

Share this article