Supporting the Annotation of Medical Reports with Biomedical Annotation Tools

Supporting the Annotation of Medical Reports with Biomedical Annotation Tools

Reducing Costs and Time of Medical Annotations

Image from

In this blog post we talk about the use of biomedical annotation tools to support, organize, and speed up the annotation process of medical reports. What are the different and available annotation tools that can be used in the biomedical domain? What advantages can be provided by such tools? Are there any cons? Let’s find it out!

The need for human annotations

In the last decades, exascale volumes of biomedical data have been produced, where a large part is available as unstructured text. Traditionally, health-care professionals adopt free-text reporting to communicate patient information, such as diagnosis and treatments. In this regard, narrative clinical reports that are conceived as free-text reports, are human-readable but not machine-readable. Consequently, this brings limitations to an effective reuse of data, which is essential for medical decision making and support. In order to process the large amount of unstructured biomedical data from clinical reports and Electronic Health Records (EHRs), Information Extraction (IE) algorithms and Natural Language Processing (NLP) techniques have been developed. However, most of these methods require manually annotated datasets – that are expensive and time-consuming resources to obtain, demanding expert annotators with extensive experience in biomedical content. Thus, it appears clear that annotation tools capable of supporting, organizing, and speeding up the annotation process are fundamental to advance NLP research in the biomedical domain. This is even more true for the histopathology domain that requires fine-grained annotation systems designed to be customizable according to the physicians’ and experts’ needs.

So, let us find out which are some of the available biomedical annotation tools, along with their pros and cons!

Biomedical annotation systems

brat: a well-known general-purpose text annotation tool that has been widely used in the biomedical domain. brat features a high-quality annotation visualization based on vector graphics that provides scalable detail and rendering. Furthermore, brat comes with an intuitive annotation interface and with versatile annotation support – which makes it fully configurable and capable of supporting most text annotation tasks. On top of this, brat supports standard approaches to integrate the results of fully automatic NLP techniques into the annotation workflow.

Even though brat has been used in several biomedical projects, it has been designed for general-purpose annotation. Thus, it provides features that are not always suited for physicians and experts of the medical domain.

TeamTat: a web-based biomedical text annotation tool to manage team annotation projects. TeamTat manages multi-user, multi-label document annotation, reflecting the production life cycle. Project managers can specify the annotation schema for entities and relations, select the annotators, and distribute documents anonymously to prevent bias. Furthermore, TeamTat displays figures from the full text for ease visualization to annotators. With TeamTat, multiple users can work on the same document independently in their workspaces, while team manager tracks task completion.

A downside of TeamTat is that it requires the user to manually install and configure some frameworks and software packages (e.g., Ruby, Rails, and MySQL) – a process that reduces its ease of use from a user’s perspective.

MedTAG: a web-based collaborative biomedical annotation tool for diagnostic reports that is open source, platform-independent, and free to use/distribute. MedTAG provides four annotation types: Mentions, Concepts, Linking, and Labels. Each annotation type refers to different semantic annotations, which represents a coarser/finer level of granularity compared to the others. In terms of functionalities, MedTAG provides support for multi-user, multi-label document annotation. Furthermore, it provides support for different ontologies/concepts that can be used in the annotation process. On top of this, MedTAG also provides support for multilinguism. Finally, MedTAG comes with a built-in automatic annotation tool working on three different histopathology use cases: colon diagnostic reports, uterine cervix diagnostic reports, and lung diagnostic reports.

One limitation of MedTAG concerns the file format of input documents, as MedTAG currently supports only plain-text documents. However, supporting PDF annotations would be particularly useful when dealing with scientific paper annotations.

GitHub repositories

So these were some great annotation tools!

If you are interested in (any of) them you can also check their GitHub repositories, which are listed below.