March 22, 2011 — Ottawa, Ontario
An NRC-IIT Research team led by Researcher Berry de Bruijn posted outstanding results in the most recent medical text mining competition, organized by i2b2 (Informatics for Integrating Biology and the Bedside), a NIH-funded National Center for Biomedical Computing). During a November 2010 i2b2 workshop, it was announced that NRC Institute for Information Technology (NRC-IIT) outperformed 40 teams from 4 continents.
The Fourth i2b2/VA Challenge evaluation was conducted using authentic but de-identified patient data from four different hospitals in the USA and was focused on the automatic extraction of medically relevant information from that data. More specifically, the challenge involved three stages of electronic extraction of medical information.
- extracting medical concepts (problems, tests, and treatments
- annotating medical problems as present, absent, possibly present, etc; and
- establishing in what way two concepts (e.g., a problem and a treatment) are related.
For all three tasks, the NRC-IIT systems proved to be the state of the art. NRC-IIT Research Officer Colin Cherry’s Task-1 system came in first position for that task. The first position for Task-2 was earned by the system created by Research Officer Berry de Bruijn. On Task-3, Research Officer Xiaodan Zhu’s system came in second position but close enough to first place that the difference was not statistically significant. Among the long list of fellow competitors were companies, universities, and medical organizations including Mitre, University of Leeds, Brandeis University, Arizona State University, Vanderbilt University, Oregon Health & Science University (OHSU), Veterans Affairs Canada, Kaiser Permanente, United States National Library of Medicine, Fraunhofer, Computer Sciences Laboratory for Mechanics and Engineering Sciences (LIMSI), and University of Sydney.
Despite the differences between the three NRC-IIT systems, they shared a few noteworthy strengths. Machine-learning algorithms were used in all systems, and were designed in such a way that they were sensitive to many more data features than were the competitor systems. The algorithms allowed this sensitivity without needing excessive computer resources, and without showing over-training effects.
Participation in the i2b2 challenge was a natural application of the NRC-IIT Interactive Information group’s expertise. In several projects, its researchers have developed intelligent computational methods for processing medical text. This world-leading work has also led to commercially licensed software for extracting information from patient data.