AMIA 2014 Affiliate Meeting: i2b2/UTHealth

Shared-Task and Workshop on Challenges in Natural Language Processing for Clinical Data

Friday, November 14
8:00 a.m. – 5:00 p.m.
For information on attending this meeting please contact

The 2014 i2b2 workshop brings together researchers interested in four tracks and will give the participants of the four tracks to opportunity to present and discuss their systems.

The 2014 i2b2/UTHealth challenge consisted of two traditional NLP tracks and two applications tracks:

Track 1: De-identification

Removing protected health information (PHI) is a critical step in making medical records accessible to more people, yet it is a very difficult and nuanced task. This track addressed the problem of de-identifying medical records over a new set of over 1300 patient records, with surrogate PHI for participants to identify.

Track 2: Identifying risk factors for heart disease over time

Medical records for diabetic patients contain information about heart disease risk factors such as high blood pressure and cholesterol levels, obesity, smoking status, etc. This track aimed to identify the information that is medically relevant to identifying heart disease risk, and track their progression over sets of longitudinal patient records.

Track 3: Software Usability Assessment

This is a new track introduced this year for testing the usability of software. This track is meant to evaluate the i2b2 challenge software for how easily users learn and use the software to achieve their goals.

Track 4: Novel Data Use

The data released for this 2014 i2b2 challenge Tracks 1 (de-identification) and 2 (heart disease risk factors) are unique among publicly available clinical data sets in that they represent longitudinal data selected by an MD for the purpose of identifying risk factors in a diabetic population. However, these data can be used to answer other questions on these patients.
This Track is for participants who want to build on their existing systems or the systems developed for Tracks 1 and 2, in order to answer new questions with these data.

Some example questions include (but are not limited to):

  • Are the medications having the desired effect?
  • Is the patient responding to treatment for their hypertension? Is the patient responding to treatment with their lipids?
  • Is the patient experiencing an adverse effect from their medications?
  • Are some risk factors are more highly correlated with CAD than others?

i2b2 links

Tracks 1&2
Tracks 3&4

Organizing Committee:

Ozlem Uzuner, co-chair,  SUNY at Albany
Amber Stubbs, co-chair,  SUNY at Albany
Hua Xu, co-chair,  University of Texas, Houston
John Aberdeen,  MITRE
Susanne Churchill,  Partners Healthcare
Cheryl Clark,  MITRE
Dina Demner Fushman, NIH/NLM
Joshua Denny, Vanderbilt University
Bill Hersh, Oregon Health and Science University
Lynette Hirschman, MITRE
Issac Kohane, Partners Healthcare
Vishesh Kumar, Massachusetts General Hospital
Anna Rumshisky, UMass Lowell
Stanley Shaw, Massachusetts General Hospital
Peter Szolovits, MIT
Meliha Yetisgen, University of Washington
Kai Zheng, University of Michigan