• March 12 - 15, San Francisco

    AMIA 2018 Informatics Summit

    Translational | Clinical Research | Implementation | Data Science

AMIA 2018 Informatics Summit Tutorials

as of January 2, 2018

8:30 a.m. – 12:00 p.m.

T01: Computational Resources for Personalized Genomics: High Performance Clusters and Bioinformatics Resources for Analysis and Functional Interpretation of Next-generation Sequencing Data

E. Crowgey, Nemours Alfred I. duPont Hospital for Children; S. Volchenboum, University of Chicago; J. Romano, Columbia University; K. Ross, Georgetown University Medical Center; S. Polson, University of Delaware; C. Wu, University of Delaware

Precision medicine continues to be a driving force for utilizing complex genomic data at the bedside, and analyses of these high-density data requires high-performance computing workflows. Advancements in DNA sequencing technology and computing capabilities have propelled the use of genetic and genomic data in precision medicine efforts. The first part of this tutorial will review computational pipelines for processing Illumina pair-end whole exome sequencing data. Specifically, attendees will be provided access to biomix, a high-performance cluster hosted by University of Delaware, as instructors review a pipeline for processing fastq files (raw) into variant call files (VCF): bwa, samtools, picards, GATK, SnpEff / ClinEff. The NGS data are simulated for a clinically relevant autosomal recessive disorder with co-occurring common drug metabolizing variant alleles. The second part of this tutorial will involve an overview of web-based bioinformatics resources, combining data mining, text mining, network analysis and visualization tools for translating variant data into biological knowledge.

T02: Making NCBI's GEO Open Data FAIR and Useful: Translating Big Data into Precision Medicine with STARGEO

M. Panahiazar, D. Hadley, University of California San Francisco

STARGEO.org makes GEO data findable, accessible, interoperable and reusable to ultimately facilitate knowledge discovery in precision medicine. STARGEO.org is a novel web-based application to gain better descriptions of GEO sample phenotypes uniformly across different studies and to define robust differentially expressed gene signatures of disease by meta-analysis of gene expression. STARGEO.org is designed to be for crowd curation of open data what GitHub has been for open source code development: i.e., a community of curators that can openly build large sets of annotations together. The Search Tag Analyze Resource for GEO (STARGEO.org) as an open online platform funded by the NIH’s Big Data to Knowledge (BD2K) consortium to use open data to characterize the functional genomics of disease.

T03: Developing Natural Language Processing Solutions to Facilitate Clinical and Translational Research

H. Xu, The University of Texas Health Science Center at Houston; H. Liu, Mayo Clinic

Over the last few decades, growing adoption of Electronic Health Record (EHR) systems has made massive clinical data available electronically. However, over 80 percent of clinical data are unstructured (e.g., narrative clinical documents) and are not directly assessable for computerized clinical applications. Therefore, natural language processing (NLP) technologies, which can unlock information embedded in clinical narratives, have received great attentions in the medical domain. Many NLP methods and systems have been developed in the medical domain. However, it is still challenging for new users to decide which NLP methods or tools to pick for their specific applications. In fact, there is a lack of best practices for building successful NLP applications in the medical domain. In this tutorial, we would like to introduce methods, tools, and best practices on building NLP solutions for clinical and translational research. We will start with an introduction of basic NLP concepts and available tools, and then focus on important applications of NLP in the medical domain such as phenotyping. We plan to use lectures, demonstrations and hands-on exercises to cover the basic knowledge/tools and use case studies to illustrate important trade-offs in the design and implementation of clinical NLP applications. Each instructor has over 10 years of experience in clinical NLP research and application and they will share their recommendations in building successful NLP applications in clinical research.

T04: Blockchain for Secure Patient-centered Data Capture and Sharing

A. Das, IBM Research, Geisel School of Medicine; O. Choudhury, IBM Research

There is a growing flood of data being collected through mobile health (mHealth) devices, Internet of Things (IoT) platforms and direct-to-consumer genomics, all of which provide novel opportunities to improve health and revolutionize healthcare. These technologies, however, create new security and privacy issues for provisioning patient-collected data for research. In particular, ensuring the control, provenance, and traceability of data is critical for enabling trust and reliability in information managed across multiple parties. In this tutorial, we will present blockchain, an encrypted distributed ledger, as a secure, reliable, robust, and decentralized framework for data sharing, and show why it is being rapidly adopted across industries. In the first half, we provide an introduction to blockchain technology and present various implementation approaches. In the second half, we will focus on the Linux Foundation open-source blockchain, Hyperledger, and illustrate how the Hyperledger Fabric provides a solution for common use cases in mHealth and health IoT. We will give a live demonstration on how to set up a blockchain network for a simple use case and simulate data sharing transactions. We will conclude by discussing open research issues in adopting blockchain as foundational data infrastructure for biomedical and healthcare research.

T05: PCORnet: Infrastructure, Research Studies, Engagement

A. Solomonides, NorthShore University HealthSystem; A. Kho, Northwestern University; C. Bailey, Children’s Hospital of Philadelphia; K. Kim, University of California Davis; M. Zirkle, Patient-Centered Outcomes Research Institute; S. Barbash, Patient-Centered Outcomes Research Institute

The Patient-Centered Outcomes Research Institute (PCORI) has funded infrastructure development and demonstration studies in patient-centered comparative effectiveness research through the National Patient-centered Clinical Research Network (PCORnet). The network encompasses 13 Clinical Data Research Networks and 20 Patient Powered Research Networks. The present phase of the program includes support to add several health data networks to this large, collaborative initiative, which is designed to link researchers, patient communities, clinicians, and health systems in productive research partnerships that leverage the power of large volumes of health data maintained by the partner networks. This tutorial will demonstrate the range of architectures deployed, the informatics tools developed, and the many study designs adopted in research through this infrastructure. Studies range from large longitudinal observational studies, pragmatic clinical trials conducted within delivery systems, rapid-cycle research in concert with health systems and plans, and surveillance studies across geographic areas and over time. The panel will highlight significant accomplishments across the PCORnet community.

T06: HL7 FHIR® for the Data Scientist

C. Jaffe, HL7; J. Mandel, Verily (Google Life Sciences); S. Huff, Intermountain Health; G. Alterovitz, Harvard Medical School; R. Leftwich, Intersystems Corporation

This workshop will provide the attendees with the background, the rapidly evolving processes, the technical elements, and the innovative approaches to solving the complex problems of interoperable data exchange. In only seven years, FHIR® has been embraced by developers of technology solutions, by government regulatory bodies, by academic institutions, and by Public Health agencies worldwide. The adoption of FHIR®-based solutions has been accelerated by coalescence around a single API structure. The process has been embraced by both public- and private-sector initiatives and by reliance upon a highly consistent maturity model and a reliable strategic roadmap. Within the scope of this workshop, we will highlight the innovative approaches to these strategic goals and articulate the framework for their solution. Unprecedented collaboration by private-sector companies and by broad based coalitions have largely refined the business model for application development. Moreover, innovative government-based initiatives have fostered the sharing of genomic data for both applied and basic research.

8:30 a.m. – 12:00 p.m.

T07: Agile Clinical Decision Support Development and Implementation

M. Basit, V. Kannan, D. Willett, University of Texas Southwestern Health System

Designing effective Clinical Decision Support (CDS) tools in an Electronic Health Record (EHR) can prove challenging, due to complex real-world scenarios and newly-discovered requirements. Deploying new CDS tools shares much in common with new product development, where “agile” principles and practices consistently prove effective. Agile methods can thus prove helpful on CDS projects, including time-boxed “sprints” and lightweight requirements gathering with User Stories. Modeling CDS behavior promotes unambiguous shared understanding of desired behavior, but risks analysis paralysis: an Agile Modeling approach can foster effective rapid-cycle CDS design and optimization. The agile practice of automated testing for test-driven design and regression testing can be applied to CDS development using open-source tools. Ongoing monitoring of CDS behavior once released to production can identify anomalies and prompt rapid-cycle redesign to further enhance CDS effectiveness. The workshop participant will learn about these topics in interactive didactic sessions, with time for practicing the techniques taught.