SAVE THE DATE Machine Learning for Healthcare (MLHC)

August 18th - 19th, 2017
Northeastern University, Boston, MA

Find Out More

Machine Learning for Healthcare

MLHC is an annual research meeting that exists to bring together two usually insular disciplines: computer scientists with artificial intelligence, machine learning, and big data expertises, with clinicians, and medical researchers. MLHC supports the advancement of data analytics, knowledge discovery, and seriously meaningful use of complex medical data by fostering collaborations and the exchange of ideas between members of these too often completely separated communities. To this end, the symposium includes invited talks, poster presentations, panels, and ample time for thoughtful discussion and robust debate.

MLHC has a rigorous peer-review process and an (optional) archival proceedings through the Journal of Machine Learning Research proceedings track. You can access the inaugural proceedings here:


Friday August 18, 2017


Welcoming Remarks

Session 1

Deep Learning as an FDA-Cleared Product

Daniel Golden, Arterys

Radiological diagnosis and interpretation is ready for an overhaul. Radiologists spend countless hours on tasks that are onerous and error-prone, resulting in high costs and frequent misdiagnoses. Arterys is working to address these deficiencies, using deep learning to vastly improve the speed and consistency with which radiologists read cardiac MRI studies. Our first product, Arterys Cardio DL, is the first technology ever to be cleared by the FDA that leverages cloud computing and deep learning in a clinical setting. We discuss the technology behind the software and how we proved its safety and efficacy to secure FDA clearance in the United States and the CE Mark in Europe.

Coffee Break and Discussion

Session 2

How Can NLP Help Cure Cancer?

Regina Barzilay, MIT

Majority of cancer research today takes place in biology and medicine. Computer science plays a minor supporting role in this process if at all. In this talk, I hope to convince you that NLP as a field has a chance to play a significant role in this battle. Indeed, free-form text remains the primary means by which physicians record their observations and clinical findings. Unfortunately, this rich source of textual information is severely underutilized by predictive models in oncology. In the first part of my talk, I will describe a number of tasks where NLP-based models can make a difference in clinical practice. For example, these include improving models of disease progression, preventing over-treatment, and narrowing down to the cure. This part of the talk draws on active collaborations with oncologists from MGH. In the second part of the talk, I will push beyond standard tools, introducing new functionalities and avoiding annotation-hungry training paradigms ill-suited for clinical practice. In particular, I will focus on interpretable neural models that provide rationales underlying their predictions, and semi-supervised methods for information extraction.


Coffee Break and Discussion

Dinner and Discussion

Saturday August 19, 2017


Session 3

The algorithm for precision medicine

Matt Might, Harvard/Utah

Precision medicine requires an algorithmic approach to the delivery of care, and it encounters a wide range of computational challenges. This talk will center on an in-depth case study in precision medicine, highlighting computational challenges at the forefront of the field.


Session 4

Challenges in Developing Learning Algorithms to Personalize Treatment in Real Time

Susan Murphy, University of Michigan

A formidable challenge in designing sequential treatments is to determine when and in which context it is best to deliver treatments. Consider treatment for individuals struggling with chronic health conditions. Operationally designing the sequential treatments involves the construction of decision rules that input current context of an individual and output a recommended treatment. That is, the treatment is adapted to the individual's context; the context may include current health status, current level of social support and current level of adherence for example. Data sets on individuals with records of time-varying context and treatment delivery can be used to inform the construction of the decision rules. There is much interest in personalizing the decision rules, particularly in real time as the individual experiences sequences of treatment. Here we discuss our work in designing online "bandit" learning algorithms for use in personalizing mobile health interventions.

Optimized risk stratification and treatment decisions with machine learning

Collin Stultz, MIT

The accurate assessment of a patient’s risk of adverse events remains a mainstay of clinical care for patients with cardiovascular disease. Sophisticated methods, such as those based on machine learning, form an attractive platform to build improved risk metrics because they can easily incorporate disparate pieces of data, yielding classifiers with improved performance. Using data from more than 5200 patients admitted with a non-ST segment elevation acute coronary syndrome we constructed an artificial neural network that identifies patients at high risk of cardiovascular death 1-year after the index event. We further demonstrate how q-learning can be used to find optimal treatment strategies for patients at high risk of death after an acute coronary syndrome.

Panel Discussion & Closing Remarks

Program Chairs

Assistant Professor in Computer Science, Harvard School of Engineering and Applied Sciences
Associate Professor Departments of Anesthesiology/Critical Care Medicine and Pediatrics Johns Hopkins University School of Medicine
PhD Student, Computer Science, Viterbi Dean's Doctoral Fellow, and Alfred E. Mann Innovation in Engineering Fellow at the University of Southern California
Assistant professor in Computer and Information Science at Northeastern University, Boston, MA
Assistant Professor of Computer Science and Engineering (CSE) at the University of Michigan

Senior Advisory Committee:

Dean of the College of Computer and Information Science, Northeastern University
Associate Professor and Canada Research Chair in Computational Biology, University of Toronto
Associate Professor, Biomedical Informatics Emory University
Associate Professor of Biomedical Informatics, Affiliated with Computer Science, Columbia University
Professor of Computer Science at Cornell Tech in New York City and a Professor of Public Health at Weill Cornell Medical College
Schlumberger Centennial Chair Professor of Electrical and Computer Engineering at The University of Texas at Austin
Professor of Computer Science at the University of Alberta
Dugald C. Jackson Professor MIT Department of Electrical Engineering and Computer Science
Professor of Computer Science, University of Pittsburgh
Technical Fellow and Managing Director, Microsoft Research
Lawrence J. Henderson Professor of Pediatrics, Boston Childrens Hospital
HST Faculty, Distinguished Professor in Health Sciences and Technology and Electrical Engineering and Computer Science, Massachusetts Institute of Technology
Professor of Medicine, Biomedical Engineering and Molecular Physiology and Biological Physics
Professor of Computer Science at the University of British Columbia
Senior Lecturer in Computer Science at Makerere University
Associate Professor at UC Riverside's Computer Science Department
Professor of Computer Science and Engineering in the MIT Department of Electrical Engineering and Computer Science
Associate Professor, Medicine - Biomedical Informatics Research, Stanford University
Founder’s Board Chair of Neurocritical Care, Professor in Pediatrics-Neurology, Neurology - Ken and Ruth Davee Department and Pharmacology, Northwestern
Chairman, Department of Anesthesiology Critical Care Medicine - Children's Hospital Los Angeles
Professor of Machine Learning, School of Informatics, University of Edinburgh

Accepted Papers

Visualizing Clinical Significance with Prediction and Tolerance Regions
Maria Jahja*, North Carolina State University; Daniel Lizotte, UWO
Understanding Coagulopathy using Multi-view Data in the Presence of Sub-Cohorts: A Hierarchical Subspace Approach
Arya Pourzanjani*, UCSB; Tie Bo Wu, UCSB; Richard M. Jiang, UCSB; Mitchell J. Cohen, Denver Health Medical Center; Linda R. Petzold, UCSB
Towards Vision-based Smart Hospitals: A System for Tracking and Monitoring Hand Hygiene Compliance
Albert Haque*, Stanford University; Michelle Guo, Stanford University; Alexandre Alahi, Stanford University; Amit Singh, Lucile Packard Children's Hospital; Serena Yeung, Stanford University; N. Lance Downing, Stanford; Terry Platchek, Lucile Packard Children's Hospital; Li Fei-Fei, Stanford University
Towards a directory of rare disease specialists: Identifying experts from publication history
Zihan Wang*, University of Toronto; Michael Brudno, U Turonto; Orion Buske, Centre for Computational Medicine, SickKids Hospital
Temporal prediction of multiple sclerosis evolution from patient-centered outcomes
ESamuele Fiorini, University of Genoa; Andrea Tacchino, Italian Multiple Sclerosis Foundation - Scientific Research Area; Giampaolo Brichetto, Italian Multiple Sclerosis Foundation - Scientific Research Area; Alessandro Verri, University of Genova, Italy; Annalisa Barla*, Universitˆ degli Studi di Genova
Surgeon Technical Skill Assessment using Computer Vision based Analysis
Hei Law*, University of Michigan; Jia Deng, University of Michigan, Ann Arbor; Khurshid Ghani, University of Michigan
Spatially-Continuous Plantar Pressure Reconstruction Using Compressive Sensing
Amirreza Farnoosh, Northeastern University; Mehrdad Nourani, University of Texas at Dallas; Sarah Ostadabbas*, Northeastern University
ShortFuse: Biomedical Time Series Representations in the Presence of Structured Information
Madalina Fiterau*, Stanford University; Suvrat Bhooshan, Stanford University; Jason Fries, Stanford University; Charles Bournhonesque, Stanford University; Jennifer Hicks, Stanford University; Eni Halilaj, Stanford University; Christopher Re, Stanford University; Scott Delp, Stanford University
Reproducibility in critical care: a mortality prediction case study
Alistair Johnson*, MIT; Tom Pollard, MIT; Roger Mark, MIT
Quantifying Mental Health from Social Media using Learned User Embeddings
Silvio Moreira*, INESC-ID; Glen Copperfield,; Paula Carvalho, INESC-ID; M‡rio Silva, INESC-ID; Byron Wallace, Northeastern
Prediction via clusters of CPT codes for improving surgical outcomes
Stephanie Brown*, Duke; Elizabeth Lorenzi, Duke; Katherine Heller, Duke University; Zhifei Sun, Duke
Predicting Surgery Duration with Neural Heteroscedastic Regression
Zachary Lipton*, UCSD; Nathan Ng, UCSD; Rodney Gabriel , UCSD; Charles Elkan, UCSD; Julian McAuley, UC San Diego
Predicting long-term mortality with first week post-operative data after Coronary Artery Bypass Grafting using Machine Learning models
JosŽ Forte*, University of Groningen; Marco Wiering, University of Groningen; Hjalmar Bouma, University Medical Center Groningen; Fred de Geus, University Medical Center Groningen; Anne Epema, University Medical Center Groningen
Piecewise-constant parametric approximations for survival learning
Jeremy Weiss*, Carnegie Mellon University
Patient Similarity Using Population Statistics and Multiple Kernel Learning
Bryan Conroy*, Philips Research North America; Minnan Xu-Wilson, Philips Research North America; Asif Rahman, Philips Reserach
Modeling Progression Free Survival in Breast Cancer with Tensorized Recurrent Neural Networks and Accelerated Failure Time Model
Yinchong Yang*, Siemens AG, LMU MŸnchen; Volker Tresp, Siemens AG and Ludwig Maximilian University of Munich ; Peter Fasching, Department of Gynecology and Obstetrics, University Hospital Erlangen
Marked Point Process for Severity of Illness Assessment
Kazi Islam*, UC Riverside; Christian Shelton, UC Riverside
Learning to Detect Sepsis with a Multitask Gaussian Process RNN
Joseph Futoma*, Duke; Sanjay Hariharan, Duke University; Katherine Heller, Duke University
Hawkes Process Modeling of Adverse Drug Reactions with Longitudinal Observational Data
Yujia Bao*, University of Wisconsin-Madison; Zhaobin Kuang, University of Wisconsin, Madison; Peggy Peissig, Marshfield Clinic Research Foundation; David Page, University of Wisconsin, Madison; Rebecca Willett, University of Wisconsin, Madison
Generating Multi-label Discrete Patient Records using Generative Adversarial Networks
Edward Choi*, Georgia Institute of Technology; Siddharth Biswal, Georgia Institute of Technology; Bradley Malin, Vanderbilt University; Jon Duke, Georgia Institute of Technology; Walter Stewart, Sutter Health; Jimeng Sun, CS
Diagnostic Inferencing via Improving Clinical Concept Extraction with Deep Reinforcement Learning: A Preliminary Study
Yuan Ling, Philips Research North America; Sadid A. Hasan*, Philips Research North America; Vivek Datla, Philips Research North America; Ashequl Qadir, Philips Research North America; Kathy Lee, Philips Research North America; Joey Liu, Philips Research North America; Oladimeji Farri, Philips Research North America
Continuous State-Space Models for Optimal Sepsis Treatment - a Deep Reinforcement Learning Approach
Aniruddh Raghu*, MIT; Marzyeh Ghassemi, MIT; Matthieu Komorowski, Imperial College London; Leo Celi, MIT; Pete Szolovits, MIT
Clustering Patients with Tensor Decomposition
Matteo Ruffini*, UPC; Ricard Gavaldˆ, UPC; Esther Lim—n, Institut Catalˆ de la Salut
Clinical Intervention Prediction and Understanding using Deep Networks
Nathan Hunt*, MIT; Marzyeh Ghassemi, MIT; Harini Suresh, MIT; Pete Szolovits, MIT; Leo Celi, MIT; Alistair Johnson, MIT
Classifying Lung Cancer Severity with Ensemble Machine Learning in Health Care Claims Data
Savannah Bergquist*, Harvard University; Gabriel Brooks, Dartmouth-Hitchcock Medical Center; Nancy Keating, Harvard Medical School, Brigham and Women's Hospital; Mary Beth Landrum, Harvard Medical School; Sherri Rose, Harvard Medical School
A Video-Based Method for Automatically Rating Ataxia
Ronnachai Jaroensri*, MIT CSAIL; Amy Zhao, MIT; Fredo Durand, MIT; John Guttag, MIT; Jeremy Schmahmann, Massachusetts General Hospital; Guha Balakrishnan, MIT; Derek Lo, Yale University

Accepted Clinical Abstracts

Visual Supervision of Unsupervised Clustering of Patients with Clustervision
Adam Perer*, IBM Research; Bum Chul Kwon, IBM Research; Janu Verma, IBM Research; Kenney Ng, IBM Research; Ben Eysenbach, MIT; Christopher deFilippi, INOVA; Walter Stewart, Sutter Health
Using Machine Learning to Recommend Oncology Clinical Trials
Anasuya Das, Memorial Sloan Kettering Cancer Center; Leifur Thorbergsson, Memorial Sloan Kettering Cancer Center; Aleksandr Grigorenko*, Memorial Sloan Kettering Cancer Center; David Sontag, MIT; Iker Huerga, Memorial Sloan Kettering Cancer Center
MS Mosaic: First Steps (and Stumbles) Toward a Patient-Centered Mobile Platform for Multiple Sclerosis Research and Care
Lee Hartsell*, Duke
Light Field Otoscope 3D Imaging of Diseased Ears in an Alaska Native Population
Manuel Martinello*, Ricoh Innovations
Extracting Information from Electronic Health Records Using Natural Language Processing Ð Knowledge Discovery from Unstructured Information
Vasu Chandrasekaran*, Merck & Co; Paul Dexter, Regenstrief Institute; Jinghua He, Merck & Co; Monica Chase, Merck & Co; Aman Bhandari, Merck & Co; Christopher Frederick, Regenstrief Institute
Accounting for diagnostic uncertainty when training a Machine Learning algorithm to detect patients with the Acute Respiratory Distress Syndrome
Michael Sjoding*, University of Michigan; Narathip Reamaroon, University of Michigan; Kayvan Najarian, University of Michigan

Important Dates

  • Paper Submission Deadline - April 24th 2017 at 6:00 PM (EDT) SUBMIT HERE
  • Acceptance Notification - June 16th 2017
  • Conference: Aug 18th - 19th, 2017

Call for Papers

Researchers in machine learning --- including those working in statistical natural language processing, computer vision and related sub-fields --- when coupled with seasoned clinicians can play an important role in turning complex medical data (e.g., individual patient health records, genomic data, data from wearable health monitors, online reviews of physicians, medical imagery, etc.) into actionable knowledge that ultimately improves patient care. For the last seven years, this meeting has drawn hundreds of clinical and machine learning researchers to frame problems clinicians need solved and discuss machine learning solutions.

This year we are calling for papers in two tracks:

Research Track

We invite submissions that describe novel methods to address the challenges inherent to health-related data (e.g., sparsity, class imbalance, causality, temporal dynamics, multi-modal data). We also invite articles describing the application and evaluation of state-of-the-art machine learning approaches applied to health data in deployed systems. In particular, we seek high-quality submissions on the following topics:

  • Predicting individual patient outcomes
  • Mining, processing and making sense of clinical notes
  • Patient risk stratification
  • Parsing biomedical literature
  • Bio-marker discovery
  • Brain imaging technologies and related models
  • Learning from sparse/missing/imbalanced data
  • Time series analysis with medical applications
  • Medical imaging
  • Efficient, scalable processing of clinical data
  • Clustering and phenotype discovery
  • Methods for vitals monitoring
  • Feature selection/dimensionality reduction
  • Text classification and mining for biomedical literature
  • Exploiting and generating ontologies
  • ML systems that assist with evidence-based medicine

Research Track Proceedings and Review Process. Accepted submissions will be published through the proceedings track of the Journal of Machine Learning Research. All papers will be rigorously peer-reviewed, and research that has been previously published elsewhere or is currently in submission may not be submitted. However, authors will have the option of only archiving the abstract to allow for future submissions to clinical journals, etc.

Research Track Submission Details. Submissions should be no longer than 10 pages (excluding references). The review process is double blind. Please refer to the submission instructions on our website.

Clinical Abstracts Track

To expose open questions and celebrate the accomplishments of the community, we are also invite submissions for late-breaking clinical podium abstracts and demos:

  • Open clinical questions: we seek viewpoints from clinicians and clinical researchers on important directions the MLHC community should tackle together.
  • Clinical/translational successes: we seek abstracts about data and data analysis that resulted in new understanding and/or changes in clinical practice.
  • Demonstrations: we seek exciting end-to-end tools that bring data and data analysis to the clinician/bedside.

We especially encourage submissions from clinical researchers working with large digital health data sets using modern computational methods. Submissions should be one page or less, and accepted submissions will presented as late-breaking abstracts and demos at MLHC. Abstracts will be made available online, but will not be archived or indexed.

Proceedings and Review Process. Accepted submissions will be published through the proceedings track of the Journal of Machine Learning Research. All papers will be rigorously peer-reviewed, and research that has been previously published elsewhere or is currently in submission may not be submitted to MLHC. However, authors will have the option of only archiving the abstract to allow for future submissions to clinical journals, etc.


The maximum paper length is 10 pages, excluding references, acknowledgements, and supplementary materials. The maximum size is 10 MB. We expect papers to be between 7-10 pages; shorter papers are acceptable as long as they fully describe the work.

Here is an example paper

LaTeX style files are available here

A Word template is available here

MLHC Style File is available here

While section headings may be changed, the margins and author block must remain the same and all papers must be in 11-point Times font. If supplementary materials are included, the paper must still stand alone; reviewers are encouraged but not required to look at the supplementary materials.

Context for Clinicians: We realize that conferences in medicine tend to be abstract-only, non-archival events. This is not the case for MLHC: to be a premier health and machine learning venue, all papers submitted to MLHC will be rigorously peer-reviewed for scientific quality -- and for that a suitably complete description of the work is necessary. So we call for submissions of 7-10 pages that describe your problem, cohort, features used, methods, results, etc. Multiple reviewers will provide feedback on the submission. If accepted, you will have the opportunity to revise the paper before submitting the final version.

Context for Computer Scientists: MLHC is a machine learning conference, and we expect papers of the same level of quality as those that would be sent to a conference (rather than a workshop). One may choose to only have the abstract of the paper archived, but it is a violation of dual-submission policy to archive the full MLHC paper and then later submit the same paper to another conference

Regardless of whether or not the full paper is archived, authors of accepted papers will be invited to present a spotlight and/or a poster on their work at the conference.

(Of course, we hope that many papers have both clinicians and computer scientists involved!)


The example paper contains sample sections. A more machine-learning oriented paper may include more mathematical details, while a more application-focused paper may include more detailed cohort and study design descriptions. In all cases, papers should contain enough information for the readers to understand and reproduce the results.

Double-Blind Reviewing

Reviewing for MLHC is double-blind: the reviewers will not know the authors’ identity and the authors will not know the reviewers’ identity. Do not include your names, your institution’s name, or identifying information in the initial submission. Wait for the camera-ready. While you should make every effort to anonymize your work -- e.g. write “In Doe et al. (2011), the authors…” rather than “In our previous work (Doe et al., 2011), we…” -- we realize that a reviewer may be able to deduce the authors’ identities based on the previous publications or technical reports on the web. This will not be considered a violation of the double-blind reviewing policy on the author’s part.

Dual Submission and Archiving Policy

All submissions to MLHC must be novel work. You may not submit work that has been previously published, accepted for publication, or that has been submitted in parallel to other conferences. There are a few exceptions:

  1. You may submit a paper to MLHC and a journal at the same time.
  2. You may submit work that has only appeared at a conference or workshop without proceedings.
  3. You may submit work that has only been previously published as a technical report (e.g. on arXiv).

All submissions to MLHC must be full papers so that the work can be rigorously reviewed. Once your paper is accepted to MLHC, however, you may choose to only have the abstract archived to enable submission to a journal.

Please upload submissions here:

Need more information?

If you have any questions regarding the symposium, please send us an email.