Machine Learning for Healthcare (MLHC)

August 18th - 19th, 2017 - Northeastern University
Interdisciplinary Science and Engineering Complex (ISEC)
805 Columbus Avenue Boston, Massachusetts 02120

Find Out More

The Conference has been sold out.

Machine Learning for Healthcare

MLHC is an annual research meeting that exists to bring together two usually insular disciplines: computer scientists with artificial intelligence, machine learning, and big data expertises, with clinicians, and medical researchers. MLHC supports the advancement of data analytics, knowledge discovery, and seriously meaningful use of complex medical data by fostering collaborations and the exchange of ideas between members of these too often completely separated communities. To this end, the symposium includes invited talks, poster presentations, panels, and ample time for thoughtful discussion and robust debate.

MLHC has a rigorous peer-review process and an (optional) archival proceedings through the Journal of Machine Learning Research proceedings track. You can access the inaugural proceedings here:

Directions to Northeastern University

For directions to Northeastern University, simply click here.
If you are driving, parking is available for $32/day in the Renaissance Parking Garage located off of Columbus Avenue. ISEC is one block away from the parking garage.

If you are taking the T, you can either get off at the Northeastern stop on the Green E line or the Ruggles stop on the Orange line. Please note that Ruggles is closest to ISEC if you exit on Columbus Avenue.

Conference Location

Northeastern University
Interdisciplinary Science and Engineering Complex (ISEC)
805 Columbus Avenue
Boston, Massachusetts 02120

The conference will take place in the ISEC Auditorium and Atrium
ISEC is #83 on the Northeastern campus map




Northeastern University

Interdisciplinary Science and Engineering Complex (ISEC)
805 Columbus Avenue Boston, Massachusetts 02120

Friday August 18, 2017 - ISEC Auditorium and Atrium


Welcoming Remarks

Session 1

The BIDMC Explore IT Program

John Halamka, BIDMC

BIDMC has implemented Machine Learning functionality from Amazon and Google to support several use cases. In this presentation, the speaker will review the implementation experience and the outcomes achieved.

Deep Learning as an FDA-Cleared Product

Daniel Golden, Arterys

Radiological diagnosis and interpretation is ready for an overhaul. Radiologists spend countless hours on tasks that are onerous and error-prone, resulting in high costs and frequent misdiagnoses. Arterys is working to address these deficiencies, using deep learning to vastly improve the speed and consistency with which radiologists read cardiac MRI studies. Our first product, Arterys Cardio DL, is the first technology ever to be cleared by the FDA that leverages cloud computing and deep learning in a clinical setting. We discuss the technology behind the software and how we proved its safety and efficacy to secure FDA clearance in the United States and the CE Mark in Europe.

Coffee Break and Discussion

Session 2

How Can NLP Help Cure Cancer?

Regina Barzilay, MIT

Majority of cancer research today takes place in biology and medicine. Computer science plays a minor supporting role in this process if at all. In this talk, I hope to convince you that NLP as a field has a chance to play a significant role in this battle. Indeed, free-form text remains the primary means by which physicians record their observations and clinical findings. Unfortunately, this rich source of textual information is severely underutilized by predictive models in oncology. In the first part of my talk, I will describe a number of tasks where NLP-based models can make a difference in clinical practice. For example, these include improving models of disease progression, preventing over-treatment, and narrowing down to the cure. This part of the talk draws on active collaborations with oncologists from MGH. In the second part of the talk, I will push beyond standard tools, introducing new functionalities and avoiding annotation-hungry training paradigms ill-suited for clinical practice. In particular, I will focus on interpretable neural models that provide rationales underlying their predictions, and semi-supervised methods for information extraction.

Tools for Interpretable Machine Learning with Healthcare Applications

Cynthia Rudin, Duke

How do patients and doctors know that they can trust predictions from a model that they cannot understand? Transparency in machine learning models is critical in high stakes decisions, like those made every day in healthcare. My lab creates machine learning algorithms for predictive models that are interpretable to human experts. As it turns out, by using modern optimization tools, one often does not need to sacrifice accuracy to gain interpretability. We will focus mainly on the problem of building medical scoring systems using data. We provide applications to ADHD diagnosis, sleep apnea screening, EEG monitoring for seizure prediction in ICU patients, and early detection of cognitive impairments. Then, we switch to creating logical models, and in particular, rule lists, which are a form of decision tree. Finally we will discuss how to model recovery curves that have realistic shapes and realistic uncertainty bands, and show an application to modeling recovery curves for prostatectomy patients.
I will focus on work of students Berk Ustun, Hima Lakkaraju, William Souillard-Mandar, and Fulton Wang. Other collaborators include Brandon Westover, Matt Bianchi, Randall Davis, Dana L. Penney, Tyler McCormick, and Ronald C. Kessler.


Coffee Break and Discussion

Dinner and Discussion

Saturday August 19, 2017 - ISEC Auditorium and Atrium


Session 3

The algorithm for precision medicine

Matt Might, Harvard/Utah

Precision medicine requires an algorithmic approach to the delivery of care, and it encounters a wide range of computational challenges. This talk will center on an in-depth case study in precision medicine, highlighting computational challenges at the forefront of the field.

Opportunities to Apply Machine Learning in Neurocritical Care

Soojin Park, Columbia

The Neurocritical care patient is monitored for reversible secondary brain injury. Timely personalized assessments of subclinical or early state changes in the neuroICU currently rely on vigilance and constant availability of expert interpretation. Those at most risk are obtunded or comatose patients, but state changes in even conscious patients may be clinically asymptomatic or subtly evade detection. With the proliferation of multimodality neuro monitoring and advances in data acquisition and analytics, the field of neuro critical care has generated studies in signal processing and machine learning, advancing the science of detection, prediction, and goal setting. There is growing demand for the implementation of these findings.


Session 4

Challenges in Developing Learning Algorithms to Personalize Treatment in Real Time

Susan Murphy, University of Michigan

A formidable challenge in designing sequential treatments is to determine when and in which context it is best to deliver treatments. Consider treatment for individuals struggling with chronic health conditions. Operationally designing the sequential treatments involves the construction of decision rules that input current context of an individual and output a recommended treatment. That is, the treatment is adapted to the individual's context; the context may include current health status, current level of social support and current level of adherence for example. Data sets on individuals with records of time-varying context and treatment delivery can be used to inform the construction of the decision rules. There is much interest in personalizing the decision rules, particularly in real time as the individual experiences sequences of treatment. Here we discuss our work in designing online "bandit" learning algorithms for use in personalizing mobile health interventions.

Optimized risk stratification and treatment decisions with machine learning

Collin Stultz, MIT

The accurate assessment of a patient’s risk of adverse events remains a mainstay of clinical care for patients with cardiovascular disease. Sophisticated methods, such as those based on machine learning, form an attractive platform to build improved risk metrics because they can easily incorporate disparate pieces of data, yielding classifiers with improved performance. Using data from more than 5200 patients admitted with a non-ST segment elevation acute coronary syndrome we constructed an artificial neural network that identifies patients at high risk of cardiovascular death 1-year after the index event. We further demonstrate how q-learning can be used to find optimal treatment strategies for patients at high risk of death after an acute coronary syndrome.

Panel Discussion & Closing Remarks

Program Chairs

Assistant Professor in Computer Science, Harvard School of Engineering and Applied Sciences
Associate Professor Departments of Anesthesiology/Critical Care Medicine and Pediatrics Johns Hopkins University School of Medicine
PhD Student, Computer Science, Viterbi Dean's Doctoral Fellow, and Alfred E. Mann Innovation in Engineering Fellow at the University of Southern California
Assistant professor in Computer and Information Science at Northeastern University, Boston, MA
Assistant Professor of Computer Science and Engineering (CSE) at the University of Michigan

Senior Advisory Committee:

Dean of the College of Computer and Information Science, Northeastern University
Associate Professor and Canada Research Chair in Computational Biology, University of Toronto
Associate Professor, Biomedical Informatics Emory University
Associate Professor of Biomedical Informatics, Affiliated with Computer Science, Columbia University
Professor of Computer Science at Cornell Tech in New York City and a Professor of Public Health at Weill Cornell Medical College
Schlumberger Centennial Chair Professor of Electrical and Computer Engineering at The University of Texas at Austin
Professor of Computer Science at the University of Alberta
Dugald C. Jackson Professor MIT Department of Electrical Engineering and Computer Science
Professor of Computer Science, University of Pittsburgh
Technical Fellow and Managing Director, Microsoft Research
Lawrence J. Henderson Professor of Pediatrics, Boston Childrens Hospital
HST Faculty, Distinguished Professor in Health Sciences and Technology and Electrical Engineering and Computer Science, Massachusetts Institute of Technology
Professor of Medicine, Biomedical Engineering and Molecular Physiology and Biological Physics
Professor of Computer Science at the University of British Columbia
Senior Lecturer in Computer Science at Makerere University
Associate Professor at UC Riverside's Computer Science Department
Professor of Computer Science and Engineering in the MIT Department of Electrical Engineering and Computer Science
Associate Professor, Medicine - Biomedical Informatics Research, Stanford University
Founder’s Board Chair of Neurocritical Care, Professor in Pediatrics-Neurology, Neurology - Ken and Ruth Davee Department and Pharmacology, Northwestern
Chairman, Department of Anesthesiology Critical Care Medicine - Children's Hospital Los Angeles
Professor of Machine Learning, School of Informatics, University of Edinburgh

Accepted Papers

Jeremy Weiss*, Carnegie Mellon University
Amirreza Farnoosh, Northeastern University; Mehrdad Nourani, University of Texas at Dallas; Sarah Ostadabbas*, Northeastern University
Savannah Bergquist*, Harvard University; Gabriel Brooks, Dartmouth-Hitchcock Medical Center; Nancy Keating, Harvard Medical School, Brigham and Women's Hospital; Mary Beth Landrum, Harvard Medical School; Sherri Rose, Harvard Medical School
JosŽ Forte*, University of Groningen; Marco Wiering, University of Groningen; Hjalmar Bouma, University Medical Center Groningen; Fred de Geus, University Medical Center Groningen; Anne Epema, University Medical Center Groningen
Madalina Fiterau*, Stanford University; Suvrat Bhooshan, Stanford University; Jason Fries, Stanford University; Charles Bournhonesque, Stanford University; Jennifer Hicks, Stanford University; Eni Halilaj, Stanford University; Christopher Re, Stanford University; Scott Delp, Stanford University
Albert Haque*, Stanford University; Michelle Guo, Stanford University; Alexandre Alahi, Stanford University; Amit Singh, Lucile Packard Children's Hospital; Serena Yeung, Stanford University; N. Lance Downing, Stanford; Terry Platchek, Lucile Packard Children's Hospital; Li Fei-Fei, Stanford University
Hei Law*, University of Michigan; Jia Deng, University of Michigan, Ann Arbor; Khurshid Ghani, University of Michigan
Zachary Lipton*, UCSD; Nathan Ng, UCSD; Rodney Gabriel , UCSD; Charles Elkan, UCSD; Julian McAuley, UC San Diego
ESamuele Fiorini, University of Genoa; Andrea Tacchino, Italian Multiple Sclerosis Foundation - Scientific Research Area; Giampaolo Brichetto, Italian Multiple Sclerosis Foundation - Scientific Research Area; Alessandro Verri, University of Genova, Italy; Annalisa Barla*, Universitˆ degli Studi di Genova
Matteo Ruffini*, UPC; Ricard Gavaldˆ, UPC; Esther Lim—n, Institut Catalˆ de la Salut
Aniruddh Raghu*, MIT; Marzyeh Ghassemi, MIT; Matthieu Komorowski, Imperial College London; Leo Celi, MIT; Pete Szolovits, MIT
Yinchong Yang*, Siemens AG, LMU MŸnchen; Volker Tresp, Siemens AG and Ludwig Maximilian University of Munich ; Peter Fasching, Department of Gynecology and Obstetrics, University Hospital Erlangen
Yujia Bao*, University of Wisconsin-Madison; Zhaobin Kuang, University of Wisconsin, Madison; Peggy Peissig, Marshfield Clinic Research Foundation; David Page, University of Wisconsin, Madison; Rebecca Willett, University of Wisconsin, Madison
Bryan Conroy*, Philips Research North America; Minnan Xu-Wilson, Philips Research North America; Asif Rahman, Philips Reserach
Ronnachai Jaroensri*, MIT CSAIL; Amy Zhao, MIT; Fredo Durand, MIT; John Guttag, MIT; Jeremy Schmahmann, Massachusetts General Hospital; Guha Balakrishnan, MIT; Derek Lo, Yale University
Maria Jahja*, North Carolina State University; Daniel Lizotte, UWO
Elizabeth C. Lorenzi, Stephanie L. Brown, Zhifei Sun, and Katherine Heller
Joseph Futoma, Sanjay Hariharan, Katherine Heller, Mark Sendak, Nathan Brajer, Meredith Clement, Armando Bedoya, and Cara O'Brien
Kazi Islam*, UC Riverside; Christian Shelton, UC Riverside
Yuan Ling, Philips Research North America; Sadid A. Hasan*, Philips Research North America; Vivek Datla, Philips Research North America; Ashequl Qadir, Philips Research North America; Kathy Lee, Philips Research North America; Joey Liu, Philips Research North America; Oladimeji Farri, Philips Research North America
Edward Choi*, Georgia Institute of Technology; Siddharth Biswal, Georgia Institute of Technology; Bradley Malin, Vanderbilt University; Jon Duke, Georgia Institute of Technology; Walter Stewart, Sutter Health; Jimeng Sun, CS
Silvio Moreira*, INESC-ID; Glen Copperfield,; Paula Carvalho, INESC-ID; M‡rio Silva, INESC-ID; Byron Wallace, Northeastern
Nathan Hunt*, MIT; Marzyeh Ghassemi, MIT; Harini Suresh, MIT; Pete Szolovits, MIT; Leo Celi, MIT; Alistair Johnson, MIT
Arya Pourzanjani*, UCSB; Tie Bo Wu, UCSB; Richard M. Jiang, UCSB; Mitchell J. Cohen, Denver Health Medical Center; Linda R. Petzold, UCSB
Zihan Wang*, University of Toronto; Michael Brudno, U Turonto; Orion Buske, Centre for Computational Medicine, SickKids Hospital
Alistair Johnson*, MIT; Tom Pollard, MIT; Roger Mark, MIT

Accepted Clinical Abstracts

Vasua Chandrasekaran, Jinghua He, Monica Reed Chase, Aman Bhandari, Christopher Frederick, and Paul Dexter
Anasuya Das, Leifur Thorbergsson, Aleksandr Grigorenko, David Sontag, Iker Huerga
Adam Perer*, IBM Research; Bum Chul Kwon, IBM Research; Janu Verma, IBM Research; Kenney Ng, IBM Research; Ben Eysenbach, MIT; Christopher deFilippi, INOVA; Walter Stewart, Sutter Health
Manuel Martinello, Harshavardhan Binnamangalam, Philip Hofstetter, John Kokesh, Samantha Kleindienst, Tiffany Romain, Noah Bedard, and Ivana Tosic

Call for Papers

Researchers in machine learning --- including those working in statistical natural language processing, computer vision and related sub-fields --- when coupled with seasoned clinicians can play an important role in turning complex medical data (e.g., individual patient health records, genomic data, data from wearable health monitors, online reviews of physicians, medical imagery, etc.) into actionable knowledge that ultimately improves patient care. For the last seven years, this meeting has drawn hundreds of clinical and machine learning researchers to frame problems clinicians need solved and discuss machine learning solutions.

This year we are calling for papers in two tracks:

Research Track

We invite submissions that describe novel methods to address the challenges inherent to health-related data (e.g., sparsity, class imbalance, causality, temporal dynamics, multi-modal data). We also invite articles describing the application and evaluation of state-of-the-art machine learning approaches applied to health data in deployed systems. In particular, we seek high-quality submissions on the following topics:

  • Predicting individual patient outcomes
  • Mining, processing and making sense of clinical notes
  • Patient risk stratification
  • Parsing biomedical literature
  • Bio-marker discovery
  • Brain imaging technologies and related models
  • Learning from sparse/missing/imbalanced data
  • Time series analysis with medical applications
  • Medical imaging
  • Efficient, scalable processing of clinical data
  • Clustering and phenotype discovery
  • Methods for vitals monitoring
  • Feature selection/dimensionality reduction
  • Text classification and mining for biomedical literature
  • Exploiting and generating ontologies
  • ML systems that assist with evidence-based medicine

Research Track Proceedings and Review Process. Accepted submissions will be published through the proceedings track of the Journal of Machine Learning Research. All papers will be rigorously peer-reviewed, and research that has been previously published elsewhere or is currently in submission may not be submitted. However, authors will have the option of only archiving the abstract to allow for future submissions to clinical journals, etc.

Research Track Submission Details. Submissions should be no longer than 10 pages (excluding references). The review process is double blind. Please refer to the submission instructions on our website.

Clinical Abstracts Track

To expose open questions and celebrate the accomplishments of the community, we are also invite submissions for late-breaking clinical podium abstracts and demos:

  • Open clinical questions: we seek viewpoints from clinicians and clinical researchers on important directions the MLHC community should tackle together.
  • Clinical/translational successes: we seek abstracts about data and data analysis that resulted in new understanding and/or changes in clinical practice.
  • Demonstrations: we seek exciting end-to-end tools that bring data and data analysis to the clinician/bedside.

We especially encourage submissions from clinical researchers working with large digital health data sets using modern computational methods. Submissions should be one page or less, and accepted submissions will presented as late-breaking abstracts and demos at MLHC. Abstracts will be made available online, but will not be archived or indexed.

Proceedings and Review Process. Accepted submissions will be published through the proceedings track of the Journal of Machine Learning Research. All papers will be rigorously peer-reviewed, and research that has been previously published elsewhere or is currently in submission may not be submitted to MLHC. However, authors will have the option of only archiving the abstract to allow for future submissions to clinical journals, etc.


The maximum paper length is 10 pages, excluding references, acknowledgements, and supplementary materials. The maximum size is 10 MB. We expect papers to be between 7-10 pages; shorter papers are acceptable as long as they fully describe the work.

Here is an example paper

LaTeX style files are available here

A Word template is available here

MLHC Style File is available here

While section headings may be changed, the margins and author block must remain the same and all papers must be in 11-point Times font. If supplementary materials are included, the paper must still stand alone; reviewers are encouraged but not required to look at the supplementary materials.

Context for Clinicians: We realize that conferences in medicine tend to be abstract-only, non-archival events. This is not the case for MLHC: to be a premier health and machine learning venue, all papers submitted to MLHC will be rigorously peer-reviewed for scientific quality -- and for that a suitably complete description of the work is necessary. So we call for submissions of 7-10 pages that describe your problem, cohort, features used, methods, results, etc. Multiple reviewers will provide feedback on the submission. If accepted, you will have the opportunity to revise the paper before submitting the final version.

Context for Computer Scientists: MLHC is a machine learning conference, and we expect papers of the same level of quality as those that would be sent to a conference (rather than a workshop). One may choose to only have the abstract of the paper archived, but it is a violation of dual-submission policy to archive the full MLHC paper and then later submit the same paper to another conference

Regardless of whether or not the full paper is archived, authors of accepted papers will be invited to present a spotlight and/or a poster on their work at the conference.

(Of course, we hope that many papers have both clinicians and computer scientists involved!)


The example paper contains sample sections. A more machine-learning oriented paper may include more mathematical details, while a more application-focused paper may include more detailed cohort and study design descriptions. In all cases, papers should contain enough information for the readers to understand and reproduce the results.

Double-Blind Reviewing

Reviewing for MLHC is double-blind: the reviewers will not know the authors’ identity and the authors will not know the reviewers’ identity. Do not include your names, your institution’s name, or identifying information in the initial submission. Wait for the camera-ready. While you should make every effort to anonymize your work -- e.g. write “In Doe et al. (2011), the authors…” rather than “In our previous work (Doe et al., 2011), we…” -- we realize that a reviewer may be able to deduce the authors’ identities based on the previous publications or technical reports on the web. This will not be considered a violation of the double-blind reviewing policy on the author’s part.

Dual Submission and Archiving Policy

All submissions to MLHC must be novel work. You may not submit work that has been previously published, accepted for publication, or that has been submitted in parallel to other conferences. There are a few exceptions:

  1. You may submit a paper to MLHC and a journal at the same time.
  2. You may submit work that has only appeared at a conference or workshop without proceedings.
  3. You may submit work that has only been previously published as a technical report (e.g. on arXiv).

All submissions to MLHC must be full papers so that the work can be rigorously reviewed. Once your paper is accepted to MLHC, however, you may choose to only have the abstract archived to enable submission to a journal.

Please upload submissions here:



Need more information?

If you have any questions regarding the symposium, please send us an email.