School of Medicine introduces big data training

NYU Langone's new data program has students learning how to interpret data rather than just diagnosing patients using public domain data provided by the New York State Department of Health, Statewide Planning and Research Cooperative System.

NYU’s School of Medicine recently implemented a required training for their first-year medical students on big data. Pairs of students are asked to formulate and answer a clinical question using the public domain data provided by the New York State Department of Health, Statewide Planning and Research Cooperative System.

Marc M. Triola, an associate professor of Medicine at NYU School of Medicine, said big data has the potential to transform the healthcare system.

“Looking at the clinical data of people in a whole community, state or country is changing how we understand the quality of care we are delivering, where there are gaps in the healthcare system and how to make changes that will better prepare us for the future,” Triola said. “There is also an explosion in data coming from connected health devices such as Fitbits and Apple Watches, new consumer health technologies like iPhone-connected ECG machines and new scientific tests like personal genome sequencing.”

The project, titled “Health Care by the Numbers,” gives students access to over 5 million anonymous records on hospital patients from 2012 and 2013.


“Each record includes things like the race and ethnicity of the patient, which doctor/hospital cared for them, their diagnosis, how long they were in the hospital, how much the total bill was for the stay and much more,” Triola said. “It’s a tremendous data resource containing over 2.5 million records of hospital admissions per year.”

The students will also have access to another database — Lacidem Care Group — with records from NYU’s faculty practices.

School of Medicine student Noah Berland said he recognizes that while a patient’s perspective will be relatively unchanged from this project, big data has the ability to change the way innovators look at health care.

“It will allow us to see more patterns in the noise of illness and improve care and outcomes,” Berland said. “It also helps us better understand who our patients are and how we can better serve them and potentially prevent disease.”

Berland says the availability of this data is a recent phenomenon, but argues medical experts are often hesitant to introduce new ways to analyze patients and healthcare.

“The data wasn’t available, and the trend in big data analysis has simply taken a little bit of extra time to trickle down to the health sphere, likely due to privacy concerns,” Berland said.

Triola attributes the rise in availability of public domain big data to President Barack Obama’s 2009 Transparency and Open Government memorandum.

“This initiative resulted in federal, state and local governments releasing data and information that was never before available,” Triola said. “In parallel, the health care system has become more data-driven and it’s clear that physicians need to be more skilled in the analysis and interpretation of these data.”

CAS freshman Hemanee Sharma, who intends on applying to medical school, believes the context in which students interpret the data will allow them to better relate it to the unique cases of each of their patients.

“Students will be able to find real-world applications for this data because they are interpreting it in an environment in which the data was initially observed,” Sharma said.

Triola emphasized the importance of the next generation of doctors’ ability to be able to utilize the mass amounts of data at their fingertips.

“There is no doubt that this is an increasingly necessary skill — if physicians of tomorrow are not navigators of big data, they will become victims of it,” Triola said.

Email Greta Chevance at [email protected]



Please enter your comment!
Please enter your name here