AI and Machine Learning for Healthcare, Precision Wellness, and Genomic Data Visualization

Ann Nguyen:
Hi, this is Ann Nguyen, Senior Associate Conference Producer with Cambridge Healthtech Institute. We are here for a podcast for the Inaugural Machine Learning conference at Bio-IT World Conference & Expo 2018, happening this May 15-17 in Boston, Massachusetts.

We're honored to have Dr. James Hendler, Tetherless World Chair of Computer, Web and Cognitive Sciences; Director of the RPI-IBM Center for Health Empowerment by Analytics, Learning and Semantics at Rensselaer Polytechnic Institute.

Jim, thank you for chatting with us.

James Hendler:
My pleasure.

Ann Nguyen:
What led you to work on precision wellness as part of your research on artificial intelligence, agent-based computing, high-performance processing and the Semantic Web, of which you're one of the originators?

James Hendler:
Yeah, it's an interesting story because for many years of my career I wasn't involved in health research and then, I guess about two years ago, I got asked if I would sit on the National Academies Board on Research and Data Information. And, as I started working on that, I was exposed to more and more of the issues that were going on in the sharing of health information and what was going on in healthcare, wellness and also precision medicine and realized that many of the problems that I was hearing about were problems that were similar to those that I had worked on in other fields.

So for example, the kinds of problems that we hear a lot about are that information that's in one part of the health system (say, the health records) and information in another part (say, images taken in the X-ray) are not always put together until it reaches the doctors level or some other level. And those things when we actually are doing research, we're often in very different sets of information where you can't put them together. So that kind of information sharing, information integration has been really a hallmark of my own work.

And then the other thing was, I started learning a lot about issues of personalization. So if two people right now both put in the same blood sugar number and said, you know, is this a good one or a bad one, you would pretty much always get the same answers out. But actually, that needs to be contextualized for somebody who has a family history of diabetes and is a little bit overweight, a number that would be sort of at the bottom of pre-diabetic would be a very good number but, for someone who is thin and exercising and didn't have a family history, that would be a bad indicator that something was going on.

So these two issues, the integration information and the personalization issues, were both things that I've worked on and been excited by and things that AI and some of these other things we mentioned and specifically this Semantic Web work really was aimed at addressing. So I got more and more interested in doing that in the medical context.

Ann Nguyen:
Let's discuss emerging technologies from machine learning and artificial intelligence. What applications do they have in healthcare? What's their potential impact?

James Hendler:
Yeah, it's a good question because, really what we're seeing is a huge potential impact. A lot of early application and, it gets into a kind of complicated discussion about what's going on right now in machine learning and AI but, really what's happened is several different AI technologies have kind of come around and … the curve, there's now enough computing power, enough data that many things that were hard to automate could be automated and many of those give us some very low hanging fruit in healthcare, things like image analysis and things like that.

However, some of the really hard problems looking further out will need new research and I think that's where we're really going to see some exciting things happen. Some of the things I think a lot about are, for example, the interaction between a patient and a doctor is right now while you're in a medical situation. So you've developed diabetes then there's a fairly ongoing interaction but, beforehand, when you're in a pre-diabetic stage, for example, there really is no medical information there, it's lifestyle, it's wellness, it's care that has to be very personal. … in AI we're getting better at doing that.

In the analytic space, what we're seeing is a lot of new work which is potentially able to look at how you can differentiate different groups of people. In the past we've had to do that in advance, what's called cohort analysis. So you'd say, let's compare a gender, let's compare by age. But the new machine learning technologies are able to start saying, you know, if you separate out your groups this way, then you see these different sets of predictors.

So, for an example, we were working with a local hospital on emergency room readmissions and there was a very small number of patients who were a very high number up to the readmissions and they were pretty easy to predict. They were homeless or had diabetes or lived in certain places, things like shelters and things. But, when you moved past that cohort, the next group was much more complicated and really was where the hospital could focus some new efforts. So the fact that the data analytics can now get us to where we can see that kind of precision cohort or what's sometimes called the cadre is really an exciting new technology.

So, again, a lot of the focus at the moment has been primarily in the biomedical side. We're learning a lot about genomics, proteomics, things like that and, of course, there's tons of AI being applied in there because you have just this deluge of data and, a lot of machine learning techniques.

But where I think things are really getting exciting is, as we start to move out to these areas where we can start to see things that need to be put together, where AI can help us bring something together.

Another wonderful example I saw recently was, basically a CAT scan that most analysts would say has a tumor in it. It's a breast cancer case. But in fact, the health record says that this person has had a mastectomy so it clearly can't be a breast tumor. If you can get that information together, then the person looking at it can do it.

Now, in a large hospital you might have a tumor board or group of people who do that as humans but if you're, you know, a clinical oncologist out in some community, it's really hard to get your hands on that kind of information all at one time and therefore patient care varies tremendously by where you are, by the capabilities of the practitioners, things like that.

I think the AI technologies and the machine learning technologies can really help us make some breakthrough cases in bringing everywhere the average doctor up to the performance of someone at a Mayo Clinic or a Dana-Farber.

Ann Nguyen:
What role might AI-driven technologies have in enhancing data visualization in genomics, drug discovery or clinical developments?

James Hendler:
Well that's another area that I think is really up and coming.

So, as you've sort of heard me implying from the examples I've been talking about so far, personalization, integration of data, things like that, that's kind of more like the clinician or patient end. But now, if we move down to the researcher end, the person who's really trying to do the biomedical research, the life science or even just the predicting healthcare, what we see is that many, many of the hard problems sort of look like, if you've ever seen a social network grid, this person knows this person, this person, this person and this person bought this and this person likes that…those are the kind of information that has been getting Facebook in trouble with Cambridge Analytica and things like that where our ability to do these large network analyses, much of which is done using data analytics then presented in some kind of network visualization or something which is where the exploration can happen…people can look at that data and start to say, here's what we think is happening or, we may see a problem over here and then you sort of drill down into that cluster to start asking questions.

So the term data exploration is used when we say, taking this kind of data, pulling it together, looking at it in the early stages, forming some hypotheses and then going out and testing them.

So most of the machine learning work is essentially correlative. We see X and Y occurring together. Now, you know, it's a truism we all learned as kids that correlation doesn't imply causality. On the other hand, if you can't find correlation, the causal explanation may be pretty hard to defend. So what we can see is that we can rule out many possibilities…. But right now humans are still better at looking at this combination of data after it's been processed by the machine and saying, you know, I think something interesting is happening there.

So one of the things we see and are actually doing within an institute that I'm also helping to run at RPI, which is called the Institute for Data Exploration Applications, is we're looking at collaborative visualizations. We're looking at bringing a set of people so, for example, if you bring several cardiologists together and show them the hospital's cardiology data, they can see things the machine can't because they know things about the processes, about the policies. If they see that…back in our emergency room example, when the administrators looked at some of our data, they saw a pattern that we knew was a pattern but we didn't know what it meant and they said, oh, that tells us if we change our staffing in a certain way we could actually reduce this problem.

So that's the kind of thing that we don't see machines by themselves working out in the near future. But people looking at the results of these machines – and that means, of course, we have to find ways to show it to them, we have to develop new devices that will make it so that multiple people can look at data together rather than staring at a small screen – we're very interested at Rensselaer in large-scale visualization. So again, we as humans are used to looking at things at the scale of the outside world and when you look at them on a small screen, you have to sort of learn to look at things differently.

When some of those same things can be shown at human perceptual scales with multiple people who can see each other, see those things, pick up on the cultural cues we've all thought, some really exciting things happen.

So it's really not so much that the data visualization itself is going to solve the genomics, drug discovery development type problems, it's that it becomes an enabler for humans together to start saying, hey, you know, I see this and you see that and maybe that explains something that's going on.

So I think that's a very, very fertile area and a very exciting area for new exploration.

Ann Nguyen:
Thank you Jim for sharing your background and insights today.

James Hendler:
It's my pleasure.

Ann Nguyen:
That was Dr. James Hendler of Rensselaer Polytechnic Institute. He'll be speaking during the Machine Learning track at Bio-IT World, which runs this May 15-17 in Boston.

To learn more from him, visit for registration info and enter the keycode “Podcast”.

This is Ann Nguyen. Thanks for listening.

Register Early and Save

Data Platforms and Storage Infrastructure