The Shaming of Watson

IBM’s AI tool for healthcare is solving problems, and getting better every year. So why is everyone acting like it’s a failure?

Published in

Becoming Human: Artificial Intelligence Magazine

7 min readSep 26, 2018

From Hero to Has-Been in Just 4 Years

If you’re at all interested in technology and healthcare, by now you’ve probably heard about IBM Watson, the artificial intelligence technology that went from winning on Jeopardy in 2011 to being marketed to healthcare organizations for a variety of purposes.

One of the earliest implementations was at MD Anderson Cancer Center (MDA) in Houston, where Watson was to help oncologists solve a big problem: too much data. From a press release in October 2013 entitled “MD Anderson Taps IBM Watson to Power ‘Moon Shots’ Mission”:

MD Anderson has accumulated an unprecedented breadth and depth of clinical oncology data and knowledge… Watson’s cognitive capability has been shown to be a powerful tool to extract valuable insights from such complex data and MD Anderson’s Oncology Expert Advisor capability can generate a more comprehensive profile of each cancer patient… MD Anderson’s Oncology Expert Advisor can provide evidence-based treatment and management options that are personalized to that patient, to aid the physician’s treatment and care decisions.

Meanwhile, Progress

As a counterpoint to the MD Anderson collaboration, one can look to IBM’s work with its more than 230 partnering healthcare organizations worldwide.

At Memorial Sloan Kettering Cancer Center (MSKCC) in New York City. At MSKCC, medical staff have been working with IBM since 2012, using the AI technology in a variety of ways, including

Watson for Genomics — a cancer knowledgebase to allow physicians to identify potential therapies based on genetic sequencing of particular tumors
Watson for Oncology — a tool that takes as inputs a patient’s information (lab tests, history, etc) and combs through medical literature and other data to make treatment recommendations
Watson for Clinical Trial Matching — a system to better identify patients who might benefit from clinical trials (and currently being used at the Mayo Clinic, among other locations)

These systems are in wide use and have been found to be highly concordant with physician recommendations in studies in Korea, Thailand, Mexico, Arkansas, North Carolina, and elsewhere. UNC provides a particularly promising example:

In a study UNC conducted with 1,000 actual patient cases to compare Watson’s genomic analysis with the analysis of the center’s tumor board, the investigators found that Watson identified the same potential therapies as the tumor board 99% of the time. But what was more extraordinary, in about 300 patients, Watson found clinically actionable information that the tumor board had not identified.

For a variety of reasons, the systems recommending treatments are unlikely to achieve full concordance, particularly in international settings—the systems are trained on US data and US treatment protocols, for example, can different significantly from those in other countries—but the results are undeniably promising.

Still, the narrative has shifted from favorable to failure.

Online articles mentioning “IBM”, “Watson”, “Health”, and “Fail” (or “Failure”)

Watson is Bad

Leading the drumbeat of bad news on Watson has been STAT News, an online journal “about life sciences and the fast-moving business of making medicines”. In 2017 and 2018, they’ve published a series of unflattering articles about Watson, with the most damning (“IBM pitched its Watson supercomputer as a revolution in cancer care. It’s nowhere close”) coming out in September 2017.

Some of the criticisms strike me as frankly silly. For example, STAT notes that “the actual capabilities of Watson for Oncology are not well-understood by the public…”, but I’m not quite sure why the public would be expected to have any in-depth understanding of a oncology data system.

STAT also says that Watson “is still struggling with the basic step of learning about different forms of cancer,” which should surprise no one. Cancer AI isn’t like self-driving cars — where at some point the systems may be good enough that the AI won’t need further training, because the system will know everything it needs to know. In medicine, and particularly in oncology, we do not know — and do not expect to ever know — everything we need to know.

Like they say, it’s a journey, not a destination.

Even When It’s Good

But the most “underwhelming” aspect of Watson, per the STAT authors, was that it agreed with the doctors’ treatment ideas:

On a recent morning, the results for a 73-year-old lung cancer patient were underwhelming: Watson recommended a chemotherapy regimen the oncologists had already flagged… [One of the oncologists] said later that the background information Watson provided, including medical journal articles, was helpful, giving him more confidence that using a specific chemotherapy was a sound idea. But the system did not directly help him make that decision, nor did it tell him anything he didn’t already know.

So we’re supposed to be disappointed because a computer sitting on a desk provided a treatment recommendation for a particular patient, taking into account that patient’s history, labs, type of cancer, etc . . . and it was the same as the one picked by the medical specialist who had trained for more than a decade to do the same thing?

Yes, says STAT: “… showing that Watson agrees with the doctors proves only that it is competent in applying existing methods of care, not that it can improve them.” Ho-hum.

Don’t Believe the Hype

It seems to me that the one truly valid criticism of the Watson system is that IBM hyped it relentlessly (a process you might have cottoned on to once you noticed them hawking Watson on Jeopardy). Guilty as charged: IBM certainly has worked to build expectations — but looking past the hype there is a there there: Watson for Oncology is a widely-used system that most of its user-doctors seem to find useful, with no evidence at all of widespread opposition or objection in that same population of providers.

Does it need more refinement, and more data, and especially more clinical validation and more peer-reviewed reporting in the medical literature? Yes, yes, yes, and yes. But let’s not overlook the fact that even today Watson is an electronic system that can more often than not look at the patient data and give us the same treatment recommendations as a highly trained oncologist with years of experience, which is—make no mistake—a goddamn miracle of technology.

A Reality Check for IBM's AI Ambitions | MIT Technology Review

Paul Tang was with his wife in the hospital just after her knee replacement surgery, a procedure performed on about…

www.technologyreview.com

IBM's Watson proves useful at fighting cancer-except in Texas | Ars Technica

IBM's Watson is on the move. With the new ability to quickly develop clever personalized treatment strategies for…

arstechnica.com

MD Anderson Benches IBM Watson In Setback For Artificial Intelligence In Medicine | Forbes

It was one of those amazing "we're living in the future" moments. In an October 2013 press release, IBM declared that…

www.forbes.com

IBM Watson and Quest Diagnostics Launch Genomic Sequencing Service Using Data from MSK | Memorial Sloan Kettering Cancer Center press release

IBM Watson Health and Quest Diagnostics announced the launch of a new service that helps advance precision medicine by…

www.mskcc.org

Mayo Clinic boosts clinical trials with IBM Watson artificial intelligence | Healthcare IT News

Mayo Clinic and IBM Watson Health have announced the results of a cutting-edge project putting the supercomputer to…

www.healthcareitnews.com

Oncologists Partner with Watson on Genomics | Cancer Discovery

Genetic sequencing has become increasingly affordable and accessible for cancer patients, but the complexity of…

cancerdiscovery.aacrjournals.org

Abstract S6-07: Double blinded validation study to assess performance of IBM artificial…

Abstracts: 2016 San Antonio Breast Cancer Symposium; December 6-10, 2016; San Antonio, Texas Background: IBM Watson for…

cancerres.aacrjournals.org

This story originally published at FutureHealth.