In 1971 Godfrey Hounsfield and his team took the first image of the human brain in vivo at Atkinson Morley Hospital in London, in order to investigate a patient with a suspected frontal lobe tumour. In the same period Tim de Dombal, a surgeon at St. James Infirmary in Leeds, successfully carried out a trial of a “computer assisted diagnosis” system which used statistical techniques to assess the likelihood of patients who presented with acute abdominal pain suffering from each of a number of conditions (1972).

By 1975 the first commercial CT scanners were available (at about $1M a machine) and every major hospital in the western world aspired to have one; by 2007 72 million CT scans were being performed annually in the USA alone (Wikipedia). By 1975 it was also possible to program de Dombal’s method on a $99 Hewlett Packard calculator. However by 2007 the use of statistical methods in diagnosis and treatment decisions had achieved almost no penetration into routine clinical practice.

This is an amazing contrast, particularly when one considers that the main uses of CT and most other imaging systems are limited to investigating conditions that produce local anatomical abnormalities that can be rendered as images, while statistical decision analysis is potentially useful in any clinical decision  –  risk assessment, diagnosis, test selection, prognosis, treatment planning and many other routine clinical problems – and doesn’t require any special equipment. Nowadays clinical decision support systems are a natural class of applications for the tablet computer, smartphone etc.

So why has statistical decision-making had so little impact on clinical practice? Its not that the techniques are immature; Bayesian and decision analysis methods are very well understood from a mathematical point of view and they have been tested with success in many other settings. One possibility for the lack of take-up in medicine that is often mentioned is that the techniques are famously “data hungry”; in order to get enough information to diagnose a disease reliably, or predict the success of a treatment or other clinical intervention, you need good data for a lot of patients. This can take a long-time to collect and be problematic in a variety of ways. The Glasgow Dyspepsia System, GLADYS, was an early computer aided diagnosis project that used probabilistic methods for diagnosing upper abdominal pain. The GLADYS team rapidly accumulated records for patients presenting with common causes of dyspepsia (e.g. duodenal ulcer and “functional” disease) but the rate-limiting factor in obtaining good estimates of probabilities for the dyspepsia domain was the unusual conditions, notably gastric cancers. In fact it took about 10 years to compile a sufficiently large set of patients with cancer to fully populate the statistical database! Further complications arose from geographical and temporal variability, the lack of standardization of terms, even vague understanding of clinical questions like “are you suffering from indigestion, and if so for how long?” 

If it isn’t practical to estimate statistical parameters on a large scale perhaps we might turn to clinical experts to provide their subjective estimates of likelihoods? Unfortunately cognitive scientists have shown that experts can’t provide the required “short cut” because their estimates of probability parameters are subject to various forms of bias, and clinicians are often reluctant to attempt estimates anyway. These and other issues have made the possibility of using subjective instead of objective estimates impractical.

By 1980 a new option for helping clinicians with their decision-making had appeared, one which took more of a logical, rule-based approach rather than a statistical one. The hubristically named “expert systems” promised to “apply human knowledge” and “emulate expert reasoning” in solving problems in many fields, one of the most prominent of which was medicine. In the last 10 years or so rule-based techniques have become increasingly popular in developing clinical decision systems in commercial clinical decision systems that are being promoted.  Despite practical successes there are reasons to be cautious about their use in clinical applications, particularly ones where a lot is riding on making good decisions. Firstly they are not grounded in a well understood mathematical framework for clinical decision-making, in contrast with established numerical methods. Furthermore its very difficult to check and validate rules and the old software adage “garbage in, garbage out” still applies.

Big data 

Perhaps now is the time for mathematical methods to return to the stage – perhaps this time to achieve the kind of impact in routine clinical practice that CT and other medical imaging techniques have? A large number of scientists, clinicians and computer scientists are taking this idea very seriously because of the rapid development of techniques for extracting useful information from the huge collections of structured and unstructured data to be found on the web. The appetite for large amounts of data characteristic of mathematical decision models may be about to be sated by the ability to rapidly estimate and continuously update reliable statistical parameters that can be fed into quantitative decision models. The new fields of “predictive analytics” and “prescriptive analytics” are growing rapidly.

“Big data” seems to be have captured people’s imagination far beyond the medical and clinical research world, in part perhaps because of our everyday experience with the magic of Google and other search engines. Google, Autonomy, SAS clinical analytics, GE Healthcare and Microsoft are just a few of the companies who are actively looking to apply big data technologies in clinical decision-making, and the new kid on the block, IBM’s “Watson” – which famously beat two national champions on the US general knowledge game show Jeopardy, and is now being refitted for healthcare applications. Watson’s advocates are promising revolutionary benefits to clinicians: “properly applied this has the potential to totally change the practice of medicine” (e.g. The Next Cancer Breakthrough http://www.youtube.com/watch?v=hMtXHvbecY0).

What does Watson do? In one IBM demonstration video[1] the voiceover says “Dr. Mark Norton, a clinical oncologist, logs into the electronic medical record for one of his patients. Instead of spending time trying to find relevant information he uses the IBM oncology diagnosis and treatment advisor and pushes the Ask Watson button and Watson analyses the patient data against tens of thousands of documents in its vast body of medical literature” (screen shows “3,469 textbooks; 69 guidelines; 247,460 journal articles; 61,540 clinical trials; 106,054 other clinical documents”).

The voiceover continues: “Dr. Norton starts with the case information tab and Watson pulls out the relevant information as well as suggestions for additional information to gather” [and] “tests that Watson suggests he consider ordering”. At the push of a button Watson can give specific text snippets to support the suggestions it makes, and pull up the evidence that underpins these snippets from the open research literature. Finally “he presses the treatment options tab to see a panel of confidence-scored suggested treatments” (see figure) “and a list of clinical trials to consider, and again can review the supporting evidence…”. The magic continues, with Dr. Norton speaking directly to Watson, to give additional information and ask questions.

This is surely amazing[2]. At this point the Watson example shown in the video is only a demonstration; the cancer application is not operational yet the narrator tells us. If and when it is we will of course want to see evidence that this new capability to process documents on a huge scale, and estimate statistical parameters to assess the predicted benefits of alternative treatments is really going to deliver significant patient benefits.  Nevertheless, despite the lack of firm and objective evidence to date, the potential implications are clear.

Caveats

There is a danger of getting a bit carried away though. In another IBM Marketing Video (http://www.youtube.com/watch?v=8DBqLTdPolI) a senior clinician from Memorial Sloan Kettering says “we have the opportunity of going past intelligence to what I would call wisdom” and “Watson is going to enable us to … take that wisdom and put it in a way that people who don’t have that much experience in any individual disease can have a wise counselor at their side”. 

The key thing to remember about Watson, however, is that despite obvious power to crunch massive amounts of medical information and clinical data its capabilities are not really comparable to the repertoire of abilities, skills and insights that a professional clinician calls clinical judgment. Watson is only what the Watson technical team designed it to be: “a smart question/answer system”. Being able to answer the range of general knowledge questions that arise in the Jeopardy game show is very impressive, and being able to rapidly come up with answers to important clinical questions like “what are the treatment options for my patient” is surely one of the core skills of the expert clinician. 

But clinical expertise and judgment are much more than that; a key feature of clinical judgment is knowing what the important questions are in the first place,  for example. Clinical judgment also depends critically on being able to consider a patient’s personal circumstances, co-morbidities and poly-pharmacy, and when to deviate from the standard or usual care pathway. It means being able to plan treatments as well as answer questions, and help patients set up a personalized plan of care consistent with their individual needs and preferences. For many patients it includes the expectation that the clinician will explain things clearly – what is being recommended and why – and tailor the detail and depth of explanations to a patient’s goals, abilities and requirements for information.

A key question in thinking about how big data might address these challenges is how people in general, including doctors and their patients, actually make decisions. Another set of insights from cognitive science is that we now know that people don’t make their decisions on statistical grounds, but in very different ways (Daniel Kahneman, Thinking Fast and Slow). Patients, and their doctors, don’t apply numerical models in their decision-making, and when they try they don’t do it very well. People do something else.

First a skilled decision-maker establishes the key clinical goals and priorities. and may have to creatively “problem solve” when facing a situation that has not been encountered before. We may apply rough “rules of thumb” and “fast and frugal heuristics” to quickly come up with diagnosis or treatment options in routine situations, but people also cope with uncertainty in subtle ways, and may weigh up many different criteria and arguments for and against the options. Clinicians and patients also seek to understand the arguments, the pros and cons of the competing options and the many kinds of knowledge that underpin them in the individual case, and evaluate the options against many distinct objective and subjective criteria. Finally, actually taking a decision is frequently difficult and is not just a matter of cool and rational assessment of data but is frequently bound up with personal aspirations, anxieties and cultural values. Good clinical judgment is about trying to accommodate all these issues in intelligible decision-making processes within a coherent patient narrative.

Cognitive systems

In Watson IBM is promising to give us a “cognitive system”. However humans are still the best cognitive systems we know in terms of the range and flexibility of our thinking. But human also have another and powerful mechanism which is not characteristic of many computer systems. We are “meta-cognitive” creatures; we can reflect critically on our thinking, our knowledge and our assumptions, on the persuasiveness of evidence and adequacy of arguments, and whether our assumptions are still valid as the world changes. We can also deliberate on the tradeoffs between costs and benefits and balance competing criteria. Is the cost of a treatment combined with the risk of an unsuccessful outcome more or less attractive than the impact on quality of life offered by a less effective but also less aggressive chemotherapy? It is a core clinical skill to be able to help patients arrive at a balance that they will not later come to regret.

Patients and their doctors want to discuss their decisions, to review and compare the treatment options and how they might play out over their lives and personal circumstances. They want to know the provenance of a recommendation and the evidence it depends on. “Why should I trust this damn machine?” The trouble with Big data and the search engines we all of us use is that right now they do not reflect in any depth on why this particular set of options appeared on the first page of google, or what the reasons are that a particular item is present and another is not. Indeed are the options we see a dispassionate reflection of the success of a treatment for a patient like me? Or have the treatment providers massaged the data to “up” their google rank? For the moment, Big data seems not so different from Black box. The instant acceptance of early CT images compared with the poor adoption of statistical decision methods into clinical practice is still a lesson in the age of big data – if computer advice isn’t intuitive and its provenance transparent then claims of “revolutions in clinical practice” and “going past intelligence to ‘wisdom'” seem somewhat premature.