Artificial intelligence and radiology – threat or tool ?

In spite of alarm bells that artificial intelligence (AI) would decimate the radiology profession, a host of barriers – both technical and regulatory – make this unlikely to happen for the foreseeable future. Instead, over the coming decade, AI is at best likely to help radiologists do their jobs more quickly and lead to improved patient outcomes.

From CAD to AI
AI in radiology, in some senses, has tended to raise the same level of expectation as computer-aided detection (CAD) did for the profession in the 1990s. Indeed, there is now a distinction between computer aided detection which reduces observational oversight and false negatives in interpreting medical images, and computer aided diagnosis (also called CAD) – by virtue of which software is used to analyse a radiographic finding to estimate the likelihood of a specific disease process (e.g. a benign versus malignant tumour).
As a result, in spite of tens of thousands of machine-learning algorithms, there is little connection to clinical application. Most remain confined to the realms of research.

The Black Box barrier
Radiologists, for example, use visual pattern matching. However, few object recognition algorithms have yet been tested on gray-scale images, such as those widely used in radiology.
Though specific algorithms could in principle be tailored for specific tasks, they use different assumptions and targets, and often are written to function in different modalities. Consolidating a set of algorithms into one package and then using this to underpin image or data analysis is not feasible.
In effect, the key problem with CAD detection is its black box’ nature, which means they cannot explain why an object has been identified as abnormal. Many users remain suspicious about sharing the already-grey zone between detection and diagnosis with a machine, which only provides probabilities.

Sensitivity and specificity
The above kind of issues also hinder AI. Nevertheless, the technology is rapidly evolving and may offer some solutions to new challenges.
Like radiologists, AI faces the twin pulls of sensitivity and specificity, between false positives which overcall disease and false negatives which undercall it. It is clear that it will favour sensitivity over specificity.

Technology creates its own momentum
In recent years, radiologists have been forced to cope with an explosion in the stock of medical images, thanks to modern imaging technologies and PACS storage capacity. In the UK, for example, almost 5 million CT scans are performed per year by the NHS. At the upper end, a single pan scan’ CT of a trauma patient, for example, renders about 4,000 images. Indeed, a busy radiologist can read about 20,000 studies a year.
To deal with this burden – both physical and visual – radiologists clearly need help. AI seems to have become one of the most optimal.
There is, nevertheless, some irony here. Technology, in this case consisting of new imaging modalities, has led to an increase in the workload on radiologists. This is in spite of the fact that the disease burden has remained more or less the same, as has the prevalence on imaging of clinically significant pathology. However, the growth of imaging stock has led to a sharp rise in the presence of detectable and potentially significant pathology. Radiologists therefore face the massive challenge of finding ways to use the latter. This is where yet another technology, AI, steps in.

Industry push combines with radiologist pull
While the need to handle the imaging data explosion will see radiologists pulling’ AI, industry has chosen radiology to push’ for clinical validation. There are two reasons for this: the sheer volume of the imaging data and its continuing growth make it a huge market, while the fact that it is stored in structured and computer-readable DICOM format means it is a ready one.

AI’s own dynamics in change
Meanwhile, AI itself has seen some changes. Although, fuelled by science fiction and Hollywood, the popular imagination associates AI with self-awareness, what we really still have is more accurately machine intelligence. The implications of even such a toned-down definition should, however, not be under-estimated. Neither should some recent developments.

From Deep Blue to AlphaGo
In the late 1990s, IBM’s Deep Blue supercomputer defeated grandmaster Garry Kasparov in a chess game. In March 2016, Google DeepMind’s AlphaGo defeated Lee Sedol, a 9th level Go grandmaster 4-1. For AI experts, the AlphaGo win is far more impressive than Deep Blue because Go is less rules-bound than chess.
Due to these constraints, Deep Blue analysed millions of potential combinations and outcomes, in what IT professionals call brute force’ calculation. No computer can yet achieve this with Go, which according to Business Insider’ (March 10, 2016) has ‘more than 300 times the number of plays as chess. Alongside continuous scenario analysis, top Go players require both experience and intuition’. This is why AlphaGo’s win was seen as a paradigm shift in AI.

Deep learning
Unlike Deep Blue’s brute force, AlphaGo used a programming method called deep learning’, with so-called neural networks, which are far more similar to human thought processes than traditional computing. Rather than seeking to map out every possible move combination, deep learning (DL) is a relatively-unregulated process by which a computer figures out why something is what it is, after being shown several examples. It uses a large but still-finite sample of data, draws conclusions from that sample, and then, along with some human inputs, repeat the process over and over again, to simulate millions of games into a decision-making system.
Technically, AlphaGo’s deep neural networks consisted of a 12-layer network of neuron-like connections with a policy network’ to select the next move and a value network’ to predict the winner of the game.

A new benchmark

Neural network-based deep learning is now the benchmark for AI in radiology, with IBM’s poster child Watson leading the way. At the 2015 RSNA meeting, Watson showed its capacity to find clots in brightly shining pulmonary arteries.
Watson, however, has a DL rival in Australia’s Enlitic, which has developed a lung nodule detector claimed to achieve positive predictive values that are 50percent higher than those of a radiologist. As the detection model analyses images, it learns from those images. It not only finds lung nodules, it also provides a probability score for malignancy. Enlitic is now conducting a trial on a model to detect fractures using X-ray images overlaid with a heat map to highlight their location within a conventional PACS viewer. The clinical application will eventually encompass X-ray, CT, and possibly MRI. At the moment, Enlitic is working to incorporate ACR guidelines into it.
Although both Watson and Enlitic use deep learning, the approach is different. Watson seeks to understand’ a disease, Enlitic simply seeks to find source problem data, solve it, and produce a diagnosis.

Another DL developer is MetaMind, since last year part of CRM (customer relationship management) giant Salesforce.com. MetaMind has an alliance with teleradiology provider vRad to identify key radiology elements associated with critical medical conditions, especially in the latter’s focus area of emergency departments (EDs). The first tool to emerge from the partnership was an algorithm to identify intracranial hemorrhage (ICH), often seen in ED patients and requiring prompt action. vRad, which has put the algorithm into a beta phase that will allow it to collect data to demonstrate outcomes, is adapting it to identify other critical conditions, such as pulmonary embolisms and aortic tears.

Swarm AI
Apart from deep learning, radiology is also seeing the first successful experiments with swarm AI, which helps form a diagnostic consensus by turning groups of human experts into super experts. The technology borrows from nature, which sees species accomplishing more by participating in a flock, school or colony (a swarm’) than they can individually. One study, published in Public Library of Science (PLOS)’, stated that swarm intelligence could improve mammography screening and has the potential to improve other types of medical decision-making, ‘including many areas of diagnostic imaging.’ Another study found that accuracy in distinguishing normal versus abnormal patients was significantly higher with swarm AI than the radiologists’ mean accuracy.

Challenges ahead
Nevertheless, there is much more to be achieved before AI becomes an everyday tool in radiology.
The biggest roadblock will consist of regulators, who are unlikely to sanction the use or marketing of intelligent’ machines. In the US, as first of their kind, they lack the predicate devices needed to be regulated under the FDA’s 510(k) rules, and it would take decades to get approval for each algorithm.
A second issue is the time and cost to get datasets to fine-tune the algorithms. Watson, for example, has a backlog of 30 billion medical images to review.
Thirdly, the algorithms would also raise significant legal and ethical issues, such as knowing when they could be trusted.
Finally, even were such machines to become available, referring physicians are unlikely to accept conclusions or interpretations drawn solely by them.
The scale of such challenges has already been seen by developers of computer-aided detection (CAD) algorithms – and the change of CAD to detection’ rather than diagnosis’, as it was called in the early days.

Need and benefit, reality checks
In short, for now, radiologists need AI just as much as AI needs them.
Radiologists will have to begin to work with AI, both to improve the technology itself and to reduce routine, repetitive tasks such as confirming line placements and looking at scans to find nodules.
On its part, AI is likely to become an increasingly smarter tool, to improve efficiency, for example by prioritizing cases, putting thresholds on data acquisition, improving workflow by escalating cases with critical findings to the worklist of a radiologist and providing automatic alerts to both radiologists and other concerned clinicians.
In the longer term, DL algorithms are likely to be trained to recognize disease patterns, identify, outline and measure nodules and possibly highlight suspicious areas in images. This is likely to be followed by the use of DL-based AI as clinical decision tools, for example to help referring physicians select or narrow choices of scans, based on clinical observations in an EMR. Such steps would not only free up resources for additional testing but also improve patient care, thereby making radiologists even more integral in the care management process.

In the final count, a resonant reality check on AI has been provided by Eliot Siegel, MD, professor of radiology at the University of Maryland. He has offered to wash the car of anyone who develops a program than can segment adrenal glands on a CT scan as reliably as a 7-year-old.