How Much Can We Trust AI?

External verification is a key element in dealing with AI-based diagnostic results.

With artificial intelligence (AI) systems providing diagnoses, “we are facing a verification problem”, said Saurabh Jha from the University of Pennsylvania. “The Why is fundamental in how we perform medicine.” 

The more sophisticated and complex an algorithm is, the less likely we are to believe it, said Jha, even if its prediction tends to be more accurate than conventional diagnostic procedures. So how should radiologists deal with believing and verifying the output of such algorithms? Jha sees a methodological as well as a plausibility issue.

Verification is Key

In medicine, professionals have to make diagnostic decisions to guide further clinical management. “The verification is the key element”, said Jha. Performing much more randomized controlled trials to verify algorithm-based results would be an option. However, this is time consuming and costly, and would therefore slow down AI progress. 

“One problem with machines is if we don’t know whether they are right – but what if they get something profoundly wrong?”, he questioned. As long as AI and conventional diagnoses match, there is no problem. “But what if AI starts telling us stuff we can’t see?” Jha gave an example: AI says ‘thickening in the sigmoid colon’, but checking images does not reveal anything suspicious. “Disconfirmation and plausibility are going to be big problems”, said Jha.

Verifier must have external perspective

Disconfirmation is not just a psychological problem (can I believe or not?), but also a logical one: No system can confirm what it is doing from within itself. “If you want to know a system you need to know it from outside as well”, he explained. “If AI itself is going to be its own arbitral, it is going to be a circular argument.” Even if it is correct, and even if it is more correct than radiologists. Therefore, these systems always need an external verifier.

Public Discussion

“When does it become unethical for a radiologist to oppose AI?” questioned an audience member? “If there is a common body of agreement that AI is better than the radiologist, then the sole objector would be unethical”, said Jha. He would not expect this case within the next 10 to 15 years, “but I am not saying this will never happen.” Science is much more of a social enterprise than most of us would expect, he added. 

Session moderator Richard Gunderman, discussed what ‘excellence’ in radiology really means: One concept could be radiology as a primarily computational process dealing with data. Another perspective says that radiology is mostly about situating the diagnostic outcome in an appropriate context. “But you could also say that radiology in the practice of medicine is mainly relational endeavors”, based on good relationships with patients and referrers and on being trusted by them, said Gunderman. “When AI aspires to work autonomously that would be not only a handicap but a fatal weakness”, he concluded. 

The last statement was from an audience member saying that discussing AI often starts with the false assumption that AI results fundamentally differ from any other type of data being used. “When we have algorithms providing us with more data, it is important to really look at their validation – not only when we install them, but continuously.”

Presentation Title: Will We Trust AI?
Speaker: Saurabh Jha, University of Pennsylvania, Philadelphia, USA
Date: 2018-11-29
Session code: RC724C