Artificial intelligence technologies for the detection of colorectal lesions: The future is now

2020-10-23 07:25SimonaAttardoViveksandeepThoguluvaChandrasekarMarcoSpadacciniRobertaMaselliHarshPatelMadhavDesaiAntonioCapogrecoMatteoBadalamentiPieraAlessiaGaltieriGaiaPellegattaAlessandroFugazzaSilviaCarraraAndreaAnderloniPietroOcchipintiC
World Journal of Gastroenterology 2020年37期

Simona Attardo, Viveksandeep Thoguluva Chandrasekar, Marco Spadaccini, Roberta Maselli, Harsh K Patel,Madhav Desai, Antonio Capogreco, Matteo Badalamenti, Piera Alessia Galtieri, Gaia Pellegatta, Alessandro Fugazza, Silvia Carrara, Andrea Anderloni, Pietro Occhipinti, Cesare Hassan, Prateek Sharma, Alessandro Repici

Abstract Several studies have shown a significant adenoma miss rate up to 35% during screening colonoscopy, especially in patients with diminutive adenomas. The use of artificial intelligence (AI) in colonoscopy has been gaining popularity by helping endoscopists in polyp detection, with the aim to increase their adenoma detection rate (ADR) and polyp detection rate (PDR) in order to reduce the incidence of interval cancers. The efficacy of deep convolutional neural network (DCNN)-based AI system for polyp detection has been trained and tested in ex vivo settings such as colonoscopy still images or videos. Recent trials have evaluated the real-time efficacy of DCNN-based systems showing promising results in term of improved ADR and PDR. In this review we reported data from the preliminary ex vivo experiences and summarized the results of the initial randomized controlled trials.

Key Words: Endoscopy; Colonoscopy; Screening; Surveillance; Technology; Quality; Artificial intelligence

INTRODUCTION

Colorectal cancer (CRC) remains one of the leading causes of mortality among neoplastic diseases in the world[1]. Adequate colonoscopy based CRC screening programs have proved to be the key to reduce the risk of mortality, by early diagnosis of existing CRC and detection of pre-cancerous lesions[2-4]. Nevertheless, long-term effectiveness of colonoscopy is influenced by a range of variables that make it far from a perfect tool[5]. The effectiveness of a colonoscopy mainly depends on its quality, which in turn is dependent on the skill and expertise of the endoscopist. In fact, several studies have shown a significant adenoma miss rate of 24%-35%, especially in patients with diminutive adenomas[6,7]. These data are in line with interval cancers incidence (I-CRC), defined as the percentage of cancers diagnosed after a screening program and before the intended surveillance duration, of approximately 3%-5%[8,9].

Adenoma detection rate (ADR), defined as the proportion of patients in which at least one adenoma is detected (> 30% in men and 20% in women), along with adequate bowel preparation rate (> 85% of all colonoscopies), cecal intubation rate (> 95% in screening colonoscopies) and withdrawal time > 6 min, have been identified as quality metrics in screening and diagnostic colonoscopies, to reduce the I-CRC incidence[5,10,11]. Increase in ADR by 1% has shown to decrease the risk of incidence of CRC by 3%[9].

Innovations such as virtual and dye-spray chromoendoscopy and add on devices may help in improving ADR, particularly in low detectors[12-14]. However, all these strategies are operator-dependent tools requiring a learning curve. Further, individual experience and preference, may influence their use and efficacy.

The development of the artificial intelligence (AI) applications in the medical field has grown in interest in the past decade. Its performance on increasing automatic polyp and adenoma detection has shown promising results in order to achieve an higher ADR[15]. The use of computer aided diagnosis (CAD) for detection and further characterization of polyps had initially been studied inex vivostudies but in the last few years, with the advancement in computer aided technology and emergence of deep learning algorithms, use of AI during colonoscopy has been achieved and more studies have been undertaken[16].

The aim of this review is to provide an overview on the progress of AI, with deep learning technologies, with experiences from initialex vivostudies to real time adenoma detection from most recent studies.

AI

AI is the result of the evolution of general software systems that provide an input and obtain an output through an algorithm. Machine Learning is the ability of a program, to learn, after an adequate training, from data that were initially entered, in order to obtain a model that can cope with scenarios that had not specifically been instructed for.

In gastroenterology, AI could be applied to tasks and clinical concerns faced by endoscopists every day. For instance, the human eye is capable of capturing only a fixed number of frames or images per second. AI can help the endoscopist to highlight a specific region of interest which needs a closer examination for identification of polyps, or can assist with categorizing polyps as hyperplastic versus adenomatous polyp, thus eventually improving the ADR[17].

In the endoscopic field, this innovative technology uses two principal Machine learning methods.

Handcrafted knowledge

In these systems, engineers create a set of rules that describe knowledge in a welldefined field. It is, historically, the first approach to artificial intelligence that, in the 1980s, led to the development of the first Expert Systems based on an approach "If … Then". The systems that fall into this category "reason" on a very specific problem, have no ability to learn and a poor ability to reason in conditions of uncertainty, when they do not have all the elements to make the decision. Indeed object characteristics are extracted and selected manually and are used to create a model capable of categorizing them through algorithms. With regards to the evaluation of polyps, it will record a series of fixed parameters such as shape, size and texture, alone or in combination, from polyp image datasets in order to differentiate a polyp from the normal mucosa.

Deep learning

In deep learning, large artificial neural networks receive algorithms and increasing amounts of data, constantly enhancing the ability to “think” and “learn”. The adjective “deep” refers to the many levels that the neural network accumulates over time, improving performance proportionally to the depth of the network. Although most of the current deep learning is performed with human supervision, the goal is the creation of neural networks that can self-train and “learn” autonomously. Similar to our biological brain which tries to formulate an answer to a question by deducing a logical hypothesis and arrive at a solution for a problem, deep learning sets neural connections in motion (exactly as the human mind does), improving its performance through continuous learning using the convolutional neural network (CNN), mathematical-computer calculation models based on the functioning of biological neural networks.

An artificial neural network receives external signals on a layer of input nodes (processing units), each of which is connected with numerous internal nodes, organized in several levels. Each node processes the received signals and transmits the result to subsequent nodes working in parallel (Figure 1). They need a system training phase that fixes the weights of individual neurons and this phase can take a long time, if the number of records and variables to be analyzed are exceptionally large. As a result, the network success significantly depends on the creator’s experience[18].

The following would be a solid example of how a deep neural network works with the visual recognition of the patterns. In polyp or adenoma detection, the neurons of the first layer could learn to recognize the edges, the neurons in the second layer could learn to recognize elementary shapes, for example the round shape created by the edges. The third layer would recognize even more complex forms as a 3D structure, the fourth would recognize further details as a granular pattern and so on.

EX VIVO STUDIES

Severalex vivostudies on AI have been published in the past 20 years (Table 1).

In the early 2000s, Karkaniset al[19]developed the first algorithm based on color analysis. They chose 180 polyp sample images and then randomly analyzed 1200 polyp frames from a 5-10 s extract of 60 videos. Algorithm results were promising with a 93.6% sensitivity and 99.3% specificity for polyp, although with the limitation of long processing time and the still images[19]. To address these issues, Wanget al[20]introduced a new algorithm called “Polyp-Alert”, that by using edge detection, succeeded in a near real-time video analysis at 10 frames per second from a sample of 43 polyp shots extrapolated randomly from 53 videos. They reported a 97.7% per-

polyp sensitivity making this algorithm one of the first to be able to compete with real time speed. Many others methods have been proposed such as the one focusing on elliptical shape features from Hwanget al[21]or the window median depth of valleys accumulation energy maps system by Fernàndez-Esparrachet al[22], which have been able to detect a specific area of the image containing a polyp as it perceives polyps as mucosal protrusions with precise boundaries. As already pointed out, a steppingstone in the progress of AI was the advent of CNN and deep learning for CAD of polyps. The real innovation is tied to the fact that CAD systems can recognize polypoid and non-polypoid features without a continuous external input, after an initial training. In 2017, Zhanget al[23]devised an algorithm able to outperform expert endoscopists in polyp detection accuracy (86%vs74%). They also reported a 98% sensitivity using a dataset which contained millions of naturalistic images in addition to images of polyps. Misawaet al[24]had trained the CNN with a dataset of 1.8 million of frames of polyps from 73 colonoscopy videos, with a total of 155 polyps. Each frame was retrospectively evaluated by two expert endoscopists before being included in the dataset. The sensitivity achieved was 90% with a specificity of 63.3% for the framebased analysis[24]. Another interesting work was published in 2018 by Urbanet al[25]. The group pre-trained the algorithm with a dataset of 8641 colonoscopy images from about 2000 patients achieving a 96.4% of accuracy. Moreover the authors confirmed the CNN value, empowering the polyps detection potential of any senior endoscopist. Further, they also showed the possibility to reduce the miss rate, including 11 videos with 73 polyps deliberately missed by the endoscopist because of a fast withdrawal. The CNN identified 67 of the 73 polyps with a false positive rate of 5%. Another comparison between the human brain and the AI was made by Hassan and colleagues with a new system, the GI-Genius (Medtronic)[26]. It uses a dataset of 2684 histologically confirmed neoplastic polyps manually annotated by expert endoscopists and listed in 1.5 million frames. They quantified the reaction time in polyp detection for AI and five expert endoscopists were asked to observe 338 video clips and to press a button, once a polyp was identified. The overall sensitivity per lesion was 99.7%. The AI anticipated the detection against endoscopists in 82% of cases. They concluded that this result is probably due to the variability of endoscopist expertise, a greater detection of hyperplastic polyps in the AI group and benefits of AI outweighing the limitations of human beings such as fatigue and distraction.

Table 1 Ex vivo studies

Data from mainex vivoexperiences on AI systems are summarized in Table 1.

IN VIVO STUDIES

There is currently a lack of robust data on AI application in real time colonoscopy and its utility compared toex vivostudies, but recently there have been more studies published. Registration data on different CAD systems are summarized in Table 2.

Klareet al[27]published a prospective study on CAD (KoloPal) to aid with polyp detection in high definition (HD) white light endoscopy in 55 patients. Fifty-five HDwhite light colonoscopies were carried out by experienced endoscopists that through a verbal signal communicated the presence of any polyp. In parallel, another independent operator observed the examination on two other screens projecting images with and without the automated polyp detection software (APDS). The system highlighted with a green ring, particular areas of interest chosen by a combination of color, structure, textures, and motion, with a delay of 50-ms. Comparing APDS and endoscopists, polyp detection rate (PDR) (50.9%vs56.4%) and ADR (29.1%vs30.9%) were comparable. In particular a good performance was observed for larger (≥ 10 mm) and Paris morphology 0-Ip and 0-Is polyps. Smaller and flat polyp morphology had insufficient polyp detection rates by APDS. However, no polyp was detected by the APDS before detected by endoscopists, probably because of the software delay[27].

In the last year, there have been randomized controlled trials (RCT) published on the use of AI in colonoscopy (Table 3). In 2019 the first prospective RCT on real time automatic detection system using deep neural networks was conducted in China. Wanget al[15]enrolled 1058 patients undergoing routine colonoscopy. They were randomly assigned to either the CAD assisted colonoscopy group or the control group. They reported a significantly higher ADR in CAD group (29.1%vs20.3%), primarily due to increase in detection of diminutive adenomas without a significant difference in large adenomas. The detection of hyperplastic polyps was also significantly higher in CAD group (43.6%vs34.9%), which could potentially minimize the resection of these polyps and adverse events. There was also a higher mean number of adenomas detected (0.53vs0.31) in the CAD group[15].

Table 2 Artificial intelligence system country approval

Table 3 In vivo randomized control trials characteristics

Figure 1 Convolutional neural network design.

Liuet al[28]published their experience from China and reported that the average number of polyps detected in the CAD group were higher (0.87vs0.57,P< 0.001). CAD group also achieved an ADR of 39% compared to 23% in the control group. The detection power was particularly improved for sessile serrated lesions and diminutive adenomas[28]. Suet al[29]developed an automatic quality control system (AQCS) using deep CNN and randomized 659 patients between AI and control groups. They reported significantly higher ADR, PDR, mean number of adenomas and mean number of polyps in the AQCS group[29].

The most recent eastern RCT was by Gonget al[30]using another CAD technology called ENDOANGEL. Apart from its help in polyp detection, ENDOANGEL technology was also trained for recording, and possibly improving, colonoscopy quality indicators such as cecal intubation or withdrawal time by real-time signaling during the procedure[30]. They reported a significant improvement in ADR but one of the principal limitations of this study is the lack of external validity due to very low ADR rates in both groups (17%vs8%, CADvscontrol group). In countries like China where the incidence of CRC is lower[1], it may be acceptable, but it’s performance in Western countries where rates are much higher these such low ADR rates would indicate a low quality colonoscopy[5].

As a matter of fact Repiciet al[31]recently investigated the role of the GI-Genius CADe system (Figure 2) among expert endoscopists in a western-setting. The ADR of 40.4% of the control group was further increased up to 54.8% using the CADe system. Diminutive adenomas (≤ 5 mm) were detected in a significantly higher proportion of subjects in the CADe group (33.7%) than in the control group (26.5%), as were adenomas of 6–9 mm (10.6%vs5.8%), regardless of morphology or location. Based on colonoscopy videos of such a trial, the authors developed a “false positive” classification aiming to standardize future reports on this topic for a better insight on AI systems[32].

FUTURE NEEDS

The utility of AI has come a long way with computed assisted technology making significant strides in colonoscopy and improving the outcomes for quality metrics. Initial studies published were retrospective in nature, based primarily on feeding still images to the software. This could potentially introduce selection bias in polyp selection and hence influence the outcomes. Following these, there have been several prospective studies, designed for both polyp detection and characterization which reduce the possibility of bias and help with better interpretation of the functioning and accuracy of the CAD system. In the last year, we have had few RCTs published on this subject, mainly from China. They have shown improvement in quality metrics like ADR, withdrawal times and other outcomes like PDR, mean adenomas and polyps per person[33]. We believe that the future studies should be directed towards these goals: it is important to investigate in future studies, the utility of AI in the diagnosis and characterization of flat lesions and sessile serrated lesions which are known to have significant malignant potential. Only one of the RCTs has been performed in a multicenter setting and we need more RCTs performed in various centers across the world for generalizability and reproducibility of the results. Also several different CAD systems have been investigated including laser-induced fluorescent spectroscopy, endocytoscopy, narrow band imaging and chromoendoscopy but an ideal AI system combined with high definition white light endoscopy, aiding with polyp detection and characterization, is yet to be found and this is a work in progress currently. Moreover, the impact of false positive (futile activations) on endoscopists behave was only reported by a single experience and will deserve further insights. The impact of the ability of CAD in assisting with accurate prediction of polyp surveillance intervals also need to be investigated with longitudinal studies including follow-up data.

CONCLUSION

Figure 2 GI-Genius computer aided polyp detection system in high definition white light, and virtual chromoendoscopy with blue light imaging and linked color imaging. A: High definition white light; B: Virtual chromoendoscopy with blue light imaging; C: Virtual chromoendoscopy with linked color imaging.

Performance of a high-quality colonoscopy is essential in preventing the incidence of colorectal cancer. Significant progress has been made in the field of AI assisted colonoscopy, especially with the advent of deep CNN, which helps in overcoming the limitations of a traditional colonoscopy related to technical variations by operators and human errors. Early evidences on AI application in colonoscopy have shown it to be an effective tool in increasing efficacy in adenoma detection. RCT’s investigating these quality metrics have been published recently and more are in progress. However, the role of the endoscopist and in particular his abilities and experience cannot be overshadowed: (1) The detection ability of AI systems is dependent on the inspection of the mucosa exposed by the endoscopist during the scope withdrawal, and an adequate technique and the quality of bowel preparation are essential for its effective operating; and (2) The improving in detection seems to involve even hyperplastic polyps with low malignant potential and the endoscopist should be able to make a decision on which need to be resected and which do not. However, the ability of optical diagnosis is still suboptimal compared to histopathological evaluation, obtaining valid results only in specific settings[55]. Also in this field, CNN systems are being developed in order to further assist the endoscopist[56]. These would be key factors in deciding the efficacy and success of AI assistance and also play an important role with cost-benefit related outcomes in the future. Longitudinal follow-up and performance of AI in different study populations is essential in future studies to study its impact and generalizability of its use in clinical practice.