PICCOLO White-Light and Narrow-Band Imaging Colonoscopic Dataset: A Performance Comparative of Models and Datasets

Research Projects
Organizational Units
Journal Issue
Colorectal cancer is one of the world leading death causes. Fortunately, an early diagnosis allows for e_ective treatment, increasing the survival rate. Deep learning techniques have shown their utility for increasing the adenoma detection rate at colonoscopy, but a dataset is usually required so the model can automatically learn features that characterize the polyps. In this work, we present the PICCOLO dataset, that comprises 3433 manually annotated images (2131 white-light images 1302 narrow-band images), originated from 76 lesions from 40 patients, which are distributed into training (2203), validation (897) and test (333) sets assuring patient independence between sets. Furthermore, clinical metadata are also provided for each lesion. Four di_erent models, obtained by combining two backbones and two encoder–decoder architectures, are trained with the PICCOLO dataset and other two publicly available datasets for comparison. Results are provided for the test set of each dataset. Models trained with the PICCOLO dataset have a better generalization capacity, as they perform more uniformly along test sets of all datasets, rather than obtaining the best results for its own test set. This dataset is available at the website of the Basque Biobank, so it is expected that it will contribute to the further development of deep learning methods for polyp detection, localisation and classification, which would eventually result in a better and earlier diagnosis of colorectal cancer, hence improving patient outcomes.
BibTeX RIS APA Harvard IEEE MLA Vancouver Chicago Sánchez-Peralta, Luisa F., J. Blas Pagador, Artzai Picón, Ángel José Calderón, Francisco Polo, Nagore Andraka, Roberto Bilbao, Ben Glover, Cristina L. Saratxaga, and Francisco M. Sánchez-Margallo. “PICCOLO White-Light and Narrow-Band Imaging Colonoscopic Dataset: A Performance Comparative of Models and Datasets.” Applied Sciences 10, no. 23 (November 28, 2020): 8501. doi:10.3390/app10238501.