by Karolinska Institutet

cancer cells

Credit: Unsplash/CC0 Public Domain

In recent years, there have been rapid advancements in the field of computational pathology, which refers to the application of computational methods in pathology workflows. Traditional pathology involves the study of diseases by examining tissues, organs, and bodily fluids. In computational pathology, digital pathology images are analyzed using computer algorithms to extract meaningful information.

To this end, techniques from machine learning, image analysis, and data mining are used to assist pathologists in tasks such as disease diagnosis and prognosis. The adoption of digital pathology workflows that generate digital images of histopathological slides, the publication of large data sets of these images and improvements in computing infrastructure have contributed to the recent technological advancements.

Methods in computational pathology can be broadly categorized into two objectives. First, the automation of routine workflows that would otherwise be performed by pathologists and second the addition of novel capabilities.

In his thesis, Philippe Weitz, Ph.D. student at the Department of Medical Epidemiology and Biostatistics, focuses on novel capabilities, i.e., the development, application, and evaluation of new methods, specifically the prediction of gene expression from pathology images and the registration of pathology images among each other. This has the potential to advance the quality of and access to precision diagnostics.

What are the most important results in your thesis?

My thesis includes five studies. Two of these focus on the development and evaluation of methods for the prediction of gene expression from histopathology images. In these studies, we find that the prediction of gene expression in co-expressed clusters significantly reduces computational costs while potentially improving the prediction performance. Furthermore, we find that attention-based multiple-instance-learning does not appear to improve gene expression prediction performance, while potentially being more vulnerable to overfitting.

The three remaining studies focus on the registration of pathology images, which refers to the alignment of corresponding tissue from multiple images. One publication describes the ACROBAT data set, which we published to facilitate the ACROBAT registration challenge. It is the currently largest publicly available data set with multiple images with different stains from the same tumor.

The ACROBAT challenge itself establishes the current state-of-the-art in multi-stain histopathology image registration. The final study is an application example of histopathology image registration, in which we demonstrate that cancer detection models trained with registered annotations are not inferior to those with annotations that were newly generated for a specific stain.

Why did you become interested in this topic?

Recent advances in computation infrastructure and the availability of large public data sets have enabled the development of machine learning and artificial intelligence models for medical diagnostics in research contexts. I believe that these methods have the potential to change many aspects of current clinical routines, and I would like to contribute to this development. It is very motivating for me to think that patients could benefit from my research results.

What do you think should be done in future research?

Future research in the field should focus on the rigorous examination of current methods in independent validation data. Currently, many studies rely on internal test sets to evaluate algorithm performances, which provides insufficient evidence for a broad adoption of computational pathology methods in pathology departments.

More information: Artificial intelligence in histopathology image analysis for cancer precision medicine. openarchive.ki.se/xmlui/handle/10616/48753

Provided by Karolinska Institutet