RADIOGENOMICS AND DEEP LEARNING IN MANAGEMENT OF BRAIN CANCER

Sushmita Mitra, Professor, Machine Intelligence Unit, Indian Statistical Institute, Kolkata, FIEEE

Radiomics (or quantitative imaging) constitutes the high-throughput automated (or semiautomated) extraction of large amounts of quantifiable information (or image features) from radiographic images for improved management of the disease (or tumor) [1]. It requires the mining of large image datasets, to reflect and quantify the inherent heterogeneities, for improved decision-making. Radiomic features provide richer information about intensity, shape, size or volume, and texture phenotype, and are complementary to that provided by clinical reports, laboratory test results, and genomic or proteomic assays [2]. Feature selection aims at dimensionality reduction, in order to lower the computational cost of classification while not hampering the discriminatory power of the system. Computer-based medical image analysis is thus becoming an important field; mainly because of the high rate of production of images, as well as the increasing reliance on these by the biomedical community. Today radiographic imaging modalities, like computed tomography (CT), positron emission tomography (PET) and magnetic resonance imaging (MRI), are playing a major role in the diagnosis and prognosis of cancer.

Cancer is a disease of genetic instability, often associated with genes directly involved with cell growth and proliferation, differentiation, survival and apoptosis, or indirectly involved through genes participating in cell signal transduction pathways. Current development in genomics and proteomics has enabled molecular profiling of biological specimens by simultaneously revealing the expression levels of thousands of genes and proteins. Gene expression patterns of cancer tissues can reveal its etiology, prognosis and response to therapy, and facilitate individualized selection of therapies. Here feature selection helps in filtering out and focusing on those sets of differentially expressed genes, from the diseased tissue, that can be best correlated with patient prognosis and clinical outcome [3].

It has become increasingly clear that patient-specific gene mutations and certain tumor characteristics result in each patient having her own disease, which again warrants specific treatments. The concept of using advanced diagnostic capabilities to tailor treatment, specific to an individual’s genetic build, is called personalized (or precision) medicine [1]. This aims to individualize treatment towards the specific characteristics of a patient and her tumor genotype. Associating molecular genotypes with imaging phenotypes (biomarkers) is termed radiogenomics, and it throws up an exciting emerging field of research. It holds promise for personalized optimal treatment [4], while eliminating the hazards of over-diagnosis. Characteristic features from the segmented region of interest in an image can be correlated with the gene expression profile of the tumor to determine its non-invasive imaging surrogates (or substitutes) [5].

With the worldwide availability of “big” datasets in healthcare (encompassing images and patterns in digital form), high computing power, efficient low-cost algorithms, and easy access to cloud computing solutions, the growth of interest in artificial intelligence (AI) applications is increasing. The automated mining of radiomic information from images (typically not discernible visually) is enhancing the diagnostic and prognostic benefits of patients, as derivable from images. Learning can be supervised (with class labels) or unsupervised (clustering/ segmentation), through maximizing (minimizing) patient similarity within (between) clusters. Attributes like age, gender, disease history, diagnostic imaging, gene expression, electrophysiological tests, physical examination results, clinical symptoms, medication, can serve as input to the learning algorithm. The output (knowledge) can consist of disease indicators, patient survival time, and quantitative disease measures (like tumor size or grade).

Objects of interest in medical images, in the context of cancer, would correspond to lesions and tumors. These again are of various shapes and inhomogeneity, even including spiculation at the surface. Such volumes of interest often become too complex to be accurately represented by any simple equation and/or model. It requires a complex model with a large number of parameters that cannot be accomplished manually and becomes data dependent. This is where the role of machine learning becomes important in medical imaging, through mining tasks like feature extraction (like, contrast, area, circularity), feature selection, clustering or segmentation, and classification (like, cancerous or benign, grading, etc.). The role of machine learning is to determine an optimal discrimination between the multiple output classes, through training, in the multi-dimensional feature space; to subsequently classify an unknown test image. Some of the popular learning algorithms include linear regression, logistic regression, naive Bayes, decision trees, nearest neighbors (NN), random forests, discriminant analysis, support vector machines (SVMs), and artificial neural networks (ANNs). However, some of the sensitive issues of concern include basic pattern recognition concepts like feature selection and dimensionality reduction, through generalization over independent test sets or cross validation, while preserving repeatability and reproducibility across observations collected in multi-institutional frameworks.

Deep learning is a kind of representation learning, where a machine learns from raw data to automatically discover the representations needed for detection or classification [6]. It involves multiple levels (depth) of representation, obtained by composing simple non-linear modules, that each transform the representation at one level into another at a higher, slightly more abstract level. Deep learning, in the context of medical images, directly uses pixel values of the images (instead of extracted or selected features) at the input, without involving object segmentation; thereby, overcoming the manual errors caused by inaccurate segmentation and/or subsequent feature extraction. However, some of the inherent limitations of deep learning include high computational cost and requirement of large number of training images. Convolutional neural networks (CNNs) constitute one of the popular models of deep learning. The breakthrough in CNNs came with the ImageNet competition in 2012 [7], where the error rate was almost halved for object recognition. CNNs were revolutionally revived through the efficient use of Graphics Processing Units (GPUs), Rectified Linear Units (ReLUs), dropout regularization and data augmentation. Given that images are naturally complex and high volume, the use of deep learning has been reported mostly in imaging analysis. Some of the commonly used deep learning models in medical applications include CNNs, recurrent neural networks (RNNs), Residual convolutional neural networks (ResNets), and deep belief networks. Recently significant research was undertaken in the application of deep learning for quantitative medical imaging [8] and genomics [9].

Gliomas constitute 70% of malignant primary brain tumors in adults [10] and are usually classified as High-Grade Gliomas (HGG) and Low-Grade Gliomas (LGG) – with the latter being less aggressive and infiltrative than the former. Magnetic Resonance Imaging (MRI) has been extensively employed in diagnosing brain and nervous system abnormalities, over the last few decades, due to its improved soft tissue contrast. The MR sequences include T1-weighted, T2-weighted, T1-weighted with gadolinium-based contrast enhanced (T1C), and T2-weighed with fluid-attenuated inversion recovery (T2-FLAIR ). The rationale behind using these four sequences lies in the fact that different tumor regions may be visible in different sequences, thereby allowing a more accurate composite demarcation of the tumor involving improved tissue discrimination. For example, T1C enhances the tumor region that is well-perfused with high tumor cell density where there is a breakdown of the blood-brain barrier. This can, thereby, provide delineation of gross tumor margins and allow earlier detection of additional small metastatic lesions. Necrosis and solid tumors can also be visually distinguished. Typically, T1Cweighted images are used for monitoring tumor response to therapy. The T2-weighted sequences are sensitive to water tissue content and can be used to estimate cellular density and presence of edema. T2-FLAIR- and T2-weighted images, in conjunction, help provide a better distinction between edema and solid tumor. Accurate delineation of tumor region in MRI sequences is of great importance since it allows:

i) volumetric measurement of the tumor, ii) monitoring of tumor growth in a patient between multiple MRI scans, and iii) treatment planning with follow-up evaluation.

Fig. 1: Segmentation view of glioma sub-regions in multi-modal MRI [11]

Besides detection, localization, and classification, the most widely studied aspect in medical imaging applications has been segmentation. Tumor segmentation from brain MRI sequences is usually done manually by the radiologist. Being a highly tedious, time-consuming and error-prone task, mainly due to factors such as human fatigue, overabundance of MRI slices per patient, interobserver variability, and increasing number of patients, such manual operations often lead to inaccurate delineation; thereby potentially leading to unstable results in the radiomic analysis.

The need for an automated or semi-automated Computer Aided Diagnosis thus becomes apparent. It can also serve to improve the accuracy in automatic detection in order to assist doctors in diagnosing faster and on time. The segmentation of the glioma and its sub-regions, in multi-modal MRI, is depicted in Figure 1.

REFERENCES

  1. S. Mitra and B. Uma Shankar, “Integrating radio imaging with gene expressions towards a personalized management of cancer,” IEEE Transactions on Human- Machine Systems, vol. 44, pp. 664–677, 2014.
  2. M.Avanzo, J. Stancanello, and I. El Naqa, “Beyond imaging: The promise of radiomics,” Physica Medica, vol. 38, pp. 122–139, 2017.
  3. T. R. Golub, D. K. Slonim, P. Tamayo, and et al., “Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring,” Science, vol. 286, pp. 531–537, 1999.
  4. E. Segal and et al., “Decoding global gene expression programs in liver cancer by non-invasive imaging,” Nature Biotechnology, vol. 25, pp. 675–680, 2007.
  5. C. Jaffe, “Imaging and genomics: Is there a synergy?” Radiology, vol. 264, pp. 329–331, 2012.
  6. Y.LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, pp. 436–444, 2015.
  7. A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet classification with deep convolutional networks,” in Advances in Neural Processing Systems, vol. 25, pp. 1097–1105, 2012.
  8. D. Shen, G. Wu, and H. I. Suk, “Deep learning in medical image analysis,” Annu. Rev. Biomed. Eng., vol. 21, pp. 221–248, 2017.
  9. H. Zeng, M. D. Edwards, G. Liu, and D. K. Gifford, “Convolutional neural network architectures for predicting DNA–protein binding,” Bioinformatics, vol. 32, pp. i121–i127, 2016.
  10. S. Bauer, R. Wiest, L. P. Nolte, and M. Reyes, “A survey of MRI-based medical image analysis for brain tumor studies,” Physics in Medicine and Biology, vol. 58, pp. R97–R129, 2013.
  11. S. Banerjee and S. Mitra, “Novel volumetric sub-region segmentation in brain tumors,” Frontiers in Computational Neuroscience, vol. 14, p. doi: 10.3389/fncom.2020.00003, 2020.

Biography

Sushmita Mitra is a full professor at the Machine Intelligence Unit (MIU), Indian Statistical Institute, Kolkata. From 1992 to 1994 she was in the RWTH, Aachen, Germany as a DAAD Fellow. She was a Visiting Professor in the Computer Science Departments of the University of Alberta, Edmonton, Canada; Meiji University, Japan; and Aalborg University Esbjerg, Denmark. Dr. Mitra received the National Talent Search Scholarship (1978-1983) from NCERT, India, the University Gold Medal in 1988, the IEEE TNN Outstanding Paper Award in 1994 for her pioneering work in neuro-fuzzy computing, the CIMPA-INRIA-UNESCO Fellowship in 1996, and Fulbright-Nehru Senior Research Fellowship in 2018-2020. She was the INAE Chair Professor during 2018-2020. Dr. Mitra has been awarded the prestigious J. C. Bose National Fellowship, 2021.

Dr. Mitra is the author of several books . She has guest edited special issues of several journals, is an Associate Editor of “IEEE/ACM Trans. on Computational Biology and Bioinformatics“, “Information Sciences“, “Neurocomputing“, “Fundamenta Informatica“, SN Computer Sciences and is a Founding Associate Editor of “Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery (WIRE DMKD)“. She has more than 150 research publications in referred international journals. According to the Stanford List, Dr. Mitra is ranked among the top 2% scientists worldwide in the domain of Artificial Intelligence and Image Processing.

Dr. Mitra is a Fellow of the IEEE, Indian National Science Academy (INSA), International Association for Pattern Recognition (IAPR), and Fellow of the Indian National Academy of Engineering (INAE) and The National Academy of Sciences, India (NASI). She is an IEEE CIS Distinguished Lecturer, Member of Inter-Academy Panel for Women in STEMM, and the current Chair, IEEE Kolkata Section.