Zum Inhalt
Fakultät Statistik

I6: ImageOmics-ToxClass

Integration of different omics data with regression methods

In project I6 (Integration of image and omics data for toxicological classification tasks), we will develop methods for toxicological classification tasks, combining data from imaging techniques and high-dimensional molecular data.

By tissue cartography, we are referring to the characterization of tissue combining information on metabolites, ultrastructure, gene expression and functional aspects of the same cells and subcellular structures. The Leibniz institute IfADo (Jan Hengstler) has recently established a new focus area for its research regarding precision imaging and big data integration and has coined the term tissue cartography. One crucial subtask is the analysis of data generated from various imaging techniques, such as fluorescence microscopy, immunostaining, CARS (Coherent Anti-Stokes Raman Scattering microscopy), MALDI-MS (mass spectroscopy, see, e.g., Chaurand et al., 2006), and spatial transcriptomics. The two main statistical challenges are to preprocess the data for each technology correctly and most effectively and to integrate the different data types towards an overall classifier construction.

Several imaging technologies can already be used in house at IfADo, while some will become available within the second phase of the RTG. Initial analyses for this project have been performed using integrated MALDI-MS and immunostaining data obtained from mouse experiments and human tissue (Ghallab et al., 2022; 2024). For example, for groups of tumour-affected mice with and without knockout of the enzyme EDI3, for each tumour (mouse), 1500 images representing different metabolites, have been generated. EDI3 was previously shown to be associated with changes in the metabolism of cancer cells (Keller et al., 2023). Here, the goal is the identification of differential metabolites between knockout and wildtype mice. This is a prototypical example for various other applications with the task of discriminating between tumour types or differently treated experimental groups.

For the EDI3 data set, preliminary analyses were performed in two master theses at TU Dortmund University, supervised by Jörg Rahnenführer, Jan Hengstler, and Franziska Kappenberg. Based on pixel intensities extracted from the MALDI images, first, the tumour area must be identified, and then the distribution of the corresponding pixel intensities can be characterised. An initial idea was to generate features from this distribution, e.g., by estimating a mixture distribution of intensities in the tumour and in the non-tumour area of the image and then selecting quantiles of intensities in the tumour-area, or by directly selecting quantiles from the intensity distribution of the entire image.

For such features, separately for each protein, univariate statistical tests to discriminate between the pixel intensity features were applied, to identify differentially expressed proteins. The next steps will be the application of classification algorithms from statistical learning to identify best feature combinations and then to increase complexity further by using images from multiple proteins simultaneously, in order to model synergies between proteins as well as interaction effects. Finally, the last level of complexity will be the incorporation of different imaging techniques, where for each data type, separate feature generation strategies have to be developed.

For interpretation purposes, we will also perform biological and toxicological plausibility checks regarding the selection and weight of single proteins in the classifiers. In addition, such toxicological knowledge can be used to preselect proteins not only based on statistical and algorithmic considerations.

A further important step in future will be to explore the potential of AI methods for the classification of the images. It is well known that deep learning networks achieve high classification rates on image data, but only if the sample size is sufficiently high, which will not be possible in all of our applications, or if the neural networks were pre-trained on similar types of images. Due to the extreme blossoming of AI-based image classification in various research fields, we expect that suitable network topologies will be available in the near future. However, in addition, we will also investigate the potential of already existing deep learning neural network structures.

Referenzen

  • Chaurand P, Norris JL, Cornett DS, Mobley JA, Caprioli RM (2006). New Developments in Profiling and Imaging of Proteins from Tissue Sections by MALDI Mass Spectrometry, Journal of Proteome Research 5 (11), 2889-2900. doi: 10.1021/pr060346u
  • Ghallab A, Hassan R, Hofmann U, …, Brecklinghaus T, Kappenberg F, Rahnenführer J, …, Jaeschke H, Hoehme S, Hengstler JG (2022). Interruption of bile acid uptake by hepatocytes after acetaminophen overdose ameliorates hepatotoxicity. J Hepatol., 77(1):71-83. doi: 10.1016/j.jhep.2022.01.020
  • Ghallab A, González D, Strängberg E, …, Duda JC, Drenda C, Kappenberg F, …, Rahnenführer J, …, Dawson PA, Lindström E, Hengstler JG (2024). Inhibition of the renal apical sodium dependent bile acid transporter prevents cholemic nephropathy in mice with obstructive cholestasis. J Hepatol. 80(2), 268-281. doi: 10.1016/j.jhep.2023.10.035
  • Keller M, Rohlf K, Glotzbach A, …, Rahnenführer J, Overbeck N, Reinders J, Cadenas C, Hengstler JG, Edlund K, Marchan R (2023). Inhibiting the glycerophosphodiesterase EDI3 in ER-HER2+ breast cancer cells resistant to HER2-targeted therapy reduces viability and tumour growth. J Exp Clin Cancer Res, 42(1):25. doi: 10.1186/s13046-022-02578-w