MitoLearner: deep learning and explainable AI to understand the role of mitochondria genes in cancer cells
Mitochondria are essential organelles, which produce cellular energy in the form of ATP. They are also involved in many other cellular processes (1) and functionally adapt to cellular needs. In this project, we will combine deep learning with mitochondrial computational biology to develop "mito-learner", a deep learning based algorithm for predicting mitochondrial functional states of tumors (taken from the TCGA database (2)). The expression states of genes involved in mitochondrial functions (mito-genes) represented as vectors will be used to build image representations allowing the use of image-based deep learning architectures (CNNs, Transformers). In this project, we want to test different representations and models, and further develop visual interpretability methods to identify genes that can discriminate between healthy and tumor tissues, as well as different tumor stages.
Deep learning, explainable AI, interpretability, data visualization, mitochondria, mitoXplorer
1. Build different image-based representations of RNA-seq data allowing both the recognition of tumors as well as identifying genes and processes for tumor formation. Genes experts should also be able to identify genes and their links.
2. Apply and evaluate deep learning-based visual models to recognize different tumors and tumor stages from the produced representations.
3. Study visual interpretability methods to understand links between mito-genes and tumors.
4. implement the deep-learning based methods in mitoXplorer
Proposed approach (experimental / theoretical / computational)
Expression data coming from The Cancer Genome Atlas (TCGA), as well as other population-based studies will be used for our project. We will build, evaluate and optimize image-based representations from RNA-seq data. These representations should: 1) integrate genes hierarchical links, mitochondrial states (defined in the mitoXplorer interactomes (3)) and description of mitochondria-associated genes (mito-genes) 2) be understandable/readable by genes experts, 3) offer levels of information allowing deep models to discriminate tumors and tumor stages. After learning deep models, Interpretability methods (4) will be applied to gain insight on the role of genes to discriminate tumor from healthy tissue, as well as early tumor stages. Successful algorithms will be implemented in the mitoXplorer platform to enable classification of unseen user data, resulting in Mito-Learner.
This project combines classical mitochondrial biology with computational biology and deep learning. The expertise on mitochondrial biology and computational biology comes from the team of Bianca Habermann, while the team of Ronan Sicre contributes entirely the expertise on deep learning, using different techniques. The student is expected to spend time in both teams and participate in lab activities.
The PhD Candidate should have a background in computer science, applied mathematics, bioinformatics or computational biology. Ideally, s/he will have already experience with machine learning and/or deep learning. Experience in Sklearn, Tensorflow, Pytorch will be beneficial.
Is this project the continuation of an existing project or an entirely new one? In the case of an existing project, please explain the links between the two projects
This is a new project.
2 to 5 references related to the project
1. Scheffler I.E. Mitochondria. 2007; Hoboken, NJ John Wiley & Sons, Inc.
2. Cancer Genome Atlas Research Network. Nat Genet. 2013 doi: 10.1038/ng.2764.
3. Marchiano F, Haering M, Habermann BH. Nucleic Acids Res. 2022. 50(W1):W490-9. doi: 10.1093/nar/gkac306.
4. Zhang, Torres, Sicre, Avrithis, Ayache: “Opti-CAM: Optimizing saliency maps for interpretability”. Ongoing submission.
3 main publications from each PI over the last 5 years
- Marchiano F, Haering M, Habermann BH. The mitoXplorer 2.0 update: integrating and interpreting mitochondrial expression dynamics within a cellular context. Nucleic Acids Res. 50(W1):W490-9 (2022). doi: 10.1093/nar/gkac306.
- Pierrelée, M., Reynders, A., Lopez, F., Moqrich, A., Tichit, L., Habermann, B.H. Introducing the novel Cytoscape app TimeNexus to analyze time-series data using temporal MultiLayer Networks (tMLNs). Sci Rep 11, 13691 (2021). https://doi.org/10.1038/s41598-021-93128-5.
- Meiler A, Marchiano F, Haering M, Weitkunat M, Schnorrer F, Habermann BH. AnnoMiner is a new web-tool to integrate epigenetics, transcription factor occupancy and transcriptomics data to predict transcriptional regulators. Sci Rep. 2021 Jul 29;11(1):15463. doi: 10.1038/s41598-021-94805-1.
- Giffon, Ayache, Kadri, Artières, Sicre: "PSM-nets: Compressing Neural Networks with Product of Sparse Matrices." 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021.
- Ayache, Sicre, Artières: “Transfer Learning by Weighting Convolution”. 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 2020.
• Sicre, Avrithis, Kijak, Jurie: “Unsupervised part learning for visual recognition”. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.