PHD2022-05

PHD2022-05

Mito-learner, a deep learning based algorithm to classify functional mitochondrial states

Host laboratory and collaborators

Bianca HABERMANN (IBDM) / bianca.habermann@univ-amu.fr

Paul VILLOUTREIX (LIS) / paul.villoutreix@univ-amu.fr

Abstract

Mitochondria are essential organelles, which produce cellular energy in the form of ATP. They are also involved in many other cellular processes (1) and functionally adapt to cellular needs. In this project, we will combine deep learning with mitochondrial computational biology to develop "mito-learner", a deep learning based algorithm for predicting mitochondrial functional states of tumors (taken from the TCGA database (2)) based on mito-gene expression states. We will test different visual representations of mito-gene expression data for classification of samples with image-based deep learning algorithms. In parallel, we explore the possibility to use graph neural network methods (3) with the aim of encoding pairwise gene interactions in the most general manner. We will focus on 1200 mito-genes, using functional information as provided by the annotated mitoXplorer mito-interactomes (4), for classifying mitochondrial states of tumor samples. Ultimately, mito-learner will be implemented in the mitoXplorer platform for classifying mitochondrial states of user-provided data.

Keywords

Deep learning, data visualization, graph neural network, mitochondria, mitoXplorer

Objectives

1. test and optimize different image- and graph-based visualization techniques for classifying RNA-seq data using deep learning algorithms
2. apply deep learning-based classification of mitochondrial states in different tumors and tumor stages based on expression data from mitochondria-associated genes (mito-genes)
3. implement the deep-learning based classification algorithm in mitoXplorer

Proposed approach (experimental / theoretical / computational)

We will use deep learning techniques on RNA-seq data for classifying mitochondrial states using deep learning. Image-, as well as network-based deep learning methods for quantitative expression data will be tested and the most successful method will be further implemented and adapted to our needs. Only mitochondria-associated genes (mito-genes) will be used to classify different mitochondrial states, as defined in the mitoXplorer interactomes (4). Expression data coming from The Cancer Genome Atlas (TCGA), as well as other population-based studies will be used for testing and classifying different mitochondrial states. The chosen algorithm, which performs best in classifying mitochondrial states from gene expression data will be implemented in the mitoXplorer platform to enable classification of unseen user data, resulting in Mito-Learner.

Interdisciplinarity

This project combines classical mitochondrial biology with computational biology and deep learning. The expertise on mitochondrial biology and computational biology comes from the team of Bianca Habermann, while the team of Paul Villoutreix contributes entirely the expertise on deep learning, using different techniques. The student is expected to spend time in both teams and participate in lab activities of both teams.

Expected profile

The PhD candidate should have a background in computer science, applied mathematics, bioinformatics or computational biology. Ideally, s/he will have already experience with machine learning.

Is this project the continuation of an existing project or an entirely new one? In the case of an existing project, please explain the links between the two projects

This is a new project.

2 to 5 references related to the project

1. Scheffler I.E. Mitochondria. 2007; Hoboken, NJ John Wiley & Sons, Inc.
2. Cancer Genome Atlas Research Network. Nat Genet. 2013 doi: 10.1038/ng.2764.
3. Biological network analysis with deep learning. Giulia Muzio, Leslie O’Bray, Karsten Borgwardt. Briefings in Bioinformatics, 2020
4. Yim A, Koti P, et al. Nucleic Acids Res. 2020 doi: 10.1093/nar/gkz1128

3 main publications from each PI over the last 5 years

• Pierrelée, M., Reynders, A., Lopez, F., Moqrich, A., Tichit, L., Habermann, B.H. Introducing the novel Cytoscape app TimeNexus to analyze time-series data using temporal MultiLayer Networks (tMLNs). Sci Rep 11, 13691 (2021). https://doi.org/10.1038/s41598-021-93128-5.

• Meiler A, Marchiano F, Haering M, Weitkunat M, Schnorrer F, Habermann BH. AnnoMiner is a new web-tool to integrate epigenetics, transcription factor occupancy and transcriptomics data to predict transcriptional regulators. Sci Rep. 2021 Jul 29;11(1):15463. doi: 10.1038/s41598-021-94805-1. PMID: 34326396.

• Yim A, Koti P, Bonnard A, Marchiano F, Dürrbaum M, Garcia-Perez C, Villaveces J, Gamal S, Cardone G, Perocchi F, Storchova Z, Habermann BH. mitoXplorer, a visual data mining platform to systematically analyze and visualize mitochondrial expression dynamics and mutations. Nucleic Acids Res. 2020 Jan 24;48(2):605-632. doi: 10.1093/nar/gkz1128

• What can machine learning do for developmental biology. Paul Villoutreix, Development, 2021

• Towards a general framework for spatio-temporal transcriptomics. Julie Pinol, Thierry Artières, Paul Villoutreix, NeurIPS, LMRL workshop, 2020

• Synthesizing developmental trajectories. Paul Villoutreix*, Joakim Andйn*, Bomyi Lim, Hang Lu, Yannis Kevrekidis, Amit Singer, Stanislav Y. Shvartsman – PLoS Comput Biol 13(9): e1005742, 2017.