Genetic and phenotypic comorbidities between Mendelian and common diseases from a network perspective

Host laboratory and collaborators

Aitor Gonzàlez / TAGC /

Anaïs Baudot / MMG /

Scientific background

Mendelian diseases show discrete phenotypes triggered usually by monogenic mutations, whereas common diseases show continuous phenotypes, which depend on numerous weak polygenic variations. However, Mendelian and common diseases share both molecular and phenotypic features, leading to comorbidity relationships [1]. For instance, mutations in the transcription factor (TF) GATA5 cause congenital heart defects (CHD), and SNPs variations around the same gene are involved in hypertension [2]. Hypertension is also a classical complication of CHD. We propose here to study these relationships between Mendelian and common diseases, but from a biological interaction network point of view. To this goal, we will map disease features (e.g., SNP variations, mutations, phenotypes) features to networks containing interactions between diseases, between genes/proteins but also interactions with non-coding genomic regions, and develop innovative algorithms to extract comorbidity subnetworks from these multiplex multipartite networks.

PhD Objectives

The objective of this PhD project is to investigate the shared architecture between Mendelian and common diseases both at the phenotypic and molecular levels by identifying subnetworks enriched in mutations and variations. As a consequence, this approach will also help predicting the relevant variations (SNPs, eQTLs) implicated in common polygenic diseases, or acting as modifier to modulate Mendelian diseases.

Proposed approach

We will create first a disease-disease network containing links between diseases sharing phenotypes. Then, we will build a classical multiplex network composed of different layers of biological relationships. It will contain protein-protein interactions, but also molecular complexes and pathway interactions [3; 4]. We will extend this multiplex framework to consider networks of relationships with non-coding DNA loci by including TF-DNA [5], and DNA-DNA/HiC interactions. Then, mutation and variation loci linked to Mendelian and common polygenic diseases will be mapped. Dedicated algorithms, such as random walks with restart and community detection strategies, will be developed and adapted to explore these extended multiplex and multipartite networks. They will help defining subnetworks enriched in comorbid associations, thereby predicting regulatory interactions and biological processes linking the different disorders.

PhD student’s expected profile

The PhD student should have a Master’s degree in an area related to Bioinformatics, Computer Science or Mathematics with interest for data analysis, graph theory and human genetics. The project and PhD student will benefit from the expertise from numerous experimental biologists working on Mendelian and common disease in both laboratories. We will have in particular access to in-house datasets of exomes and transcriptomes for some diseases of interest. The TAGC laboratory is interested in the study of complex traits and diseases, with a particular focus on the analysis of gene regulatory regions [5]. Within the TAGC, Aitor González investigates and models the non-coding regions and variants of the genome [6]. Marseille Medical Genetics is a research center located in la Timone faculty of Medicine. The research team "Networks and Systems Biology for Diseases" lead by Anaïs Baudot applies and develops network approaches to extract information from large-scale biological data, in order to investigate human disorders [3; 4; 7]. The project will be developed in collaboration with Daniel Rico and his team at Newcastle University, expert in network approaches to analyze genomic data [8].


[1] Blair, D. R.; Lyttle, C. S.; Mortensen, J. M.; Bearden, C. F.; Jensen, A. B.; Khiabanian, H.; Melamed, R.; Rabadan, R.; Bernstam, E. V.; Brunak, S.; Jensen, L. J.; Nicolae, D.; Shah, N. H.; Grossman, R. L.; Cox, N. J.; White, K. P. and Rzhetsky, A.(2013). A nondegenerate code of deleterious variants in Mendelian loci contributes to complex disease risk. Cell 155: 70-80.

[2] Padang, R.; Bagnall, R. D.; Richmond, D. R.; Bannon, P. G. and Semsarian, C. (2012). Rare non-synonymous variations in the transcriptional activation domains of GATA5 in bicuspid aortic valve disease. Journal of molecular and cellular cardiology 53: 277-281.

[3] Didier, G.; Brun, C. and Baudot, A. (2015). Identifying communities from multiplex biological networks. PeerJ 3: e1525.

[4] Valdeolivas, A.; Tichit, L.; Navarro, C.; Perrin, S.; Odelin, G.; Levy, N.; Cau, P.; Remy, E. and Baudot, A. (2018). Random Walk with Restart on Multiplex and Heterogeneous Biological Networks. Bioinformatics (Oxford, England) .

[5] Chèneby, J.; Gheorghe, M.; Artufel, M.; Mathelier, A. and Ballester, B. (2018). ReMap 2018: an updated atlas of regulatory regions from an integrative analysis of DNA-binding ChIP-seq experiments. Nucleic acids research 46: D267-D275.

[6] Seyres, D.; Darbo, E.; Perrin, L.; Herrmann, C. and González, A. (2016). LedPred: an R/bioconductor package to predict regulatory sequences using support vector machines. Bioinformatics (Oxford, England) 32: 1091-1093.

[7] Ibáñez, K.; Boullosa, C.; Tabarés-Seisdedos, R.; Baudot, A. and Valencia, A. (2014). Molecular evidence for the inverse comorbidity between central nervous system disorders and cancers detected by transcriptomic meta-analyses. PLoS genetics 10: e1004173.

[8] Pancaldi, V.; Carrillo-de-Santa-Pau, E.; Javierre, B. M.; Juan, D.; Fraser, P.; Spivakov, M.; Valencia, A. and Rico, D.
(2016). Integrating epigenomic data and 3D genomic structure with a new measure of chromatin assortativity. Genome biology 17: 152.