#
Computational Biology Major

Master 1 courses

# 1st Semester

## Shared courses

**Organisation: **TD (18h)

**Lecturers: ** Julien Lefèvre

**Evaluation:** projects and oral presentation

The teaching unit consists of a presentation of the main professions involved in biological modelling. It will be carried out in two ways. First of all, some seminars will be offered with speakers from outside Aix-Marseille University in the academic and industrial field. Secondly, students will benefit from an immersion in Centuri laboratories where they will discover multidisciplinary research topics. To conclude this unit, students will be asked to present a specific problem related to data processing and modelling. The scientific aspect will also have to be integrated into a reflection on the underlying professional issues, whether in the academic or private sector.

**Organisation: **Lectures (6h), TD (6h), TP (6h)

**Lecturers: ** Sylvain Sené, Elisabeth Remy

**Evaluation:** projects and final written exam

This course is an introduction to the basics of finite dynamic systems and PLC networks (definitions of local functions, global function/relationship, automata, interaction graph, transition graph) as well as the main static and dynamic properties. A part of the lecture will also focus on the parallel update mode.

The lectures should provide students the skills to implement modelling approaches (differential, logical, stochastic or deterministic equations) to develop mathematical models of a biological system, analyze mathematical models and biological data to understand complex systems, evaluate the adequacy between a biological question, available data, and mathematical formalisms and interpret and validate a study.

**Organisation: **Lectures (18h)

**Lecturers: ** Thomas Lecuit, Jacques van Helden, Michael Kopp, Laurence Röder, François Muscatelli

**Evaluation:** final written exam

This course is divided in 2 parts taught during the 1^{st} and 2^{nd} semester of CMB. The first part of this module is a presentation of the evolutionary theories that have founded modern biology (from Lamarck to Darwin), and a synthesis of the discoveries that have led to current concepts of molecular and cellular biology: the role of macromolecules in cell function (information transfer between DNA, RNA, proteins, regulation, etc.), heredity and cellular adaptation.

Examples topics covered during the lectures:

- Information, evolution causes for living organisms
- Cellular information
- Epigenetics – phenomenons, information, adaptation, mechanisms

**Organisation: **Lectures (14h), TP (16h)

**Lecturers: ** Laurent Pézard

**Evaluation:** projects and final written exam

Computational biology will introduce the biological concepts necessary to model complex systems, implement modelling approaches (differential, logical, stochastic or deterministic equations) to develop mathematical models of a biological system, analyze mathematical models and biological data to understand complex systems and assess the adequacy between a biological question.

The course is divided in 2 sections:

- computational neuroscience: dynamic models of neuron function: dynamic behavioural simulation, biological aspects, computer complexity, analytical aspects
- bioinformatics: alignment, molecular phylogeny, prediction and modelling of structural aspects of proteins, cis-regulation

**Organisation: **Lectures (12h), TD (12h), TP (12h)

**Lecturers: ** Sylvain Sené

**Evaluation:** continuous exams

This course will teach students how to link biological problematics, available datas and mathematical formalisms by tackling the important notions and concepts used in algorithms and programmation. The course is divided in 3 sections:

- Unix: file system and basic shell commands, text utilities, redirections, pipe

- Programming language (Python): basic principles of imperative programming, control flow, basic data structures, local and external modules, functions

- Algorithms: arrays, sorting, lists, stacks, queues

**Organisation: **Lectures (12h), TD (12h), TP (12h)

**Lecturers: ** Jacques van Helden, Charlotte Perrin

**Evaluation:** projects and final written exam

*This course is divided in 2 parts.*

**Probability and statistics for modeling 1**

The first part of this course is a quick revision of the basics of probability and statistics. The concepts will be taught in relation to concrete biology exemples (genome analysis, complex systems). The following concepts will be taught:

- Combinatorial analysis
- Probabilities concepts
- Discrete laws (Bernoulli, géométrique, hypergéométrique, binomiale, Poisson)
- Quick review of the basic continuous laws (normale, Student)
- Estimation and sampling
- Hypothetical tests

**Continuous dynamical systems and modeling, examples**

The purpose of this course is to introduce some of the simplest differential equations and systems of differential equations which underlie the main continuous models used in biology (dynamics of populations or cells, biochemical processes, etc.). We will address both qualitative (long-time behavior) and quantitative (positivity, parameter dependency) properties of the considered models. In parallel to this theoretical study, numerical simulations will be performed during the computer sessions. Practicals will consist in using Python specialised libraries as scipy.integrate in order to visualise trajectories and systems behaviours.

## Bioinformatics courses

**Organisation: **Lectures (9h), TP (9h)

**Lecturers: ** Andrew Saurin, Laurence Roder

Genomics is an interdisciplinary field of modern biology studying the full genetic complement (the genome) of an organism. With nearly 3,500 genomes now sequenced, the post-genomic era has considerably improved our knowledge of the origin of diversity and evolution of genomes, cellular pathways, organismal phenotypes and human diseases. The course will be split into three main topics of a teaching course followed by a practical informatics-oriented course designed to cover the methods of genome analysis, the evolution of eukaryotic genomes, the power of comparative genomics to understand gene regulation and will finish by exploring how genomics gives rise to epigenomics and how both allow our understanding of gene regulation and its dysfunction in the development of human diseases.

**Organisation: **Lectures (6h), TD (6h), TP (6h)

**Lecturers:** Anna Frid

**Evaluation:** continuous monitoring and final written exam

This course will give a review of some basic notions of algebra and analysis and also teach how to program them in Python and visualize the results. The purpose is to is to set up the matrix tools necessary for regression:

- Matrix calculation: linear system, matrix multiplication,
- determinant calculation with Python, matrix inverse with Python,
- calculation of own values/vectors with Python,
- R^d, standard, scalar product, transposed from a matrix
- multivariate linear regression

# 2nd Semester

## Shared courses

**Organisation: **Lectures (12h)

**Lecturers: ** Bianca Habermann, Laurent Tichit

**Evaluation:** project and continuous monitoring

Scientific seminars constitute a good way to broaden your scientific horizon. In this regard, MSc students will frequently attend CENTURI seminars. At the end of the semester, students will be asked to write a summary of two seminars they have attended.

The students will learn to work in an interdisciplinary group, to deepen a subject and to communicate on it.

**Organisation: **Lectures (12h), TD (12h)

**Lecturers: ** François Muscatelli, Thomas Lecuit, Dominique Payet, Guillaume Voisinne, Valery Matarazzo, Julie Koenig

**Evaluation:** final written exam

The second part of this module will show how these molecular mechanisms underlie the development and functioning of tissues and organisms. It will be structured around four areas: intergenerational transmission of traits; organism development; immune system and nervous system.

Examples topics covered during the lectures:

- Information and organization: intergenerational transmission (cells, organisms Information, evolution causes for living organisms)
- Organisms’ development
- Information and organization of the immune system
- Information and organization of the nervous system

**Organisation: **Lectures (12h), TD (12h), TP (12h)

**Lecturers: ** Victor Chepoi, Kolja Knauer

**Evaluation:** projects and final written exam

*This course is divided in 2 parts.*

**Statistics for biology**

Statistics for biology aims at providing students with a practical approach of the analysis of biological data with R, based on the concepts acquired in the course “Probabilities and statistics for modelling 1”. The associated mathematical foundations will be developed in the course “Advanced statistics”.

- Sampling and estimation (moments, robust estimators, confidence intervals)
- Fitting
- Additional distributions
- Hypothesis testing (mean comparison, goodness of fit, …)

**Graph theory and algorythms 1**

This introductory course focuses on graphs as mathematical objects and some of its uses to solve applications to biological networks. After intruducing different classes of graphs and their properties, the following points will be developped:

- Planar graphs, graphs on a surface, Euler characteristic
- Interval graphs, perfect graphs

**Organisation: **Project

**Evaluation:** project

At the end of the courses Professional perspectives for biological systems modelling, and Fundamentals of biology 1, students will choose a scientific article at the interface of several disciplines on which they will work in groups. They will have to present in a memory and an oral presentation, to explain the biological context and the related basic concepts, to explain the methods used to interpret the biological data, to synthesize the results obtained in the article. Through this course, students will learn to work in an interdisciplinary group, to deepen a subject and to communicate on it.

**Organisation: **Lectures (6h), TD (6h), TP (6h)

**Lecturers: **Pierre Pudlo

**Evaluation:** projects and final written exam

This course will tackle advanced notions in statistics such as:

- Statistical inference (fundamental concepts, estimators, confidence intervals and tests, square error, bias and variance)
- Likelihood (Fisher information, likelihood ratio)
- Exponential family
- Convergence
- Multivariate Gaussian distributions

**Lecturers: ** Denis Puthier

**Evaluation:** projects and final written exam

*This course is divided in 2 parts.*

** Bioinformatic analysis : **TD (15h), TP (15h)

Analysis of omics data constitute a first approach towards omics data’s (e.g. DNA sequencing) high speed analysis. This course will first focus on introducing bioinformatics’ methods and their goal (e.g. quality check, positioning reads on a genome, searching for ChIP-seq’s spikes, etc.). Different kinds of files needed to proceed to these analysis will be presented, and more particularly those used in epigenomic sequencing (e.g. ChIP-seq, RNA-seq, etc.).

**Statistical analysis: **TD (10h), TP (20h)

During this course you will learn more about concepts of probability and statistics applied to biological models (e.g. epigenomic data sets). These biological models will be picked amongst the models presented in other courses.

Here are some examples:

- Statistical concepts: Fisher's exact test, hypergeometric distribution, multiple tests, unsupervised classification
- Biological models: detection of differentially expressed genes with counting data, functional enrichment analysis, classification of expression profiles

## Bioinformatics courses

**Organisation: **Lectures (6h), TD (6h), TP (6h)

**Lecturers: **Michael Kopp

**Evaluation:** projects and final written exam

This course is addressed to students with a background in biology, and aims at enforcing the theoretical grounds in order to allow them to apprehend advanced statistics.

- Numerical sequences and series (limits, convergence)
- Continuous laws of probabilities (normal, Student, chi2, Snedecor).
- Introduction to the different notions of convergences in probabilities and Relationships between the different laws (convergence between laws).
- Law of large number and Central limit theorem