Home / C2D3 Computational Biology

C2D3 Computational Biology

C2D3 Computational Biology logo

We are living in a very exciting time for biology: whole-genome sequencing has opened up the field of genome-scale biology and with this a trend to larger-scale experiments, whether based on DNA sequencing or other technologies such as microscopy.  However it is also a time of great opportunity for small-scale biology as there is a new wealth of data to build from: one can turn to a computer to ask questions that previously might have taken months to answer in the laboratory. One of the great challenges for the field is analysing the large amounts of complex data generated, and synthesising them into useful systems-wide models of biological processes. Whether operating on a large or small scale the use of mathematical and computational methods is becoming an integral part of biological research.

There remains a world-wide shortage of skilled computational biologists. An important part of C2D3 Computational Biology is an MPhil course based at the Centre for Mathematical Sciences. The 11-month course introduces students to bioinformatics and other quantitative aspects of modern biology and medicine. It is intended especially for those whose first degree is in mathematics and computer science and others wishing to learn about the subject in preparation for a PhD course or a career in industry. Complementing the MPhil course is the Wellcome Trust PhD programme in Mathematical Genomics and Medicine.  Run jointly with the Wellcome Trust Sanger Institute this programme provides opportunities for collaborative research across the Cambridge region at the exciting interfaces between mathematics, genomics and medicine.

History and financial support 

C2D3 Computational Biology came about by the merger of the Cambridge Computational Biology Institute (CCBI) into C2D3 in 2021. The CCBI was established in 2003 to promote computational biology, interpreted broadly, within the University and in the region. It established (2004) the MPhil in Computational Biology programme, founded (2011) the Wellcome Trust Mathematical Genomics and Medicine 4-year PhD programme, and, among other activities, started a popular computational biology annual symposium. The CCBI was involved in setting up and helping to run the Cambridge Big Data (CBD) Strategic Research Initiative out of which the C2D3 Interdisciplinary Research Centre was formed. Similarly the CCBI was part of the group that helped set up the Alan Turing Institute.  

The CCBI received financial support equally from the four science schools of the University: 

  • The School of the Biological Sciences      
  • The School of Clinical Medicine      
  • The School of the Physical Sciences (via DAMTP, Physics, Chemistry)      
  • The School of Technology (via Engineering, Computer Science) 

Space was kindly provided by the Department of Applied Mathematics and Theoretical Physics, within the Centre for Mathematical Sciences. 

MPhil in Computational Biology  

The Cambridge-MIT Institute provided funds to establish the MPhil in Computational Biology and subsequently studentships have been provided by: 

  • Biotechnology and Biological Sciences Research Council      
  • Cancer Research UK      
  • Engineering and Physical Sciences Research Council      
  • Medical Research Council      
  • Microsoft Research 

MGM PhD Programme 

The PhD programme in Mathematical Genomics and Medicine is funded by the Wellcome Trust.

Mailing list

To sign-up to the mailing list, with option to join the C2D3 main mailing list, please complete the appropriate form here.


Does computational biology explain Fibonacci numbers in plants?

Wednesday, 1 February 2023, 2.00pm to 3.00pm
Speaker: Jonathan Swinton
Venue: CMS, Meeting Room 15

The appearance of Fibonacci numbers in plant spirals is one of the most famous of all the mathematical structures in the natural world, while the genes involved in plant pattern development (phyllotaxis) are increasingly well characterised. Since there are no genes coding for Fibonacci numbers, proponents of systems biology might well hope this is an ideal setting in which to demonstrate the power of a combination of computational modelling and molecular detail. Yet although there is now a standard and mathematically satisfying argument for why Fibonacci structure _might_ be seen in models of plant development, there remain a number of outstanding questions, both mathematical and biological, which need to be convincingly answered before the Fibonacci problem can take its place as a paradigm for mathematical modelling in molecular biology. Jonathan Swinton will show how to find Fibonacci numbers, outline the open questions in the field and introduce his new book Mathematical Phyllotaxis.

Research in the Goldman group: pandemic-scale phylogenetics, and optimizing new sequencing technologies

Wednesday, 8 February 2023, 2.00pm to 3.00pm
Speaker: Nick Goldman and Nicola De Maio, EMBL-European Bioinformatics Institute
Venue: CMS, Meeting Room 15

We will give a summary of the ongoing research in the Goldman group at EMBL-EBI, and of possible projects for the students.
First, we will describe our work in large-scale phylogenetics, and in particular the development of a new algorithm for fast and accurate phylogenetic analysis of millions of SARS-CoV-2 genomes.
Phylogenetic methods are essential in studying evolution, as exemplified by their use during the SARS-CoV-2 pandemic to identify new variants, reconstruct the origin of the virus, and trace transmission, among many other applications. Phylogenetic analyses have however been hindered by their elevated computational demand and low reliability, issues that we address with a new algorithm specifically developed for scenarios of low divergence between genomes. Secondly, we are working on optimizing the genome-sequencing data gathering capabilities of third-generation technologies. Nanopore sequencing devices now have the ability to 'reject' DNA fragments they have started to read, and we have been working on methods that enable good strategies for which fragments to read in their entirety and which to reject. We will describe two experimental contexts in which our methods can be useful, one of which is working and recently published and one of which is still work-in-progress.

Mammalian Synthetic Biology – Biomolecular Circuits as Medicine

Monday, 13 February 2023, 2.00pm to 3.00pm
Speaker: Xiaojing Gao, Stanford University
Venue: CRUK CI

Dr. Xiaojing Gao is an Assistant Professor of Chemical Engineering from Stanford University. He received a B.S. in Biology from Peking University and a Ph.D. in Biology from Stanford University. He received his postdoctoral training from Biology and Biological Engineering at Caltech. His lab tackles fundamental engineering challenges across different levels of complexity, such as (1) protein components that minimize their crosstalk with human cells and immunogenicity, (2) biomolecular circuits that function robustly in different cells and are easy to deliver, (3) multicellular consortia that communicate through scalable channels, and (4) therapeutic modules that interface with physiological inputs/outputs. Their engineering targets include biomolecules, molecular circuits, viruses, and cells, and their approach combines quantitative experimental analysis with computational simulation. The molecular tools they build will be applied to diverse fields such as immunology, neurobiology, and cancer therapy. In this talk, Xiaojing will share their recently developed tools for controlling protein secretion and sensing RNA in living cells, and the applications they envision for basic research and therapeutics.

Mapping DNA replication stress in parasites and cancer cells with long-read sequencing and AI

Wednesday, 15 February 2023, 2.00pm to 3.00pm
Speaker: Dr. Michael A. Boemo, Research Group Leader, Department of Pathology
Venue: CMS, Meeting Room 15

Every time a cell divides, it must copy (or “replicate”) its genome exactly once, which it achieves through the parallel action of thousands of replication forks. One of the most serious errors in DNA replication occurs when replication forks stall, which happens when the fork encounters an obstacle that it cannot pass. The frequent slowing or stalling of replication forks, termed “replication stress”, is rare in healthy human cells but common in both cancer cells and parasites. Replication stress is therefore a common therapeutic target for anti-malarial and cancer chemotherapies, but we have a relatively poor understanding of where, when, why, and how often replication forks stall under these therapies. I will discuss our recent progress towards answering these questions, whereby we are using long-read nanopore DNA sequencing together with AI to measure the movement and stress of thousands of replication forks across the genomes of human cancer cells and the malaria parasite Plasmodium falciparum.

Seeing the Unseen (TBC)

Wednesday, 22 February 2023, 2.00pm to 3.00pm
Speaker: Jordan P. Skittrall, NIHR Clinical Lecturer in Virology and Honorary Specialty Registrar in Infectious Diseases and Medical Virology, Division of Virology, Department of Pathology
Venue: CMS, Meeting Room 15

Not yet available

About us

The Cambridge Centre for Data-Driven Discovery (C2D3) brings together researchers and expertise from across the academic departments and industry to drive research into the analysis, understanding and use of data science and AI. C2D3 is an Interdisciplinary Research Centre at the University of Cambridge.

  • Supports and connects the growing data science and AI research community 
  • Builds research capacity in data science and AI to tackle complex issues 
  • Drives new research challenges through collaborative research projects 
  • Promotes and provides opportunities for knowledge transfer 
  • Identifies and provides training courses for students, academics, industry and the third sector 
  • Acts as a gateway for external organisations 

Join us