Home / News and reports / Blog Posts / From Big Data to Data-Driven Discovery

From Big Data to Data-Driven Discovery

Posted 15 June 2020, by Ellen Ashmore

The graduation from Strategic Research Initiative to Interdisciplinary Research Centre powered Cambridge Big Data to a bigger stage, bringing us the Cambridge Centre for Data-Driven Discovery (C2D3).

From Big Data 

Cambridge Big Data (CBD) was formed in 2013 chaired by Professor Paul Alexander, Professor in Astrophysics, and coordinated by Dr Clare Dyer-Smith.

Through the commitment of Paul, Clare and the Steering Committee, a large network of big data and data science experts was established from across the University of Cambridge and with external collaborators. CBD enabled large scale multi-disciplinary research challenges by developing new research collaborations and supporting knowledge transfer across disciplines. It raised the profile of big data, data science, artificial intelligence and machine learning research at Cambridge. Numerous research workshops, showcases, conferences, and small meetings were hosted and supported by CBD, from disciplines right across academia, including from mathematics, biological sciences, engineering and manufacturing, social science and policy, to medicine.


ATI logo

The Alan Turing Institute

CBD was instrumental in bringing together the proposed team for the University’s successful bid to establish the Alan Turing Institute (ATI), building on the successful establishment of CBD’s linked network of big data expertise. In 2015, Cambridge joined Edinburgh, Oxford, Warwick and UCL as founding university partners of the ATI, along with the UK Engineering and Physical Sciences Research Council. The ATI is the UK’s national institute for data science and recently added artificial intelligence to its portfolio. The ATI has opened doors for Cambridge staff and students, with the opportunity to join ambitious research programmes and access to training. Now approximately 70 students, researchers and academics from Cambridge have connected with ATI, the majority in formal programmes and positions including the student Enrichment programme, Doctoral Students, Visiting Researchers, and Turing Fellows. The Turing University Lead for Cambridge is Professor Zoe Kourtzi, CBD/C2D3 Steering Committee member.


Data-driven discovery

To Data-Driven Discovery

Bolder ambitions and bigger impacts paved the way for a proposal to the University Research Policy Committee for the establishment of a new Interdisciplinary Research Centre (IRC). This was successful and CBD graduated to an IRC in October 2019.

As an IRC, it laid out four key and complementary elements: an academic research centre and leadership for interdisciplinary data science​; technical, computing, and data science services supporting the wider community in Cambridge​; a community – a linked and interdisciplinary network of SRI members​; and a secretariat to support the various planned activities.

To take forward the key elements, CBD transformed into the Cambridge Centre for Data-Driven Discovery (C2D3). The change of name better reflected the wide network and diverse research applications within C2D3. C2D3 seeks to provide an intellectual space for the University to further the theoretical foundations of data science, apply the methodology to real-world problems and work alongside Industry leaders.

C2D3 were thrilled to announce the UK’s largest insurance company, Aviva, as its founding industry partner. Complementing C2D3’s bold ambitions, Aviva were equally ambitious from the start of the relationship; PhD students, collaborative research, training courses for Aviva staff, support of the Cambridge University Data Science Society (predominately aimed at students) were all achieved in just the first year of the partnership.




The first year as an IRC has presented a major practical and intellectual challenge for C2D3 researchers, and indeed for the world, in the form of COVID19. The coronavirus pandemic of 2020 will be remembered in decades to come, analysed in great depth and ideally lessons learned to mitigate future occurrences of such infectious diseases.

It is with great pride that we can report that many of our members are actively contributing to the global fight against COVID-19, contributing world-class research to the pool of information on the novel disease. Contributions to research have included machine learning modelling of coughs and breathing (COVID-19 Sounds App); machine learning methods for healthcare infrastructure; membership of the Royal Society DELVE data analytics working group; work on understanding the role of intellectual property in the pandemic; policy responses to the pandemic; and optimising patient care through machine learning, mathematics and statistical methods.


Virtual symposium

Adapting to working from home and virtual communications will enable C2D3 to continue its research outreach and networking through the C2D3 Virtual Symposium – Research Rendezvous. On 21 October 2020, the C2D3 Research Rendezvous will showcase exciting research from across the University alongside talks from industry and independent organisations.

The C2D3 Research Rendezvous seeks to spark new research questions, create new collaborations and connect distant parts of the data science community at the University and beyond. Ultimately C2D3 aims to drive new research programmes. For example, one such research programme has already emerged based around the theme of Hierarchical Bayesian Modelling. A workshop in early 2020 brought together researchers and academics from a wide range of disciplines, including ecology, neuroscience, manufacturing, infectious diseases, chronic diseases, astronomy and cosmology, and education. Following the success of the workshop, a research grant application was developed between a new set of collaborators from three different University Schools.


The Future

The future of C2D3 is exciting as we work to bring on board further industry partners, enable new academic collaborations and provide career opportunities through Research Fellows and PhD studentships. C2D3 will continue to be a conduit for data science activity across the University, retaining strong links with the ATI and providing an academic centre and leadership for interdisciplinary aspects of data science.

Cambridge Big Data to C2D3

About us

The Cambridge Centre for Data-Driven Discovery (C2D3) brings together researchers and expertise from across the academic departments and industry to drive research into the analysis, understanding and use of data science and AI. C2D3 is an Interdisciplinary Research Centre at the University of Cambridge.

  • Supports and connects the growing data science and AI research community 
  • Builds research capacity in data science and AI to tackle complex issues 
  • Drives new research challenges through collaborative research projects 
  • Promotes and provides opportunities for knowledge transfer 
  • Identifies and provides training courses for students, academics, industry and the third sector 
  • Serves as a gateway for external organisations 

Join us