Research workshops and challenge areas | Cambridge Centre for Data-Driven Discovery

C2D3 works with groups of researchers, Departments, and other Interdisciplinary Research Centres, Strategic Research Initiatives and Networks to promote collaboration, share ideas, scope out key research questions in data science and to support the development of interdisciplinary funding proposals.

Our members are eligible to apply for small amounts of seed funding for the organisation of research workshops (for more information, see the call for proposals).

Research Workshops

Research Workshops are an important part of our programme. Workshops range from half-day meetings between researchers in Cambridge Departments to discuss specific research questions, to multi-day conferences with external speakers and delegates. Workshop outcomes include strengthened networks and intellectual exchange, development of new project ideas.

We can also provide small amounts of seed funding for the organisation of research workshops (for more information, see the call for proposals).

Data Science School: Machine Learning applicaons for life sciences (Online)

17-22 September 2020

Hosted by the University of Cambridge, Bioinformatics Training Facility remotely via Zoom and using hosted UNIX environments accessed via browser. Organizer:Louisa Bellis, University of Cambridge.

Speakers: Marta Milo, Astrazeneca; John Thomas, University of Cambridge; Mario Guarracino, National Research Council, Naples, Italy; Javier Gonzalez Hernandez, Microsoft Research; Magdalena Strauss, The Sanger Institute, Cambridge; Neil Lawrence, University of Cambridge.

The course was attended by 46 participants, including 20 external participants from The Crick Institute, The Royal Veterinary Society, Kymab, Unilever, Astrazeneca, University of Oxford, UCL, KCL, Queens College London, Imperial, Sanger Institute, NHS Addenbrookes Hospital and KTH Royal Institute of Technology, Sweden.

This School aims to familiarise biomedical students and researchers with principles of Data Science. Focusing on utilising machine learning algorithms to handle biomedical data, it will cover: effects of experimental design, data readiness, pipeline implementations, machine learning in Python, and related statistics, as well as Gaussian Process models. Providing practical experience in the implementation of machine learning methods relevant to biomedical applications, including Gaussian processes, we will illustrate best practices that should be adopted in order to enable reproducibility in any data science application.

C2D3 Hierarchical Modelling Workshop

25 February 2020

Held at the Maxwell Centre, West Cambridge, University of Cambridge, led by Sylvia Richardson and Mark Girolami.

Bayesian hierarchical modelling (BHM) is one of the most powerful modern statistical techniques. It provides a unifying framework for dealing with a diversity of sources of complexity arising from the structure (e.g. dependence) of the data and its associated measurement process. Hierarchical model building strategy involves defining latent unobserved quantities of interest which are organised into a number of levels with distinct interpretations and building probabilistic between the latent quantities and the data. Bayesian hierarchical models coupled with efficient computational tools have been successfully used in a very wide range of application areas (e.g epidemiology, social sciences, education, geography, environmental sciences, biomedicine, political sciences).

By its generic character, this modelling strategy has the potential to bring together scientists from a wide range of disciplines across the University. Computationally, it also raises a number of algorithmic challenges which could provide useful topics for interactions.

C2D3 hosted a workshop to bring together academics interested in BHM to develop research areas for the programme. With an introduction to BHM from Sylvia Richardson and Mark Girolami, the meeting of interested academics facilitated a flowing and open discussion with a large unstructured half-day programme. The workshop led to a research grant application covering several application areas, bringing researchers together from across the University to work together for the first time.

Autumn School: Data Science: Machine learning applications for life sciences

23-26 September 2019

Held at Craik-Marshall Building, Downing Site, University of Cambridge and organised by Dr Gabriella Rustici (Head of Bioinformatics Training, University of Cambridge; Associate Director of Training, HDR UK), and Dr Marta Milo (University of Sheffield).

The Autumn School provided 44 biomedical students and researches the opportunity to share the principles of Data Science, using a multidisciplinary approach. Focusing on utilising machine learning algorithms to handle biomedical data, the Autumn School covered: the effects of experimental design, data readiness, pipeline implementations, machine learning in Python, and related statistics, as well as Gaussian Process models. Providing practical experience in the implementation of machine learning methods relevant to biomedical applications, including Gaussian processes, the event illustrated best practices that should be adopted in order to enable reproducibility in any data science application.

Cambridge Networks Day 2019 (6th Edition)

29 August 2019

Network Science (interdisciplinary field) has methods widely applied to problems and datasets in fields as diverse as computer science, ecology, neuroscience, archaeology, medicine, economics, social sciences and engineering. Cambridge Networks Network (CNN) brings together academics from across the university and beyond who share an interest in Network Science.

140 registrations came from a diverse range of interdisciplinary backgrounds, with technology, physical sciences, biological sciences and the humanities well represented. Early Career Researchers were well prepresented in poster presentations, the poster prize and travel grants.

CNDay 2019 was kindly supported by The Alan Turing Institute, C 2D3 and King's College Cambridge.

Machine Learning for Environmental Sciences 2019

17-18 June 2019

This joint British Antarctic Survey (BAS) and University of Cambridge organised workshop followed on from the 2017 conference on Environmental Science in the Big Data Era (also hosted by BAS and University of Canterbury).

An improved understanding of the natural environment and ability to predict future changes is crucial for society and the global economy. With ever growing volumes of data produced through both increased environmental modelling capability and technological advances in earth observation systems, techniques to harness the power of this data and extract useful information have never been more important. Recent years have seen an acceleration in the use of Data Science techniques being applied within the environmental sciences. The application of Machine Learning to this new area has also identified a number of new and interesting challenges to the data science community, with new data challenge requiring bespoke machine learning tools to deliver the next wave of scientific breakthroughs.

Activities included in the workshop:

Recognising recent achievements and proven concepts in the research area, with keynote talks from Claire Monteleoni (University of Colorado) and Emily Shuckburgh
Presentations from workshop participants on preliminary work and results
Early Career Researchers were encouraged to attend and given a reduced price
Hands-on data challenge, aimed to be inclusive of all levels of expertise and to encourage participants to work together and share expertise
A conference dinner to further encourage networking and to enhance potential collaborations.

Advances and Challenges in Machine Learning Programming of Languages

20-21 May 2019

The development of machine learning programming languages is critical to support the research and deployment of ML solutions as data-size and model-complexity grow. These languages often offer built-in support for expressing machine learning models as programs and aim at automating inference, through probabilistic analysis and simulation or back-propagation and differentiation. Machine learning languages enable models to be deployed, critiqued, and improved, support reproducible research, and lower the barrier for the use of these methods.

This workshop brought together researchers from both academia and industry, to discuss recent advances and challenges in machine learning languages development and research.

The workshop was supported by American Statistical Association, C2D3, Alan Turing Institute, Isaac Newton Institute (University of Cambridge), International Society for Bayesian Analysis

Winter School: 5th International Winter School on Big Data

7-11 January 2019

Held at the Department of Engineering in collaboration with the Institute for Research Development, Training and Advice.

BigDat 2019 was a research training event with a global scope aimed at updating participants on the most recent advances in the critical and fast developing area of big data, which covers a large spectrum of current exciting research and industrial innovation with an extraordinary potential for a huge impact on scientific discoveries, medicine, engineering, business models, and society itself. Renowned academics and industry pioneers lectured and share their views with the audience.

Most big data subareas were displayed, namely foundations, infrastructure, management, search and mining, security and privacy, and applications (to biological and health sciences, to business, finance and transportation, to online social networks, etc.). Major challenges of analytics, management and storage of big data were identified through 2 keynote lectures, 24 four-hour courses, and 1 round table, which tackled the most active and promising topics.

An open session gave participants the opportunity to present their own work in progress in 5 minutes. There were also two special sessions with industrial and recruitment profiles.

The event provided an avenue to advertise C2D3, the Alan Turing Institute, University job opportunities, and Aviva job openings.

Cambridge Big Data Research Symposium

26th November 2018, Sainsbury Laboratory Cambridge University (SLCU)

This one-day Symposium showcased cross-disciplinary research, and also highlighted research challenges, with a particular focus on projects involving biosciences and clinical medicine.