Events 32 x 13.1 ( with space) ppt.png

Events and Talks

 

In AI, Machine Learning and Data Science across the University and beyond.

Events

C2D3 event Conference In person

C2D3 Computational Biology Annual Symposium 2026

13 May 2026

Uni of Cambridge Training In person

CRIT Building computational pipelines with Nextflow

14 Apr 2026 - 15 Apr 2026

20 Apr 2026 - 21 Apr 2026

Uni of Cambridge Training In person

CRIT Programming in Python

23 Apr 2026 - 24 Apr 2026

Uni of Cambridge Training Online

CRIT Working on HPC clusters

29 Apr 2026 - 1 Jun 2026

C2D3 event Workshop In person

Google Cloud - Vertex AI Workshop

7 May 2026

6 Jul 2026 - 7 Jul 2026

13 Jul 2026 - 17 Jul 2026

14 Jul 2026 - 29 Jul 2026

EPSRC Centre for Mathematical and Statistical Analysis of Multimodal…
Ethics of Big Data in practice: Social media research
Ethics of Big Data in practice: Administrative data
Ethics of Big Data in practice: Patient record linkage in hospitals
Ethics of Big Data in practice: Health and Policy research in Africa
Workshop on Urban Data Science #wuds15
Neurocomputation: from brains to machines
Big Data for Small and Medium Enterprises - an Alan Turing Institute Summit
Inside Snowden’s suitcase
Regulation of medical research under European Data Protection: in theory and practice…
What is Big Data? Discovery through a Data Walkshop
Green Computing - Materials, Architectures and Applications
Big Data Methods for Social Science and Policy - Interdisciplinary Workshop Programme…
Data In Drug Discovery - Time To Get Honest!
Data and Sensing in Extreme Environments
Big Data in Medicine: Exemplars and Opportunities in Data Science
Policy-Making in the Big Data Era: Opportunities & Challenges
Economic and Econometric Applications of Big Data
Selling Science? News, public relations and communicating scientific research
Social Media and Qualitative Health Research: Big Data Seminar and Masterclass
Tenth Annual Symposium of the Cambridge Computational Biology Institute
Cambridge Networks Day 2015
Human-Data Interaction
Data and Digital Innovation in Enabling Servitization (ESRC/BAE Systems…
Data and Life on Tenison Road
The Vocabulary of Big Data External
The Future of Economics and Public Policy External
Privacy, public interest and the future of healthcare research External

Talks

Upcoming related talks from talks@cam

Date Title Speaker Abstract
BSU Seminar: "A unifying framework for generalised Bayesian online learning in non-stationary environments" Gerado Duran-Martin, Oxford-Man Institute, University of Oxford We propose a unifying framework for methods that perform probabilistic online learning in non-stationary environments. We call the framework BONE, which stands for generalised (B)ayesian (O)nline learning in (N)on-stationary (E)nvironments. BONE provides a common structure to tackle a variety of problems, including online continual learning, prequential forecasting, and contextual bandits.
GraphNeuralRAG: On the Opportunities and Challenges of GNNs for GraphRAG, from Multi-Hop Question Answering to Perturbation Modelling Andrea Giuseppe Di Francesco, Sapienza University of Rome, ISTI-CNR Retrieval-augmented generation (RAG) has become the standard approach for grounding generative models in external knowledge. When that knowledge is structured as a graph, GraphRAG methods have emerged to leverage topology to boost retrieval. However, existing approaches predominantly rely on either LLM-based pipelines, which treat graph structure as text to summarise or traverse, or use graph algorithms and neural scoring only as an intermediate step before falling back to document-based retrieval, leaving much of the graph structure unexploited by the generative model.
BSU Seminar: "Nonparametric causal decomposition of group disparities" Ang Yu, Hong Kong University of Science and Technology We introduce a new nonparametric causal decomposition approach that identifies the mechanisms by which a treatment variable contributes to a group-based outcome disparity. Our approach distinguishes three mechanisms: group differences in: (1) treatment prevalence, (2) average treatment effects, and (3) selection into treatment based on individual-level treatment effects.
BSU Seminar: "Generating crossmodal gene expression from cancer histopathology improves multimodal AI predictions" Samiran Dey, Indian Association for the Cultivation of Science, Kolkota Transcriptomic profiling provides rich molecular insights for cancer diagnosis and prognosis, but its high cost limits routine clinical use, where histopathology remains the primary diagnostic modality. Recent advances in artificial intelligence suggest that molecular information can be inferred directly from digital pathology images. This talk discusses a generative multimodal framework that synthesizes transcriptomic features from whole-slide histopathology images and incorporates them to improve cancer grading and survival risk prediction across multiple cancer cohorts.
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models Zhijiang Guo (HKUST (GZ) | HKUST) In this talk, I will present CodeScaler, a novel framework designed to overcome the scalability bottlenecks of Reinforcement Learning from Verifiable Rewards (RLVR) in code generation. While traditional RLVR relies heavily on the availability of high-quality unit tests—which are often scarce or unreliable—CodeScaler introduces an execution-free reward model that scales both training and test-time inference.
A Novel Diffusion Model based Approach for Sleep Music Generation Kevin Monteiro, Department of Computer Science and Technology Sleep disorders, particularly insomnia, and mental health conditions affect a significant fraction of adults worldwide, posing seriousmmental and physical health risk. Music therapy offers promising, low-cost, and non-invasive treatment, but current approaches rely heavily on expert-curated playlists, limiting scalability and personalisation. We propose a low-cost generative system leveraging recent advances in diffusion models to synthesize music for therapy. We focus on insomnia and curate a dataset of waveform sleep music to generate audio tailored to sleep.
Numerically verified proofs in pure maths Daniel Platt, Imperial College London What’s a numerically verified proof? In pure maths we want to prove theorems, usually using pen and paper. On the other side there exist hundreds of very elaborate ways to approximately solve equations, for example physics-informed neural networks. Due to the advent of greater computational power it has recently become possible to use such approximate solutions in a theorem proofs. In the talk, I’ll explain how that works in a toy example and then briefly mention some applications of this in pure maths.
Representational Geometry of Language Models Matthieu Téhénan (University of Cambridge) Abstract not available
 Life, death, and the discovery of PDAR: the Pol II Degradation-dependent Apoptotic Response  Mike Lee PhD, Associate Professor Department of Systems Biology, UMass Chan Medical School *Talk Title:* Life, death, and the discovery of PDAR: the Pol II Degradation-dependent Apoptotic Response *Abstract:* Many cellular functions are considered “life essential”, but why are they actually essential? Why does a cell die, for instance, when transcription or translation are inhibited, and can we improve cancer therapies by developing a more complete understanding of how cellular life/death decisions are made? To answer these questions, we developed a suite of new tools for studying all forms of cell death.
A Data-Centric Approach to AI Adaptation and Alignment Prof. Stephen Bach (Brown University) Training generative AI is not a one-step process. In the case of large language models (LLMs), self-supervision is often followed by supervised and reinforcement learning stages to improve instruction following, safety, and other desirable qualities. This multi-stage process that has emerged in the last 3 years has led to huge leaps in model capabilities. It has also led to new challenges and risks. In this talk, I will overview some of our group's work to identify and address such challenges by focusing on the training data used at different stages.
Understanding the Interplay between LLMs' Utilisation of Parametric and Contextual Knowledge Prof Isabelle Augenstein (University of Copenhagen) Language Models (LMs) acquire parametric knowledge from their training process, embedding it within their weights. The increasing scalability of LMs, however, poses significant challenges for understanding a model's inner workings and further for updating or correcting this embedded knowledge without the significant cost of retraining. Moreover, when using these language models for knowledge-intensive language understanding tasks, LMs have to integrate relevant context, mitigating their inherent weaknesses, such as incomplete or outdated knowledge.
Talk by Aaron Mueller (Boston University) Aaron Mueller (Boston University) Abstract not available
C2D3 Computational Biology Annual Symposium 2026 Keynote: Natasha Latysheva (Google DeepMind) We warmly invite you to the C2D3 Computational Biology Annual Symposium 2026. This event is open to everyone in the Computational Biology Community. https://www.c2d3.cam.ac.uk/events/comp-bio-2026 Early Career Researcher: Abstract Submission We are inviting Early Career Researchers to present their research during the symposium. Talks should be 17 minutes each, and a short Q&A will follow. Abstract submission - Deadline 9am 1st April 2026. Registrations Registration is essential. A waitlist will open if capacity is reached. Registrations - Deadline 9am Monday 4th May 2026.
Title to be confirmed Arduin Findeis (University of Cambridge) Abstract not available
The AI Ecosystem as a Reasoning Maze: How Collaborative Intelligence Accelerates Scientific Discovery Yuri Yuri (Oxford) Scientific discovery emerges not from isolated reasoning, but from the intersection of diverse epistemic traditions. This talk proposes that the modern AI ecosystem, a structured network of heterogeneous reasoning agents spanning approximate and rigorous inference, constitutes a new form of collaborative intelligence for scientific inquiry. Drawing on Simon's conception of reasoning as adaptive search, we argue that such ecosystems do not merely accelerate known reasoning pathways, but create conditions under which genuinely novel representations may emerge.
The AI Ecosystem as a Reasoning Maze: How Collaborative Intelligence Accelerates Scientific Discovery Yuri Yuri (Oxford) Scientific discovery emerges not from isolated reasoning, but from the intersection of diverse epistemic traditions. This talk proposes that the modern AI ecosystem, a structured network of heterogeneous reasoning agents spanning approximate and rigorous inference, constitutes a new form of collaborative intelligence for scientific inquiry. Drawing on Simon's conception of reasoning as adaptive search, we argue that such ecosystems do not merely accelerate known reasoning pathways, but create conditions under which genuinely novel representations may emerge.
The AI Ecosystem as a Reasoning Maze: How Collaborative Intelligence Accelerates Scientific Discovery Yuri Yuri (Oxford) Scientific discovery emerges not from isolated reasoning, but from the intersection of diverse epistemic traditions. This talk proposes that the modern AI ecosystem, a structured network of heterogeneous reasoning agents spanning approximate and rigorous inference, constitutes a new form of collaborative intelligence for scientific inquiry. Drawing on Simon's conception of reasoning as adaptive search, we argue that such ecosystems do not merely accelerate known reasoning pathways, but create conditions under which genuinely novel representations may emerge.
TBC Luke Gilbert, PhD, Associate Professor of Urology, University of California, San Francisco TBC
Talk by Aditi Raghunathan (CMU) Aditi Raghunathan (CMU) Abstract not available
Positional encodings in LLMs Valeria Ruscio Positional encodings are essential for transformer-based language models to understand sequence order, yet their influence extends far beyond simple position tracking. This talk explores the landscape of positional encoding methods in LLMs and reveals surprising insights about how these architectural choices shape model behavior. We begin with the fundamental challenge: why attention mechanisms require explicit positional information.