Events

Forthcoming events

This page lists C2D3 events, University events, as well as related external conferences and events of interest to our members.

ML
University of Cambridge event
Monday, 4 December 2023, 1.00pm to 5.00pm

Convenors: Dr Anne Alexander (CDH Director of Learning), Dr Emily Sandford (Research Fellow, Gonville & Caius) and Jarrah O’Neill (Reproductive Sociology Research Group)

neurips
University of Cambridge event
Friday, 8 December 2023, 9.00am to 10.00pm

An offline Cambridge-local meetup of the Neural Information Processing Systems (NeurIPS) conference.

The goal of this meetup is to bring students, researchers, and engineers from the greater Cambridge area (UK) together for an opportunity to meet and discuss machine learning research presented at NeurIPS. We also want to provide an opportunity for researchers to promote their work and meet people from their local community. The day will feature poster and panel sessions and in-person presentations.

cdh journalism
University of Cambridge event
Wednesday, 13 December 2023, 4.00pm to 5.30pm

An online public event at the Cambridge Social Data School

Can journalists use AI responsibly? If so, how? Join the discussion at Cambridge Social Data School's public event next month, featuring investigative journalist and media trainer Ruona Meyer, and Postdoctoral Fellow at the Leverhulme Centre for the Future of Intelligence, Dr Tomasz Hollanek.

Register here: cdh.cam.ac.uk/events/37325

Research cafe
University of Cambridge event
Wednesday, 17 January 2024, 11.30am to 3.00pm

The West Hub, with the STEMM libraries in collaboration with Postdocs of Cambridge Society are pleased to invite you to their new Research Cafés held throughout the year. They will allow you to meet your peers outside of your immediate research group or discipline, to further understanding, collaboration, and engagement across the West Cambridge site in a relaxed environment. 

Forthcoming talks

A collation of interesting data science talks from across the University.

Fairness Evaluation in Generative NLP

Friday, 1 December 2023, 12.00pm to 1.00pm
Speaker: Seraphina Goldfarb-Tarrant (Cohere)
Venue: Computer Lab, SS03

The largest shifts in NLP over the past five years have been the shift to reliance on large pre-trained models (with the advent of the Transformer), followed by the shift to using generative rather than discriminative language models. These shifts each come with serious challenges for ensuring fairness in an NLP system. For the first, the relationship between fairness during pretraining and downstream applications is tenuous and understudied. This causes challenges for where to apply mitigations, and also causes logistical challenges because a different set of engineers creates the pretrained model and the application. For the second shift, generative systems are notoriously hard to evaluate for anything, with poor correlation between automatic metrics and humans, and low agreement scores even among humans. In this talk, I'll present my own research into both of these areas, discuss an overview of current challenges, and make some suggestions for future promising directions of research.

Bio:

Seraphina Goldfarb-Tarrant is the Head of Safety at Cohere, where she works on both the practice and the theory of evaluating and mitigating harms from LLMs. She did her PhD under Adam Lopez in Fairness in Tranfer Learning for NLP, at the Institute for Language, Cognition, and Computation (ILCC) in the Informatics department at the University of Edinburgh. She did her MSc in NLP, with a focus on Natural Language Generation, at the University of Washington under Fei Xia in collaboration with Nanyun Peng. Her research interests include the intersection of fairness with robustness and generalisation, cross-lingual transfer, and causal analysis. She had an industry career before her PhD, where she worked at Google in Tokyo, NYC, and Shanghai. She also spent two years as a sailor in the North Sea.

Sound Control Synthesis with Logics and Data

Friday, 1 December 2023, 2.00pm to 3.00pm
Speaker: Professor Alessandro Abate, Department of Computer Science, University of Oxford
Venue: Department of Engineering, JDB Seminar Room, and online (zoom): https://newnham.zoom.us/j/92544958528?pwd=YS9PcGRnbXBOcStBdStNb3E0SHN1UT09

Abstract:
We are witnessing an inter-disciplinary convergence between scientific areas underpinned by model-based reasoning and by data-driven learning. Original technical work across these areas is justified by numerous applications, where access to information-rich data has to be traded off with a demand for safety criticality: cyber-physical systems are exemplar applications.

In this talk, I will report on ongoing research in this cross-disciplinary domain at OXCAV, the Oxford Control and Verification group.

I will, in particular, focus on control synthesis for complex objectives, and describe how techniques from formal verification (logics and SAT, automata theory, abstractions) and from learning (sample-driven approaches and neural architectures) can be together leveraged to attain both sound and effective synthesis outcomes.

More broadly, throughout this contribution I will argue that, on the one hand, control theory and formal methods can provide certificates to learning algorithms and, on the other hand, that learning can bolster formal verification and strategy synthesis objectives.

BSU Seminar: "Unravelling the mechanisms and decision-making logic of biological systems"

Tuesday, 5 December 2023, 2.00pm to 3.00pm
Speaker: Giorgos Minas, St Andrew's University
Venue: MRC Biostatistics Unit, East Forvie Building, Forvie Site Robinson Way Cambridge CB2 0SR.

This talk is divided into two parts. The first part considers systems of stochastic differential equations (SDEs) that are used to describe how a group of interacting populations evolve over time. These systems are widely used in molecular biology to study the mechanisms of gene regulation, cell signalling and development. They are also used as compartmental models in epidemiology and other fields. After a crash course on the topic, I will introduce an SDE model that speeds up standard methods for stochastic simulation while maintaining model accuracy. We will then discuss Bayesian computational methods for estimating the posterior distribution of model parameters using time-series observations, and how the (un)identifiability of model parameters affect estimation.
The second part of the talk considers a large-p-small-n dataset of gene expression. In particular, we will discuss the analysis of an atypical single-cell RNA sequencing dataset. This experiment attempts to study the response of single cells to a set of stress factors. We will discuss the use of supervised classification and information theoretic quantities to measure the predictive power of single cells to identify the stress factor based on the expression of their genes. We will also discuss methods for comparing the importance of features involved in those responses. Finally, we will discuss an attempt to learn the decision-making logic of cells when responding to multi-component, nested stress factors.

BSU Seminar: "Hyper-Localization and Predictive Modeling of Rapid Lung Function Decline in Cystic Fibrosis"

Tuesday, 12 December 2023, 2.00pm to 3.00pm
Speaker: Rhonda Szczesniak and Emrah Gecili, both from the Cincinnati Children's Hospital Medical Center and the University of Cincinnati
Venue: MRC Biostatistics Unit, East Forvie Building, Forvie Site Robinson Way Cambridge CB2 0SR.

Neighborhood/built environments (the areas in which people live, work, and play) and community context as social and environmental determinants of health have gained prominence with the changing care needs of people living with cystic fibrosis (CF) lung disease. Select measures of these social and environmental determinants of health (referred to as “geomarkers”) are also predictors of rapid decline, which is clinically defined as a prolonged drop in lung function relative to patient and/or center-level norms. The extent to which hyper-localization (defined as increasing the spatiotemporal precision of social and environmental exposures) aids in prediction of rapid decline remains unclear. Linear mixed effects (LME) models have been historically used for predicting rapid decline in CF, but there are few options to properly incorporate spatial correlation and induce simultaneous variable selection. The objective of this work is to develop a Bayesian spatial linear mixed effects model to predict rapid decline using geomarkers.

We describe an application of the proposed model for predicting rapid lung function decline (measured as FEV1% predicted/year) in a Midwest U.S. cohort of pediatric CF patients aged 6-20 years. We consider a breadth of demographic and clinical characteristics alongside geomarkers, which focus on neighborhood/built environments and social/community context. Our innovative Bayesian model uses a “spike and slab” prior, accounting for spatial correlation based on ZIP code distances. We evaluate model fits and prediction accuracies. Our proposed model results in improved model fit and predictive accuracy, compared to other Bayesian and frequentist LME models with different spatial correlation assumptions. We describe how a combination of demographic, clinical, and geomarker variables can be selected as optimal predictors based on the posterior inclusion probabilities and Bayesian false discovery rate controlling rule. Our findings suggest that incorporating spatiotemporal effects and geomarkers results in an improved prediction tool. We discuss how predicting the timing and extent of rapid lung function decline can help clinicians to proactively adjust treatment plans and improve patient outcomes.

BSU Seminar: "Hyper-Localization and Predictive Modeling of Rapid Lung Function Decline in Cystic Fibrosis"

Tuesday, 12 December 2023, 2.00pm to 3.00pm
Speaker: Rhonda Szczesniak and Emrah Gecili, both from the Cincinnati Children's Hospital Medical Center and the University of Cincinnati
Venue: MRC Biostatistics Unit, East Forvie Building, Forvie Site Robinson Way Cambridge CB2 0SR.

Neighborhood/built environments (the areas in which people live, work, and play) and community context as social and environmental determinants of health have gained prominence with the changing care needs of people living with cystic fibrosis (CF) lung disease. Select measures of these social and environmental determinants of health (referred to as “geomarkers”) are also predictors of rapid decline, which is clinically defined as a prolonged drop in lung function relative to patient and/or center-level norms. The extent to which hyper-localization (defined as increasing the spatiotemporal precision of social and environmental exposures) aids in prediction of rapid decline remains unclear. Linear mixed effects (LME) models have been historically used for predicting rapid decline in CF, but there are few options to properly incorporate spatial correlation and induce simultaneous variable selection. The objective of this work is to develop a Bayesian spatial linear mixed effects model to predict rapid decline using geomarkers.

We describe an application of the proposed model for predicting rapid lung function decline (measured as FEV1% predicted/year) in a Midwest U.S. cohort of pediatric CF patients aged 6-20 years. We consider a breadth of demographic and clinical characteristics alongside geomarkers, which focus on neighborhood/built environments and social/community context. Our innovative Bayesian model uses a “spike and slab” prior, accounting for spatial correlation based on ZIP code distances. We evaluate model fits and prediction accuracies. Our proposed model results in improved model fit and predictive accuracy, compared to other Bayesian and frequentist LME models with different spatial correlation assumptions. We describe how a combination of demographic, clinical, and geomarker variables can be selected as optimal predictors based on the posterior inclusion probabilities and Bayesian false discovery rate controlling rule. Our findings suggest that incorporating spatiotemporal effects and geomarkers results in an improved prediction tool. We discuss how predicting the timing and extent of rapid lung function decline can help clinicians to proactively adjust treatment plans and improve patient outcomes.