Cambridge-Turing sessions reloaded: collaborative data science and AI research

C2D3 event

Thu, 21 Oct 2021 10:00 AM - 12:15 PM

You are invited to the second series of Cambridge-Turing sessions reloaded, hosted by Cambridge Centre for Data-Driven Discovery (C2D3). This series of online sessions reflects the strengths and collaborations of data science and AI research across the University and many of the presentations will showcase the University’s partnership with The Alan Turing Institute. Our speakers will cover a wide range of themes and disciplines including predicting personalities, mental health, personalised healthcare, and data science for science and humanities.

We invite participants from academia, industry, government, third sector and anywhere in between. C2D3 are also keen to find collaborators and make connections in the East of England region.

Registration

To attend the event, please follow the registration link below. Attendance is free of charge. Please register in advance so there is time for you to be sent the event link.

Register here

Programme

We are delighted to have Paul Kirk as our event Chair. Paul is a group leader (Programme Leader Track) within the Precision Medicine and Inference for Complex Outcomes (PREM) theme at the MRC Biostatistics Unit.

10:00-10:05 Opening and Session 1 Presenting Turing Research Projects

10:05: AI-guided solutions for early detection of dementia - Professor Zoe Kourtzi (Turing University Lead for the University of Cambridge)
10:15: Q&A
10:25: The cooked and the raw; extracting and exploiting structured and unstructured clinical data from patient electronic health records - Dr Paul Schofield (Department of Physiology, Development and Neuroscience)
10:45: Q&A

10:55- 12:10 Session 2: Research showcase

10:55: Session introduction
11:00: Data driven built environment design for decoupling energy and health burdens in poverty - Dr Ronita Bardhan (Department of Architecture)
11:20: Modelling the Impact of Climate Change on UK Agriculture - Dr Sebastian Ahnert (Department of Chemical Engineering and Biotechnology at Cambridge, and also seconded to the Turing as Senior Research Fellow in the Data Science for Science programme)
11:40: Group Q&A

12:10 - 12:15 Closing

The above times are UK BST

Abstracts

Prof. Zoe Kourtzi

Head of the Adaptive Brain Lab; Turing Fellow; Fellow of Downing.

Title: AI-guided solutions for early detection of dementia

Alzheimer’s disease (AD) is characterised by a dynamic process of neurocognitive changes from normal cognition to mild cognitive impairment (MCI) and progression to dementia. However, not all individuals with MCI develop dementia. Predicting whether individuals with MCI will decline (i.e. progressive MCI) or remain stable (i.e. stable MCI) is impeded by patient heterogeneity due to comorbidities that may lead to MCI diagnosis without progression to AD. Despite the importance of early diagnosis of AD for prognosis and personalised interventions, we still lack robust tools for predicting individual progression to dementia. Here, we propose a novel trajectory modelling approach based on metric learning that mines multimodal data from MCI patients to derive individualised prognostic scores of cognitive decline due to AD. Our approach affords the generation of a predictive and interpretable marker of individual variability in progression to dementia due to AD based on cognitive data alone. Including non-invasively measured biological data (grey matter density, APOE 4) enhances predictive power and clinical relevance. Our trajectory modelling approach has strong potential to facilitate effective stratification of individuals based on prognostic disease trajectories, reducing MCI patient misclassification with important implications for clinical practice and discovery of personalised interventions.

Dr Paul Schofield

Reader in Biomedical Informatics; Department of Physiology, Development and Neuroscience

Title: The cooked and the raw; extracting and exploiting structured and unstructured clinical data from patient electronic health records

Electronic health records (EHRs) contain information critical to the realisation of the promise of personalised medicine, but also data essential for the discovery of the molecular basis of disease. Clinical information systems and EHRs were not developed for the discovery, integration and export of information, most being based on the concept of paper records going back to the 1990s. Consequently we find in EHRs information contained in administrative, diagnostic and procedure codes, which are highly structured and standardised ( pre-cooked) , the results of investigative tests, ranging from blood chemistry to images, which might be regarded as partially structured information (lukewarm?), and finally narrative reports of clinical encounters and discharge letters which are rich sources of information but completely unstructured – raw data. Reliably extracting and integrating these types of information is a huge challenge, but the ability to retrieve coded and quantitative data into a common symbolic framework opens up the possibility of connecting these data together with the large amounts of background knowledge now available, to begin to make semantic sense of our whole ‘menu’.

I will discuss three approaches to extracting and using EHR information: the first uses the Komenti platform which is designed to extract information from free text into semantically formalised ontological annotations, the second is an approach to combine quantitative data into that same semantic framework. The third, a new resource, axiomatises ICD-10 terms uses the Human phenotype ontology for integration with existing knowledge and, for example, patient classification. The promise of these multi-pronged approaches will be discussed.

Dr Ronita Bardhan

Assistant Professor of Sustainability in the Built Environment; Director, MPhil in Architecture and Urban Studies;Fellow of Architecture at Selwyn College.

Title: Data driven built environment design for decoupling energy and health burdens in poverty

The built environment is a significant modifiable factor that implicates health and energy decisions. Yet how and to what extent does built environment design parameters affects the quality of life remains unknown. The impacts of a dysfunctional space design are most aggravated in poorer communities where the asymmetries are profound. This talk scientifically unfolds how various data streams : (i)quantitative data from environmental/energy sensors, (i)qualitative data on agency and use of space, and (iii) big data on performance metrics like energy consumption can enable understanding the effects of building design parameters quantifiable outcomes. It advances the innovative paradigm of data-driven design to decouple health and energy burdens from poverty. Using novel datasets from Mumbai, India, the talk demonstrates how design can help understand health metrics like walkability in cities, outdoor heat stress due to climate change and indoor environmental quality in slum transitional housing. One of the challenges of working in resource constraint communities is the absence of data. This talk discusses how novel datasets like films, social dialogues and collective intelligence can be used in data-driven design for a sustainable and healthy future.

Dr Sebastian Ahnert

Department of Chemical Engineering and Biotechnology at Cambridge, and also seconded to the Turing as Senior Research Fellow in the Data Science for Science programme

Title: Modelling the Impact of Climate Change on UK Agriculture

Climate change is likely have a large impact on UK agriculture, and the viability of current crops in particular. This project aims to integrate data and models of plant development, plant pathology, crop yields, and climate science to form an integrated national crop modelling framework for the UK. This could allow us to predict the impact of climate change on UK agriculture and food security over the coming 50 years. The project brings The Alan Turing Institute, Rothamsted Research, John Innes Centre, and the University of Exeter together to address this challenge in close collaboration. Particular areas of focus are the integration of climate-disease interdependence and the genetics of temperature response into existing crop models, and an attempt to build large-scale machine learning models of crop growth based on satellite, weather, and soil data, with precision crop yield data as a training data set.

Sponsorship

Thank you to our sponsors for their sponsorship towards this event.

The Isaac Newton Trust

Cambridge University Press & Assessment

Our sponsor Cambridge University Press & Assessment publishes three open access, peer-reviewed titles that explore the impact of data science: Data & Policy; Data-Centric Engineering; and Environmental Data Science. You can read more about each title – and some associated webinars and books – on this Data Science hub page at CUP. Authors affiliated with Cambridge University, like those at many other institutions, can publish if accepted on an open access basis in these journals with no article processing charge, courtesy of an overarching Read & Publish agreement. Contact ahyde@cambridge.org if you would like to find out more.

Find out more about @CambridgeUP’s #OpenAccess data science titles @data_and_policy @dce_journal and @envdatascience and some related webinars and books here: https://bit.ly/3DJtvxa

Social Media

We will be using #CamTuringSessions on our social media.