Home / Events / Our Digital Future - Multidisciplinary Perspectives on Long Term Data Preservation and Access

Our Digital Future - Multidisciplinary Perspectives on Long Term Data Preservation and Access

Monday, 14 March 2016, 9.00am to Tuesday, 15 March 2016, 5.00pm
Location: Murray Edwards College, Cambridge

Quick Links 

View Programme

Download the abstracts and the programme at the bottom of this page.


As the worldwide volume of digital data undergoes exponential growth, Big Data technology allows unexpected value to be derived from existing and new datasets, and increasingly huge datasets to be recorded across all areas of academic research. As data volumes grow, and electronic storage deteriorates, the recoverability of this data is dependent upon curation of electronic archives and replacement of storage media, along with the ability to discover and access the data stored using technologies that may soon be obsolete. Decisions will need to be made about which data is kept, how it is stored, and how it can be accessed, in order that the scientific and human record from the current digital age is appropriately preserved for the future.

With keynote speakers representing disciplines ranging from high energy physics to digital humanities, from bioinformatics to libraries, this two-day conference will address perspectives from technology, policy and the social sciences on data as our human record.


Areas of discussion will be wide-ranging, including.

  • Code, compilers, machines and emulators to read data
  • Standards, file formats, metadata and conventions
  • Access, networking and cloud storage
  • Technical issues in long-term data storage
  • Data protection, consent and copyright
  • Discoverability: metadata, link rot
  • Policies and practices – long term data preservation, open access
  • What can digital data preservers learn from archivists and librarians? Which data are important?


Who should attend?

This conference is open to researchers, students, practitioners and policymakers interested in the preservation of digital data and who wish to inform a roadmap for research and funding in this area. 



Public Lecture

A public lecture, in conjunction with the Cambridge Science Festival, will be held on the evening of 14 March.


Call for Papers

The call for abstracts is now closed.


Conference structure

14 March 2016

Presentations and panel discussions on Digital data as the human recordSystems, devices and infrastructures – storing, sharing and curating, and Data Preservation Policy

15 March 2016

Focused workshops aimed at understanding and generating research questions and collaborations

Workshop 1: What should we keep? Lessons from history and the shift to digital Organiser Dr Anne Alexander, Cambridge Digital Humanities Network

This workshop addresses the implications for human culture, science and memory of a generalised shift from paper versions of record to digital versions of record, and seeks to identify where and how this transition is taking place.

Workshop 2: Storing, sharing and curating digital (big) data Organiser Professor Val Gibson, Cavendish Laboratory

This workshop will address the technical challenges in data preservation and access, including storage media, metadata, provenance and remote access, aiming to produce a roadmap of the needs and challenges.

Click here to view the Draft Programme


We acknowledge the kind support of EPSRC and Cambridge University Press in hosting this workshop

PDF icon Download the Programme489.61 KB
PDF icon Download the Abstracts293.31 KB
PDF icon Conference report626.12 KB

Forthcoming talks

Achieving Consistent Low Latency for Wireless Real-Time Communications with the Shortest Control Loop

Thursday, 18 August 2022, 4.00pm to 5.00pm
Speaker: Zili Meng, Tsinghua Unversity
Venue: FW11 and

Real-time communication (RTC) applications like video conferencing or cloud gaming require consistent low latency to provide a seamless interactive experience. However, wireless networks including WiFi and cellular, albeit providing a satisfactory median latency, drastically degrade at the tail due to frequent and substantial wireless bandwidth fluctuations. We observe that the control loop for the sending rate of RTC applications is inflated when congestion happens at the wireless access point (AP), resulting in untimely rate adaption to wireless dynamics. Existing solutions, however, suffer from the inflated control loop and fail to quickly adapt to bandwidth fluctuations. In this paper, we propose Zhuge, a pure wireless AP based solution that reduces the control loop of RTC applications by separating congestion feedback from congested queues. We design a Fortune Teller to precisely estimate per-packet wireless latency upon its arrival at the wireless AP. To make Zhuge deployable at scale, we also design a Feedback Updater that translates the estimated latency to comprehensible feedback messages for various protocols and immediately delivers them back to senders for rate adaption. Trace-driven and real-world evaluation shows that Zhuge reduces the ratio of large tail latency and RTC performance degradation by 17% to 95%.

Speaker Bio: Zili is a 3rd-year PhD student in Tsinghua University. His current research interest focuses on real-time video communications. He has published several papers in SIGCOMM / NSDI and received the Microsoft Research Asia PhD Fellowship, Gold Medal of SIGCOMM 2018 Student Research Competition, and two best paper awards.

BSU Seminar: "Genome-wide genetic models for association, heritability analyses and prediction"

Monday, 22 August 2022, 4.30pm to 5.30pm
Speaker: David Balding, Honorary Professor of Statistical Genetics at UCL Genetics Institute and University of Melbourne
Venue: Seminar Rooms 1 & 2, School of Clinical Medicine, Hills Road, Cambridge CB2 0SP

Although simultaneous analysis of genome-wide SNPs has been popular for over a decade, the problems posed by more SNPs than study participants (more parameters than data points), and correlations among the SNPs, have not been adequately overcome so that almost all published genome-wide analyses are suboptimal. While there has been much attention paid to the shape of prior distributions for SNP effect sizes, we argue that this attention is misplaced. We focus on what we call the "heritability model": a low-dimensional model for the expected heritability at each SNP, which is key to both individual-data and summary-statistic analyses. The 1-df uniform heritability model has been implicitly adopted in a wide range of analyses. Replacing it with better heritability models, using predictors based on allele frequency, linkage disequilibrium and functional annotations, leads to substantial improvements in estimates of heritability and selection parameters over traits, and over genome regions, as well as improvements in gene-based association testing and prediction. Key collaborators Doug Speed, Aarhus, Denmark and Melbourne PhD student Anubhav Kaphle.

Statistics Clinic Summer 2022 III

Wednesday, 31 August 2022, 5.30pm to 7.00pm
Speaker: Speaker to be confirmed
Venue: Venue to be confirmed

If you would like to participate, please fill in the following "form": The deadline for signing up for a session is 12pm on Monday the 29th of August. Subject to availability of members of the Statistics Clinic team, we will confirm your in-person or remote appointment.

This event is open only to members of the University of Cambridge (and affiliated institutes). Please be aware that we are unable to offer consultations outside clinic hours.

Statistics Clinic Summer 2022 IV

Wednesday, 21 September 2022, 5.30pm to 7.00pm
Speaker: Speaker to be confirmed
Venue: Venue to be confirmed

Abstract not available

Title to be confirmed

Monday, 26 September 2022, 3.00pm to 4.00pm
Speaker: Christopher Yau, University of Manchester
Venue: CRUK CI Lecture Theatre

Abstract not available