Menu

Home / Events / Big Data in Medicine: Exemplars and Opportunities in Data Science / Baal-ChIP: Allele-specific ChIP-seq analysis from cancer cell lines

Baal-ChIP: Allele-specific ChIP-seq analysis from cancer cell lines

Back to: 
Big Data in Medicine: Exemplars and Opportunities in Data Science

Ines de Santiago, CRUK - Cambridge Institute

Baal-ChIP: Allele-specific ChIP-seq analysis from cancer cell lines

 Ines de Santiago*1, Wei Liu*1, Ke Yuan1, Kerstin B Meyer1, Bruce A Ponder1, Florian Markowetz1

1University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, United Kingdom

Abstract

Allele-specific measurements of transcription-factor binding from ChIP-seq data have provided important insights into the allelic effects of non-coding variants and its contribution to phenotypic diversity. However, such approaches are designed to examine the allelic imbalances in diploid samples and do not address copy number differences between the two alleles, a known phenotypical features of cancer cells. We describe the effect of allele-specific amplifications in ChIP-seq read densities obtained from cancer and non-cancer cell lines and develop a statistical approach called Baal-ChIP (Bayesian Analysis of Allelic imbalances from ChIP-seq data) to model the effect of relative allele frequency on the observed ChIP-seq read counts. Baal-ChIP allows the interrogation of multiple ChIP-seq datasets across a singe variant simultaneously and performs well in simulations. We applied this method to 548 ENCODE data sets obtained from a panel of 8 cancer and 6 non-cancer cell lines and observed that the majority of the allelic imbalances in cancer cell lines can be explained by imbalances in the background allele frequency due to genomic copy number alterations rather than true sequence regulatory effects. We find that 60% of the identified variants are non-coding, with 60% mapping to cell-type specific enhancers. Baal-ChIP illustrates the value of taking into consideration structural genomic alterations in other to detect putative cis-acting regulatory variants in cancer cell-lines.