This repository contains code for benchmarking and analyzing Topologically Associating Domains (TADs) in genomic data. It includes implementations of various metrics for evaluating TAD callers and analyzing their characteristics.
TADs (Topologically Associating Domains) are self-interacting genomic regions that play important roles in gene regulation and 3D genome organization. This project provides a comprehensive benchmarking framework for evaluating different TAD calling algorithms and methods.
The analysis focuses on several key aspects of TAD performance:
- Histone Modification Enrichment - Evaluating TAD with histone modifications (H3K27me3, H3K36me3)
- Structural Protein Enrichment - Analyzing enrichment of structural proteins (e.g., CTCF, RAD21, SMC3) at TAD boundaries
- Contact Enrichment - Measuring interaction frequencies within TADs
- Boundary Insulation - Quantifying the insulation strength of TAD boundaries
- Robustness - Assessing the consistency of TAD calls across different Hi-C sequencing depth
Data/
- Contains some example dataFig1C_HistMod_FDR/
- Histone modification analysisFig1D_StructProt_EnrichBoundaries/
- Structural protein enrichment at TAD boundariesFig1E_Contact_Enrichment/
- Contact enrichment analysisFig1F_Boundary_Insulation/
- Boundary insulation analysisFig1G_Robustness/
- Robustness analysisutils/
- Common utility functions used across analyses
The code is written in R and requires the following packages:
- magrittr
- dplyr
- tibble
- readr
- GenomicRanges
- purrr
Each analysis component can be run independently by executing the main R script in the corresponding directory.
The TAD data is stored in TSV format with the following columns:
chr
- Chromosome namestart
- Start position of the TADend
- End position of the TADmeta.tool
- Name of the TAD caller tool usedmeta.resol
- Resolution of the TAD calls (in base pairs)meta.sub
- Used in robustness analysis
Contributions to improve the benchmarking framework are welcome. Please feel free to submit pull requests or open issues for discussion.
This project is licensed under the terms of the included LICENSE file.