Skip to content

danbing-tk v1.0

Choose a tag to compare

@joyeuxnoel8 joyeuxnoel8 released this 12 Jan 04:33
· 236 commits to master since this release

Improvements:

  • Improved length estimation accuracy using multi-boundary expansion, due to more accurate orthology mapping of VNTRs across haplotypes.
  • More stringent QC on VNTR size, number of supporting haplotypes, consistency of liftover coordinates, etc.
  • Slightly expand VNTR set from 29,111 to 32,138 loci.
  • Added more user-friendly length estimation script.
  • Added option for alignment output by using -a with danbing-tk align
  • DOI created using Zenodo

Additional resources:

  • Repeat-pangenome graph encoded as pan.tr.kmers, pan.ntr.kmers and pan.graph.kmers in RPGG.tar.gz
  • 84,411 raw VNTR coordinates tr.84411.bed
  • 32,138 raw VNTR coordinates (high-confidence genotypable set) tr.good.bed
  • 397 non-VNTR regions ctrl.bed
  • Locus-specific biases of VNTR and non-VNTR regions LSB.tsv
  • Summary of eGene discoveries Alltissue.egenes.tsv
  • Comprehensive VNTR statistics vntr.statistics.tsv vntr.statistics.README
  • 13 PacBio CLR assemblies (26 haplotypes) *.h?.fasta.gz
  • 32,138 boundary-expanded VNTR coordinates in the 26 haplotypes pan.tr.mbe.no_CCS.bed and pan.tr.mbe.no_CCS.README
  • 73,582 boundary-expanded VNTR coordinates pan.tr.73582.mbe.no_CCS.bed

Example analyses:

  • QC of multi-boundary expansion 202011.MultiBoundaryExpansion.QC.ipynb
  • Measuring length prediction accuracy 202012.Acc.pan.ipynb
  • Contrasting the most informative kmer between populations 202012.mikmer.ipynb
  • eQTL mapping 202012.eQTL.32138.ipynb
  • Sample QC on locus-specific bias LSB_analysis.ipynb
  • Heritability analysis of SNP v.s. SNP+VNTR models 202011.sg.joint.ipynb
  • Miscellaneous analyses in the original manuscript 202012.revision.supp.ipynb