TT-Mars: Structural Variants Assessment Based on Haplotype-resolved Assemblies.
- Clone TT-Mars from github and
cd TT-Mars. Python >= 3.8 is preferred. - Create environment and activate:
conda create -n ttmarsandconda activate ttmars. - Run
dowaload_files.shto download required files to./ttmars_files. - Run
download_asm.shto download assembly files of 10 samples from HGSVC. - Install packages:
conda install -c bioconda pysam,conda install -c anaconda numpy,conda install -c bioconda mappy,conda install -c conda-forge biopython,conda install -c bioconda pybedtools. - Run TT-Mars with following steps:
run_ttmars.shincludes more instructions. Users can run it to run TT-Mars after setting up.
The main program: run python ttmars.py -h for help.
python ttmars.py output_dir files_dir centro_file vcf_file reference asm_h1 asm_h2 tr_file num_X_chr
output_dir: Output directory.files_dir: Input files directory../ttmars_files/sample_name. The directory where you store required files after runningdowaload_files.sh.centro_file: provided centromere file.vcf_file: callset file callset.vcf(.gz).reference: referemce file reference_genome.fasta.asm_h1: assembly files assembly1.fa, which were downloaded after runningdownload_asm.sh.asm_h2: assembly files assembly2.fa, which were downloaded after runningdownload_asm.sh.tr_file: provided tandem repeats file.num_X_chr: if male sample: 1; if female sample: 2.
-n/--not_hg38: if reference is NOT hg38/chm13 (hg19).
-p/--passonly: if consider PASS calls only.
-s/--seq_resolved: if consider sequence resolved calls.
-w/--wrong_len: if count wrong length calls as True.
-g/--gt_vali: conduct genotype validation.
-i/--gt_info: index with GT info. (For phased callsets)
-d/--phased : take phased information. (For phased callsets)
-v/--vcf_out: output results as vcf files (tp (true positive), fp (false positive) and na).
-f/--false_neg: output recall, must be used together with -t/--truth_file.
-t/--truth_file: input truth vcf file, must be used together with -f/--false_neg.
ttmars_combined_res.txt:
| SV index | relative length | relative score | validation result | chr | start | end | Type | Genotype Match |
|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 3.48 | True | chr1 | 249912 | 249912 | INS | True |