Skip to content

ncc-gap/juncmut

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

juncmut

The function for analyzing splicing junction associated variant (generated by STAR or bamTojunction, .SJ.out.tab files )

Dependency

Binary programs

samtools, bedtools

Python

pysam, edlib, bio

Install

Download this repository and run prep.sh

pip install git+https://github.com/ncc-gap/juncmut.git

cd db
bash prep.sh
cd ..

Commands

detect

juncmut detect [-h] 
               [--control_file [CONTROL_FILE [CONTROL_FILE ...]]]
               [--read_num_thres READ_NUM_THRES]
               [--freq_thres FREQ_THRES]
               [--mut_num_thres MUT_NUM_THRES]
               [--mut_freq_thres MUT_FREQ_THRES]
               [--support_read_rmdup_thres SUPPORT_READ_RMDUP_THRES]
               [--debug]
               input.SJ.out.tab input.bam juncmut.txt reference.fa gencode.v46.basic.annotation.gtf.gz

positional arguments:
  input.SJ.out.tab      Input splice junctions generated by STAR
  input.bam             Input RNA bam file
  juncmut.txt           Output file
  reference.fa          Reference genome
  gencode.v46.basic.annotation.gtf.gz
                        GENCODE gene file

optional arguments:
  -h, --help            show this help message and exit
  --control_file [CONTROL_FILE [CONTROL_FILE ...]]
                        Control data created by merge_control (default: None)
  --read_num_thres READ_NUM_THRES
                        Splicing junctions with reads >= read_num_thres is saved (default: 3)
  --freq_thres FREQ_THRES
                        Splicing junctions with reads >= freq_thres is saved (default: 0.05)
  --mut_num_thres MUT_NUM_THRES
                        A mutation with mutation alleles >= mut_num_thres is a true candidate (default: 1)
  --mut_freq_thres MUT_FREQ_THRES
                        A mutation with frequency >= mut_freq_thres is a true candidate (default: 0.05)
  --support_read_rmdup_thres SUPPORT_READ_RMDUP_THRES
                        A mutation with mutation alleles >= mut_num_thres is a true candidate (default: 2)
  --debug               True keeps the intermediate files.

filt_bam

usage: juncmut filt_bam [-h] juncmut.txt input.bam juncmut.filt.bam wgEncodeGencode.txt.gz

positional arguments:
  juncmut.txt           juncmut.txt
  input.bam             Input RNA bam file
  juncmut.filt.bam      Output bam file
  wgEncodeGencode.txt.gz
                        GENCODE gene file

optional arguments:
  -h, --help            show this help message and exit

sjclass

usage: juncmut sjclass [-h] [--gencode gencode.v46.basic.annotation.gtf.gz] [--depth_th depth_th] [--debug]
                       juncmut.annot.txt juncmut.annot.sjclass.txt juncmut.filt.bam input.SJ.out.tab reference.fa

positional arguments:
  juncmut.annot.txt     Input file(such as file generated by juncmut
  juncmut.annot.sjclass.txt
                        Output classified file
  juncmut.filt.bam      Input RNA bam file
  input.SJ.out.tab      Input splice junctions generated by STAR
  reference.fa          Reference genome

optional arguments:
  -h, --help            show this help message and exit
  --gencode gencode.v46.basic.annotation.gtf.gz
                        GENCODE gene file
  --depth_th depth_th   Depth in the intron for classification.
  --debug               keep temporary files.

alu

usage: juncmut alu [-h] [--alu {Alu_Funakoshi,AluJ0,AluJ0pA,AluYb8pA}] [--debug]
                   juncmut.sjclass.txt juncmut.sjclass.alu.txt rmsk_sine.bed reference.fa

positional arguments:
  juncmut.sjclass.txt   Input file (generated by juncmut sjclass
  juncmut.sjclass.alu.txt
                        Output mapped alu file
  rmsk_sine.bed         Input rmsk_sine.bed (db/rmsk_sine.bed)
  reference.fa          Reference genome

optional arguments:
  -h, --help            show this help message and exit
  --alu {Alu_Funakoshi,AluJ0,AluJ0pA,AluYb8pA}
                        choice alu reference
  --debug               keep temporary files.

annot

usage: juncmut annot [-h] [--acmg_file ACMG_SF_v3.2.txt] [--cgc_file CancerGeneSensus_GRCh38_v97.txt] [--cgd_file CGD.txt] [--clinvar_file clinvar.vcf.gz]
                     [--clinvar_star234_file clinvar_star234_gene.txt] [--dosage_sensitivity_file clingen_dosage_sensitivity_230105.proc.txt]
                     [--gnomad gnomad.genomes.r3.0.sites.vcf.bgz] [--pancan_file PANAtlas_CellTableS1.txt] [--debug]
                     juncmut.txt juncmut.annot.txt reference.fa

positional arguments:
  juncmut.txt           Input file (such as juncmut.txt generated by juncmut)
  juncmut.annot.txt     Output annotated file
  reference.fa          Reference genome

optional arguments:
  -h, --help            show this help message and exit
  --acmg_file ACMG_SF_v3.2.txt
  --cgc_file CancerGeneSensus_GRCh38_v97.txt
  --cgd_file CGD.txt
  --clinvar_file clinvar.vcf.gz
  --clinvar_star234_file clinvar_star234_gene.txt
  --dosage_sensitivity_file clingen_dosage_sensitivity_230105.proc.txt
  --gnomad gnomad.genomes.r3.0.sites.vcf.bgz
  --pancan_file PANAtlas_CellTableS1.txt
  --debug               True keeps the intermediate files.

filt

usage: juncmut filt [-h] juncmut.annot.txt juncmut.annot.filt.txt

positional arguments:
  juncmut.annot.txt     Input file(such as juncmut.txt generated by juncmut)
  juncmut.annot.filt.txt
                        Output filted file

optional arguments:
  -h, --help            show this help message and exit

Tutorial

See this page

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •