Skip to content

A command-line utility for the generation of comprehensive reports on ribosome profiling (Ribo-Seq) dataset quality

License

Notifications You must be signed in to change notification settings

JackCurragh/RiboMetric

Repository files navigation

RiboMetric

PyPI version Documentation Status Build Status Python Version License

A python command-line utility for the generation of comprehensive reports on the quality of ribosome profiling (Ribo-Seq) datasets

Installation

To install RiboMetric:

$ pip install RiboMetric

For PDF export support (adds ~30 dependencies):

$ pip install RiboMetric[pdf]

Usage

Create annotation files from gff files:

$ RiboMetric prepare -g gff_file.gff

Use the annotation file to run RiboMetric on a bam file:

$ RiboMetric run -b bam_file.bam -a annotation_RiboMetric.tsv

By default, RiboMetric calculates standard Ribo-Seq QC metrics. To enable optional (theoretical) metrics:

$ RiboMetric run -b bam_file.bam -a annotation_RiboMetric.tsv --enable-optional-metrics

Or enable specific metrics:

$ RiboMetric run -b bam_file.bam -a annotation_RiboMetric.tsv --enable-metric periodicity_fourier

For more information on how to use RiboMetric, see the documentation or use --help

Features

RiboMetric calculates comprehensive quality metrics for Ribo-Seq data:

Default Metrics (Standard Ribo-Seq QC):
  • Read length distribution (IQR, coefficient of variation, max proportion)
  • Terminal nucleotide bias (5' and 3' ligation bias detection)
  • 3-nt periodicity (frame dominance and information content)
  • Metagene uniformity (entropy-based)
  • CDS coverage
  • Regional distribution (5'UTR, CDS, 3'UTR proportions and ratios)
Optional Metrics (Theoretical/Experimental):
  • Alternative periodicity methods (autocorrelation, Fourier transform, Trips-Viz)
  • Alternative uniformity methods (autocorrelation, Theil index, Gini index)
  • Additional read length metrics (bimodality, normality tests)

Use --enable-optional-metrics to calculate all metrics, or --enable-metric <name> for specific ones.

Output Formats

RiboMetric provides multiple output formats for different use cases:

For Pipeline Integration:
  • Summary TSV - One-line summary per sample for quick QC decisions
  • QC Status JSON - Machine-readable pass/warn/fail with thresholds
  • Comparison CSV - Wide format for multi-sample comparison
For Sample Review:
  • Interactive HTML - Professional reports with executive summary and searchable metrics
  • PDF - Archivable reports for documentation
  • Metrics Table CSV - Detailed metrics with read-length breakdowns

See REPORTING_GUIDE.md for complete documentation and examples.

Requirements

  • Transcriptomic alignments are required in BAM format
  • GFF annotations from Ensembl are also required

Testing

RiboMetric has a comprehensive test suite to ensure reliability:

$ pip install -r requirements_test.txt
$ pytest

For more information, see TESTING.md

Credits

This project was worked on by Lukas Wierdsma during his Internship at the UCC for Bioinformatics, Howest in 2023.

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

About

A command-line utility for the generation of comprehensive reports on ribosome profiling (Ribo-Seq) dataset quality

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •