Skip to content

nuhs-projects/faster-whisper-types

Repository files navigation

Pydantic Types for faster-whisper

Pydantic types to make serializing to/from JSON easier for faster-whisper.

This is required because dataclasses (which faster-whisper uses) cannot be recursively deserialized.

Usage

Parse JSON into model parameters:

from faster_whisper_types.types import WhisperOptions

model = WhisperModel("tiny.en")
options = WhisperOptions.model_validate_json(some_json_data)
segments, info = model.transcribe(audio_file, **options.model_dump())

Convert faster-whisper output to JSON:

from faster_whisper import WhisperModel
from faster_whisper_types.util import fw_transcribe_output_to_pydantic

model = WhisperModel("tiny.en", "cuda", compute_type="float16")

segments, info = fw_transcribe_output_to_pydantic(model.transcribe("tests/audio/short.flac"))

info.model_dump_json(indent=2)
# {
#   "language": "en",
#   "language_probability": 1.0,
#   "duration": 10.008,
#   "duration_after_vad": 10.008,
#   "all_language_probs": null,
# ...

Testing

coverage run --branch -m pytest && coverage html

About

Pydantic types for faster-whisper.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages