Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 30 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,32 @@

# Useful Transformers
Useful Transformers is a library for efficient inference of Transformer models. The focus is on low cost, low energy processors to run inference at the edge. The initial implementation is aimed at running OpenAI's [Whisper](https://github.com/openai/whisper) speech-to-text model efficiently on the [RK3588](https://www.rock-chips.com/a/en/products/RK35_Series/2022/0926/1660.html) processors' based single-board computers. The tiny.en Whisper model runs transcribes speech at 30x real-time speeds, and 2x better than best [known](https://github.com/guillaumekln/faster-whisper) implementation.

## Getting started
Useful Transformers is a library for efficient inference of Transformer models. The focus is on low cost, low energy processors to run inference at the edge. The initial implementation is aimed at running OpenAI's [Whisper](https://github.com/openai/whisper) speech-to-text model efficiently on the [RK3588](https://www.rock-chips.com/a/en/products/RK35_Series/2022/0926/1660.html) processors' based single-board computers. The tiny.en Whisper model runs transcribes speech at 30x real-time speeds, and 2x better than the best [known](https://github.com/guillaumekln/faster-whisper) implementation.

## Getting Started

The easiest way to try out Whisper transcription is to install the [release](https://github.com/usefulsensors/useful-transformers/releases/download/0.1_rk3588/useful_transformers-0.1-cp310-cp310-linux_aarch64.whl) wheel package.

# Preferably inside a virtual environment
$ python -m pip install https://github.com/usefulsensors/useful-transformers/releases/download/0.1_rk3588/useful_transformers-0.1-cp310-cp310-linux_aarch64.whl
```bash
# Preferably inside a virtual environment
$ python -m pip install https://github.com/usefulsensors/useful-transformers/releases/download/0.1_rk3588/useful_transformers-0.1-cp310-cp310-linux_aarch64.whl
```

Try transcribing a WAV file.

Try transcribing a wav file.
```bash
$ taskset -c 4-7 python -m useful_transformers.transcribe_wav <wav_file> [output_file]
```

$ taskset -c 4-7 python -m useful_transformers.transcribe_wav <wav_file>
- `<wav_file>`: Path to the input WAV file.
- `[output_file]`: (Optional) Path to save the transcribed text. If omitted, the transcription is printed to the console.

If you don't have a wav file handy, running the above command will transcribe an example provided in the package.
If you don't have a WAV file handy, running the command without specifying a WAV file will transcribe an example provided in the package.

$ taskset -c 4-7 python -m useful_transformers.transcribe_wav
Ever tried, ever failed. No matter, try again. Fail again. Fail better.
```bash
$ taskset -c 4-7 python -m useful_transformers.transcribe_wav
Ever tried, ever failed. No matter, try again. Fail again. Fail better.
```

## Performance

Expand All @@ -25,19 +36,20 @@ The plot shows `useful-transformers` Whisper `tiny.en` model's inference times a

## TODO

- [x] Whisper tiny.en
- [x] Whisper base.en
- [ ] Larger Whisper models
- [ ] Use int8 matmuls from the librknnrt
- [ ] Use int4 matmuls (request Rockhip for int4 matmul kernels)
- [ ] Use asynchronous kernel launches (request Rockchip for better APIs in general)
- [ ] Decode with timestamps

- [x] Whisper tiny.en
- [x] Whisper base.en
- [ ] Larger Whisper models
- [ ] Use int8 matmuls from the librknnrt
- [ ] Use int4 matmuls (request Rockhip for int4 matmul kernels)
- [ ] Use asynchronous kernel launches (request Rockchip for better APIs in general)
- [ ] Decode with timestamps

## Contributors
* Nat Jeffries (@njeffrie)
* Manjunath Kudlur (@keveman)
* Guy Nicholson (@guynich)
* James Wang (@JamesUseful)
* Pete Warden (@petewarden)
* Ali Zartash (@aliz)
* Ali Zartash (@aliz)

----
24 changes: 16 additions & 8 deletions examples/whisper/transcribe_wav.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,25 @@
import os
import sys

import argparse
from .whisper import decode_wav_file


def main():
if len(sys.argv) < 2:
wav_file = os.path.join(os.path.dirname(__file__), 'assets', 'ever_tried.wav')
else:
wav_file = sys.argv[1]
text = decode_wav_file(wav_file)
print(text)
parser = argparse.ArgumentParser(description="Transcribe WAV file to text")
parser.add_argument('-i', '--input', type=str, help='Path to the input WAV file', default=os.path.join(os.path.dirname(__file__), 'assets', 'ever_tried.wav'))
parser.add_argument('-o', '--output', type=str, help='Path to the output text file')
args = parser.parse_args()

if not os.path.isfile(args.input):
print(f"The file {args.input} does not exist.")
sys.exit(1)

text = decode_wav_file(args.input)

if args.output:
with open(args.output, 'w') as file:
file.write(text)
else:
print(text)

if __name__ == '__main__':
main()