Skip to content

hitblast/avro.py

Repository files navigation

avro.py

A modern Pythonic implementation of the popular Bengali phonetic-typing software Avro Phonetic.

Downloads Python Version License



⚡ Overview

avro.py provides a fully fledged, batteries-included text parser which can parse, reverse and even convert English Roman script into its phonetic equivalent (unicode) of Bengali. At its core, it implements an extensively modified version of the Avro Phonetic Dictionary Search Library by Mehdi Hasan Khan.

✨ Inspirations

This package is inspired from Rifat Nabi's jsAvroPhonetic library and derives from Kaustav Das Modak's pyAvroPhonetic.


🔨 Installation

This package requires Python 3.9 or higher to be used inside your development environment.

# Install or upgrade.
pip install -U avro.py

Usage Guide

You can also check the examples directory for checking this whole snippet in action, as well as other use cases.

Parsing to Bengali

For a single block of text, use avro.parse():

# Import the package.
import avro

# Our dummy text.
dummy = 'ami banglay gan gai.'

# Parse a single string.
parsed = avro.parse(dummy)
print(parsed)  # আমি বাংলায় গান গাই।

If you have multiple strings, use avro.parse_iter() to get a list of parsed results:

texts = ['ami banglay gan gai.', 'tumi kOthay zao?']
parsed_list = avro.parse_iter(texts)
print(parsed_list)  # ['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?']

Alternatively, set the bijoy flag to True for receiving the output in the Bijoy Keyboard format.

bijoy_output = avro.parse(dummy, bijoy=True)
# Output: Avwg evsjvh় Mvb MvB।

Conversions

To convert a single Bengali string (Avro/Unicode) to the Bijoy Keyboard format:

bijoy_text = avro.to_bijoy("আমি বাংলায় গান গাই।")
print(bijoy_text)  # Avwg evsjvh় Mvb MvB।

To convert multiple strings at once, use avro.to_bijoy_iter():

bijoy_list = avro.to_bijoy_iter(['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?'])
print(bijoy_list)  # ['Avwg evsjvh় Mvb MvB।', 'tvmf wkrwb‡¶ jd?']

On the contrary, to convert a single Bijoy string back to Unicode Bengali:

unicode_text = avro.to_unicode("Avwg evsjvh় Mvb MvB।")
print(unicode_text)  # আমি বাংলায় গান গাই।

For multiple strings, use avro.to_unicode_iter():

unicode_list = avro.to_unicode_iter(['Avwg evsjvh় Mvb MvB।', 'tvmf wkrwb‡¶ jd?'])
print(unicode_list)  # ['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?']

Reversing Back

To reverse a single Unicode Bengali string back to English Roman script:

reversed_text = avro.reverse("আমি বাংলায় গান গাই।")
print(reversed_text)  # ami banglay gan gai.

Warning

The reverse functions are by-nature lossy and might not output the correct replacement for some letters in favor of readability sometimes.

To reverse multiple strings at once, use avro.reverse_iter():

rev_list = avro.reverse_iter(['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?'])
print(rev_list)  # ['ami banglay gan gai.', 'tumi kothay zaw?']

Remapped Exceptions

avro.py also contains a built-in collection of words which are pre-baked to be passed into your text without any processing. These words can be accessed through both the parse and reverse functions, so that you do not have to care for phonetics:

Note

Remapping is a work-in-progress feature. Some words may still be missing.

avro.parse("ami Microsoft e kaj kori")
# আমি মাইক্রোসফট এ কাজ করি

Asynchronous Operations

All of the functions above, when suffixed with _async, provide their asynchronous counterparts which have a slight performance bump in certain use cases. Please see the async examples to find out more about their usage.


🛠️ Contributing

:octocat: "Fork -> Do your changes -> Send a Pull Request, it's that easy!"

This project is based on the uv package manager by Astral. In order to automatically update and set up the environment, you can run the following command:

# (Optional) Install recommended Python version and
# setup virtual environment for development.
$ uv python install && uv venv
$ source .venv/bin/activate

# Install the project:
$ uv sync --all-extras --dev

# Build the project:
$ uv build --verbose

In order to run the tests, you can use the following command:

# Run unit tests:
$ uv run pytest .

❤️ Acknowledgements

avro.py would not be possible without the awesome minds behind the original Avro Keyboard software:

And, some awesome people:


License

This project has been licensed under the MIT License.

About

The Avro Keyboard you love for typing in unicode Bengali, written in Python, for Python developers.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Contributors 8

Languages