A modern Pythonic implementation of the popular Bengali phonetic-typing software Avro Phonetic.
avro.py provides a fully fledged, batteries-included text parser which can parse, reverse and even convert English Roman script into its phonetic equivalent (unicode) of Bengali. At its core, it implements an extensively modified version of the Avro Phonetic Dictionary Search Library by Mehdi Hasan Khan.
This package is inspired from Rifat Nabi's jsAvroPhonetic library and derives from Kaustav Das Modak's pyAvroPhonetic.
This package requires Python 3.9 or higher to be used inside your development environment.
# Install or upgrade.
pip install -U avro.py
You can also check the examples directory for checking this whole snippet in action, as well as other use cases.
For a single block of text, use avro.parse()
:
# Import the package.
import avro
# Our dummy text.
dummy = 'ami banglay gan gai.'
# Parse a single string.
parsed = avro.parse(dummy)
print(parsed) # আমি বাংলায় গান গাই।
If you have multiple strings, use avro.parse_iter()
to get a list of parsed results:
texts = ['ami banglay gan gai.', 'tumi kOthay zao?']
parsed_list = avro.parse_iter(texts)
print(parsed_list) # ['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?']
Alternatively, set the bijoy
flag to True
for receiving the output in the Bijoy Keyboard format.
bijoy_output = avro.parse(dummy, bijoy=True)
# Output: Avwg evsjvh় Mvb MvB।
To convert a single Bengali string (Avro/Unicode) to the Bijoy Keyboard format:
bijoy_text = avro.to_bijoy("আমি বাংলায় গান গাই।")
print(bijoy_text) # Avwg evsjvh় Mvb MvB।
To convert multiple strings at once, use avro.to_bijoy_iter()
:
bijoy_list = avro.to_bijoy_iter(['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?'])
print(bijoy_list) # ['Avwg evsjvh় Mvb MvB।', 'tvmf wkrwb‡¶ jd?']
On the contrary, to convert a single Bijoy string back to Unicode Bengali:
unicode_text = avro.to_unicode("Avwg evsjvh় Mvb MvB।")
print(unicode_text) # আমি বাংলায় গান গাই।
For multiple strings, use avro.to_unicode_iter()
:
unicode_list = avro.to_unicode_iter(['Avwg evsjvh় Mvb MvB।', 'tvmf wkrwb‡¶ jd?'])
print(unicode_list) # ['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?']
To reverse a single Unicode Bengali string back to English Roman script:
reversed_text = avro.reverse("আমি বাংলায় গান গাই।")
print(reversed_text) # ami banglay gan gai.
Warning
The reverse functions are by-nature lossy and might not output the correct replacement for some letters in favor of readability sometimes.
To reverse multiple strings at once, use avro.reverse_iter()
:
rev_list = avro.reverse_iter(['আমি বাংলায় গান গাই।', 'তুমি কোথায় যাও?'])
print(rev_list) # ['ami banglay gan gai.', 'tumi kothay zaw?']
avro.py also contains a built-in collection of words which are pre-baked to be passed into your text without any processing. These words can be accessed through both the parse
and reverse
functions, so that you do not have to care for phonetics:
Note
Remapping is a work-in-progress feature. Some words may still be missing.
avro.parse("ami Microsoft e kaj kori")
# আমি মাইক্রোসফট এ কাজ করি
All of the functions above, when suffixed with _async
, provide their asynchronous counterparts which have a slight performance bump in certain use cases. Please see the async examples to find out more about their usage.
"Fork -> Do your changes -> Send a Pull Request, it's that easy!"
This project is based on the uv package manager by Astral. In order to automatically update and set up the environment, you can run the following command:
# (Optional) Install recommended Python version and
# setup virtual environment for development.
$ uv python install && uv venv
$ source .venv/bin/activate
# Install the project:
$ uv sync --all-extras --dev
# Build the project:
$ uv build --verbose
In order to run the tests, you can use the following command:
# Run unit tests:
$ uv run pytest .
avro.py would not be possible without the awesome minds behind the original Avro Keyboard software:
And, some awesome people:
This project has been licensed under the MIT License.