-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Added multi-language support to the tokenizer #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Switch from badge.fury.io to shields.io for working PyPI badge - Convert relative paths to absolute GitHub URLs for PyPI compatibility - Bump version to 0.1.3
- Add GitHub Actions workflow for automated PyPI publishing via OIDC - Configure trusted publishing environment for verified releases - Update project metadata with proper URLs and license format - Prepare for v1.0.0 stable release with production-ready automation
- Add pylibmagic>=0.5.0 dependency for bundled libraries - Add [full] install option and pre-import handling - Update README with troubleshooting and Docker sections - Bump version to 1.0.1 Fixes google#6
Deleted an inline comment referencing the output directory in the save_annotated_documents.
…ples.md docs: clarify output_dir behavior in medication_examples.md
Prevents confusion from default `test_output/...` by explicitly saving to current directory.
docs: add output_dir="." to all save_annotated_documents examples
feat: add code formatting and linting pipeline
Introduces a common base exception class that all library-specific exceptions inherit from, enabling users to catch all LangExtract errors with a single except clause.
Add LangExtractError base exception for centralized error handling
Fixes google#25 - Windows installation failure due to pylibmagic build requirements Breaking change: LangFunLanguageModel removed. Use GeminiLanguageModel or OllamaLanguageModel instead.
fix: Remove LangFun and pylibmagic dependencies to fix Windows installation and OpenAI SDK v1.x compatibility
- Modified save_annotated_documents to accept both pathlib.Path and string paths - Convert string paths to Path objects before calling mkdir() - This fixes the error when using output_dir='.' as shown in the README example
…-mkdir Fix save_annotated_documents to handle string paths
feat: Add OpenAI language model support
Signed-off-by: shankeleven <[email protected]>
Manual validation results: Size: 205 lines Run ID: 16790882125 |
Manual validation results: Size: 205 lines Run ID: 16791204003 |
Manual Validation ResultsStatus: ❌ Failed
Errors:
|
Manual Validation ResultsStatus: ❌ Failed
Errors:
|
Your branch is 24 commits behind git fetch origin main
git merge origin/main
git push Note: Enable "Allow edits by maintainers" to allow automatic updates. |
Your branch is 51 commits behind git fetch origin main
git merge origin/main
git push Note: Enable "Allow edits by maintainers" to allow automatic updates. |
Your branch is 86 commits behind git fetch origin main
git merge origin/main
git push Note: Enable "Allow edits by maintainers" to allow automatic updates. |
Your branch is 98 commits behind git fetch origin main
git merge origin/main
git push Note: Enable "Allow edits by maintainers" to allow automatic updates. |
Your branch is 106 commits behind git fetch origin main
git merge origin/main
git push Note: Enable "Allow edits by maintainers" to allow automatic updates. |
Your branch is 107 commits behind git fetch origin main
git merge origin/main
git push Note: Enable "Allow edits by maintainers" to allow automatic updates. |
Your branch is 109 commits behind git fetch origin main
git merge origin/main
git push Note: Enable "Allow edits by maintainers" to allow automatic updates. |
Your branch is 110 commits behind git fetch origin main
git merge origin/main
git push Note: Enable "Allow edits by maintainers" to allow automatic updates. |
Your branch is 111 commits behind git fetch origin main
git merge origin/main
git push Note: Enable "Allow edits by maintainers" to allow automatic updates. |
Solves #13
regex patterns have been modified to be unicode aware
have added tests for different languages
added a new depency regex , as it works better with unicode compared to re