-
Notifications
You must be signed in to change notification settings - Fork 7
Demo - AI safety validators (lexical slurs and PII removal) #463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demo - AI safety validators (lexical slurs and PII removal) #463
Conversation
|
Caution Review failedThe pull request is closed. WalkthroughIntroduces a comprehensive guardrails safety system for content validation. Adds configuration models, a GuardrailsEngine orchestrator that builds and executes validators from config, multiple validator implementations (lexical slur detection, PII anonymization, ban lists), language detection utilities, and corresponding test coverage. Includes hub validator loader infrastructure and new project dependencies. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant GuardrailsEngine
participant HubLoader
participant LanguageDetector
participant Validators as Validators<br/>(Lexical Slur,<br/>Ban List, PII)
participant ExternalServices as External<br/>(Guardrails Hub,<br/>Presidio, HF Model)
User->>GuardrailsEngine: init(GuardrailConfigRoot)
activate GuardrailsEngine
GuardrailsEngine->>HubLoader: ensure_hub_validator_installed(type)
activate HubLoader
HubLoader->>HubLoader: is_importable(module_path)?
alt Not Installed
HubLoader->>ExternalServices: guardrails hub install
ExternalServices-->>HubLoader: (installed)
end
HubLoader-->>GuardrailsEngine: (ready)
deactivate HubLoader
GuardrailsEngine->>HubLoader: load_hub_validator_class(type)
HubLoader-->>GuardrailsEngine: validator_class
GuardrailsEngine->>Validators: instantiate validators
Validators-->>GuardrailsEngine: validator instances
GuardrailsEngine-->>User: GuardrailsEngine ready
deactivate GuardrailsEngine
User->>GuardrailsEngine: run_input_validators(text)
activate GuardrailsEngine
GuardrailsEngine->>LanguageDetector: predict(text)
activate LanguageDetector
LanguageDetector->>ExternalServices: XLM-RoBERTa inference
ExternalServices-->>LanguageDetector: lang_label
LanguageDetector-->>GuardrailsEngine: {language, score}
deactivate LanguageDetector
rect rgb(200, 220, 255)
note right of GuardrailsEngine: Run validators based on language
end
GuardrailsEngine->>Validators: run(text)
alt Language == Hindi
Validators->>Validators: Hinglish path (lexical slur)
else Language == English
Validators->>ExternalServices: Presidio anonymize (PII)
ExternalServices-->>Validators: anonymized_text
end
Validators-->>GuardrailsEngine: validated_output
GuardrailsEngine-->>User: validated_result
deactivate GuardrailsEngine
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Suggested labels
Suggested reviewers
Poem
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (13)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Target issue is #PLEASE_TYPE_ISSUE_NUMBER
Explain the motivation for making this change. What existing problem does the pull request solve?
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes
Please add here if any other information is required for the reviewer.
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.