Skip to content

Conversation

@Pouyanpi
Copy link
Collaborator

Description

Fixes a bug where stream_async() would hang or behave unpredictably when output rails are configured but output.streaming.enabled is False or not set.

Problem

When users configured output rails but either:

  1. Set rails.output.streaming.enabled = False, or
  2. Didn't configure rails.output.streaming at all (defaults to disabled)

And then called stream_async(), the system would enter an incompatible state:

  • The streaming handler expected real-time token streaming
  • But output rails tried to run in blocking/non-streaming mode on the complete response
  • This resulted in hangs, undefined behavior, or silent failures

Fix

Added validation at the start of stream_async() to detect this misconfiguration and raise a clear, actionable ValueError with guidance on how to fix it.

The error message tells users to either:

  • Set rails.output.streaming.enabled = True in their configuration, or
  • Use generate_async() instead of stream_async()

…ut rails streaming

When output rails are configured but output.streaming.enabled is False (or not set),
calling stream_async() would result in undefined behavior or hangs due to the conflict
between streaming expectations and blocking output rail processing.

This change adds explicit validation in stream_async() to detect this misconfiguration
and raise a clear ValueError with actionable guidance:
- Set rails.output.streaming.enabled = True to use streaming with output rails
- Use generate_async() instead for non-streaming with output rails

Updated affected tests to expect and validate the new error behavior instead of
relying on the previous buggy behavior.
@Pouyanpi Pouyanpi added this to the v0.18.0 milestone Oct 24, 2025
@Pouyanpi Pouyanpi self-assigned this Oct 24, 2025
@Pouyanpi Pouyanpi added the bug Something isn't working label Oct 24, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a critical bug where stream_async() would hang or behave unpredictably when output rails are configured but streaming is disabled. The fix adds validation at the start of stream_async() to detect this misconfiguration and raise a clear, actionable error message guiding users to either enable output rails streaming or use generate_async() instead.

Key changes:

  • Added validation logic in stream_async() to check for incompatible output rails configuration
  • Updated multiple test files to verify the new error is raised correctly instead of testing the old incorrect behavior
  • Error message provides clear remediation steps for users

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
nemoguardrails/rails/llm/llmrails.py Added validation check in stream_async() that raises ValueError when output rails are configured but streaming is disabled
tests/test_streaming.py Added new test case test_streaming_with_output_rails_disabled_raises_error() to verify error is raised with explicit enabled=False config
tests/test_streaming_output_rails.py Updated three existing test cases to expect ValueError instead of testing the old incorrect behavior
tests/test_parallel_streaming_output_rails.py Updated one test case to expect ValueError for default config without explicit streaming settings

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR adds fail-fast validation to stream_async() to prevent a critical runtime bug where the method would hang indefinitely when output rails are configured without explicit streaming support. The fix adds a configuration check at the start of stream_async() in llmrails.py that raises a clear ValueError if output rails exist but rails.output.streaming.enabled is either False or unset. This prevents the incompatible state where the streaming handler waits for real-time tokens while output rails attempt to process the complete response in blocking mode. The validation fits into the existing configuration validation pattern and is positioned before any async task spawning occurs. Four test files were updated to verify the new error-raising behavior instead of testing the previous undefined behavior.

Important Files Changed

Filename Score Overview
nemoguardrails/rails/llm/llmrails.py 5/5 Added validation check to raise ValueError when stream_async() is called with output rails but streaming disabled
tests/test_streaming.py 5/5 Added test verifying stream_async() raises error when output rails configured with streaming.enabled=False
tests/test_streaming_output_rails.py 4/5 Updated three tests to expect ValueError instead of testing previous undefined behavior; two tests appear duplicative
tests/test_parallel_streaming_output_rails.py 4/5 Changed test to verify error-raising behavior instead of assuming default config streaming works

Confidence score: 4/5

  • This PR is safe to merge with only minor concerns about test organization
  • Score reflects robust validation logic with clear error messages, comprehensive test coverage, and well-documented behavior change; deducted one point due to apparent test duplication in test_streaming_output_rails.py (lines 177-193 and 224-240 are nearly identical) which could be consolidated or better differentiated
  • Pay close attention to tests/test_streaming_output_rails.py for the duplicated test cases that should potentially be merged or have their distinct purposes clarified

Sequence Diagram

sequenceDiagram
    participant User
    participant LLMRails
    participant StreamAsync
    participant ValidationCheck
    participant GenerateAsync
    participant StreamingHandler
    participant OutputRails

    User->>LLMRails: stream_async(messages)
    LLMRails->>StreamAsync: Initialize streaming
    
    StreamAsync->>ValidationCheck: Check output rails config
    
    alt Output rails exist AND streaming disabled
        ValidationCheck->>ValidationCheck: len(config.rails.output.flows) > 0
        ValidationCheck->>ValidationCheck: NOT config.rails.output.streaming.enabled
        ValidationCheck-->>User: raise ValueError("stream_async() cannot be used...")
    else Output rails compatible OR no output rails
        ValidationCheck->>StreamingHandler: Create StreamingHandler
        StreamAsync->>GenerateAsync: asyncio.create_task(generate_async())
        
        par Parallel Execution
            GenerateAsync->>OutputRails: Process with rails
            OutputRails->>StreamingHandler: push_chunk(tokens)
        and
            StreamingHandler-->>User: yield token chunks
        end
        
        GenerateAsync->>StreamingHandler: push_chunk(END_OF_STREAM)
        StreamingHandler-->>User: Stream complete
    end
Loading

4 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Collaborator

@tgasser-nv tgasser-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple of items before merging:

  • Could you add more tests to cover the 8 permutations of {stream_async, generate_async}, output rail streaming enabled, and RailsConfig.streaming? I expect some of these are already covered in other tests. For example (generate_async, output rail streaming.enabled False, Railsconfig.streaming False).
  • I understand we need to wait until inference-time for the call to either generate_async or stream_async to figure out whether the client actually wants to stream output or not. Are there any cases we can detect statically from Pydantic validators that we know wouldn't work regardless of which method the client calls?

@Pouyanpi
Copy link
Collaborator Author

Looks good, just a couple of items before merging:

  • Could you add more tests to cover the 8 permutations of {stream_async, generate_async}, output rail streaming enabled, and RailsConfig.streaming? I expect some of these are already covered in other tests. For example (generate_async, output rail streaming.enabled False, Railsconfig.streaming False).

I checked and currently we are testing all permutations

  1. stream_async + enabled=True + config.streaming=True → test_streaming_output_rails_allowed

  2. stream_async + enabled=True + config.streaming=False → test_stream_async_with_output_rails_streaming_enabled_rails_config_streaming_false

  3. stream_async + enabled=False + config.streaming=True → test_streaming_with_output_rails_disabled_raises_error

  4. stream_async + enabled=False + config.streaming=False → test_streaming_with_output_rails_no_streaming_config_raises_error
    generate_async tests (4 permutations):

  5. generate_async + enabled=True + config.streaming=True → test_generate_async_with_output_rails_streaming_enabled_rails_config_streaming_true

  6. generate_async + enabled=True + config.streaming=False → test_generate_async_with_output_rails_streaming_enabled_rails_config_streaming_false

  7. generate_async + enabled=False + config.streaming=True → test_generate_async_with_output_rails_streaming_disabled_rails_config_streaming_true

  8. generate_async + enabled=False + config.streaming=False → test_generate_async_with_output_rails_streaming_disabled_rails_config_streaming_false

*I understand we need to wait until inference-time for the call to either generate_async or stream_async to figure out whether the client actually wants to stream output or not. Are there any cases we can detect statically from Pydantic validators that we know wouldn't work regardless of which method the client calls

No, that's all we can do for now. Later when we drop top level streaming field makes it much cleaner.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the changes made since the last review, not the entire PR. The developer has addressed all previous review feedback by:

  1. Refactored validation logic - Extracted the output-rails streaming validation into a dedicated _validate_streaming_with_output_rails() method in llmrails.py, improving code maintainability and testability
  2. Strengthened test assertions - Replaced fragile substring checks (in) with exact equality assertions (==) across all test files to ensure the complete error message with actionable guidance is validated
  3. Removed test duplication - Consolidated three redundant test cases in test_streaming_output_rails.py into a single comprehensive test
  4. Added missing test coverage - Created a new test case in test_streaming.py for the scenario where rails.output.streaming configuration is completely omitted (defaults to disabled)
  5. Fixed error message text - Corrected "output.streaming.enabled" to "rails.output.streaming.enabled" throughout for accuracy

These changes improve code quality, test coverage, and user experience without modifying the core validation behavior introduced in the original fix.

Important Files Changed

Filename Score Overview
nemoguardrails/rails/llm/llmrails.py 5/5 Extracted validation logic into dedicated _validate_streaming_with_output_rails() method for better maintainability
tests/test_streaming.py 5/5 Added test for missing streaming config and upgraded assertions to exact string equality
tests/test_streaming_output_rails.py 5/5 Removed three redundant test cases and strengthened remaining assertion to check full error message
tests/test_parallel_streaming_output_rails.py 5/5 Updated assertion to validate complete error message with actionable guidance

Confidence score: 5/5

  • This PR is safe to merge with minimal risk
  • All previous review feedback has been comprehensively addressed with appropriate refactoring, test improvements, and duplication removal. The changes are purely improvements to code quality and test robustness without any logic modifications to the core validation behavior.
  • No files require special attention

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants