-
Notifications
You must be signed in to change notification settings - Fork 2
feat: Test framework modernization BDD #331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…reamline and enhance our testing capabilities. Universal Framework: Works across all BDD and xtest suites from a single, shared core. Profile-Based Testing: Easily switch between test configurations like cross-sdk-basic and no-kas. Simplified Integration: A central pytest plugin removes the need for suite-specific setup. Key Capabilities: Includes service discovery, deterministic time/randomness control, and automated evidence collection. Future-Proof: Lays the groundwork for TestRail integration. Backward-Compatible: Existing xtest commands are unaffected, allowing for gradual adoption of new features. Requirement: Requires Python 3.12 or newer.
- Added @pytest.mark.req() and @pytest.mark.cap() markers to 4
- Tagged all xtest functions with business requirement IDs (BR
- Added capability dimensions: sdk, format, feature, policy, e
- Registered custom markers in conftest.py for pytest integrat
## CoverageMatrixGenerator (Phase 1C)
- Created unified reporting system for ALL test suites (xtest,
- Single report aggregates coverage across multiple test suite
- Multi-dimensional coverage analysis:
- Business requirement coverage tracking
- SDK compatibility matrix (go×java, go×js, etc.)
- Capability coverage across 7+ dimensions
- Automatic gap detection and severity classification
## Report Formatters
- HTML: Interactive dashboard with charts and progress bars
- JSON: Machine-readable format for CI/CD integration
- Markdown: Documentation-friendly for PRs and wikis
- All formats generated from single Pydantic data model
## CLI Interface
- `python -m framework.reporting` command for report generatio
- Configurable coverage thresholds for CI/CD gates
- Support for profile-based analysis
- Evidence-based test result loading
## Test Coverage Statistics
- Successfully discovered 57 tests (48 xtest, 9 BDD)
- Tracked 5 business requirements
- SDK matrix shows 34 tests per combination
- 17 different features tracked
- Reports generated in <1 second
The framework now provides comprehensive visibility into test
and helps identify gaps to prioritize future testing efforts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @pflynn-virtru, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly modernizes the BDD test framework by introducing robust environment setup, profile-based test configuration, and enhanced artifact management. It lays the foundation for deterministic test execution and seamless integration with external test management tools, aiming to improve the reliability and efficiency of the overall testing pipeline.
Highlights
- BDD Framework Enhancements: The new
bdd/environment.pyscript provides a robust test setup, integrating a service locator for dynamic dependency resolution, deterministic time and randomness controllers for reproducible tests, and automated evidence collection for each scenario. - Profile-Based Test Configuration: Introduction of a flexible profile system (
profiles/) allows defining test configurations based on capabilities. This enables tailored test runs, such as across-sdk-basicprofile for full functionality or ano-kasprofile for testing scenarios without the Key Access Service. - Comprehensive Documentation: The pull request includes comprehensive documentation in
REQUIREMENTS.mdandDESIGN.md, outlining the business and functional requirements, architectural overview, and detailed implementation plans for the test framework modernization. - External Integration Readiness: New configuration templates (
.env.example) and detailed design for TestRail and Jira integration lay the groundwork for centralized test management, enabling automatic test run creation, result publishing, and bug reporting. - Pytest Plugin for Universal Support: A universal
framework/pytest_plugin.pyhas been added to seamlessly integrate core framework features like profile-based filtering, evidence collection, and deterministic controls into any pytest-based test suite, including the existingxtestsuite. - Deterministic Testing Controls: The implementation of
TimeControllerandRandomnessControllerensures that test execution is deterministic and reproducible, which is critical for reliably identifying and addressing flaky tests. - Automated Coverage Reporting: New modules under
framework/reporting/enable the generation of detailed coverage matrices, including requirements coverage, capability coverage, and SDK compatibility, with output options for JSON, Markdown, and HTML.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant and well-designed modernization of the test framework, adding BDD support, profile-based configuration, and deterministic execution controls. The architecture is robust, and the use of Pydantic models and a dedicated pytest plugin is excellent. My review focuses on several critical and high-severity issues related to potential security vulnerabilities (use of eval), conflicting hooks that could break test execution, and duplicate data models that will cause maintenance problems. I've also included several medium-severity suggestions to improve maintainability by reducing hardcoded logic and making the framework more data-driven, which aligns with the overall design goals.
|
|
||
|
|
||
| # Cleanup | ||
| def after_scenario(context, scenario): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defining an after_scenario hook here will conflict with the one defined in bdd/environment.py. Behave only executes one hook of each type per run. This will likely cause the main after_scenario hook in environment.py (which handles evidence collection) to be skipped for features that use these steps, leading to incomplete test results.
The cleanup logic should be moved into a separate function and attached to the context object in a before_scenario hook, then called from the main after_scenario hook in environment.py.
| def _evaluate_condition(self, condition: str, capabilities: Dict[str, str]) -> bool: | ||
| """ | ||
| Evaluate a skip condition. | ||
|
|
||
| Simple evaluation of conditions like: | ||
| - "sdk == 'swift' and format == 'ztdf-ecwrap'" | ||
| """ | ||
| if not condition: | ||
| return False | ||
|
|
||
| # Replace capability references with actual values | ||
| eval_condition = condition | ||
| for key, value in capabilities.items(): | ||
| eval_condition = eval_condition.replace(key, f"'{value}'") | ||
|
|
||
| try: | ||
| # Safe evaluation with limited scope | ||
| return eval(eval_condition, {"__builtins__": {}}, {}) | ||
| except Exception as e: | ||
| logger.warning(f"Failed to evaluate condition '{condition}': {e}") | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of eval() on a string derived from configuration (condition) is a significant security risk. Even with a limited scope, it can be exploited if a malicious actor can influence the profile configuration files. A safer approach should be used, such as implementing a simple custom parser for the condition strings or using a dedicated safe evaluation library like asteval.
bdd/environment.py
Outdated
| } | ||
|
|
||
| # Save evidence to file | ||
| evidence_dir = context.artifacts_dir / context.req_id / context.variant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
context.req_id can be None if no req: tag is found on the scenario or its feature. Using it directly in a Path operation (context.artifacts_dir / context.req_id) will raise a TypeError. You should handle the case where req_id is None, for example, by using a default directory name like "unclassified".
| evidence_dir = context.artifacts_dir / context.req_id / context.variant | |
| evidence_dir = context.artifacts_dir / (context.req_id or "unclassified") / context.variant |
| @dataclass | ||
| class ProfileConfig: | ||
| """Configuration settings for a test profile.""" | ||
|
|
||
| roles: Dict[str, Dict[str, List[str]]] = field(default_factory=dict) | ||
| selection: Dict[str, Any] = field(default_factory=dict) | ||
| matrix: Dict[str, Any] = field(default_factory=dict) | ||
| timeouts: Dict[str, int] = field(default_factory=dict) | ||
|
|
||
| @classmethod | ||
| def from_dict(cls, data: Dict[str, Any]) -> 'ProfileConfig': | ||
| """Create ProfileConfig from dictionary.""" | ||
| return cls( | ||
| roles=data.get('roles', {}), | ||
| selection=data.get('selection', {}), | ||
| matrix=data.get('matrix', {}), | ||
| timeouts=data.get('timeouts', {}) | ||
| ) | ||
|
|
||
|
|
||
| @dataclass | ||
| class ProfilePolicies: | ||
| """Policy settings for a test profile.""" | ||
|
|
||
| waivers: List[Dict[str, str]] = field(default_factory=list) | ||
| expected_skips: List[Dict[str, str]] = field(default_factory=list) | ||
| severities: Dict[str, str] = field(default_factory=dict) | ||
|
|
||
| @classmethod | ||
| def from_dict(cls, data: Dict[str, Any]) -> 'ProfilePolicies': | ||
| """Create ProfilePolicies from dictionary.""" | ||
| return cls( | ||
| waivers=data.get('waivers', []), | ||
| expected_skips=data.get('expected_skips', []), | ||
| severities=data.get('severities', {}) | ||
| ) | ||
|
|
||
|
|
||
| @dataclass | ||
| class Profile: | ||
| """Test profile configuration.""" | ||
|
|
||
| id: str | ||
| capabilities: Dict[str, List[str]] | ||
| config: ProfileConfig | ||
| policies: ProfilePolicies | ||
| metadata: Dict[str, Any] = field(default_factory=dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file defines Profile, ProfileConfig, and ProfilePolicies using dataclass. However, framework/core/models.py defines the same (or very similar) models using Pydantic. Having duplicate model definitions is confusing and will lead to maintenance issues and bugs. The Pydantic models in models.py are more robust due to built-in validation. It's highly recommended to consolidate these models and use only the Pydantic versions throughout the framework.
bdd/environment.py
Outdated
| import logging | ||
|
|
||
| # Add framework to path | ||
| sys.path.insert(0, str(Path(__file__).parent.parent)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifying sys.path directly can be brittle. Since this project is set up with pyproject.toml, a better practice is to install the framework in an editable mode using pip install -e . from the root directory. This makes the framework package available throughout your environment without needing to manipulate the path in your code.
bdd/environment.py
Outdated
| try: | ||
| context.profile = context.profile_manager.load_profile(profile_id) | ||
| logger.info(f"Loaded profile: {profile_id}") | ||
| except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catching a broad Exception can hide unexpected errors. ProfileManager.load_profile raises a ValueError if the profile is not found, so it would be more precise to catch that specific exception. This allows other unexpected errors (e.g., IOError, parsing errors) to fail fast, which can make debugging easier.
| except Exception as e: | |
| except ValueError as e: |
bdd/environment.py
Outdated
| if context.profile.id == "no-kas" and cap_key in ["format", "encryption", "policy"]: | ||
| scenario.skip(f"Capability '{cap_key}' not available without KAS") | ||
| logger.info(f"Skipping '{scenario.name}': {cap_key} requires KAS") | ||
| return | ||
| else: | ||
| scenario.skip(f"Capability '{cap_key}' not available in profile") | ||
| logger.info(f"Skipping '{scenario.name}': missing capability {cap_key}") | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block contains hardcoded logic specific to the no-kas profile. This makes the framework less maintainable, as any new profile with similar skipping rules would require code changes here. This logic should be data-driven and defined within the profile's configuration files (e.g., using expected_skips or a new forbidden_capabilities section in policies.yaml). The existing profile.should_skip() method should be enhanced to handle this, centralizing all skip logic.
| # Special handling for no-kas profile | ||
| if profile.id == "no-kas": | ||
| # Skip any test that requires encryption capabilities | ||
| if any(key in ["format", "encryption", "policy", "kas_type"] for key in required_caps): | ||
| deselected.append(item) | ||
| item.add_marker(pytest.mark.skip( | ||
| reason=f"Profile '{profile.id}' does not support encryption capabilities" | ||
| )) | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block contains hardcoded logic for the no-kas profile. This logic should be data-driven and defined within the profile's configuration to improve maintainability and align with the framework's design. The pytest_collection_modifyitems hook should interpret the profile's rules rather than containing profile-specific code.
| gaps = [] | ||
|
|
||
| # Check requirement coverage | ||
| expected_requirements = ["BR-101", "BR-102", "BR-301", "BR-302", "BR-303"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…e and improved error handling; introduced BDD tests for authorization-only scenarios; updated no-KAS profile with explicit capabilities; added mock implementations and steps for GetDecisions API.
- Replaced `requirements.txt` with `pyproject.toml` for dependency management using `uv`. - Introduced `run.py` script to centralize test setup, execution, and platform management. - Added `BDD` and `framework` modernized test suites with detailed README guides. - Updated CI workflows for Python 3.13 compatibility and `uv` integration. - Enhanced documentation to streamline and simplify testing workflows.
- Removed `requirements.txt` in favor of `pyproject.toml` for improved dependency handling. - Added `uv` dependency caching to optimize CI workflow performance. - Updated `.github/workflows/xtest.yml` to support `uv` and Python 3.13. - Enhanced README to include updated `uv` virtual environment creation instructions.
…onment and handle dependencies more effectively.
…ths and update dependency installation to use `requirements.txt` with `uv pip sync`.
…flows/xtest.yml`.
- Add logic to conditionally use `requirements.txt` from either the current branch or the main branch. - Implement error handling if no `requirements.txt` is found.
- Refine messaging for `requirements.txt` source (`next main` and `legacy main checkout`). - Add installation of `GitPython` when using `legacy main checkout`.
…flow - Removed `pwd` and `ls -la` commands from `.github/workflows/xtest.yml`. - Updated Python executable path to use the virtual environment in dependency resolution logic.
* Fix shellcheck * Add pytest, ruff, and black to "dev" group * Run dev tooling with 'uv' * Exit zero, despite ruff errors * Exit zero, despite black errors * Add 'pyyaml' dependency * Remove unused imports
* Remove 'project.optional-dependencies' table * Use 'uv run' for tooling
- Introduced a `clean` subcommand for `run.py` to clean up the test environment safely, including stopping the platform and removing untracked files (with exclusions). - Updated `run.py` to use the virtual environment's Python interpreter for improved portability. - Adjusted paths in `xtest/conftest.py` for consistency within `xtest` directory. - Updated README to reflect the new execution model using `./run.py` directly.
…ecution in `xtest` - Updated `.gitignore` to reflect new temporary file handling strategy with `pytest`. - Integrated session-scoped `tmp_path` management, replacing legacy hardcoded directories (`tmp/`). - Enhanced `run.py` to support parallel test execution with configurable worker counts - Added `pytest-xdist` dependency to `pyproject.toml` for parallelization. - Streamlined directory management in fixtures (`conftest.py`) for better test isolation and cleanup. - Updated documentation in `DESIGN.md` to explain the revised temporary file management strategy.
Introduce a root `conftest.py` for standard pytest configuration, including custom markers and CLI options. Centralize temporary files in a `work/` directory for improved IDE visibility and debugging. Migrate BDD tests to `pytest-bdd` for unified parallel execution with cross-SDK tests, enhance module structure, and update dependencies.
- Introduced `uv.lock` for dependency locking to ensure consistent development and deployment environments. - Updated dependencies to their latest compatible versions.
- Implemented a Go-based test helper server to facilitate integration testing for ABAC scenarios. - Introduced key management endpoints like attribute creation, namespace operations, and key assignments. - Added a Python `OpentdfHttpClient` for interacting with the test helper server, replacing subprocess overhead. - Improved test performance by integrating with the new HTTP-based approach.
Introduced a test helper HTTP server to simplify test workflows and enhanced error handling for certain subprocess calls. Also updated dependencies and improved the cleaning script by targeting specific generated artifacts instead of using `git clean`.
- Added Go, Java, and Node.js-based test helper servers to streamline cross-SDK integration testing. - Implemented endpoints for encryption, decryption, and policy management. - Improved performance and reduced reliance on subprocess calls in testing workflows.
…ak readiness checks - Improved SDK server build and start logic with better error handling. - Added Keycloak readiness validation to ensure smooth service provisioning. - Updated encryption/decryption flow in Go SDK server to support base64 input/output. - Simplified Python `run.py` script for managing test workflows and service setup.
Dependency ReviewThe following issues were found:
|
- Introduced an SSL context in `run.py` to bypass certificate verification for Keycloak readiness checks. - Enhanced Keycloak health check to handle 302 redirects as a valid response. - Updated README to include the new `./run.py clean` command.
- Added `filterwarnings` block in `pyproject.toml` to ignore warnings related to unknown pytest marks and gherkin deprecation.
- Added `stream_output` parameter to `run_command` to support real-time output streaming, particularly for pytest runs. - Updated logic to utilize virtual environment activation where necessary. - Refined pytest suite execution with clear type-specific logging. - Removed redundant `-q` flag from pytest `addopts` in `pyproject.toml` for better test output visibility.
… workflows - Introduced multi-KAS testing flow with dynamic configuration via `start_multi_kas` in `run.py`. - Added `generate-keys.sh` script for automated key generation in multi-KAS setups. - Created new `multi-kas` profile configuration files (`config.yaml`, `capabilities.yaml`) for split-key testing. - Refactored logging configuration to suppress verbose output and standardize levels across modules. - Improved cross-SDK tests with updated parameter names for clarity. - Modified HTTP client retry strategy to disable retries and enhance error responsiveness.
…eration - Introduced profile-specific `opentdf.yaml` loading for flexible configuration. - Added automatic KAS key generation using `generate-keys.sh`. - Created `multi-kas` profile with detailed setup instructions and testing support.
- Deleted `bdd/` test environment setup and test scripts, including associated utilities and fixtures. - Removed `run_bdd_tests.py`, including its setup logic for `behave` and artifact generation. - Extracted evidence management logic into a new `ArtifactManager` and `EvidenceManager` in `framework/core/evidence.py` for modularity and reuse. - Updated `DESIGN.md` to reflect changes in architecture and focus on a leaner testing philosophy.
|




This pull request introduces significant enhancements to the BDD test framework, focusing on deterministic test execution, profile-based configuration, artifact management, and integration with external test management tools. The main changes include the addition of a comprehensive requirements document, a new environment setup script for BDD tests, example feature files and step definitions, and configuration templates for TestRail integration.
uvFramework and Test Infrastructure Enhancements:
bdd/environment.pyto provide robust test environment setup, including service locator, deterministic time and randomness controllers, profile management, tag parsing, scenario skipping logic based on capabilities, and automated evidence/artifact collection for each scenario.bdd/features/framework_demo.featureas a demo feature file, showcasing framework integration scenarios for service resolution, deterministic controllers, and profile loading.bdd/features/steps/common_steps.pywith reusable step definitions for profile-based scenario skipping, especially for scenarios requiring KAS (Key Access Service).bdd/features/steps/__init__.pyto mark the steps directory as a Python package.Documentation and Configuration:
REQUIREMENTS.md, a detailed requirements and acceptance criteria document for the test framework modernization initiative, covering business and functional requirements, artifact management, integrations, and success metrics..env.examplewith environment variable templates for TestRail configuration, performance, caching, and BDD settings.Example Test Coverage Report
Generated: 2025-08-14 19:04:32
Executive Summary
Test Suite Summary
Requirements Coverage
SDK Compatibility Matrix
Cross-SDK test coverage (encryption → decryption):
Capability Coverage
Auth_Type
Encryption
Feature
Feature2
Format
Framework
Kas_Type
Policy
Sdk
Report generated by OpenTDF Test Framework