-
Notifications
You must be signed in to change notification settings - Fork 196
Historical Data Management OSS-Fuzz SDK Implementation #1150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add comprehensive data models for build, crash, corpus, and coverage history - Implement HistoricalSummary model for aggregated statistics - Add specialized error classes for SDK configuration and validation - Include proper type hints and Pydantic validation
- Extend storage adapters with history-specific functionality - Add support for time-series data storage and retrieval - Implement environment variable utilities for configuration - Improve error handling and logging in storage operations
- Add abstract HistoryManager base class with common functionality - Implement BuildHistoryManager for build statistics and trends - Add CoverageHistoryManager for coverage data analysis - Include data validation and storage abstraction - Add comprehensive logging and error handling
- Implement CorpusHistoryManager for corpus growth analysis - Add CrashHistoryManager for crash tracking and statistics - Include duplicate detection and data validation - Complete the historical data management infrastructure
- Add OSSFuzzSDK class as main entry point for historical data - Implement project report generation and analysis features - Add fuzzing efficiency analysis and health scoring - Include environment configuration and error handling - Provide unified interface for all history managers
- Export OSSFuzzSDK and history managers in package __init__ - Add data models and error classes to public API - Maintain backward compatibility with existing exports - Complete integration of historical data functionality
- Add test suite for OSSFuzzSDK main functionality - Include tests for all history managers (build, crash, corpus, coverage) - Test configuration, error handling, and edge cases - Ensure proper integration with storage and data validation - Add mocking for external dependencies
/gcbrun exp -n zewei -m vertex_ai_gemini-2-5-flash-chat -ag -b quick-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a comprehensive Historical Data SDK for OSS-Fuzz, providing unified access to historical fuzzing data with specialized managers for builds, crashes, corpus, and coverage analysis. The implementation includes storage infrastructure, data models, and extensive testing capabilities.
Key Changes:
- Introduces the main
OSSFuzzSDK
facade class for unified historical data access - Adds specialized history managers for builds, crashes, corpus, and coverage data
- Extends storage infrastructure with history-specific operations and multiple backend support
Reviewed Changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 6 comments.
Show a summary per file
File | Description |
---|---|
ossfuzz_py/utils/env_vars.py | Adds environment variables for historical data storage configuration |
ossfuzz_py/unittests/test_local_builder_pipeline.py | Updates path resolution for benchmark YAML file |
ossfuzz_py/unittests/test_historical_data_sdk.py | Comprehensive test suite for new SDK functionality |
ossfuzz_py/unittests/test_cloud_builder_pipeline.py | Updates path resolution for benchmark YAML file |
ossfuzz_py/history/*.py | New history manager classes and base functionality |
ossfuzz_py/errors/*.py | New error types for historical data operations |
ossfuzz_py/data/storage_*.py | Extended storage infrastructure with history operations |
ossfuzz_py/core/ossfuzz_sdk.py | Main SDK facade implementation |
ossfuzz_py/core/data_models.py | New data models for historical data structures |
ossfuzz_py/init.py | Updates to public API exports |
…atures - Update cloud builder pipeline tests for new SDK integration - Modify local builder pipeline tests to work with enhanced functionality - Ensure backward compatibility and proper error handling - Fix any test conflicts with new historical data features
0a02fd8
to
29484f1
Compare
/gcbrun exp -n zewei -m vertex_ai_gemini-2-5-flash-chat -ag -b quick-test |
Summary
This PR introduces a comprehensive Historical Data SDK for OSS-Fuzz, providing a unified interface for accessing, storing, and analyzing historical fuzzing data. The SDK enables researchers and developers to track fuzzing progress over time, analyze trends, and generate detailed reports across builds, crashes, corpus, and coverage data.
Features
OSSFuzzSDK
class providing unified access to all historical data functionalityBuildHistoryManager
- Build history, success rates, and artifactsCrashHistoryManager
- Crash data, deduplication, and analysisCorpusHistoryManager
- Corpus growth, statistics, and effectivenessCoverageHistoryManager
- Coverage data, trends, and reportingStorageManager
- Unified storage backend managementStorageAdapter
- Abstract interface with file and GCS implementationsTesting
test_historical_data_sdk.py
with comprehensive coverage of:test_cloud_builder_pipeline.py
,test_local_builder_pipeline.py
) to use proper path resolutionNone - This is a purely additive feature that extends the existing SDK without modifying existing APIs or functionality.