Alpha Release (v0.5): This library is in active development. While already functional and useful for real-world applications, some APIs may change, and additional cleanup and feature development are in progress. Feedback and contributions are welcome!
I have added another interesting example examples/debate_cyclical_improvement.py. This example submits a query, performs answer inference and then performs an iterative critique with scoring followed by improvement of the answer based on the critique until the judging AI rates the answer as 10/10 or max_iterations occurs. This is a very cool example.
RATING: 10
EXPLANATION: This response is comprehensive, well-organized, and addresses all aspects of the original query. It provides an informed perspective on the likelihood of achieving AGI in the next decade, potential societal implications, and various factors to consider, such as economic impacts, global disparities, intermediate milestones, biases in development, and ethical guidelines. The response is also written in a clear, concise manner that makes it easy for readers to understand. Overall, this is an exceptional response that deserves a perfect score of 10.
Rating: 10.0/10
Python framework for multiple GGUF language models to collaborate on tasks using structured communication patterns, aggregating their outputs into coherent responses. While I chose to support GGUF to begin with, I plan to add OpenAI compatible server support allowing local LLM server and Internet APIs to be called.
- Installation
- Quick Start
- Key Features
- Ensembles Explained
- Why Ensembles Work
- Why This Library?
- Examples
- Decision Guide: Choosing Collaboration & Aggregation Patterns
- Collaboration Phases
- Aggregation Strategies
- YAML Configuration and Template Guide
- Architecture Diagram
- Class Hierarchy and Interaction Diagram
- Project File Structure
- API Reference
- Requirements
- Contributing
- Running Tests
- Roadmap
- License
- Special Thanks
- Python >= 3.10
- Sufficient RAM for running multiple language models
- Preferably a GPU with sufficient VRAM to load multiple models, otherwise your llama-cpp will store models in RAM and use your CPU for inference
For GPU acceleration note the export/set. See optional dependencies below before installing
export CMAKE_ARGS="-DGGML_CUDA=on"
export FORCE_CMAKE=1
pip install ai-ensemble-suite
SET CMAKE_ARGS="-DGGML_CUDA=on"
SET FORCE_CMAKE=1
pip install ai-ensemble-suite
For Apple Metal (macOS):
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install llama-cpp-python
pip install ai-ensemble-suite
The library has optional dependencies that can be installed with:
pip install "ai-ensemble-suite[rich]" # For enhanced console output
pip install "ai-ensemble-suite[dev]" # For development tools
To install the package with development dependencies:
git clone https://github.com/StephenGenusa/ai-ensemble-suite.git
cd ai-ensemble-suite
pip install -e ".[dev]"
You will want to determine which GGUF models you wish to work with. Download them and place in the root folder of the project under a /models directory... or change the YAML to point to the location of your models.
Here's a minimal example to get started with AI Ensemble Suite:
from ai_ensemble_suite import Ensemble
# Initialize the ensemble with a configuration
ensemble = Ensemble(config_path="path/to/config.yaml")
# Initialize models (loads them into memory)
await ensemble.initialize()
# Ask a question
try:
response = await ensemble.ask("What are the key considerations for sustainable urban planning?")
print(response)
finally:
# Release resources when done
await ensemble.shutdown()
# Alternatively, use with async context manager
async with Ensemble(config_path="path/to/config.yaml") as ensemble:
response = await ensemble.ask("What are the key considerations for sustainable urban planning?")
print(response)
- β Multiple Collaboration Phases: Choose from over a dozen different collaboration methods to design your ensemble approach
- β Specialized Model Roles: Assign different roles (critic, synthesizer, researcher, etc.) to models, allowing them to specialize in different aspects of task completion
- β Flexible Aggregation Strategies: Use different methods for combining model outputs into coherent, high-quality responses
- β Advanced Tracing: Get detailed traces of model interactions and decision-making processes
- β Extensible Design: Easily add custom collaboration phases or aggregation strategies
- β Built for GGUF Models: Optimized for running multiple smaller GGUF language models locally
- β Confidence Estimation: Token probability analysis and self-evaluation capabilities
- β Concurrent Model Management: Efficient loading and execution of multiple models
- β Async-First Design: Native async/await support with context manager
- β YAML Configuration for Models: YAML configuration files to hold model parameters and queries
- β Jinja2 Templating: The (query) "templates" portion of the the YAML configuration file support Jinja2
Think of AI ensembles like getting advice from a group of experts instead of just one person. Just as you might ask several friends for opinions before making an important decision, AI ensembles combine the strengths of multiple AI systems or approaches to produce better results than any single AI could achieve alone. Instead of relying on one AI that might have blind spots or make certain types of mistakes, ensembles bring together diverse AI perspectives that can check each other's work, complement each other's strengths, and collectively arrive at more reliable answers. It's similar to how a team of doctors with different specialties might collaborate on a difficult medical caseβthe combined expertise leads to better outcomes than what any individual doctor could provide on their own.
Brief Explanations of AI Ensemble Techniques
-
AsyncThinking: Like a rapid brainstorming session where multiple people jot down ideas independently before sharing, this technique generates diverse initial thoughts quickly without influence from other perspectives.
-
StructuredCritique: Similar to having your work reviewed by a tough but fair editor, this approach systematically evaluates ideas, identifies logical flaws, and improves the rigor of thinking.
-
SynthesisOriented: Acts like a skilled mediator who finds common ground between opposing viewpoints, integrating different perspectives into a balanced, comprehensive analysis.
-
RoleBasedDebate: Resembles a panel discussion with experts from different fields, each contributing specialized knowledge to address complex topics from multiple angles.
-
HierarchicalReview: Works like a multi-stage editing process, where content is refined layer by layer, with each review focusing on different aspects for progressive improvement.
-
CompetitiveEvaluation: Functions like a contest where multiple solutions compete, and the strongest approach wins based on objective criteria.
-
PerspectiveRotation: Similar to walking around a sculpture to view it from all sides, this technique examines issues from different stakeholder perspectives, ethical frameworks, or creative angles.
-
ChainOfThoughtBranching: Like mapping out a complex maze with multiple possible paths, this method explores different reasoning routes for problems with multiple decision points.
-
AdversarialImprovement: Acts as a stress-test or devil's advocate, actively looking for weaknesses in a solution to strengthen it against potential problems.
-
RoleBasedWorkflow: Operates like a production line with specialized stations, creating a structured process where different roles handle specific aspects of a multi-stage analysis.
-
Bagging: Works like taking the average of multiple poll results to get a more stable prediction, reducing the impact of outliers or unusual patterns.
-
UncertaintyBasedCollaboration: Similar to how a group might work together on a puzzle with missing pieces, this approach handles ambiguous questions by combining different levels of confidence.
-
StackedGeneralization: Functions like a team of specialists with a coordinator, where outputs from different AI models are combined to leverage their unique strengths and minimize weaknesses.
Ensemble methods have a strong theoretical and empirical foundation in machine learning, and they're particularly effective with language models for several reasons:
-
Diverse Knowledge and Perspectives: Different language models, even when trained on similar data, develop slightly different internal representations and "expertise areas." By combining multiple models, you access a broader knowledge base than any single model contains.
-
Error Reduction Through Aggregation: Models tend to make different errors. When their outputs are combined intelligently, errors from one model can be corrected by others, leading to more accurate results.
-
Specialization Through Roles: When models adopt specialized roles (like critic, researcher, or synthesizer), they can focus on specific aspects of a task. This division of cognitive labor mirrors effective human teams and leads to more thorough analysis.
-
Iterative Refinement: Multi-step collaboration allows initial ideas to be critiqued, refined, and expanded. This resembles human drafting and editing processes, typically producing higher quality results than single-pass generation.
-
Confidence Calibration: Ensemble techniques help identify areas of uncertainty or disagreement between models, leading to better-calibrated confidence in the final output.
Research consistently shows that properly designed ensembles outperform even the strongest individual models, often by significant margins. AI Ensemble Suite provides the infrastructure to easily tap into these powerful techniques.
AI Ensemble Suite was created to address two key challenges:
-
The Need for Human-Friendly Ensemble AI: After extensive searching for a comprehensive yet easy-to-use library for ensemble AI work that followed a "for humans" philosophy, nothing quite fit the bill. This library makes it easy to harness multiple smaller language models on local machines to produce enhanced AI responses.
-
Structured Collaboration Patterns: Rather than just averaging model outputs, AI Ensemble Suite implements sophisticated collaboration patterns where models can critique, refine, and extend each other's work - resulting in higher quality responses that benefit from diverse model strengths.
The framework is designed with simplicity in mind for common use cases while providing comprehensive customization for advanced users.
Additionally, this project served as a meta-challenge to build a medium sized Python library using AI assistance despite context window limitations, developing techniques to work around these constraints.
The library includes several example scripts demonstrating different ensemble techniques:
- Basic Usage: Non-ensembled simple usage with default configurations
- Structured Debate: Models present opposing viewpoints to refine conclusion
- Expert Committee: Specialized models contribute domain expertise
- Hierarchical Review: Progressive refinement through layers of specialized models
- Competitive Evaluation: Multiple solutions generated and evaluated against criteria
- Perspective Rotation: Problem analyzed through different framing lenses
- Chain-of-Thought Branching: Reasoning paths that branch and reconverge
- Adversarial Improvement: One model finds flaws in another's reasoning
- Role-based Workflow: Models adopt complementary roles in a structured process
- Bagging: Combines models to reduce prediction variance
Note on Examples: Some example files require additional libraries not included in requirements.txt. Several examples also generate graphic charts to disk, which were implemented to help visualize how the ensemble builds the final result. These visualization features are optional but helpful for testing and understanding the process.
When you need... | Use this collaboration pattern | Best for |
---|---|---|
Quick independent analyses | AsyncThinking | Simple questions, brainstorming, diverse initial ideas |
Critical evaluation of ideas | StructuredCritique | Evaluating arguments, finding flaws, improving rigor |
Balanced perspectives | SynthesisOriented | Finding common ground, integrating viewpoints, balanced analysis |
Multiple specialist perspectives | RoleBasedDebate | Complex topics requiring multiple forms of expertise |
Progressive improvement | HierarchicalReview | Content requiring layer-by-layer refinement or fact-checking |
Competition between solutions | CompetitiveEvaluation | Generating multiple solutions and selecting the best one |
Examining from different angles | PerspectiveRotation | Ethical analysis, stakeholder considerations, creative ideation |
Complex reasoning paths | ChainOfThoughtBranching | Mathematical problems, logic puzzles, decision trees |
Stress-testing solutions | AdversarialImprovement | Finding edge cases, improving robustness, anticipating objections |
Structured workflow process | RoleBasedWorkflow | Research projects, content creation, multi-stage analysis |
Stabilizing volatile outputs | Bagging | Reducing variance, improving prediction stability |
Handling uncertainty | UncertaintyBasedCollaboration | Questions with ambiguity, calibrating confidence |
Model stacking | StackedGeneralization | Leveraging strengths of different model types, boosting performance |
When you need... | Use this aggregation strategy | Best for |
---|---|---|
To prioritize some models over others | WeightedVoting | When certain models perform better for specific tasks |
To use the final result of a sequence | SequentialRefinement | When using phases that progressively refine content |
To choose the most confident output | ConfidenceBased | When models have reliable confidence estimation |
To evaluate along multiple criteria | MultidimensionalVoting | Complex evaluation requiring different quality dimensions |
To blend multiple perspectives | EnsembleFusion | Creating a coherent synthesis from diverse inputs |
Dynamic strategy selection | AdaptiveSelection | When different queries benefit from different aggregation approaches |
AI Ensemble Suite implements various collaboration phases that can be combined or used individually:
- β AsyncThinking: Models work independently on a problem before combining insights
- β Integration/Refinement: Models refine responses based on feedback and insights
- β ExpertCommittee: Final processing/structuring of model outputs before aggregation
- β StructuredCritique: Models evaluate others' responses using structured formats
- β SynthesisOriented: Models focus on finding common ground and integrating perspectives
- β RoleBasedDebate: Models interact according to assigned specialized roles
- β HierarchicalReview: Content is progressively reviewed by models in a hierarchical structure
- β CompetitiveEvaluation: Models are pitted against each other in a competition
- β PerspectiveRotation: Models iterate on a problem by assuming different perspectives
- β ChainOfThoughtBranching: Models trace through multiple reasoning paths
- β AdversarialImprovement: Models improve a solution by seeking its weaknesses
- β RoleBasedWorkflow: Models function in specialized roles like researcher, analyst, and writer
- β Bagging: Models process different variations of the same input
- β UncertaintyBasedCollaboration: Uncertainty measurements guide model interactions
- β StackedGeneralization: Base models process input, then a meta-model combines their outputs
The library provides several strategies for aggregating the outputs from multiple models:
- β WeightedVoting: Models are assigned different weights based on performance or expertise
- β SequentialRefinement: Assumes phases run in a sequence where later phases refine earlier ones
- β ConfidenceBased: Selects output with the highest confidence score
- β MultidimensionalVoting: Evaluates outputs along multiple dimensions
- β EnsembleFusion: Uses a model to synthesize multiple outputs into one
- β AdaptiveSelection: Dynamically selects and executes another aggregation strategy
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Ensemble β
β (Main user-facing interface) β
β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββββββββββ
β β Attributes: β β Core Methods: β β Interaction Methods: ββ
β β β config_manager β β β __init__() β β β initialize() β ModelManager.initialize() ββ
β β β model_manager β β β ask() β β β shutdown() β ModelManager.shutdown() ββ
β β β template_manager β β β configure() β β β _execute_collaboration_phases() ββ
β β β _initialized β β β __aenter__(), β β β _aggregate_results() ββ
β β β _initialization_lock β β __aexit__() β β ββ
β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββ¬ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β β β
β β β
βββββββββΌβββββββ ββββββββββΌββββββββββ βββββββββββΌβββββββββ
β ConfigManagerβ β ModelManager β β TemplateManager β
β (YAML Config β β (GGUF Models) β β β
βββββββββ¬βββββββ ββββββββββ¬ββββββββββ ββββββββββββββββββββ
β β
β β
β βΌ
β ββββββββββββββββββββββ
β β Model Registry β
β βββββββββββββ¬βββββββββ
β β
ββββββββββΌβββββββββββ βββββββΌββββββββββββββββββββββββββββββββββββ
β β β β
βΌ βΌ βΌ βΌ
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ
β BaseAggregator β βBaseCollaboration β β TraceCollector β
β (Abstract) β βPhase (Abstract) β β Tracing System β
βββββ¬βββββββ¬βββββββ ββββββ¬βββββββ¬βββββββ β(Records all collaboration steps)β
β β β β βββββββββββββββββββββββββββββββββββ
βΌ βΌ βΌ βΌ
βββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β β
β Aggregation Implementations β β Collaboration Phase Implementations β
β βββββββββββββββββββββββββββ β β ββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββββ
β β WeightedVoting β β β β AsyncThinking β βChainOfThought β βBaseDebate ββ
β βββββββββββββββββββββββββββ€ β β ββββββββββββββββββ€ βββββββββββββββββββ€ ββββββββββββββββββββ€β
β β SequentialRefinement β β β β Integration β βCompetitiveEval β βStructuredCritiqueββ
β βββββββββββββββββββββββββββ€ β β ββββββββββββββββββ€ βββββββββββββββββββ€ ββββββββββββββββββββ€β
β β ConfidenceBased β β β β ExpertCommitteeβ βPerspectiveRot β βSynthesisOriented ββ
β βββββββββββββββββββββββββββ€ β β ββββββββββββββββββ€ βββββββββββββββββββ€ ββββββββββββββββββββ€β
β β MultidimensionalVoting β β β β HierarchicalRevβ βAdversarialImp β βRoleBasedDebate ββ
β βββββββββββββββββββββββββββ€ β β ββββββββββββββββββ βββββββββββββββββββ€ βββββββββββββββββββββ
β β EnsembleFusion β β β βRoleBasedWorkflowβ β
β βββββββββββββββββββββββββββ€ β β βββββββββββββββββββ β
β β AdaptiveSelection β β β ββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββ β
β βββββββββββββββββββββββββββ β β β Bagging β βUncertaintyBased β βStackedGen β β
β β β ββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββ β
βββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Exception Hierarchy β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β AiEnsembleSuiteError (Base) β β
β ββββββββ¬βββββββββββββββ¬ββββββββββββββββββ¬βββββββββββββββββ¬βββββββββββββββββββββ β
β β β β β β
β βΌ βΌ βΌ βΌ β
β ββββββββββββββ βββββββββββββββ βββββββββββββββββ βββββββββββββββ β
β βModelError β βConfigError β βCollabError β βAggrError β β
β ββββββββββββββ ββββββββ¬βββββββ βββββββββββββββββ βββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββ β
β βValidationError β β
β ββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β ENSEMBLE CLASS β
β (Main user-facing interface & orchestrator) β
β β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ€
β β β β
β Public API: β Core Control Flow: β Context Management: β
β β initialize()β β _execute_collaboration_phases() β β __aenter__() β
β β shutdown() β β _aggregate_results() β β __aexit__() β
β β ask() β β _get_phase_class() β β _initialization_lock β
β β configure() β β β
β β β β
βββββββββββ¬ββββββββ΄ββββββββββββββββββββββ¬ββββββββββββββββββββ΄ββββββββββββββ¬ββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββ
β β β β β β
β ConfigManager β β ModelManager β β TemplateManager β
β (Config handling) β β(Model loading/inference)β β (Prompt template β
β β β β β management) β
β β load() β β β initialize() β β β get_template() β
β β update() β β β shutdown() β β β render_template() β
β β validate() β β β get_model() β β β
β β get_collaboration_mode() β β β run_inference() β βββββββββββββββββββββββββββββββ
β β get_aggregation_strategy()β β β
β β β β
βββββββββββ¬ββββββββββββββββββββ βββββββββββββ¬ββββββββββββββ
β β
β βΌ
β ββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
β β Model Registry β
β β (Loaded GGUF Models) β
β β β
ββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββΏββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β COLLABORATION PHASES β β
β β β
ββββββββββββββββββββββββ¬ββββββββββββββββββββββ¬ββββββββββββββββββββββββ€ β
β β β β β
β Simple Phases: β Complex Phases: β Debate Types: β β
β β AsyncThinking β β CompetitiveEval β β StructuredCritiqueβ β
β β Integration β β PerspectiveRot β β SynthesisOriented β β
β β ExpertCommittee β β ChainOfThought β β RoleBasedDebate β β
β β HierarchicalRev β β AdversarialImp β β β
β β β RoleBasedWork β β β
β β β β β
ββββββββββββββββββββββββ΄ββββββββββββββββββββββ΄ββββββββββββββββββββββββ€ β
β β β
β Machine Learning-Oriented Phases: β β
β β Bagging β β
β β UncertaintyBasedCollaboration β β
β β StackedGeneralization β β
β β β
ββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ β
β β
β β
βΌ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β AGGREGATION STRATEGIES ββββββββββββββββ
β β
ββββββββββββββββββββββββ¬ββββββββββββββββββββββ¬ββββββββββββββββββββββββ€
β β β β
β β WeightedVoting β β ConfidenceBased β β EnsembleFusion β
β β SequentialRef β β MultiDimVoting β β AdaptiveSelection β
β β β β
ββββββββββββββββββββββββ΄ββββββββββββββββββββββ΄ββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β UTILITY COMPONENTS β
β β
ββββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ€
β β β
β β TraceCollector β β Exceptions: β
β (Execution tracing) β - AiEnsembleSuiteError β
β β - ConfigurationError β
β β Logger β - ModelError β
β (Structured logging) β - CollaborationError β
β β - AggregationError β
β β - ValidationError β
β β β
ββββββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββ
ai_ensemble_suite/
βββ π __init__.py # Package exports (Ensemble class)
βββ π ensemble.py # Main Ensemble class implementation
β
βββ π config/
β βββ π __init__.py # Configuration package exports
β βββ π config_manager.py # Core configuration handling
β βββ π defaults.py # Default config values
β βββ π schema.py # JSON Schema for config validation
β βββ π template_manager.py # Manages prompt templates
β βββ π utils.py # Configuration utilities
β
βββ π models/
β βββ π __init__.py # Model package exports
β βββ π model_manager.py # Core model management
β βββ π llm_interface.py # Common LLM interface
β βββ π gguf_model.py # GGUF format model implementation
β βββ π llama_cpp.py # llama.cpp specific implementation
β βββ π metadata.py # Model metadata handling
β
βββ π collaboration/
β βββ π __init__.py # Collaboration package exports
β βββ π base.py # BaseCollaborationPhase abstract class
β βββ π async_thinking.py # Independent parallel thinking
β βββ π integration.py # Result integration approach
β βββ π expert_committee.py # Expert committee pattern
β βββ π hierarchical_review.py # Hierarchical review pattern
β βββ π competitive_evaluation.py # Competitive evaluation pattern
β βββ π perspective_rotation.py # Perspective rotation pattern
β βββ π chain_of_thought.py # Chain-of-thought implementation
β βββ π adversarial_improvement.py # Adversarial improvement pattern
β βββ π role_based_workflow.py # Role-based workflow pattern
β βββ π structured_debate.py # Structured debate with subtypes
β βββ π bagging.py # ML-inspired bagging approach
β βββ π uncertaintybased.py # Uncertainty-based collaboration
β βββ π stackedgeneralization.py # Stacked generalization approach
β
βββ π aggregation/
β βββ π __init__.py # Aggregation package exports
β βββ π base.py # BaseAggregator abstract class
β βββ π weighted_voting.py # Weighted voting implementation
β βββ π sequential_refinement.py # Sequential refinement pattern
β βββ π confidence_based.py # Confidence-based aggregation
β βββ π multidimensional_voting.py # Multi-dimensional voting
β βββ π ensemble_fusion.py # Ensemble fusion approach
β βββ π adaptive_selection.py # Adaptive selection strategy
β
βββ π exceptions/
β βββ π __init__.py # Exception exports
β βββ π errors.py # Custom exception definitions
β
βββ π utils/
βββ π __init__.py # Utilities package exports
βββ π logging.py # Logging configuration
βββ π tracing.py # Execution tracing (TraceCollector)
βββ π concurrency.py # Concurrency utilities
βββ π prompt_tools.py # Prompt manipulation utilities
βββ π validators.py # Validation utilities
class Ensemble:
"""Coordinates the collaboration of multiple AI models for complex tasks."""
def __init__(self, config_path: Optional[str] = None, config_dict: Optional[Dict[str, Any]] = None) -> None:
"""Initialize the Ensemble orchestration layer."""
async def initialize(self) -> None:
"""Load models and prepare the ensemble for processing queries."""
async def shutdown(self) -> None:
"""Release resources used by the ensemble."""
async def ask(self, query: str, **kwargs: Any) -> Union[str, Dict[str, Any]]:
"""Process a query through the configured collaboration and aggregation pipeline."""
def configure(self, config_dict: Dict[str, Any]) -> None:
"""Update the ensemble's configuration dynamically."""
class ModelManager:
"""Manages the loading, execution, and lifecycle of GGUF models."""
def __init__(self, config_manager: ConfigProvider, max_workers: Optional[int] = None) -> None:
"""Initialize the ModelManager."""
async def initialize(self) -> None:
"""Initialize the ModelManager: Instantiates models and loads them asynchronously."""
async def shutdown(self) -> None:
"""Shutdown the ModelManager: Unloads models and shuts down the executor."""
async def run_inference(self, model_id: str, prompt: str, **kwargs: Any) -> Dict[str, Any]:
"""Run inference on a specific model using its generate method via thread pool."""
class BaseCollaborationPhase(ABC):
"""Abstract base class for collaboration phases."""
def __init__(self, model_manager: "ModelManager", config_manager: "ConfigManager", phase_name: str) -> None:
"""Initialize the collaboration phase."""
@abstractmethod
async def execute(self, query: str, context: Dict[str, Any], trace_collector: Optional[TraceCollector] = None) -> Dict[str, Any]:
"""Execute the collaboration phase."""
class BaseAggregator(ABC):
"""Abstract base class for aggregation strategies."""
def __init__(self, config_manager: "ConfigManager", strategy_name: str, model_manager: Optional["ModelManager"] = None, strategy_config_override: Optional[Dict[str, Any]] = None) -> None:
"""Initialize the aggregator."""
@abstractmethod
async def aggregate(self, outputs: Dict[str, Dict[str, Any]], context: Dict[str, Any], trace_collector: Optional[TraceCollector] = None) -> Dict[str, Any]:
"""Aggregate the outputs from collaboration phases."""
Contributions are welcome! Please feel free to submit a Pull Request.
- Follow PEP 8 and PEP 484 (Type Hinting)
- Use Google Python Style Guide for formatting and documentation
- Apply Google Docstrings for all modules, classes, and functions
- Write tests for new features
- Format code using black (configured in pyproject.toml)
- Run type checking with mypy
π Note: Tests are currently broken. They were originally developed early on in the development process and are now out of date. I will be updating these later.
# Install test dependencies
pip install ".[dev]"
# Run tests
pytest
# Run with coverage
pytest --cov=ai_ensemble_suite
- Refinement and cleanup of the current codebase
- Rewriting the existing tests due to API changes from early code
- Adding OpenAI API compatibility for local LLM servers like LM Studio, Ollama
- Support for Internet API providers
This project is licensed under the MIT License - see the LICENSE file for details.
π Special thanks to Georgi Gerganov and the whole team working on llama.cpp for making all of this possible.