Releases: defog-ai/defog
v1.0.0
This release introduces stability and significant enhancements to the defog-python library, expanding its capabilities across data extraction, agent orchestration, database support, and multimodal interactions.
TLDR
The Defog Python library has undergone a significant architectural transformation to provide greater autonomy and flexibility:
-
Removed Defog API dependencies: The library now operates independently without requiring a Defog API key. SQL generation happens directly through LLM APIs (OpenAI, Anthropic, etc.), giving you more control over your AI providers.
-
Local data storage: Configuration and data are now stored locally in your home directory instead of relying on remote services, enhancing privacy and reducing external dependencies.
-
Streamlined architecture: CLI components and static files have been removed to create a more focused, lightweight library.
-
Backward compatibility: While these changes are substantial, backward compatibility is maintained with deprecation warnings for previously required parameters, allowing for a smooth transition to the new architecture.
These changes represent a fundamental shift toward a more independent, flexible library that puts you in control of your SQL generation workflow while reducing external dependencies.
Improvements
This release includes several architectural improvements and code quality enhancements that strengthen the foundation of our Python SDK:
-
Unified Orchestrator Architecture: We've consolidated multiple orchestrator classes into a single, configurable Orchestrator class. This simplification maintains backward compatibility while making enhanced features available through configuration options rather than requiring different implementations.
-
Modernized LLM Provider System: The internal architecture for LLM providers has been refactored to reduce code duplication and improve maintainability, resulting in a more efficient codebase with the same functionality.
-
Updated AI Model Support: Gemini Pro 2.5 has been updated from preview to stable version, ensuring you have access to the latest stable AI model capabilities.
Data Extraction Framework
- Comprehensive Data Extraction Suite: Extract structured data from multiple sources with our new extractors:
- TextDataExtractor: Identify and extract structured data from unstructured text documents using multiple LLM providers
- HTMLDataExtractor: Extract tables, lists, product cards, and key-value pairs from HTML content, with automatic image data extraction from embedded visuals
- ImageDataExtractor: Convert charts, tables, and visual data from images into structured, analyzable data
- PDFDataExtractor: Transform PDF documents into SQL-friendly structured data with proper typing and descriptive field names
Enhanced Agent Capabilities
- Advanced Multi-Agent Orchestration: Build sophisticated AI systems with our new orchestration framework that enables:
- Automatic creation of specialized subagents for complex tasks
- Parallel execution by default for significantly improved performance
- Shared context storage and cross-agent memory sharing
- Configurable reasoning depth with the new
reasoning_effortparameter - Rich, color-coded logging for better visibility into agent operations
- Robust error handling with automatic retries and fallback mechanisms
Database Connectivity Expansion
- New Database Connectors: Connect to more database types with our expanded support:
- SQLite: Enable zero-setup local development and testing
- DuckDB: Query and analyze data with this high-performance analytical database
- SQL Agent Tools: Convert natural language to SQL for local databases including PostgreSQL, MySQL, BigQuery, Snowflake, Databricks, SQL Server, and Redshift
- Local SQL Generation: Generate SQL queries without sending data to external APIs, enhancing privacy and control
Multimodal and Provider Support
-
Comprehensive Image Support:
- Send images alongside text in chat messages across all major LLM providers
- Return images from tools with automatic format detection for different providers
- Process visual content with a unified API that handles multiple image formats
-
Expanded LLM Provider Support:
- Mistral AI: Full support for all Mistral models with function calling and structured outputs
- DeepSeek: Dedicated provider with proper support for function calling and JSON mode
- Consistent cost tracking across all providers
Content Analysis Tools
- YouTube Transcript Tool: Convert videos to timestamped transcripts with speaker identification
- PDF Analysis Tool: Process documents with smart features like input caching and automatic chunking
- Schema Documentation: Automatically generate and save table and column descriptions in databases
Each enhancement is designed to make complex AI and data operations more accessible, efficient, and powerful while maintaining backward compatibility with existing code.