=== Data Machine === Contributors: chubes4 Tags: ai, automation, content, workflow, pipeline Requires at least: 6.2 Tested up to: 6.8 Requires PHP: 8.0 Stable tag: 0.1.2 License: GPLv2 or later License URI: https://www.gnu.org/licenses/gpl-2.0.html
AI-first WordPress plugin for content processing workflows with visual pipeline builder and multi-provider AI integration.
Features:
- Tool-First AI with enhanced multi-turn conversation management and duplicate detection
- Visual Pipeline Builder with drag & drop interface and status detection
- Multi-Provider AI (OpenAI, Anthropic, Google, Grok, OpenRouter) with 5-tier directive system
- REST API Flow Trigger Endpoint with authentication and error handling
- Centralized Engine Data Architecture with unified filter access pattern
- Enhanced Unified Handler Filter System with shared functionality patterns
- AIStepConversationManager with Turn Tracking, temporal context, and validation
- AIStepToolParameters with buildForHandlerTool() and centralized parameter building
- Clean Content Processing with structured data packet architecture
- Modular WordPress Publish Handler with specialized processing components
- Universal Handler Settings Template with metadata-based auth detection
- Complete AutoSave System with cache invalidation
- Performance Optimizations: 50% query reduction in handler settings operations
- Advanced Centralized Cache Management with granular controls and action-based architecture
Requirements: WordPress 6.2+, PHP 8.0+, Composer (for development)
Pipeline+Flow: Pipelines are reusable templates, Flows are configured instances
Example: WordPress Content Enhancement System
- Pipeline Template: Fetch → AI → Update (defines workflow structure with system prompt "You are a content optimizer. Analyze existing WordPress content and enhance it with better SEO, readability, and comprehensive information using research tools.")
- Flow A: WordPress Local (old blog posts) → AI + Google Search tool → WordPress Update (weekly)
- Flow B: WordPress Local (draft pages) → AI + Local Search tool → WordPress Update (daily)
- Flow C: WordPress Local (product pages) → AI + WebFetch tool → WordPress Update (bi-weekly)
Development:
- Clone to
/wp-content/plugins/data-machine/
- Run
composer install
- Activate plugin
- Configure AI provider at Settings → Data Machine
Production:
- Run
./build.sh
to create/dist/data-machine.zip
- Install via WordPress admin
- Configure AI provider and tools
Google Search (optional):
- Create Custom Search Engine + get API key
- Add credentials at Settings → Data Machine → Tool Configuration
- Free tier: 100 queries/day
OAuth Providers:
- Twitter: OAuth 1.0a
- Reddit/Facebook/Threads/Google Sheets: OAuth2
- Bluesky: App Password
Auth via /dm-oauth/{provider}/
popup flow.
- Create Pipeline Template: "Document Processing" (Fetch → AI → Publish)
- Add System Prompt: "Extract key insights and create structured WordPress posts with proper headings, summaries, and tags"
- Create Flow Instance: Files handler → AI → WordPress handler
- Configure Flow: Upload PDFs, set scheduling, configure WordPress settings
- Result: Automatic WordPress posts with clean formatting and taxonomy
Content Enhancement: Pipeline (Fetch → AI → Update) + Flow (WordPress Local → AI + tools → WordPress Update)
- Template defines step structure, flow selects specific handlers and tools
- Uses
source_url
from engine data to target specific content
Document Processing: Pipeline (Fetch → AI → Publish) + Flow (Files → AI + tools → WordPress)
- Template provides workflow, flow configures file handling and publishing destination
- Flow-isolated file storage with automatic cleanup
Research Workflows: Pipeline (Fetch → AI → Publish) + Flow (Google Sheets → AI + WebFetch → WordPress)
- Template structures workflow, flow defines data source and research tools
- Multi-turn AI conversations for complex content creation
Multi-Platform Publishing: Pipeline (Fetch → AI → Publish → AI → Publish) + Flow Configuration
- Template structures sequential publishing workflow
- Flow configures RSS/Reddit → AI → Twitter → AI → Facebook publishing chain
- Engine data maintains source attribution throughout workflow
WordPress Content Enhancement: Pipeline (Fetch → AI → Update) + Multiple Enhancement Flows
- Pipeline: "Content Optimizer" (Fetch → AI → Update)
- Flow A: WordPress Local (old posts) → AI + Google Search tool → WordPress Update (weekly SEO refresh)
- Flow B: WordPress Local (draft content) → AI + WebFetch tool → WordPress Update (research enhancement)
- Flow C: WordPress Local (product pages) → AI + Local Search + WordPress Post Reader → WordPress Update (internal linking)
Automated News Publishing: Pipeline (Fetch → AI → Publish) + Multiple Source Flows
- Pipeline: "News Feed" (Fetch → AI → Publish)
- Flow A: TechCrunch RSS → AI → WordPress (hourly tech news)
- Flow B: Reddit r/webdev → AI → WordPress (daily development updates)
- Flow C: Industry Google Sheets → AI → WordPress (weekly reports)
Note: Update workflows require
source_url
(provided by fetch handlers or AI tools like Local Search/WordPress Post Reader). AI tools enable multi-turn conversations for complex research and analysis tasks.
For detailed examples and technical specifications, see CLAUDE.md
// Pipeline creation and execution
$pipeline_id = apply_filters('dm_create_pipeline', null, ['pipeline_name' => 'My Pipeline']);
$step_id = apply_filters('dm_create_step', null, ['step_type' => 'fetch', 'pipeline_id' => $pipeline_id]);
$flow_id = apply_filters('dm_create_flow', null, ['pipeline_id' => $pipeline_id]);
do_action('dm_run_flow_now', $flow_id, 'manual');
// AI integration
$response = apply_filters('ai_request', [
'messages' => [['role' => 'user', 'content' => $prompt]],
'model' => 'gpt-5-mini'
], 'openai');
Trigger flow execution via REST API endpoint (POST /wp-json/dm/v1/trigger
):
# Trigger any flow via REST API
curl -X POST https://example.com/wp-json/dm/v1/trigger \
-H "Content-Type: application/json" \
-u username:application_password \
-d '{"flow_id": 123}'
# Success Response
{
"success": true,
"flow_id": 123,
"flow_name": "My Flow",
"message": "Flow triggered successfully."
}
# Error Response (403 Forbidden)
{
"code": "rest_forbidden",
"message": "You do not have permission to trigger flows.",
"data": {"status": 403}
}
# Error Response (404 Not Found)
{
"code": "invalid_flow",
"message": "Flow not found.",
"data": {"status": 404}
}
Implementation: inc/Engine/Rest/Trigger.php
Requirements: WordPress application password or cookie authentication with manage_options
capability
Context: Triggers flows via dm_run_flow_now
action with 'rest_api_trigger'
context
For complete REST API documentation, see docs/api-reference/rest-api.md
| For technical specifications, see CLAUDE.md
Complete extension framework supporting Fetch, Publish, Update handlers, AI tools, and Database services with filter-based auto-discovery.
See CLAUDE.md
for development guides and technical specifications
Fetch Sources:
- Local/remote files
- RSS feeds (timeframe/keyword filtering)
- Reddit posts (timeframe/keyword filtering)
- WordPress Local (timeframe/keyword filtering)
- WordPress Media (with parent post content integration, timeframe/keyword filtering)
- WordPress API (timeframe/keyword filtering)
- Google Sheets
Publish Destinations:
- Twitter, Bluesky, Threads, Facebook
- WordPress
- Google Sheets
Update Handlers:
- WordPress Update (existing post/page modification via source_url from engine data filter access)
AI Providers:
- OpenAI, Anthropic, Google, Grok, OpenRouter (200+ models)
General Tools:
- Google Search, Local Search
- WebFetch (50K character limit)
- WordPress Post Reader
Architecture Highlights:
- Centralized Engine Data:
dm_engine_data
filter provides unified access to source_url, image_url- Clean separation between AI data packets and handler engine parameters
- Universal Handler Filters:
- Shared functionality (
dm_timeframe_limit
,dm_keyword_search_match
,dm_data_packet
) - Eliminates code duplication across multiple handlers
- Shared functionality (
- Tool-First AI Integration:
- Multi-turn conversation management with
AIStepConversationManager
- Unified parameter building via
AIStepToolParameters
- Multi-turn conversation management with
- Modular WordPress Publisher:
- Specialized components (
FeaturedImageHandler
,TaxonomyHandler
,SourceUrlHandler
) - Configuration hierarchy system
- Specialized components (
- Complete AutoSave System:
- Single
dm_auto_save
action handles pipeline persistence, flow synchronization, and cache invalidation
- Single
- Filter-Based Discovery:
- All components self-register via WordPress filters maintaining consistent architectural patterns
All handlers are fully functional with OAuth authentication where required and comprehensive error handling
For detailed specifications, see CLAUDE.md
- Content marketing automation
- News monitoring and alerts
- Document processing and extraction
- Social media management
- Content repurposing
- Research automation
- WordPress workflow integration
Pages: Pipelines, Flows, Jobs, Logs
Settings (WordPress Settings → Data Machine):
- Engine Mode (headless), page controls, tool toggles
- Site Context toggle (WordPress info injection)
- Job data cleanup on failure toggle (debugging)
- File retention settings (1-90 days)
- 5-Tier AI Directive System: Auto-registering directive classes with priority spacing for comprehensive AI context
- AIStepConversationManager: Multi-turn conversation state with turn tracking, chronological ordering, and duplicate detection
- AIStepToolParameters: Centralized parameter building with buildForHandlerTool() for unified tool execution
- Tool configuration (API keys, OAuth)
- WordPress defaults (post types, taxonomies, author, status)
- Three-layer tool management (global → modal → validation)
Features: Drag & drop, auto-save, status indicators, real-time monitoring
composer install # Development setup
composer test # Run tests (PHPUnit configured, test files not yet implemented)
./build.sh # Production build to /dist/data-machine.zip
Architecture:
- PSR-4 Autoloading: Composer-managed dependency structure
- Filter-Based Service Discovery: WordPress hooks for component registration
- Unified Handler Filter System:
- Centralized cross-cutting filters (
dm_timeframe_limit
,dm_keyword_search_match
,dm_data_packet
)
- Centralized cross-cutting filters (
- Centralized Engine Data:
EngineData.php
filter providing unifieddm_engine_data
access with clean AI data packets - Centralized Cache System: Actions/Cache.php with comprehensive WordPress action-based clearing and granular methods
- 5-Tier AI Directive System: Auto-registering directive classes with priority spacing from PluginCoreDirective to SiteContextDirective
- Intelligent Tool Discovery: UpdateStep and PublishStep with exact handler matching and partial name matching
- Advanced Conversation Management: AIStepConversationManager with turn tracking and AIStepToolParameters for unified execution
- AutoSave System:
- Complete pipeline persistence and flow synchronization
- Modular WordPress Publisher:
- (
FeaturedImageHandler
,TaxonomyHandler
,SourceUrlHandler
) with configuration hierarchy
- (
- Universal Handler Settings:
- Template system with metadata-based auth detection (
requires_auth
flag) - Eliminates auth provider instantiation overhead
- Template system with metadata-based auth detection (
- Performance Optimizations:
- Handler settings modal load: 50% query reduction (single flow config query, metadata-based auth check)
- Handler settings save: 50% query reduction (memory-based config building)
- AJAX status system: Flow-scoped (
FlowStatusAjax
) and pipeline-wide (PipelineStatusAjax
) handlers - Status detection contexts:
pipeline_step_status
andflow_step_status
for targeted checks - Composer-managed ai-http-client dependency
- REST API Integration:
- Flow trigger endpoint with authentication and comprehensive error handling
- Complete REST API documentation with integration examples
See CLAUDE.md
for complete technical specifications.
GPL v2+ - License
Developer: Chris Huber
Documentation: CLAUDE.md