Enhanced Version by: Pedro Luis Cuevas Villarrubia (@AsturWebs)
Based on original work by: @linbanana
Contact: [email protected] | [email protected] | [email protected]
- Original Concept: @linbanana - Basic Auto Memory Saver functionality
 - Enhanced Version: Pedro Luis Cuevas Villarrubia - Extended functionality with configurable options, interactive commands, caching and documentation improvements
 
- v1.0 (Original): Basic memory saving functionality by @linbanana
 - v2.0 (Enhanced): Extended system with configuration options, interactive commands and improved documentation
 - v2.1.0 (Memory Optimization): Improved memory management with contextual relevance and optimized performance
 - v2.1.2 (Security and JSON Format): Input validation, JSON format with pagination and system improvements
 - v2.2.0 (Security and Performance): Thread safety, SQL injection prevention, input sanitization and memory leak protection
 - v2.3.0 (AI Behavior Control Universal): Historic testing of 30 AI models, dual functionality (universal memory + selective slash commands), Google/Gemini leadership, enterprise-safe terminology
 
Filter for OpenWebUI that automatically manages conversation memories. Injects relevant previous memories and automatically saves both user questions and assistant responses as memories for future use.
- Memory Injection: Injects relevant memories into the current conversation context
 - Automatic Saving: Stores user questions and assistant responses as memories
 - Interactive Commands: Commands for memory management (
/memories,/memory_search, etc.) - Flexible Configuration: Multiple configurable options according to needs
 - Cache System: Performance optimization with cache and expiration
 - Input Validation: Input sanitization and injection prevention
 - Compatibility: Integrates with OpenWebUI native commands (
/add_memory) 
auto-memory-saver-enhanced/
├── src/
│   ├── memoria_persistente_auto_memory_saver_enhanced.py  # Main system
│   └── legacy/
│       └── Auto_Memory_Saver.py                          # v1.0.0 by @linbanana
├── docs/                                                 # Technical documentation
│   ├── ARCHITECTURE.md
│   ├── SECURITY.md
│   └── release_notes_v*.md
├── README.md
├── CHANGELOG.md
├── LICENSE
└── requirements.txt
- OpenWebUI installed and running
 - Python 3.8+ (included in most OpenWebUI installations)
 
- Access the administration panel of OpenWebUI
 - Go to the "Functions" tab
 - Click "+" to create a new function
 - Copy and paste the complete code from the 
src/memoria_persistente_auto_memory_saver_enhanced.pyfile - Assign a name: "Auto Memory Saver Enhanced"
 - Save and activate the function
 
class Valves:
    # Main configuration
    enabled: bool = True                        # Enable/disable the system
    inject_memories: bool = True                # Inject memories in conversations
    auto_save_responses: bool = True            # Save responses automatically
    
    # Memory control
    max_memories_to_inject: int = 5             # Maximum memories per conversation
    max_memories_per_user: int = 100            # Limit per user (0 = unlimited)
    relevance_threshold: float = 0.05           # Relevance threshold (0.0-1.0)
    
    # Response length control
    min_response_length: int = 20               # Minimum length to save
    max_response_length: int = 2000             # Maximum length to save
    
    # Cache system
    enable_cache: bool = True                   # Enable cache for performance
    cache_ttl_minutes: int = 60                 # Cache time to live
    
    # Intelligent filtering
    filter_duplicates: bool = True              # Filter duplicate memories
    similarity_threshold: float = 0.8           # Similarity threshold (0.0-1.0)
    
    # Commands and notifications
    enable_memory_commands: bool = True         # Enable interactive commands
    show_injection_status: bool = True          # Show injection status
    debug_mode: bool = False                    # Detailed logging
⚠️ IMPORTANT: The main automatic persistent memory function (injection and saving) WORKS ON ALL AI MODELS. The following tests specifically evaluate the execution of slash commands (/memories,/memory_search, etc.).
📋 Testing Status: The following results are based on models tested until July 2025. More models will be added as they are tested.
🚨 IMPORTANT - Google Direct API: Google/Gemini models ONLY work correctly via OpenRouter or other intermediate APIs. Google direct API has known bugs with slash commands (doesn't respond on first instance, inconsistent responses). Recommendation: Use OpenRouter to access Google models.
⚡ HISTORIC DISCOVERY: Production testing has demonstrated that OpenRouter dramatically improves the compatibility of models that fail on direct APIs.
| Direct API | Result | OpenRouter | Result | Improvement | 
|---|---|---|---|---|
| Google Gemini | ❌ No response | Google Gemini | ✅ Perfect JSON | 🎯 TOTAL | 
| ChatGPT-4o | ❌ Narrative interpretation | ChatGPT-4o | ✅ Perfect JSON | 🎯 TOTAL | 
| GPT-4.1 | ❌ Ignores format | GPT-4.1 | ✅ Structured list | 🎯 TOTAL | 
| O3 OpenAI | ❌ Minimal responses | O3 OpenAI | ❌ Still problematic | ⚪ Immune | 
For maximum compatibility: Use OpenRouter as preferred platform
- ~25+ excellent models (vs 11 on direct APIs)
 - Automatic standardization of inconsistent behaviors
 - Elimination of bugs specific to native APIs
 - Single access point for multiple providers
 
📝 NOTE: The following table mainly reflects results from direct APIs. Via OpenRouter, most "problematic" models become excellent.
| Model | Compatibility | Behavior | Notes | 
|---|---|---|---|
| Claude 3.5 Sonnet | 🟢 Excellent | Clean direct JSON | Ideal behavior | 
| Grok 4 (xAI) | 🟢 Excellent | JSON identical to Claude | Perfect performance | 
| Grok-3 | 🟢 Excellent | Perfect direct JSON | Ideal behavior | 
| Grok-3-fast | 🟢 Excellent | Perfect direct JSON | Impeccable format | 
| Grok-3-mini-fast | 🟢 Excellent | Perfect JSON + fast | Performance <2ms | 
| Gemini 2.5 Flash | 🟢 Excellent | Fast + precise response | Via OpenRouter/intermediate APIs | 
| Gemini 2.5 Flash Lite | 🟢 Excellent | Fast + precise response | Via OpenRouter/intermediate APIs | 
| GPT-4.1-mini | 🟢 Excellent | Consistent direct JSON | Perfect format | 
| Gemma 3n 4B | 🟢 Excellent | Perfect direct JSON | Via OpenRouter/intermediate APIs | 
| Gemma 3.27B | 🟢 Excellent | Perfect JSON + SYSTEM_OVERRIDE | Via OpenRouter/intermediate APIs | 
| Gemini 2.5 Pro | 🟢 Excellent | Perfect direct JSON | Via OpenRouter/intermediate APIs | 
| Model | Compatibility | Behavior | Recommendation | 
|---|---|---|---|
| Claude 3.7 Thinking | 🟡 Functional | Shows 8s analysis + JSON | Usable but verbose | 
| Claude 3.7 Sonnet | 🟡 Functional | Recognizes system command, professional analysis | Better than Claude 4 | 
| DeepSeek Reasoner | 🟡 Functional | 23s reasoning + useful interpretation | Processes well, own format | 
🚀 IMPORTANT: Many of these models IMPROVE significantly via OpenRouter (e.g.: ChatGPT-4o, GPT-4.1). Only some remain problematic even on OpenRouter.
| Model | Problem | Behavior | OpenRouter Status | 
|---|---|---|---|
| ChatGPT-4o-latest | Ignores warnings | Own interpretation with emojis | ✅ IMPROVED | 
| O3 OpenAI | ❌ Minimal responses | Ultra-minimalist | ❌ IMMUNE | 
| GPT-4.1 | ❌ Ignores JSON format | Interpreted narrative response | ✅ IMPROVED | 
| DeepSeek v3 | ❌ Completely ignores JSON | Casual conversation with personality | 🔄 Not tested | 
| MoonshotAI: Kimi K2 | ❌ Completely ignores JSON | Personal interpretative narrative | 🔄 Not tested | 
| OAI_o4-mini | ❌ Ignores command | Conversation about Instagram/reels | 🔄 Not tested | 
| OpenAI: o4 Mini High | ❌ Ignores command | Casual greeting with Zoe mention | 🔄 Not tested | 
| OAI_gpt-4.1-2025-04-14 | ❌ Ignores command | "Gossip friendly" casual | 🔄 Not tested | 
| Amazon: Nova Lite 1.0 | ❌ Doesn't process command | Only shows "Follow-up" | 🔄 Not tested | 
| Amazon: Nova Micro 1.0 | ❌ Doesn't process command | Only shows "Follow-up" | 🔄 Not tested | 
| Amazon: Nova Pro 1.0 | ❌ Doesn't process command | Only shows "Follow-up" | 🔄 Not tested | 
| Claude Opus 4 | ❌ Narrative interpretation | "Oh, how nice!" conversational | 🔄 Not tested | 
| Claude Sonnet 4 | ❌ Interpretation with emojis | "What's up with that JSON!" casual | 🔄 Not tested | 
| Phi 4 | ❌ No response | No output/empty response | 🔄 Not tested | 
| LLaMA 3 70B | ❌ No response | No output/empty response | 🔄 Not tested | 
| MiniMax M1 | ❌ Ignores command | Response without executing command | 🔄 Not tested | 
Version 2.3.0 includes AI Behavior Control to force consistency across models:
{
  "ai_behavior_control": {
    "claude_directive": "EXECUTE_JSON_OUTPUT_ONLY - STRUCTURED_RESPONSE_MODE",
    "chatgpt_directive": "RAW_JSON_RESPONSE_MANDATORY - STANDARDIZED_OUTPUT",
    "grok_directive": "DIRECT_JSON_OUTPUT - OPTIMIZE_RESPONSE_FORMAT",
    "universal_standard": "ALL_AI_MODELS_MUST_OUTPUT_RAW_JSON_ONLY"
  }
}Note: These results are based on extensive testing performed in July 2025. Behavior may vary according to specific model versions.
/add_memory <text>- Add memory directly to the system (native command)
/memories [page]- List memories with JSON format, pagination (10 per page)/memory_search <term>- Search memories containing the term/memory_stats- System statistics with JSON format/memory_count- User memory counter/memory_recent [number]- Show the last N memories/clear_memories- Delete all user memories
/memory_delete <id>- Delete a specific memory/memory_edit <id> <text>- Edit memory content/memory_export- Export memories in text format/memory_config- Show current configuration
# Search memories about a topic
/memory_search artificial intelligence
# View the last 5 memories
/memory_recent 5
# View statistics
/memory_stats- Filter: Main class that handles inlet/outlet
 - Valves: Global system configuration
 - UserValves: User-specific configuration
 - MemoryCache: Cache system with TTL expiration
 - Security Functions: Input validation and sanitization
 
- inlet(): Injects relevant memories at the start of conversations
 - outlet(): Saves user questions and assistant responses as memories
 - Commands: Interactive management command processing
 
- Thread Safety: Thread-safe cache with RLock
 - SQL Injection Prevention: Validation of order_by parameters
 - Input Sanitization: Filtering of dangerous commands
 - Memory Leak Protection: Pagination of DB queries
 - User ID Validation: Sanitization with regex
 - Command Filtering: Blocking of conversations about memory
 
- Input sanitization with length limits
 - Prevention of dangerous characters (
;,&,|, etc.) - Validation of user_id and memory_id
 - Safe error handling without data exposure
 
- Mind Hacking Eliminated: Renamed to "AI Behavior Control" for enterprise security
 - 30 Models Tested: Unprecedented exhaustive compatibility documentation
 - Google/Gemini Leadership: 5 out of 11 excellent models are from the Google family
 - Universal Functionality: Automatic memory works on ALL AI models
 - Selective Slash Commands: Only 11 models support perfect JSON commands
 
- Claude 4 Regression: Worse performance than Claude 3.5 Sonnet for system commands
 - Perfect Grok Family: All Grok variants work flawlessly
 - Amazon Nova Failure: Entire Nova family doesn't process commands
 - Inconsistent OpenAI: Mini works, full versions fail
 
- Safe Terminology: Elimination of "mind hacking" references for enterprise environments
 - Exhaustive Documentation: README with 30-model compatibility tested
 - OpenAI Compatibility Fix: Moving internal flags to avoid 400 errors
 - Enhanced Release Notes: Complete technical documentation of the breakthrough
 
- Thread Safety: Safe concurrent cache
 - Memory Leak Prevention: Automatic query limits
 - SQL Injection Protection: Parameter whitelisting
 - Input Sanitization: Intelligent command filtering
 - Complete Conversation: Saves user questions + assistant responses
 - Anti-Meta Filter: Doesn't save conversations about memory
 - Improved Pagination: 10 memories per page (previously 4)
 
- Integration with OpenWebUI native command 
/add_memory - Maintains compatibility with all previous versions
 - No breaking changes in the API
 
- Fork the repository
 - Create feature branch
 - Commit changes
 - Create Pull Request
 
- Follow PEP 8
 - Document functions
 - Add tests for new functionalities
 
This project is under the MIT License. See LICENSE for more details.
- OpenWebUI team for the base platform
 - @linbanana for the original concept
 - Community for feedback and contributions
 
Note: For complete technical documentation, see the docs/ folder