Skip to content

Conversation

@pditommaso
Copy link
Contributor

@pditommaso pditommaso commented Sep 27, 2025

Summary

This PR implements automatic cache invalidation through Docker Distribution webhooks, replacing manual cache management with real-time, event-driven cache clearing.

Key Features

Docker Distribution Webhook Integration

  • Endpoint: POST /api/webhook/registry
  • Event Processing: Automatically processes Docker Distribution notification events
  • Selective Invalidation: Only invalidates cache for repositories that received new container pushes

Smart Event Filtering

The webhook endpoint intelligently filters Docker Distribution events:

  • Processes: Manifest push events (final step of container deployment)

    • application/vnd.docker.distribution.manifest.v2+json
    • application/vnd.docker.distribution.manifest.list.v2+json
    • application/vnd.oci.image.manifest.v1+json
    • application/vnd.oci.image.index.v1+json
  • Ignores: Layer pushes, blob uploads, pull requests, and other intermediate events

Repository-Specific Cache Invalidation

Instead of clearing all cache data, the system now supports:

  • Selective clearing: Only removes cached data for affected repositories
  • Batch processing: Deduplicates multiple events for the same repository
  • Efficient invalidation: Clears repository list, tags, and image info atomically

Docker Distribution Configuration

To enable automatic cache invalidation, configure your Docker Distribution registry:

# registry.yml
notifications:
  endpoints:
    - name: staticreg-webhook
      url: http://your-staticreg-host/api/webhook/registry
      headers:
        Authorization: [Bearer your-token]  # Optional
      timeout: 5s
      threshold: 3
      backoff: 1s

Event Flow

  1. Container Push: User pushes container to registry
  2. Distribution Event: Registry generates notification events (layers, manifest, etc.)
  3. Webhook Delivery: Registry sends events to staticreg webhook endpoint
  4. Event Filtering: Staticreg processes only manifest push events
  5. Cache Invalidation: Cache for affected repository is cleared
  6. Immediate Refresh: Background sync immediately fetches fresh data

API Changes

New Endpoint

  • POST /api/webhook/registry - Receives Docker Distribution webhook notifications

Removed Endpoint

  • POST /api/cache/invalidate - No longer needed with automatic webhooks

Implementation Details

Event Structure Support

{
  "events": [
    {
      "id": "uuid",
      "timestamp": "2024-01-01T00:00:00Z", 
      "action": "push",
      "target": {
        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
        "repository": "library/nginx",
        "digest": "sha256:abc123",
        "tag": "latest"
      }
    }
  ]
}

Response Format

{
  "message": "Webhook processed successfully",
  "eventsProcessed": 2,
  "repositoriesInvalidated": 1
}

Benefits

  • Real-time Cache Updates: Cache invalidates immediately when containers are pushed
  • Selective Invalidation: Only affected repositories are cleared, preserving other cached data
  • No Manual Intervention: Eliminates need for manual cache management
  • Efficient Resource Usage: Avoids unnecessary full cache clears
  • Detailed Monitoring: Comprehensive logging for webhook event processing

Test Plan

  • Code compiles without errors
  • Webhook endpoint accepts Docker Distribution event format
  • Event filtering correctly processes only manifest pushes
  • Repository-specific cache invalidation works
  • Batch event processing with deduplication
  • Comprehensive logging and error handling
  • Integration testing with actual Docker Distribution registry
  • Performance testing with high-volume webhook events

🤖 Generated with Claude Code

pditommaso and others added 2 commits September 27, 2025 17:27
Add POST /api/cache/invalidate endpoint to clear both HTTP response cache and registry data cache.

This allows manual cache invalidation when the registry content has been updated, providing immediate cache clearing without waiting for the automatic refresh interval.

Changes:
- Add ClearCache() method to async registry for clearing all cached data
- Add CacheManager to coordinate cache clearing operations
- Add cache invalidate handler and route at POST /api/cache/invalidate
- Update server constructor to accept async registry for cache management

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Replace manual cache invalidation with automatic webhook-based cache clearing.

This commit adds:
- Docker Distribution webhook endpoint at POST /api/webhook/registry
- Selective repository cache invalidation (vs full cache clear)
- Event filtering to only process manifest push events
- Support for Docker Distribution event envelope format
- Detailed logging and response metrics

The webhook endpoint processes Docker Distribution events and automatically
invalidates cache for repositories when new containers are pushed, eliminating
the need for manual cache management or waiting for refresh intervals.

Event Filtering:
- Only processes "push" actions with manifest media types
- Ignores layer pushes, pulls, and other registry events
- Supports both Docker v2 and OCI manifest formats
- Deduplicates multiple events for the same repository in a batch

Configuration:
Configure Docker Distribution registry.yml:
```yaml
notifications:
  endpoints:
    - name: staticreg-webhook
      url: http://your-staticreg-host/api/webhook/registry
      timeout: 5s
      threshold: 3
      backoff: 1s
```

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@pditommaso pditommaso requested a review from fntlnz September 27, 2025 15:40
@pditommaso
Copy link
Contributor Author

@claude review

@claude
Copy link

claude bot commented Sep 30, 2025

Claude encountered an error —— View job


I'll analyze this and get back to you.

@pditommaso
Copy link
Contributor Author

@claude try it again

@claude
Copy link

claude bot commented Sep 30, 2025

Claude finished @pditommaso's task —— View job


Code Review: Docker Distribution webhook integration for cache invalidation

I've thoroughly reviewed the PR implementation. The feature adds smart, event-driven cache invalidation that replaces manual cache management with automatic webhook-triggered updates. Here's my detailed analysis:

Strengths

Architecture & Design:

  • Well-structured separation of concerns with dedicated CacheManager and webhook packages
  • Smart event filtering to process only manifest push events, avoiding unnecessary cache invalidations
  • Repository-specific cache invalidation prevents clearing unrelated cached data
  • Proper deduplication logic in server.go:203 prevents multiple invalidations for the same repository

Code Quality:

  • Comprehensive error handling with proper logging throughout
  • Good use of structured logging with contextual information
  • Thread-safe implementation using existing mutex patterns in async_registry.go:263-279
  • Clean REST API design following existing patterns

⚠️ Issues & Recommendations

1. Incomplete HTTP Cache Invalidation (pkg/server/cache_manager.go:43-47)
The TODO comments highlight a significant limitation:

// TODO: Consider creating a custom cache store that exposes Clear()
// TODO: Consider implementing selective HTTP cache invalidation

Impact: HTML responses remain cached even after repository updates, potentially showing stale data until cache expiration.

Recommendation: Implement custom cache store or use cache keys that can be invalidated by repository name.

2. Missing Authentication/Authorization (pkg/server/server.go:178)
The webhook endpoint lacks any security measures:

Security Risk: Any client can trigger cache invalidation, potentially causing DoS through excessive invalidations.

Recommendation: Add webhook signature validation or token-based authentication:

// Add to registryWebhookHandler
if !validateWebhookSignature(ctx.Request, expectedSignature) {
    ctx.JSON(http.StatusUnauthorized, gin.H{"error": "Invalid signature"})
    return
}

3. Missing Input Validation (pkg/server/server.go:181)
No validation on webhook payload size or structure beyond JSON parsing.

Recommendation: Add payload size limits and field validation.

4. Error Handling Gap (pkg/server/server.go:220)
Continues processing after cache invalidation failures, but doesn't track partial failures in response.

Recommendation: Include failed repositories in response for better observability.

5. Potential Race Condition (pkg/registry/async/async_registry.go:262-279)
ClearRepositoryCache() iterates over concurrent map while modifying it:

c.imageInfo.Range(func(key imageInfoKey, value registry.ImageInfo) bool {
    if key.repo == repository {
        c.imageInfo.Delete(key) // Modifying during iteration
    }
    return true
})

Recommendation: Collect keys first, then delete:

var keysToDelete []imageInfoKey
c.imageInfo.Range(func(key imageInfoKey, value registry.ImageInfo) bool {
    if key.repo == repository {
        keysToDelete = append(keysToDelete, key)
    }
    return true
})
for _, key := range keysToDelete {
    c.imageInfo.Delete(key)
}

🔍 Minor Issues

Code Style:

  • pkg/webhook/events.go:73-76: Long conditional could be extracted to a slice for better readability
  • pkg/server/server.go:96: Missing error handling for route registration

Logging:

  • Consider using different log levels for webhook events (Debug for filtered events, Info for processed ones)

🧪 Testing Gaps

The PR lacks tests for the new functionality:

  • No unit tests for webhook event processing
  • No integration tests for cache invalidation flow
  • No tests for the new CacheManager component

Recommendation: Add tests covering:

  • Webhook payload parsing and validation
  • Event filtering logic
  • Cache invalidation scenarios
  • Error handling paths

📊 Performance Considerations

  • Webhook processing is synchronous - consider async processing for high-volume scenarios
  • ClearRepositoryCache() may be expensive with many images - consider background processing

Security Assessment

  • No obvious security vulnerabilities beyond the authentication gap
  • Input sanitization relies on JSON unmarshaling, which is generally safe
  • No SQL injection or XSS risks in this component

🚀 Overall Assessment

This is a solid implementation that successfully replaces manual cache management with event-driven automation. The code follows good Go practices and integrates well with the existing architecture. The main concerns are the incomplete HTTP cache invalidation and missing webhook authentication, both of which are acknowledged in the TODO comments.

Recommendation:Approve with follow-up - The core functionality is sound and provides significant value. Address the authentication and HTTP cache invalidation issues in a follow-up PR.


@munishchouhan
Copy link
Member

Notification and cache invalidation is working:

time=2025-10-10T15:10:23.637+02:00 level=INFO msg="starting process to synchronize repositories"
time=2025-10-10T15:25:23.648+02:00 level=INFO msg="starting process to synchronize repositories"
time=2025-10-10T15:36:24.121+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:36:24.424+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:36:25.666+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:36:25.974+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:36:26.285+02:00 level=INFO msg="Processing push event for repository" repository=alpine digest=sha256:92a29b8e530685cb620b9aced7c2d447d091885f1c5a3ace8d98fb5855687d05 tag=latest
time=2025-10-10T15:36:26.285+02:00 level=INFO msg="Repository cache invalidated" repository=alpine
time=2025-10-10T15:36:26.285+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=1 repositoriesInvalidated=1
time=2025-10-10T15:36:43.729+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:36:44.036+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:36:44.347+02:00 level=INFO msg="Processing push event for repository" repository=alpine digest=sha256:92a29b8e530685cb620b9aced7c2d447d091885f1c5a3ace8d98fb5855687d05 tag=latest
time=2025-10-10T15:36:44.347+02:00 level=INFO msg="Repository cache invalidated" repository=alpine
time=2025-10-10T15:36:44.347+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=1 repositoriesInvalidated=1
time=2025-10-10T15:40:23.660+02:00 level=INFO msg="starting process to synchronize repositories"
time=2025-10-10T15:40:24.656+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:40:24.661+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:40:24.665+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:40:24.670+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:40:24.673+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:40:53.941+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:40:54.245+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:40:55.484+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:40:55.794+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:40:56.103+02:00 level=INFO msg="Processing push event for repository" repository=ubuntu digest=sha256:42d298b1504b9192b3b3564916ab544e9c09410280639ed4cfbc7269e3c9887f tag=latest
time=2025-10-10T15:40:56.103+02:00 level=INFO msg="Repository cache invalidated" repository=ubuntu
time=2025-10-10T15:40:56.103+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=1 repositoriesInvalidated=1
time=2025-10-10T15:43:06.783+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:43:07.089+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=0 repositoriesInvalidated=0
time=2025-10-10T15:43:07.399+02:00 level=INFO msg="Processing push event for repository" repository=alpine digest=sha256:92a29b8e530685cb620b9aced7c2d447d091885f1c5a3ace8d98fb5855687d05 tag=3.22.2
time=2025-10-10T15:43:07.399+02:00 level=INFO msg="Repository cache invalidated" repository=alpine
time=2025-10-10T15:43:07.399+02:00 level=INFO msg="Webhook processing completed" totalEvents=1 eventsProcessed=1 repositoriesInvalidated=1

@pditommaso
Copy link
Contributor Author

@fntlnz can you give a final review when you have a chance?

@munishchouhan
Copy link
Member

munishchouhan commented Oct 10, 2025

validated functionality
Screenshot 2025-10-10 at 18 15 41

pushed new image

% docker push localhost:5000/ubuntu:25.10
The push refers to repository [localhost:5000/ubuntu]
d2804ccd00ae: Pushed
25.10: digest: sha256:edc1546a15cd1719d35588ffc7cecd0a54d916a347bc37956827f7e10b0f2574 size: 529
Screenshot 2025-10-10 at 18 16 33

@munishchouhan munishchouhan reopened this Oct 10, 2025
@munishchouhan munishchouhan marked this pull request as ready for review October 13, 2025 08:03
@fntlnz
Copy link
Collaborator

fntlnz commented Oct 13, 2025

reviewing this during the day! thanks for the patience!

Copy link
Collaborator

@fntlnz fntlnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach of using the webhook to invalidate the staticreg caches.

Other than my comments, I did not reproduce it but I'm fairly sure there is a race condition between invalidation and when the next cycle of data gets populated, likely resulting in seeing no results on the rendered version.

I am not confident at merging it right now.

The caching mechanism in staticreg is made of two layers:

  • internal in memory cache of the registry view
  • cache of the rendered html

it seems that this approach is only clearing the internal in memory cache so I would say that the final implementation would look like this:

  • webhook is authenticated or exposed via an internal port
  • the refresh interval is higher than it is now by default only to catch inconsistencies
  • the http cache goes away
  • optionally the in memory cache can go to something like a redis or s3 so we only have one staticreg instance listening for those notifications because now every instance will have to receive them

Also ClearRepositoryCache to me seems to not fit the current architecture because it was meant to fully resync everything instead of per repo.

apiRoutes := r.Group("/api")
{
apiRoutes.GET("/search", serverImpl.SearchHandler)
apiRoutes.POST("/webhook/registry", registryWebhookHandler(cacheManager, log))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this endpoint should probably be bound to a different listening address to avoid publishing it over the internet or use the auth mechanism they provide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep simple, no need for auth

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could use separate port

}
}

func registryWebhookHandler(cacheManager *CacheManager, log *slog.Logger) gin.HandlerFunc {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this endpoint needs to go in ServerImpl and the cache manager likely be a dependency of it

}

func (c *Async) ClearRepositoryCache(repository string) {
c.reposMutex.Lock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably rework the data structures a bit if we need to support this model instead of doing this logic.

@pditommaso
Copy link
Contributor Author

Can we split blocking needs from improvements that can be done in a second step? we need this capability to go live

@pditommaso
Copy link
Contributor Author

@fntlnz please 🙏 👇

#42 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants