Skip to content

Conversation

@munishchouhan
Copy link
Member

@munishchouhan munishchouhan commented Oct 15, 2025

Summary

This PR adds Docker Distribution webhook integration to enable container registry pull metrics tracking and cache invalidation. The implementation consists of:

  • Webhook endpoint (/api/webhook/registry) that processes Docker Distribution events
  • Pull metrics tracking that records manifest pull events to PostgreSQL
  • Smart cache invalidation that automatically clears repository caches on push events
  • Optional database support with graceful degradation when database is unavailable

Key Changes

Webhook Integration (pkg/webhook/)

  • events.go: Defines Docker Distribution event structs with helper methods to identify manifest push/pull events
  • webhookservice.go: Interface contract for webhook operations (cache invalidation + database persistence)
  • webhookserviceadapter.go: Adapter implementation that coordinates cache invalidation and database operations

Cache Management (pkg/server/)

  • cache_manager.go: New cache manager that controls both async registry cache and HTTP response cache
  • Supports repository-specific invalidation and full cache clearing
  • server.go: Integrated webhook handler that processes push events for cache invalidation and pull events for metrics

Database Layer (pkg/db/)

  • postgres.go: PostgreSQL connection pool with graceful failure handling
  • Logs warnings if DATABASE_URL is not set, allowing the app to run without database
  • Pull events are stored as JSONB for flexibility

Infrastructure

  • main.go: Database pool initialization with proper cleanup
  • cmd/serve.go: Wired cache manager through server initialization
  • GitHub Actions workflow for Claude PR assistant
  • Makefile updates for macOS ARM64 development

Test Plan

  • Verify webhook endpoint accepts Docker Distribution event payloads
  • Confirm manifest push events trigger cache invalidation for affected repositories
  • Validate manifest pull events are stored in database (when DATABASE_URL is configured)
  • Test graceful degradation when database is unavailable
  • Verify repository listings reflect changes after push events
  • Check that concurrent webhook requests are handled correctly
  • Test with multi-event batches to ensure deduplication works

munishchouhan and others added 5 commits October 14, 2025 14:21
Add POST /api/cache/invalidate endpoint to clear both HTTP response cache and registry data cache.

This allows manual cache invalidation when the registry content has been updated, providing immediate cache clearing without waiting for the automatic refresh interval.

Changes:
- Add ClearCache() method to async registry for clearing all cached data
- Add CacheManager to coordinate cache clearing operations
- Add cache invalidate handler and route at POST /api/cache/invalidate
- Update server constructor to accept async registry for cache management

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Replace manual cache invalidation with automatic webhook-based cache clearing.

This commit adds:
- Docker Distribution webhook endpoint at POST /api/webhook/registry
- Selective repository cache invalidation (vs full cache clear)
- Event filtering to only process manifest push events
- Support for Docker Distribution event envelope format
- Detailed logging and response metrics

The webhook endpoint processes Docker Distribution events and automatically
invalidates cache for repositories when new containers are pushed, eliminating
the need for manual cache management or waiting for refresh intervals.

Event Filtering:
- Only processes "push" actions with manifest media types
- Ignores layer pushes, pulls, and other registry events
- Supports both Docker v2 and OCI manifest formats
- Deduplicates multiple events for the same repository in a batch

Configuration:
Configure Docker Distribution registry.yml:
```yaml
notifications:
  endpoints:
    - name: staticreg-webhook
      url: http://your-staticreg-host/api/webhook/registry
      timeout: 5s
      threshold: 3
      backoff: 1s
```

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@munishchouhan munishchouhan changed the base branch from master to feat/cache-invalidate-endpoint October 15, 2025 12:16
@munishchouhan munishchouhan changed the base branch from feat/cache-invalidate-endpoint to master October 15, 2025 12:17
@munishchouhan munishchouhan marked this pull request as draft October 15, 2025 12:22
@munishchouhan munishchouhan self-assigned this Oct 15, 2025
@munishchouhan
Copy link
Member Author

@pditommaso @fntlnz since the webhook is coupled with cache manager
I have made the chnages for matrices on it

Please let me know the decision for the following:

  1. How do we want to move forward with cache invalidation?
  2. Currently I am storing the full event, please decide also what we want to save in DB?

Example db data:

id |event_time                   |repo_name|tag   |actor_name|event_payload                                                                                                                                                                                                                                                  |created_at                   |
---+-----------------------------+---------+------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------+
106|2025-10-14 18:36:33.942 +0200|alpine   |latest|          |{"id": "c9cd486b-8f88-45d0-84e1-55f9cfd22e11", "actor": {}, "action": "pull", "source": {"addr": "10981c11d243:5000", "instanceID": "ff85536d-9cfb-4f18-a565-b807782ec47b"}, "target": {"tag": "latest", "url": "http://localhost:5000/v2/alpine/manifests/sha2|2025-10-14 18:36:33.952 +0200|
107|2025-10-14 18:36:33.950 +0200|alpine   |latest|          |{"id": "d4f99e71-fd01-4b5a-988d-67ee95642679", "actor": {}, "action": "pull", "source": {"addr": "10981c11d243:5000", "instanceID": "ff85536d-9cfb-4f18-a565-b807782ec47b"}, "target": {"tag": "latest", "url": "http://localhost:5000/v2/alpine/manifests/sha2|2025-10-14 18:36:33.960 +0200|

@pditommaso
Copy link
Contributor

@munishchouhan the event is stored into a bson column type right? can you dump here a full json object?

Thinking more about this pushes are rare, but pull events will be a LOT. therefore we need to plan carefully for this case, the event handling fully async and tracking only essential data

@munishchouhan
Copy link
Member Author

munishchouhan commented Oct 16, 2025

@pditommaso db schema has been added in the PR
below is the json, i am on vacation till 30 oct, I will pick it up after that

{
   "id":"1b149cb4-9405-4daf-a07e-973bc92f8887",
   "actor":{
      
   },
   "action":"pull",
   "source":{
      "addr":"be53ed94df9c:5000",
      "instanceID":"2c62a37a-a73c-4ee9-9ed3-554500c58108"
   },
   "target":{
      "tag":"latest",
      "url":"http://localhost:5000/v2/alpine/manifests/sha256:92a29b8e530685cb620b9aced7c2d447d091885f1c5a3ace8d98fb5855687d05",
      "size":527,
      "digest":"sha256:92a29b8e530685cb620b9aced7c2d447d091885f1c5a3ace8d98fb5855687d05",
      "length":527,
      "mediaType":"application/vnd.docker.distribution.manifest.v2+json",
      "repository":"alpine"
   },
   "request":{
      "id":"ff15a5e1-7fc7-4a90-8ce1-c90b06a6d177",
      "addr":"172.17.0.1:56990",
      "host":"localhost:5000",
      "method":"HEAD",
      "useragent":"docker/28.4.0 go/go1.24.7 git-commit/249d679 kernel/6.10.14-linuxkit os/linux arch/arm64 UpstreamClient(Docker-Client/28.4.0 \\(darwin\\))"
   },
   "timestamp":"2025-10-16T10:45:38.465271793Z"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants