Container image chunking with multiple strategies for optimized storage and transfer.
Takes container images and breaks them into smaller, reusable chunks with:
- Multiple strategies: Fixed-size, file-boundary chunking
- Filesystem awareness: Extract complete metadata
- Automatic deduplication: Reuse identical chunks
- Stargz support: Build lazy-loading format
# Setup environment
./scripts/setup-all.sh
# Basic chunking
go run cmd/chunker/main.go -image test-alpine -verbose
# With stargz output
go run cmd/chunker/main.go -image test-alpine -stargz -verbose| Flag | Default | Description |
|---|---|---|
-image |
required | Image name to chunk |
-strategy |
fixed-size |
fixed-size, file-boundary |
-chunk-size-kb |
64 |
Chunk size in KB |
-filesystem |
true |
Extract filesystem metadata |
-stargz |
false |
Build stargz format |
-verbose |
false |
Show detailed output |
go run cmd/chunker/main.go -image test-alpine -strategy file-boundary -filesystem=true -stargz -verboseOutput:
🎯 Chunking test-alpine with file-boundary strategy
Chunk Size: 64 KB
Filesystem Aware: true
✅ Image Parsed Successfully
Manifest: 501d4f3b800d...
Architecture: arm64
📄 Manifest Analysis:
Schema Version: 2
Config Blob: bb088... (1275 bytes)
Layer Count: 3
Layers:
1. 6e174... (4130750 bytes)
2. e0ba9... (135 bytes)
3. 212e2... (121 bytes)
🔧 Layer Processing:
1. Processing 6e174226... (4130750 bytes) → 86 chunks → 298de5de...
2. Processing e0ba9b9a... (135 bytes) → 1 chunks → 2fcb0491...
3. Processing 212e2d0e... (121 bytes) → 1 chunks → e83c7c57...
📊 Chunking Summary:
Total Chunks: 88 (created 81 new, 7 deduplicated)
Total Indexes: 3 (created 3 new)
🎯 Building stargz format...
Building layer 1/3... ✅ test-alpine-layer-1.stargz (3.9 MB)
Building layer 2/3... ✅ test-alpine-layer-2.stargz (135 bytes)
Building layer 3/3... ✅ test-alpine-layer-3.stargz (121 bytes)
✅ Stargz format complete! Created 3 stargz files
✅ Complete! Created 88 chunks across 3 indexes
# Count chunks and indexes
echo "Chunks: $(ls data/registry-data/chunks | wc -l)"
echo "Indexes: $(ls data/registry-data/indexes | wc -l)"
echo "Stargz: $(ls data/registry-data/stargz | wc -l)"Output:
Chunks: 81
Indexes: 3
Stargz: 3
./scripts/push_stargz_proper.sh test-alpineOutput:
🚀 Pushing test-alpine as proper stargz layers to ECR...
Using temp directory: /tmp/tmp.XXXXXX
Added stargz layer: 298de5de1234... (3981KB)
Added stargz layer: 2fcb04915678... (1KB)
Added stargz layer: e83c7c579abc... (1KB)
✅ Successfully pushed proper stargz layers to: 299170649678.dkr.ecr.us-east-1.amazonaws.com/hackathon/personal:test-alpine-stargz-proper
./scripts/cleanup.shOutput:
🧹 Cleaning up chunking data...
Removed 81 chunks
Removed 3 indexes
Removed OCI images
Removed 3 stargz files
Removed statistics
✅ Clean slate ready!
Fixed-Size (-strategy fixed-size)
- Splits at fixed byte boundaries
- Good for deduplication
File-Boundary (-strategy file-boundary)
- Never splits individual files
- Better for file-level caching
Filesystem Awareness (-filesystem true/false)
true: Process individual files with metadatafalse: Treat as binary data
data/registry-data/
├── chunks/ # Individual chunk files (SHA256 named)
├── indexes/ # Index files mapping chunks to layers
└── stargz/ # Stargz format files
test-alpine- Basic Alpine Linuxtest-alpine-2- Alpine varianttest-variant-a- Test variant Atest-variant-b- Test variant Bubuntu-overlap- Ubuntu for overlap testing
Prerequisites:
go install github.com/google/go-containerregistry/cmd/crane@latest
aws configureRegistry: 299170649678.dkr.ecr.us-east-1.amazonaws.com/hackathon/personal
Tag format: {image-name}-stargz-proper