Skip to content

ayush-panta/chunked-containers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chunked Containers

Container image chunking with multiple strategies for optimized storage and transfer.

What It Does

Takes container images and breaks them into smaller, reusable chunks with:

  • Multiple strategies: Fixed-size, file-boundary chunking
  • Filesystem awareness: Extract complete metadata
  • Automatic deduplication: Reuse identical chunks
  • Stargz support: Build lazy-loading format

Quick Start

# Setup environment
./scripts/setup-all.sh

# Basic chunking
go run cmd/chunker/main.go -image test-alpine -verbose

# With stargz output
go run cmd/chunker/main.go -image test-alpine -stargz -verbose

Commands

Flag Default Description
-image required Image name to chunk
-strategy fixed-size fixed-size, file-boundary
-chunk-size-kb 64 Chunk size in KB
-filesystem true Extract filesystem metadata
-stargz false Build stargz format
-verbose false Show detailed output

Sample Workflow

1. Chunk Image with Verbose Output

go run cmd/chunker/main.go -image test-alpine -strategy file-boundary -filesystem=true -stargz -verbose

Output:

🎯 Chunking test-alpine with file-boundary strategy
   Chunk Size: 64 KB
   Filesystem Aware: true

✅ Image Parsed Successfully
   Manifest: 501d4f3b800d...
   Architecture: arm64

📄 Manifest Analysis:
   Schema Version: 2
   Config Blob: bb088... (1275 bytes)
   Layer Count: 3

   Layers:
     1. 6e174... (4130750 bytes)
     2. e0ba9... (135 bytes)
     3. 212e2... (121 bytes)

🔧 Layer Processing:
   1. Processing 6e174226... (4130750 bytes) → 86 chunks → 298de5de...
   2. Processing e0ba9b9a... (135 bytes) → 1 chunks → 2fcb0491...
   3. Processing 212e2d0e... (121 bytes) → 1 chunks → e83c7c57...

📊 Chunking Summary:
   Total Chunks: 88 (created 81 new, 7 deduplicated)
   Total Indexes: 3 (created 3 new)

🎯 Building stargz format...
   Building layer 1/3... ✅ test-alpine-layer-1.stargz (3.9 MB)
   Building layer 2/3... ✅ test-alpine-layer-2.stargz (135 bytes)
   Building layer 3/3... ✅ test-alpine-layer-3.stargz (121 bytes)
✅ Stargz format complete! Created 3 stargz files

✅ Complete! Created 88 chunks across 3 indexes

2. Verify Results

# Count chunks and indexes
echo "Chunks: $(ls data/registry-data/chunks | wc -l)"
echo "Indexes: $(ls data/registry-data/indexes | wc -l)"
echo "Stargz: $(ls data/registry-data/stargz | wc -l)"

Output:

Chunks: 81
Indexes: 3
Stargz: 3

3. Push to Registry (Optional)

./scripts/push_stargz_proper.sh test-alpine

Output:

🚀 Pushing test-alpine as proper stargz layers to ECR...
Using temp directory: /tmp/tmp.XXXXXX
  Added stargz layer: 298de5de1234... (3981KB)
  Added stargz layer: 2fcb04915678... (1KB)
  Added stargz layer: e83c7c579abc... (1KB)
✅ Successfully pushed proper stargz layers to: 299170649678.dkr.ecr.us-east-1.amazonaws.com/hackathon/personal:test-alpine-stargz-proper

4. Clean Previous Runs

./scripts/cleanup.sh

Output:

🧹 Cleaning up chunking data...
   Removed       81 chunks
   Removed        3 indexes
   Removed OCI images
   Removed        3 stargz files
   Removed statistics
✅ Clean slate ready!

Strategies

Fixed-Size (-strategy fixed-size)

  • Splits at fixed byte boundaries
  • Good for deduplication

File-Boundary (-strategy file-boundary)

  • Never splits individual files
  • Better for file-level caching

Filesystem Awareness (-filesystem true/false)

  • true: Process individual files with metadata
  • false: Treat as binary data

Output Structure

data/registry-data/
├── chunks/      # Individual chunk files (SHA256 named)
├── indexes/     # Index files mapping chunks to layers  
└── stargz/      # Stargz format files

Available Images

  • test-alpine - Basic Alpine Linux
  • test-alpine-2 - Alpine variant
  • test-variant-a - Test variant A
  • test-variant-b - Test variant B
  • ubuntu-overlap - Ubuntu for overlap testing

Registry Push

Prerequisites:

go install github.com/google/go-containerregistry/cmd/crane@latest
aws configure

Registry: 299170649678.dkr.ecr.us-east-1.amazonaws.com/hackathon/personal
Tag format: {image-name}-stargz-proper

About

Container image chunking with multiple strategies for CCT Hackweek

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published