A high-performance vector similarity search engine with LSH (Locality-Sensitive Hashing) optimization, written in Go.
- Fast LSH-based similarity search with configurable parameters
- SIMD-optimized vector operations for maximum performance
- Thread-safe concurrent operations with built-in synchronization
- Memory-efficient storage with parallel processing support
- Clean Go API with comprehensive test coverage
git clone https://github.com/colesmcintosh/vectorvault.git
cd vectorvault
go mod tidy
Requirements: Go 1.21+
package main
import (
"fmt"
"log"
"github.com/colesmcintosh/vectorvault/internal/vectorstore"
)
func main() {
// Create vector store with default LSH parameters
vs := vectorstore.New(vectorstore.DefaultLSHParams())
// Add vectors with string keys
vs.Add("document_1", []float64{0.1, 0.8, 0.3, 0.9})
vs.Add("document_2", []float64{0.2, 0.7, 0.4, 0.8})
vs.Add("document_3", []float64{0.9, 0.1, 0.7, 0.2})
// Search for most similar vectors
query := []float64{0.15, 0.75, 0.35, 0.85}
results, err := vs.Search(query, 5)
if err != nil {
log.Fatal(err)
}
// Display results
for _, result := range results {
fmt.Printf("Key: %s, Similarity: %.4f\n", result.Key, result.Similarity)
}
}
Tune performance and accuracy by configuring LSH parameters:
params := vectorstore.LSHParams{
NumHashTables: 6, // More tables = better recall, more memory
NumHashFunctions: 8, // More functions = better precision, slower hashing
BucketWidth: 4.0, // Larger width = more matches, less precision
}
vs := vectorstore.New(params)
All operations are thread-safe and can be called concurrently:
// Safe concurrent operations
go vs.Add("key1", vector1)
go vs.Search(queryVector, 10)
go vs.Delete("key2")
Run the included examples to see VectorVault in action:
# Basic performance benchmark
go run cmd/vectorstore/main.go
# Text similarity demo
go run examples/text_similarity/main.go
# Semantic search with OpenAI embeddings (requires API key)
export OPENAI_API_KEY='your-api-key'
go run examples/semantic_search/main.go
├── cmd/vectorstore/ # Performance benchmarks
├── examples/ # Usage examples
│ ├── text_similarity/ # Basic text similarity
│ └── semantic_search/ # OpenAI semantic search
├── internal/vectorstore/ # Core implementation
└── pkg/vectormath/ # Public vector utilities
VectorVault is optimized for both speed and memory efficiency:
- Vector addition: ~3.8µs per operation
- Memory usage: ~1.8KB per vector
- Search: Parallel processing with SIMD optimization
BenchmarkVectorStore/Add-10 300000 3824 ns/op
BenchmarkVectorStore/Search-10 50000 31245 ns/op
# Run all tests
go test ./...
# With coverage
go test -cover ./...
# Benchmarks only
go test -bench=. ./internal/vectorstore
- Fork the repository
- Create a feature branch (
git checkout -b feature/name
) - Make changes with tests
- Use conventional commit messages:
feat(search): add parallel processing fix(store): resolve race condition docs: update API examples
- Submit a pull request
MIT License - see LICENSE file for details.