Skip to content

πŸ› Lightweight Go library providing robust string sanitization and normalization utilities

License

Notifications You must be signed in to change notification settings

mrz1836/go-sanitize

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

πŸ› go-sanitize

Lightweight Go library providing robust string sanitization and normalization utilities

CIΒ /Β CD QualityΒ &Β Security DocsΒ &Β Meta Community
Latest Release
Build Status
CodeQL
Last commit
Go Report Card
Code Coverage
OpenSSF Scorecard
Security policy
OpenSSF Best Practices
Go version
Go docs
AGENTS.md rules
Makefile Supported
Dependabot
Contributors
Sponsor
Donate Bitcoin

πŸ—‚οΈ Table of Contents


πŸ“¦ Installation

go-sanitize requires a supported release of Go.

go get -u github.com/mrz1836/go-sanitize

πŸ’‘ Usage

Here is a basic example of how to use go-sanitize in your Go project:

package main

import (
    "fmt"
    "github.com/mrz1836/go-sanitize"
)

func main() {
	// Sanitize a string to remove unwanted characters
	input := "Hello, World! @2025"
	sanitized := sanitize.AlphaNumeric(input, false) // true to keep spaces

	// Output: "Sanitized String: HelloWorld2025"
	fmt.Println("Sanitized String:", sanitized) 
}
  • Explore additional usage examples for practical integration patterns
  • Review benchmark results to assess performance characteristics
  • Examine the comprehensive test suite for validation and coverage
  • Fuzz tests are available to ensure robustness against unexpected inputs

πŸ“š Documentation

View the generated documentation

Heads up! go-sanitize is intentionally light on dependencies. The only external package it uses is the excellent testify suiteβ€”and that's just for our tests. You can drop this library into your projects without dragging along extra baggage.


Features

  • Alpha and alphanumeric sanitization with optional spaces
  • Bitcoin and Bitcoin Cash address sanitizers
  • Custom regular expression helper for arbitrary patterns
  • Precompiled regex sanitizer for repeated patterns
  • Decimal, domain, email and IP address normalization
  • HTML and XML stripping with script removal
  • URI, URL and XSS sanitization

Functions

  • Alpha: Remove non-alphabetic characters, optionally keep spaces
  • AlphaNumeric: Remove non-alphanumeric characters, optionally keep spaces
  • BitcoinAddress: Filter input to valid Bitcoin address characters
  • BitcoinCashAddress: Filter input to valid Bitcoin Cash address characters
  • Custom: Use a custom regex to filter input (legacy)
  • CustomCompiled: Use a precompiled custom regex to filter input (suggested)
  • Decimal: Keep only decimal or float characters
  • Domain: Sanitize domain, optionally preserving case and removing www
  • Email: Normalize an email address
  • FirstToUpper: Capitalize the first letter of a string
  • FormalName: Keep only formal name characters
  • HTML: Strip HTML tags
  • IPAddress: Return sanitized and valid IPv4 or IPv6 address
  • Numeric: Remove all but numeric digits
  • PhoneNumber: Keep digits and plus signs for phone numbers
  • PathName: Sanitize to a path-friendly name
  • Punctuation: Allow letters, numbers and basic punctuation
  • ScientificNotation: Keep characters valid in scientific notation
  • Scripts: Remove scripts, iframe and object tags
  • SingleLine: Replace line breaks and tabs with spaces
  • Time: Keep only valid time characters
  • URI: Keep characters allowed in a URI
  • URL: Keep characters allowed in a URL
  • XML: Strip XML tags
  • XSS: Remove common XSS attack strings

Additional Documentation & Repository Management

Library Deployment

This project uses goreleaser for streamlined binary and library deployment to GitHub. To get started, install it via:

brew install goreleaser

The release process is defined in the .goreleaser.yml configuration file.

To generate a snapshot (non-versioned) release for testing purposes, run:

make release-snap

Before tagging a new version, update the release metadata in the CITATION.cff file:

make citation version=0.2.1

Then create and push a new Git tag using:

make tag version=x.y.z

This process ensures consistent, repeatable releases with properly versioned artifacts and citation metadata.

Makefile Commands

View all makefile commands

make help

List of all current commands:

bench                 ## Run all benchmarks in the Go application
build-go              ## Build the Go application (locally)
citation              ## Update version in CITATION.cff (use version=X.Y.Z)
clean-mods            ## Remove all the Go mod cache
coverage              ## Show test coverage
diff                  ## Show git diff and fail if uncommitted changes exist
fumpt                 ## Run fumpt to format Go code
generate              ## Run go generate in the base of the repo
godocs                ## Trigger GoDocs tag sync
govulncheck-install   ## Install govulncheck (pass VERSION= to override)
govulncheck           ## Scan for vulnerabilities
help                  ## Display this help message
install-go            ## Install using go install with specific version
install-releaser      ## Install GoReleaser
install-stdlib        ## Install the Go standard library for the host platform
install-template      ## Kick-start a fresh copy of go-template (run once!)
install               ## Install the application binary
lint-version          ## Show the golangci-lint version
lint                  ## Run the golangci-lint application (install if not found)
loc                   ## Total lines of code table
mod-download          ## Download Go module dependencies
mod-tidy              ## Clean up go.mod and go.sum
pre-build             ## Pre-build all packages to warm cache
release-snap          ## Build snapshot binaries
release-test          ## Run release dry-run (no publish)
release               ## Run production release (requires github_token)
tag-remove            ## Remove local and remote tag (use version=X.Y.Z)
tag-update            ## Force-update tag to current commit (use version=X.Y.Z)
tag                   ## Create and push a new tag (use version=X.Y.Z)
test-ci-no-race       ## CI test suite without race detector
test-ci               ## CI test runs tests with race detection and coverage (no lint - handled separately)
test-cover-race       ## Runs unit tests with race detector and outputs coverage
test-cover            ## Unit tests with coverage (no race)
test-fuzz             ## Run fuzz tests only (no unit tests)
test-no-lint          ## Run only tests (no lint)
test-parallel         ## Run tests in parallel (faster for large repos)
test-race             ## Unit tests with race detector (no coverage)
test-short            ## Run tests excluding integration tests (no lint)
test                  ## Default testing uses lint + unit tests (fast)
uninstall             ## Uninstall the Go binary
update-linter         ## Upgrade golangci-lint (macOS only)
update-releaser       ## Reinstall GoReleaser
update                ## Update dependencies
vet-parallel          ## Run go vet in parallel (faster for large repos)
vet                   ## Run go vet only on your module packages
GitHub Workflows

πŸŽ›οΈ The Workflow Control Center

All GitHub Actions workflows in this repository are powered by a single configuration file: .env.shared – your one-stop shop for tweaking CI/CD behavior without touching a single YAML file! 🎯

This magical file controls everything from:

  • πŸš€ Go version matrix (test on multiple versions or just one)
  • πŸƒ Runner selection (Ubuntu or macOS, your wallet decides)
  • πŸ”¬ Feature toggles (coverage, fuzzing, linting, race detection)
  • πŸ›‘οΈ Security tool versions (gitleaks, nancy, govulncheck)
  • πŸ€– Auto-merge behaviors (how aggressive should the bots be?)
  • 🏷️ PR management rules (size labels, auto-assignment, welcome messages)

Pro tip: Want to disable code coverage? Just flip ENABLE_CODE_COVERAGE=false in .env.shared and push. No YAML archaeology required!


Workflow Name Description
auto-merge-on-approval.yml Automatically merges PRs after approval and all required checks, following strict rules.
codeql-analysis.yml Analyzes code for security vulnerabilities using GitHub CodeQL.
dependabot-auto-merge.yml Automatically merges Dependabot PRs that meet all requirements.
fortress.yml Runs the GoFortress security and testing workflow, including linting, testing, releasing, and vulnerability checks.
pull-request-management.yml Labels PRs by branch prefix, assigns a default user if none is assigned, and welcomes new contributors with a comment.
scorecard.yml Runs OpenSSF Scorecard to assess supply chain security.
stale.yml Warns about (and optionally closes) inactive issues and PRs on a schedule or manual trigger.
sync-labels.yml Keeps GitHub labels in sync with the declarative manifest at .github/labels.yml.
update-python-dependencies.yml Updates Python dependencies for pre-commit hooks in the repository.
update-pre-commit-hooks.yml Automatically update versions for pre-commit hooks
Updating Dependencies

To update all dependencies (Go modules, linters, and related tools), run:

make update

This command ensures all dependencies are brought up to date in a single step, including Go modules and any tools managed by the Makefile. It is the recommended way to keep your development environment and CI in sync with the latest versions.


πŸ§ͺ Examples & Tests

All unit tests and examples run via GitHub Actions and use Go version 1.24.x. View the configuration file.

Run all tests (fast):

make test

Run all tests with race detector (slower):

make test-race

⚑ Benchmarks

Run the Go benchmarks:

make bench

Benchmark Results

Benchmark Iterations ns/op B/op allocs/op
Alpha 14,018,806 84.89 24 1
Alpha_WithSpaces 12,664,946 94.25 24 1
AlphaNumeric 9,161,546 130.6 32 1
AlphaNumeric_WithSpaces 7,978,879 150.8 32 1
BitcoinAddress 8,843,929 137.1 48 1
BitcoinCashAddress 5,892,612 196.2 48 1
Custom (Legacy) 938,733 1,249.0 913 16
CustomCompiled 1,576,502 762.3 96 5
Decimal 16,285,825 73.91 24 1
Domain 4,784,115 251.6 176 3
Domain_PreserveCase 5,594,325 213.9 160 2
Domain_RemoveWww 4,771,556 251.0 176 3
Email 8,380,172 144.2 48 2
Email_PreserveCase 13,468,302 90.06 24 1
FirstToUpper 57,342,418 20.60 16 1
FormalName 14,557,754 83.12 24 1
HTML 2,558,787 468.5 48 3
IPAddress 11,388,638 102.7 32 2
IPAddress_IPV6 3,434,715 350.9 96 2
Numeric 22,661,516 52.92 16 1
PhoneNumber 17,502,224 68.84 24 1
PathName 13,881,150 86.58 24 1
Punctuation 7,377,070 162.3 48 1
ScientificNotation 19,399,621 61.62 24 1
Scripts 2,060,790 580.6 16 1
SingleLine 9,777,549 123.5 32 1
Time 21,270,655 55.92 16 1
URI 9,005,937 133.4 32 1
URL 8,989,400 135.2 32 1
XML 4,351,617 275.7 48 3
XSS 3,302,917 362.9 40 2

These benchmarks reflect fast, allocation-free lookups for most retrieval functions, ensuring optimal performance in production environments. Performance benchmarks for the core functions in this library, executed on an Apple M1 Max (ARM64).


πŸ› οΈ Code Standards

Read more about this Go project's code standards.


πŸ€– AI Compliance

This project documents expectations for AI assistants using a few dedicated files:

  • AGENTS.md β€” canonical rules for coding style, workflows, and pull requests used by Codex.
  • CLAUDE.md β€” quick checklist for the Claude agent.
  • .cursorrules β€” machine-readable subset of the policies for Cursor and similar tools.
  • sweep.yaml β€” rules for Sweep, a tool for code review and pull request management.

Edit AGENTS.md first when adjusting these policies, and keep the other files in sync within the same pull request.


πŸ‘₯ Maintainers

MrZ
MrZ

🀝 Contributing

View the contributing guidelines and please follow the code of conduct.

How can I help?

All kinds of contributions are welcome πŸ™Œ! The most basic way to show your support is to star 🌟 the project, or to raise issues πŸ’¬. You can also support this project by becoming a sponsor on GitHub πŸ‘ or by making a bitcoin donation to ensure this journey continues indefinitely! πŸš€

Stars


πŸ“ License

License

About

πŸ› Lightweight Go library providing robust string sanitization and normalization utilities

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 6