-
Notifications
You must be signed in to change notification settings - Fork 86
feat: Multi-database and AWS S3 storage support #100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
feat: Multi-database and AWS S3 storage support #100
Conversation
- Replace POSTGRES_* env vars with generic DB_* naming convention - Add support for MySQL alongside PostgreSQL and SQLite - Add DB_TYPE environment variable to specify database engine - Configure optional dependencies for pg and mysql2 drivers - Maintain backward compatibility with SQLite as fallback
- Implement StorageManager with automatic S3/local detection - Support S3 storage for assets, collections, and storage.json - Add CloudFront integration and URL handling fixes - Maintain full backward compatibility with local storage
…Move storage files to dedicated directory, rename S3Storage to AwsS3Storage, remove redundant files, integrate CloudStorage into StorageManager, fix built-in collections asset processing, unify S3/local logic, improve collection listing format consistency, fix URL issues, add detailed logging
- Add STORAGE_TYPE environment variable for explicit storage configuration - Support local file system and AWS S3 storage with configurable switching - Add multi-database support with PostgreSQL, MySQL, and SQLite fallback - Update clean-world.mjs script to support all database types - Improve environment variable documentation with clear configuration sections - Add proper validation for required S3 configuration when using AWS storage - Maintain backward compatibility with existing configurations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some quick observations, let me know your thoughts.
.env.example
Outdated
S3_REGION=eu-west-1 | ||
S3_ASSETS_PREFIX= | ||
S3_COLLECTIONS_PREFIX= | ||
S3_STORAGE_PREFIX= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes would default to postgres for new people attempting to quickly spin up new worlds and I think we should continue to stick to sqlite as the default, so that people don't have to spend a whole lot of time launching a postgres docker container etc.
Additionally these are a lot of new environment variables that (might) be simplified by just using URI's like this:
# by default set to empty which uses sqlite in the world folder
DB_TYPE=
DB_URL=
# optionally configured to connect to a remote postgres database
DB_TYPE=pg
DB_URL=https://username:password@localhost:5432/database?schema=public
I don't think many people use mysql so i'd be happier to just not support that right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find your suggestion fine and I've updated both the code and the .env.example file to simplify the database setup as you recommended. Now, only two environment variables are used:
-
DB_TYPE (leave empty for SQLite, or set to pg for Postgres)
-
DB_URL (leave empty for SQLite, or provide a Postgres connection URI)
MySQL support and all the old related variables have been removed. This makes the default experience much simpler for new users, who can now spin up worlds with SQLite out of the box.
scripts/clean-world-s3.mjs
Outdated
* - For SQLite: Uses local ./[world]/db.sqlite file | ||
* | ||
* Storage configuration: | ||
* - STORAGE_TYPE must be set to 'aws' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S3 has become a bit of a standard, for example CloudFlare R2 can actually use S3-specified libraries for managing files on R2. So perhaps 'aws' should just be called 's3'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right!
I've updated the codebase to use s3 instead of aws for the storage type and environment variables. This makes the storage integration more generic and compatible with any S3-compatible provider (such as CloudFlare R2, DigitalOcean Spaces, etc.), not just AWS S3.
All relevant code, environment variables, and documentation have been updated accordingly.
scripts/clean-world-s3.mjs
Outdated
|
||
// Close database connection before exiting | ||
await db.destroy() | ||
process.exit() No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be done in the future, but maintaing two separate world clean scripts might be better abstracted so that there are functions like deleteFileS3() and deleteFileLocal() so that any changes to the general logic in a shared world-clean script would automatically maintain both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored the clean-world scripts to use shared utilities with functions like deleteFileS3() and deleteFileLocal().
Now both storage types stay in sync automatically when we update the cleanup logic - no more maintaining two separate scripts! The code is much cleaner and easier to maintain.
The scripts are organized in a nice modular structure (scripts/clean-world/) but still work exactly the same way with npm run world:clean.
Thanks for the feedback - this is definitely a much better approach!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again highly likely that this will work with Cloudflare R2 as its S3 compatible, so we could just call this S3Storage.js etc and drop the AWS naming.
Is it possible to rebase this branch so that it doesn't have a bunch of dev commits merged back on top of it? This helps a lot to keep git history clean |
…to sqlite, remove MySQL support and old env vars. Update .env.example accordingly.
… for S3-compatible storage support (CloudFlare R2, DigitalOcean Spaces, etc.)
… and modular structure - Create clean-world/ directory with modular scripts - Add clean-world-utils.mjs with shared functions (deleteFileLocal, deleteFileS3, etc.) - Refactor clean-world-local.mjs and clean-world-s3.mjs to use shared utilities - Update main clean-world.mjs to dynamically import appropriate script - Add README.md with documentation for the new structure - Fix path resolution issues for world directories - Maintain backward compatibility with npm run world:clean This addresses the reviewer's suggestion to avoid code duplication and ensure both storage types stay in sync when cleanup logic changes.
…Move storage files to dedicated directory, rename S3Storage to AwsS3Storage, remove redundant files, integrate CloudStorage into StorageManager, fix built-in collections asset processing, unify S3/local logic, improve collection listing format consistency, fix URL issues, add detailed logging
- Replace POSTGRES_* env vars with generic DB_* naming convention - Add support for MySQL alongside PostgreSQL and SQLite - Add DB_TYPE environment variable to specify database engine - Configure optional dependencies for pg and mysql2 drivers - Maintain backward compatibility with SQLite as fallback
…to sqlite, remove MySQL support and old env vars. Update .env.example accordingly.
… for S3-compatible storage support (CloudFlare R2, DigitalOcean Spaces, etc.)
… and modular structure - Create clean-world/ directory with modular scripts - Add clean-world-utils.mjs with shared functions (deleteFileLocal, deleteFileS3, etc.) - Refactor clean-world-local.mjs and clean-world-s3.mjs to use shared utilities - Update main clean-world.mjs to dynamically import appropriate script - Add README.md with documentation for the new structure - Fix path resolution issues for world directories - Maintain backward compatibility with npm run world:clean This addresses the reviewer's suggestion to avoid code duplication and ensure both storage types stay in sync when cleanup logic changes.
…mes/numinia-hyperfy2 into feat/full-storage-and-db-support # Conflicts: # .env.example # scripts/clean-world.mjs # src/server/storage/StorageManager.js
…alizing the storage manager
I did a rebase of the branch with dev and I think I didn't mess up anything 🤞! |
package.json
Outdated
"pg": "^8.16.0", | ||
"@aws-sdk/client-s3": "^3.685.0", | ||
"@aws-sdk/s3-request-presigner": "^3.685.0", | ||
"pg": "^8.16.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pg here twice
.env.example
Outdated
S3_REGION=eu-west-1 | ||
S3_ASSETS_PREFIX= | ||
S3_COLLECTIONS_PREFIX= | ||
S3_STORAGE_PREFIX= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do uri's for S3 too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented S3 URIs as suggested!
Now you can use STORAGE_URL=s3://bucket-name/folder-path/assets/ with standard AWS credentials, (just like DB_URL approach).
scripts/clean-world.mjs
Outdated
} else { | ||
console.log('Running local cleanup...') | ||
await import('./clean-world/clean-world-local.mjs') | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're still fragmenting here, eg any changes to the way cleaning works needs to be carefully updated in both of these and is likely to diverge because they are very different from each other. this is a risky piece of code.
i think this could be mitigated by having a single control flow that uses an abstraction to read and write from the db and files.
for example we have this code currently:
let blueprints = new Set()
const blueprintRows = await db('blueprints')
for (const row of blueprintRows) {
const blueprint = JSON.parse(row.data)
blueprints.add(blueprint)
}
this would work with both database options.
but then we have this:
console.log(`deleting ${filesToDelete.length} assets`)
for (const fileAsset of filesToDelete) {
const fullPath = path.join(assetsDir, fileAsset)
if (!DRY_RUN) {
fs.removeSync(fullPath)
}
console.log('delete asset:', fileAsset)
}
which only works with the filesystem, but could be rewritten to work with both (this is an example only):
console.log(`deleting ${filesToDelete.length} assets`)
for (const fileAsset of filesToDelete) {
if (!DRY_RUN) {
await storage.deleteAsset(fileAsset)
}
console.log('delete asset:', fileAsset)
}
this would allow us to have a SINGLE script for world cleaning instead of needing to keep two in sync and risk them diverging/breaking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your concern about uncontrolled growth - if we end up needing more storage types (Azure, GCP, etc.), we'd indeed need to rethink the architecture toward something more modular.
I've consolidated everything into a single script (scripts/clean-world.mjs
) that completely eliminates duplication. The solution maintains the simple and direct style of the original code, but adds conditional AWS S3 support.
Does this approach seem better? It maintains simplicity while completely eliminating the risk you mentioned.
346179b
to
fa33fd5
Compare
…storage support - Resolved conflicts in clean-world.mjs: unified script supports both local and S3 storage - Resolved conflicts in db.js: flexible database configuration (SQLite default, PostgreSQL via env vars) - Resolved conflicts in index.js: integrated StorageManager for multi-storage support - Added S3 storage capabilities via StorageManager architecture - Maintained backward compatibility with local filesystem storage - All database operations now support both SQLite and PostgreSQL
9c4f3ab
to
2c94a8f
Compare
- Deleted scripts/clean-world/clean-world-local.mjs (functionality moved to unified script) - Deleted scripts/clean-world/clean-world-s3.mjs (functionality moved to unified script) - Deleted scripts/clean-world/clean-world-utils.mjs (functions now inline) - Deleted scripts/clean-world/README.md (documentation for removed fragmented approach) - Removed empty scripts/clean-world/ directory All functionality is now consolidated in scripts/clean-world.mjs with conditional logic for local/S3 storage.
- Added S3 URI parsing (s3://bucket/prefix/?region=us-east-1) - Support for STORAGE_URL environment variable - Maintains backward compatibility with individual S3_* env vars - Auto-detection of S3 from STORAGE_URL (no STORAGE_TYPE needed) - Support for AWS standard credentials (AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY) - Updated StorageManager to use unified configuration approach - Updated clean-world.mjs script to support URI configuration - Added comprehensive .env.example with configuration examples - Follows AWS credential chain best practices (env vars, profiles, IAM roles) Examples: STORAGE_URL=s3://bucket-name/assets/?region=us-east-1 AWS_ACCESS_KEY_ID=AKIA... AWS_SECRET_ACCESS_KEY=secret... This addresses the maintainer's request for unified URI-based configuration similar to DB_URL approach, making S3 configuration cleaner and more intuitive.
Remove all legacy individual environment variables and adopt a clean, consistent URI-based configuration system: Database Configuration: - Remove DB_TYPE - auto-detect from DB_URL (postgres:// vs SQLite default) - Simplified: Only DB_URL needed for PostgreSQL, empty for SQLite Storage Configuration: - Remove STORAGE_TYPE, S3_BUCKET_NAME, S3_REGION, S3_*_PREFIX - all in URI now - Simplified: Only STORAGE_URL needed (s3://bucket/assets/ vs local default) - Use standard AWS_ACCESS_KEY_ID instead of S3_ACCESS_KEY_ID Updated Components: - StorageManager: Only accepts STORAGE_URL, no fallback to individual vars - clean-world.mjs: Simplified storage/DB config parsing - db.js: Auto-detect PostgreSQL from postgres:// protocol - .env.example: Clean and minimal like original, 4 vars instead of 10+ Benefits: - Consistent with maintainer's request for URI-based config like DB_URL - Cleaner configuration: one URI per service (DB_URL, STORAGE_URL) - Auto-detection eliminates need for TYPE variables - Standard AWS credential naming - Much simpler .env.example file Example configuration: DB_URL=postgres://user:pass@host:port/db STORAGE_URL=s3://bucket/assets/?region=us-east-1 AWS_ACCESS_KEY_ID=AKIA... AWS_SECRET_ACCESS_KEY=secret...
- Fix SSL certificate errors by properly using CloudFront URLs - Fix PostgreSQL schema configuration with proper search_path - Fix database migrations to use remote storage instead of local files - Fix clean-world script to support PostgreSQL schemas - Remove redundant CloudFront URL configuration - Improve database configuration structure
Description
This PR introduces comprehensive database and storage flexibility to the Hyperfy server infrastructure. The changes add multi-database support (PostgreSQL, MySQL, SQLite) and AWS S3 storage capabilities, improving deployment flexibility and scalability options.
Components affected:
Key improvements:
DB_TYPE
configuration supporting PostgreSQL, MySQL, and SQLiteSTORAGE_TYPE
configuration for local file system or AWS S3 storageFixes # (issue)
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe your tests:
Test Configuration:
Checklist: