A Python CLI tool to upload files to an AWS S3 bucket using a presigned URL. Designed for data engineering workflows, automation pipelines, and robust CLI-based file ingestion.
Built with portability, reproducibility, and cloud-native best practices in mind.
- Upload any file to S3 via presigned URL.
- AWS authentication via named profile (from
~/.aws/credentials
). - Supports dynamic S3 bucket names and upload paths.
- Well-structured CLI with helpful error messages and validation.
- Fully typed and tested Python codebase.
- Compatible across macOS, Linux, and Windows.
- The maximum supported file size is 5 GB, due to the use of presigned URLs with S3
put_object
operations. - Multipart uploads (required for files >5 GB) are not yet supported.
- Uploads are unauthenticated beyond the presigned URL β anyone with the URL can upload during its validity period (300 seconds).
- No built-in retry or resumable uploads β transient network issues may cause failures.
- The CLI assumes the AWS profile is configured locally (via
~/.aws/credentials
).
Before using this tool, make sure the following are set up:
- Python 3.13+ installed on your system
- An AWS S3 bucket you have access to (create one if needed)
- An AWS CLI installation and configuration:
- Install the AWS CLI
- Run
aws configure
to set up your credentials
- An AWS profile properly configured in
~/.aws/credentials
and~/.aws/config
:- You can use the default profile or a named profile via
--profile
when using this tool
- You can use the default profile or a named profile via
Warning
For safety and best practices, we strongly recommend using an AWS profile with least privilege when running this tool.
This means:
- Create a dedicated IAM user or role for uploads
- Attach a custom policy that only allows
s3:PutObject
on the specific bucket or path- Avoid using your root credentials or overly permissive policies like
AdministratorAccess
This reduces the risk of accidental or malicious access to other AWS resources in your account.
This tool uses presigned URLs to upload files directly to Amazon S3, instead of routing large files through API Gateway or other intermediaries.
API Gateway as a proxy | Presigned URLs with API Gateway | CloudFront with Lambda@Edge | |
---|---|---|---|
Max Object Size | 10 MB | 5 GB (5 TB with multipart upload) | 5 GB |
Client Complexity | Single HTTP Request | Multiple HTTP Requests | Single HTTP Request |
Authorization Options | Amazon Cognito, IAM, Lambda Authorizer | Amazon Cognito, IAM, Lambda Authorizer | Lambda@Edge |
Throttling | API Key Throttling | API Key Throttling | Custom Throttling |
Source: https://aws.amazon.com/blogs/compute/patterns-for-building-an-api-to-upload-files-to-amazon-s3/
Presigned URLs strike a great balance between security, upload size, and simplicity, making them ideal for data engineering ingestion pipelines.
Clone the repo and install dependencies using Poetry:
poetry install --with dev,test
Alternatively, you can use:
make install
Youβll also need to have Python 3.13+ installed.
Run the CLI:
poetry run python -m uploader.main \
--bucket-name my-bucket-name \
--upload-path uploads/my-file.csv \
--file-location ~/Downloads/local-file.csv \
--profile default
Or alternatively:
make run ARGS="--bucket-name my-bucket-name \
--upload-path uploads/my-file.csv \
--file-location ~/Downloads/local-file.csv \
--profile default"
Arguments:
--bucket-name
: Target S3 bucket to upload to (required)--upload-path
: S3 key where file should be stored (required)--file-location
: Path to the local file to upload (required)--profile
: AWS CLI profile to use (optional, defaults todefault
)
Run all tests:
make test
Generate test coverage report:
make test-coverage
This project uses the following tools:
black
for formattingflake8
for lintingisort
for sorting importsmypy
for type checkspre-commit
to run all tools automatically before each commit
Run pre-commits without a commit:
pre-commit run --all-files
s3-file-uploader-cli/
βββ src/
β βββ uploader/ # Main logic (presign, upload, logging)
βββ tests/ # Unit tests for all modules
βββ .github/workflows/ # CI pipeline with linting + test coverage
βββ .pre-commit-config.yaml
βββ Makefile # Common tasks (run, test, lint, etc)
βββ pyproject.toml # Tooling config (Poetry, mypy, black, etc)
- Support multi-part uploads for large files.
- Add retry and backoff logic on upload failures.
- Add logging hooks to external monitoring systems.
- Add support for directory-level uploads.
Feel free to fork, contribute, or reach out if you find this project helpful! π