Skip to content

πŸ“¦ Upload files to Amazon S3 using presigned URLs β€” designed for CLI-first, flexible ingestion workflows.

License

Notifications You must be signed in to change notification settings

Djirlic/s3-file-uploader-cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

21 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“€ s3-file-uploader-cli

codecov CI Python License: MIT pre-commit

A Python CLI tool to upload files to an AWS S3 bucket using a presigned URL. Designed for data engineering workflows, automation pipelines, and robust CLI-based file ingestion.

Built with portability, reproducibility, and cloud-native best practices in mind.


πŸš€ Features

  • Upload any file to S3 via presigned URL.
  • AWS authentication via named profile (from ~/.aws/credentials).
  • Supports dynamic S3 bucket names and upload paths.
  • Well-structured CLI with helpful error messages and validation.
  • Fully typed and tested Python codebase.
  • Compatible across macOS, Linux, and Windows.

⚠️ Limitations

  • The maximum supported file size is 5 GB, due to the use of presigned URLs with S3 put_object operations.
  • Multipart uploads (required for files >5 GB) are not yet supported.
  • Uploads are unauthenticated beyond the presigned URL β€” anyone with the URL can upload during its validity period (300 seconds).
  • No built-in retry or resumable uploads β€” transient network issues may cause failures.
  • The CLI assumes the AWS profile is configured locally (via ~/.aws/credentials).

βœ… Prerequisites

Before using this tool, make sure the following are set up:

  • Python 3.13+ installed on your system
  • An AWS S3 bucket you have access to (create one if needed)
  • An AWS CLI installation and configuration:
  • An AWS profile properly configured in ~/.aws/credentials and ~/.aws/config:
    • You can use the default profile or a named profile via --profile when using this tool

Warning

For safety and best practices, we strongly recommend using an AWS profile with least privilege when running this tool.

This means:

  • Create a dedicated IAM user or role for uploads
  • Attach a custom policy that only allows s3:PutObject on the specific bucket or path
  • Avoid using your root credentials or overly permissive policies like AdministratorAccess

This reduces the risk of accidental or malicious access to other AWS resources in your account.


🧠 Why Use Presigned URLs?

This tool uses presigned URLs to upload files directly to Amazon S3, instead of routing large files through API Gateway or other intermediaries.

API Gateway as a proxy Presigned URLs with API Gateway CloudFront with Lambda@Edge
Max Object Size 10 MB 5 GB (5 TB with multipart upload) 5 GB
Client Complexity Single HTTP Request Multiple HTTP Requests Single HTTP Request
Authorization Options Amazon Cognito, IAM, Lambda Authorizer Amazon Cognito, IAM, Lambda Authorizer Lambda@Edge
Throttling API Key Throttling API Key Throttling Custom Throttling

Source: https://aws.amazon.com/blogs/compute/patterns-for-building-an-api-to-upload-files-to-amazon-s3/

Presigned URLs strike a great balance between security, upload size, and simplicity, making them ideal for data engineering ingestion pipelines.


πŸ“¦ Installation

Clone the repo and install dependencies using Poetry:

poetry install --with dev,test

Alternatively, you can use:

make install

You’ll also need to have Python 3.13+ installed.


βš™οΈ Usage

Run the CLI:

poetry run python -m uploader.main \
  --bucket-name my-bucket-name \
  --upload-path uploads/my-file.csv \
  --file-location ~/Downloads/local-file.csv \
  --profile default

Or alternatively:

make run ARGS="--bucket-name my-bucket-name \
  --upload-path uploads/my-file.csv \
  --file-location ~/Downloads/local-file.csv \
  --profile default"

Arguments:

  • --bucket-name: Target S3 bucket to upload to (required)
  • --upload-path: S3 key where file should be stored (required)
  • --file-location: Path to the local file to upload (required)
  • --profile: AWS CLI profile to use (optional, defaults to default)

πŸ§ͺ Testing & Coverage

Run all tests:

make test

Generate test coverage report:

make test-coverage

🧼 Code Style & Quality

This project uses the following tools:

  • black for formatting
  • flake8 for linting
  • isort for sorting imports
  • mypy for type checks
  • pre-commit to run all tools automatically before each commit

Run pre-commits without a commit:

pre-commit run --all-files

πŸ“ Project Structure

s3-file-uploader-cli/
β”œβ”€β”€ src/
β”‚   └── uploader/           # Main logic (presign, upload, logging)
β”œβ”€β”€ tests/                  # Unit tests for all modules
β”œβ”€β”€ .github/workflows/     # CI pipeline with linting + test coverage
β”œβ”€β”€ .pre-commit-config.yaml
β”œβ”€β”€ Makefile                # Common tasks (run, test, lint, etc)
β”œβ”€β”€ pyproject.toml          # Tooling config (Poetry, mypy, black, etc)

πŸ“Œ Future Improvements

  • Support multi-part uploads for large files.
  • Add retry and backoff logic on upload failures.
  • Add logging hooks to external monitoring systems.
  • Add support for directory-level uploads.

Feel free to fork, contribute, or reach out if you find this project helpful! πŸ™Œ

About

πŸ“¦ Upload files to Amazon S3 using presigned URLs β€” designed for CLI-first, flexible ingestion workflows.

Topics

Resources

License

Stars

Watchers

Forks