Skip to content

Latest commit

 

History

History
55 lines (38 loc) · 1.34 KB

File metadata and controls

55 lines (38 loc) · 1.34 KB

Overview

This directory holds code to build an Apache Beam pipeline written in python.

Requirements

Usage

Setup virtual environment

python3 -m venv .venv
source .venv/bin/activate

Install dependencies

pip install -U -r requirements.txt

Run Word Count

Run the following command to execute the pipeline on your local machine.

python3 main.py --source resources/catsum.txt --output /tmp/wordcount/output

or

Assumes previously run gcloud auth application-default login

python3 main.py --source gs://apache-beam-samples/shakespeare/* --output /tmp/wordcount/output