Skip to content

Chapter 2 example error #66

@jhancock1229

Description

@jhancock1229

In attempting to execute the code at the end of chapter 2 i get the following error:

WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 1 of 3. Reason: timed out
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 2 of 3. Reason: timed out
WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 3 of 3. Reason: timed out
WARNING:google.auth._default:Authentication failed using Compute Engine authentication due to unavailable metadata server.
WARNING:apache_beam.internal.gcp.auth:Unable to find default credentials to use: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Connecting anonymously.

I know its in reference to attempting to pull kinglear.txt from google storage. Any tips on how to resolve this? BTW here is the source code i copied out of the book:

import re
import apache_beam as beam
from apache_beam.io import ReadFromText
from apache_beam.io import WriteToText
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.pipeline_options import SetupOptions

input_file = "gs://dataflow-samples/shakespeare/kinglear.txt"
output_file = "~/coding/machine-learning/output.txt"

pipeline_options = PipelineOptions()

with beam.Pipeline(options=pipeline_options) as p:
    lines = p | ReadFromText(input_file)
    counts = (
        lines
        | 'Split' >> beam.FlatMap(lambda x: re.findall(r'[A-Za-z\']+', x))
        | 'PairWithOne' >> beam.Map(lambda x: (x, 1))
        | 'GroupAndSum' >> beam.CombinePerKey(sum)
    )
    def format_result(word_count):
        (word, count) = word_count
        return "{}: {}".format(word, count)
    
    output = counts | 'Format' >> beam.Map(format_result)

    output | WriteToText(output_file)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions