Skip to content

How to get evaluation metrics in output logs #392

Open
@MelissaKR

Description

@MelissaKR

Hi,

This is my first time working with Sagemaker. I successfully trained a model, however, I'm having difficulty getting it to output evaluation metrics to the log files.

Here is a snippet of my model:

def metric_fn(label_ids, predicted_labels):
    accuracy = tf.compat.v1.metrics.accuracy(label_ids, predicted_labels)
    recall = tf.compat.v1.metrics.recall(label_ids,predicted_labels)
    precision = tf.compat.v1.metrics.precision(label_ids,predicted_labels) 
                
    return {"eval_accuracy": accuracy,
            "precision": precision,
            "recall": recall}
if mode== tf.estimator.ModeKeys.EVAL:
      eval_metrics = metric_fn(label_ids, predicted_labels)
      return tf.estimator.EstimatorSpec(mode=mode,loss=loss,eval_metric_ops=eval_metrics)

And this is how the model is fit:

estimator = TensorFlow(
    entry_point='script.py',
    source_dir = [#Source_dir],
    train_instance_type='ml.m5.2xlarge',
    train_instance_count=4,
    output_path=s3_output_location,
    hyperparameters=hyperparameters,
    role=role,
    py_version='py3',
    framework_version='1.15.2',
    sagemaker_session=sess,
    metric_definitions=[{'Name': 'eval-accuracy', 'Regex': 'eval-accuracy=(\d\.\d+)'},
                        {'Name': 'precision', 'Regex': 'precision=(\d\.\d+)'},
                        {'Name': 'recall', 'Regex': 'recall=(\d\.\d+)'}],
    enable_sagemaker_metrics=True,
    distributions= {'parameter_server': {'enabled': True}})

When the training finishes, I don't see any of these metrics in the logs, nor in the 'training jobs' section. This is how the Metrics section looks:

Metrics
Name Regex
eval-accuracy eval-accuracy=(\d.\d+)
precision precision=(\d.\d+)
recall recall=(\d.\d+)

I don't know why it should be so obscure. I've run the script multiple times with sagemaker, and no luck so far! I'd appreciate any help!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions