Skip to content

Optimize models using Automatic Model Tuning

Introduction

When training ML models, hyperparameter tuning is a step taken to find the best performing training model. In this lab you will apply a random algorithm of Automated Hyperparameter Tuning to train a BERT-based natural language processing (NLP) classifier. The model analyzes customer feedback and classifies the messages into positive (1), neutral (0), and negative (-1) sentiments.

Amazon SageMaker supports Automated Hyperparameter Tuning. It runs multiple training jobs on the training dataset using the hyperparameter ranges specified by the user. Then it chooses the combination of hyperparameters that leads to the best model candidate. The choice is made based on the objective metrics, e.g. maximization of the validation accuracy.

For the choice of hyperparameters combinations, SageMaker supports two different types of tuning strategies: random and Bayesian. This capability can be further extended by providing an implementation of a custom tuning strategy as a Docker container.

In this lab you will perform the following three steps:

First, let's install and import the required modules.

# please ignore warning messages during the installation
!pip install --disable-pip-version-check -q sagemaker==2.35.0
!conda install -q -y pytorch==1.6.0 -c pytorch
!pip install --disable-pip-version-check -q transformers==3.5.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /opt/conda

  added / updated specs:
    - pytorch==1.6.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2023.05.30 |       h06a4308_0         120 KB
    certifi-2022.12.7          |   py37h06a4308_0         150 KB
    cudatoolkit-10.2.89        |       hfd86e86_1       365.1 MB
    ninja-1.10.2               |       h06a4308_5           8 KB
    ninja-base-1.10.2          |       hd09550d_5         109 KB
    pytorch-1.6.0              |py3.7_cuda10.2.89_cudnn7.6.5_0       537.7 MB  pytorch
    ------------------------------------------------------------
                                           Total:       903.1 MB

The following NEW packages will be INSTALLED:

  cudatoolkit        pkgs/main/linux-64::cudatoolkit-10.2.89-hfd86e86_1 None
  ninja              pkgs/main/linux-64::ninja-1.10.2-h06a4308_5 None
  ninja-base         pkgs/main/linux-64::ninja-base-1.10.2-hd09550d_5 None
  pytorch            pytorch/linux-64::pytorch-1.6.0-py3.7_cuda10.2.89_cudnn7.6.5_0 None

The following packages will be UPDATED:

  ca-certificates    conda-forge::ca-certificates-2022.12.~ --> pkgs/main::ca-certificates-2023.05.30-h06a4308_0 None

The following packages will be SUPERSEDED by a higher-priority channel:

  certifi            conda-forge/noarch::certifi-2022.12.7~ --> pkgs/main/linux-64::certifi-2022.12.7-py37h06a4308_0 None


Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Retrieving notices: ...working... done
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

import boto3
import sagemaker
import pandas as pd
import botocore

config = botocore.config.Config(user_agent_extra='dlai-pds/c3/w1')

# low-level service client of the boto3 session
sm = boto3.client(service_name='sagemaker', 
                  config=config)

sess = sagemaker.Session(sagemaker_client=sm)

bucket = sess.default_bucket()
role = sagemaker.get_execution_role()
region = sess.boto_region_name
Clean data in minutes
Automatically visualize data, and improve data quality in a few clicks. Learn more
Remind me later
Don't show again

1. Configure dataset and Hyperparameter Tuning Job (HTP)

1.1. Configure dataset

Let's set up the paths and copy the data to the S3 bucket:

processed_train_data_s3_uri = 's3://{}/transformed/data/sentiment-train/'.format(bucket)
processed_validation_data_s3_uri = 's3://{}/transformed/data/sentiment-validation/'.format(bucket)
processed_test_data_s3_uri = 's3://{}/transformed/data/sentiment-test/'.format(bucket)

Upload the data to the S3 bucket:

!aws s3 cp --recursive ./data/sentiment-train $processed_train_data_s3_uri
!aws s3 cp --recursive ./data/sentiment-validation $processed_validation_data_s3_uri
!aws s3 cp --recursive ./data/sentiment-test $processed_test_data_s3_uri
upload: data/sentiment-train/part-algo-1-womens_clothing_ecommerce_reviews.tsv to s3://sagemaker-us-east-1-610140168408/transformed/data/sentiment-train/part-algo-1-womens_clothing_ecommerce_reviews.tsv
upload: data/sentiment-validation/part-algo-1-womens_clothing_ecommerce_reviews.tsv to s3://sagemaker-us-east-1-610140168408/transformed/data/sentiment-validation/part-algo-1-womens_clothing_ecommerce_reviews.tsv
upload: data/sentiment-test/part-algo-1-womens_clothing_ecommerce_reviews.tsv to s3://sagemaker-us-east-1-610140168408/transformed/data/sentiment-test/part-algo-1-womens_clothing_ecommerce_reviews.tsv

Check the existence of those files in the S3 bucket:

!aws s3 ls --recursive $processed_train_data_s3_uri
2023-06-12 02:35:27    4894416 transformed/data/sentiment-train/part-algo-1-womens_clothing_ecommerce_reviews.tsv
!aws s3 ls --recursive $processed_validation_data_s3_uri
2023-06-12 02:35:28     276522 transformed/data/sentiment-validation/part-algo-1-womens_clothing_ecommerce_reviews.tsv
!aws s3 ls --recursive $processed_test_data_s3_uri
2023-06-12 02:35:29     273414 transformed/data/sentiment-test/part-algo-1-womens_clothing_ecommerce_reviews.tsv

Exercise 1

Set up a dictionary of the input training and validation data channels, wrapping the corresponding S3 locations in a TrainingInput object.

Instructions: Pass the S3 input paths for training and validation data into the TrainingInput function

TrainingInput(s3_data=...)

to construct the Amazon SageMaker channels for S3 input data sources. Then put the corresponding channels into the dictionary.

from sagemaker.inputs import TrainingInput

data_channels = {
    ### BEGIN SOLUTION - DO NOT delete this comment for grading purposes
    'train': TrainingInput(s3_data=processed_train_data_s3_uri), # Replace None
    'validation': TrainingInput(s3_data=processed_validation_data_s3_uri) # Replace None
    ### END SOLUTION - DO NOT delete this comment for grading purposes
}

There is no need to create a test data channel, as the test data is used later at the evaluation stage and does not need to be wrapped into the sagemaker.inputs.TrainingInput function.

1.2. Configure Hyperparameter Tuning Job

Model hyperparameters need to be set prior to starting the model training as they control the process of learning. Some of the hyperparameters you will set up as static - they will not be explored during the tuning job. For the non-static hyperparameters you will set the range of possible values to be explored.

First, configure static hyperparameters including the instance type, instance count, maximum sequence length, etc. For the purposes of this lab, you will use a relatively small instance type. Please refer to this link for additional instance types that may work for your use cases outside of this lab.

max_seq_length=128 # maximum number of input tokens passed to BERT model
freeze_bert_layer=False # specifies the depth of training within the network
epochs=3
train_steps_per_epoch=50
validation_batch_size=64
validation_steps_per_epoch=50
seed=42

train_instance_count=1
train_instance_type='ml.c5.9xlarge'
train_volume_size=256
input_mode='File'
run_validation=True

Some of these will be passed into the PyTorch estimator and tuner in the hyperparameters argument. Let's set up the dictionary for that:

hyperparameters_static={
    'freeze_bert_layer': freeze_bert_layer,
    'max_seq_length': max_seq_length,
    'epochs': epochs,
    'train_steps_per_epoch': train_steps_per_epoch,
    'validation_batch_size': validation_batch_size,
    'validation_steps_per_epoch': validation_steps_per_epoch,
    'seed': seed,
    'run_validation': run_validation
}

Configure hyperparameter ranges to explore in the Tuning Job. The values of the ranges typically come from prior experience, research papers, or other models similar to the task you are trying to do.

from sagemaker.tuner import IntegerParameter
from sagemaker.tuner import ContinuousParameter
from sagemaker.tuner import CategoricalParameter

hyperparameter_ranges = {
    'learning_rate': ContinuousParameter(0.00001, 0.00005, scaling_type='Linear'), # specifying continuous variable type, the tuning job will explore the range of values
    'train_batch_size': CategoricalParameter([128, 256]), # specifying categorical variable type, the tuning job will explore only listed values
}

1.3. Set up evaluation metrics

Choose loss and accuracy as the evaluation metrics. The regular expressions Regex will capture the values of metrics that the algorithm will emit.

metric_definitions = [
     {'Name': 'validation:loss', 'Regex': 'val_loss: ([0-9.]+)'},
     {'Name': 'validation:accuracy', 'Regex': 'val_acc: ([0-9.]+)'},
]

For example, these sample log lines...

[step: 100] val_loss: 0.76 - val_acc: 70.92%

...will produce the following metrics in CloudWatch:

validation:loss = 0.76

validation:accuracy = 70.92

In the Tuning Job, you will be maximizing validation accuracy as the objective metric.

2. Run Tuning Job

2.1. Set up the RoBERTa and PyTorch script to run on SageMaker

Prepare the PyTorch model to run as a SageMaker Training Job. The estimator takes into the entry point a separate Python file, which will be called during the training. You can open and review this file src/train.py.

For more information on the PyTorchEstimator, see the documentation here: https://sagemaker.readthedocs.io/

from sagemaker.pytorch import PyTorch as PyTorchEstimator
# Note: indeed, it is not compulsory to rename the PyTorch estimator,
# but this is useful for code clarity, especially when a few modules of 'sagemaker.pytorch' are used

estimator = PyTorchEstimator(
    entry_point='train.py',
    source_dir='src',
    role=role,
    instance_count=train_instance_count,
    instance_type=train_instance_type,
    volume_size=train_volume_size,
    py_version='py3',
    framework_version='1.6.0',
    hyperparameters=hyperparameters_static,
    metric_definitions=metric_definitions,
    input_mode=input_mode,
)

2.2. Launch the Hyperparameter Tuning Job

A hyperparameter tuning job runs a series of training jobs that each test a combination of hyperparameters for a given objective metric (i.e. validation:accuracy). In this lab, you will use a Random search strategy to determine the combinations of hyperparameters - within the specific ranges - to use for each training job within the tuning job. For more information on hyperparameter tuning search strategies, please see the following documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-how-it-works.html

When the tuning job completes, you can select the hyperparameters used by the best-performing training job relative to the objective metric.

The max_jobs parameter is a stop criteria that limits the number of overall training jobs (and therefore hyperparameter combinations) to run within the tuning job.

The max_parallel_jobs parameter limits the number of training jobs (and therefore hyperparameter combinations) to run in parallel within the tuning job. This parameter is often used in combination with the Bayesian search strategy when you want to test a smaller set of training jobs (less than the max_jobs), learn from the smaller set of training jobs, then apply Bayesian methods to determine the next set of hyperparameters used by the next set of training jobs. Bayesian methods can improve hyperparameter-tuning performance in some cases.

The early_stopping_type parameter is used by SageMaker hyper-parameter tuning jobs to automatically stop a training job if the job is not improving the objective metrics (i.e. validation:accuracy) relative to previous training jobs within the tuning job. For more information on early stopping, please see the following documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-early-stopping.html.

Exercise 2

Set up the Hyperparameter Tuner.

Instructions: Use the function HyperparameterTuner, passing the variables defined above. Please use tuning strategy 'Random'.

tuner = HyperparameterTuner(
    estimator=..., # estimator
    hyperparameter_ranges=..., # hyperparameter ranges
    metric_definitions=..., # definition metric
    strategy='...', # tuning strategy
    objective_type='Maximize',
    objective_metric_name='validation:accuracy',
    max_jobs=2, # maximum number of jobs to run
    max_parallel_jobs=2, # maximum number of jobs to run in parallel
    early_stopping_type='Auto' # early stopping criteria
)
from sagemaker.tuner import HyperparameterTuner

tuner = HyperparameterTuner(
    ### BEGIN SOLUTION - DO NOT delete this comment for grading purposes
    estimator=estimator, # Replace None
    hyperparameter_ranges=hyperparameter_ranges, # Replace None
    metric_definitions=metric_definitions, # Replace None
    strategy='Random', # Replace None
    ### END SOLUTION - DO NOT delete this comment for grading purposes
    objective_type='Maximize',
    objective_metric_name='validation:accuracy',
    max_jobs=2, # maximum number of jobs to run
    max_parallel_jobs=2, # maximum number of jobs to run in parallel
    early_stopping_type='Auto' # early stopping criteria
)

Exercise 3

Launch the SageMaker Hyper-Parameter Tuning (HPT) Job.

Instructions: Use the tuner.fit function, passing the configured train and validation inputs (data channels).

tuner.fit(
    inputs=..., # train and validation input
    include_cls_metadata=False, # to be set as false if the algorithm cannot handle unknown hyperparameters
    wait=False # do not wait for the job to complete before continuing
)
tuner.fit(
    ### BEGIN SOLUTION - DO NOT delete this comment for grading purposes
    inputs=data_channels, # Replace None
    ### END SOLUTION - DO NOT delete this comment for grading purposes
    include_cls_metadata=False,
    wait=False
)

2.3. Check Tuning Job status

You can see the Tuning Job status in the console. Let's get the Tuning Job name to construct the link.

tuning_job_name = tuner.latest_tuning_job.job_name
print(tuning_job_name)
pytorch-training-230612-0238

Check the status of the Tuning Job.

from IPython.core.display import display, HTML

display(HTML('<b>Review <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/hyper-tuning-jobs/{}">Hyper-Parameter Tuning Job</a></b>'.format(region, tuning_job_name)))

Review Hyper-Parameter Tuning Job

Wait for the Tuning Job to complete.

This cell will take approximately 20-30 minutes to run.

%%time

tuner.wait()
.................................................................................................................................................................................................................................................................................................................................................!
CPU times: user 1.55 s, sys: 160 ms, total: 1.71 s
Wall time: 28min 54s

Wait until the ^^ Tuning Job ^^ completes above

The results of the SageMaker Hyperparameter Tuning Job are available on the analytics of the tuner object. The dataframe function converts the result directly into the dataframe. You can explore the results with the following lines of the code:

import time

time.sleep(10) # slight delay to allow the analytics to be calculated

df_results = tuner.analytics().dataframe()
df_results.shape
(2, 8)
df_results.sort_values('FinalObjectiveValue', ascending=0)
learning_rate train_batch_size TrainingJobName TrainingJobStatus FinalObjectiveValue TrainingStartTime TrainingEndTime TrainingElapsedTimeSeconds
1 0.000022 "128" pytorch-training-230612-0238-001-e2bd7fda Completed 71.480003 2023-06-12 02:39:29+00:00 2023-06-12 03:03:36+00:00 1447.0
0 0.000025 "256" pytorch-training-230612-0238-002-0346b365 Completed 68.750000 2023-06-12 02:40:05+00:00 2023-06-12 03:04:11+00:00 1446.0

When training and tuning at scale, it is important to continuously monitor and use the right compute resources. While you have the flexibility of choosing different compute options how do you choose the specific instance types and sizes to use? There is no standard answer for this. It comes down to understanding the workload and running empirical testing to determine the best compute resources to use for the training.

SageMaker Training Jobs emit CloudWatch metrics for resource utilization. You can review them in the AWS console:

  • open the link
  • notice that you are in the section Amazon SageMaker -> Hyperparameter tuning jobs
  • have a look at the list of the Training jobs below and click on one of them
  • scroll down to the Monitor section and review the available metrics
from IPython.core.display import display, HTML

display(HTML('<b>Review Training Jobs of the <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/hyper-tuning-jobs/{}">Hyper-Parameter Tuning Job</a></b>'.format(region, tuning_job_name)))

Review Training Jobs of the Hyper-Parameter Tuning Job

3. Evaluate the results

An important part of developing a model is evaluating the model with a test data set - one that the model has never seen during its training process. The final metrics resulting from this evaluation can be used to compare competing machine learning models. The higher the value of these metrics, the better the model is able to generalize.

3.1. Show the best candidate

Exercise 4

Show the best candidate - the one with the highest accuracy result.

Instructions: Use the sort_values function to sort the results by accuracy, which is stored in the column FinalObjectiveValue. Put ascending=0 and head(1) for the selection.

df_results.sort_values(
    '...', # column name for sorting
    ascending=0).head(1)
df_results.sort_values(
    ### BEGIN SOLUTION - DO NOT delete this comment for grading purposes
    'FinalObjectiveValue', # Replace None
    ### END SOLUTION - DO NOT delete this comment for grading purposes
    ascending=0).head(1)
learning_rate train_batch_size TrainingJobName TrainingJobStatus FinalObjectiveValue TrainingStartTime TrainingEndTime TrainingElapsedTimeSeconds
1 0.000022 "128" pytorch-training-230612-0238-001-e2bd7fda Completed 71.480003 2023-06-12 02:39:29+00:00 2023-06-12 03:03:36+00:00 1447.0

3.2. Evaluate the best candidate

Let's pull the information about the best candidate from the dataframe and then take the Training Job name from the column TrainingJobName.

best_candidate = df_results.sort_values('FinalObjectiveValue', ascending=0).iloc[0]

best_candidate_training_job_name = best_candidate['TrainingJobName']
print('Best candidate Training Job name: {}'.format(best_candidate_training_job_name))
Best candidate Training Job name: pytorch-training-230612-0238-001-e2bd7fda

Exercise 5

Show accuracy result for the best candidate.

Instructions: Use the example in the cell above.

### BEGIN SOLUTION - DO NOT delete this comment for grading purposes
best_candidate_accuracy = best_candidate['FinalObjectiveValue'] # Replace all None
### END SOLUTION - DO NOT delete this comment for grading purposes

print('Best candidate accuracy result: {}'.format(best_candidate_accuracy))
Best candidate accuracy result: 71.4800033569336

You can use the function describe_training_job of the service client to get some more information about the best candidate. The result is in dictionary format. Let's check that it has the same Training Job name:

best_candidate_description = sm.describe_training_job(TrainingJobName=best_candidate_training_job_name)

best_candidate_training_job_name2 = best_candidate_description['TrainingJobName']

print('Training Job name: {}'.format(best_candidate_training_job_name2))
Training Job name: pytorch-training-230612-0238-001-e2bd7fda

Exercise 6

Pull the Tuning Job and Training Job Amazon Resource Name (ARN) from the best candidate training job description.

Instructions: Print the keys of the best candidate Training Job description dictionary, choose the ones related to the Tuning Job and Training Job ARN and print their values.

print(best_candidate_description.keys())
dict_keys(['TrainingJobName', 'TrainingJobArn', 'TuningJobArn', 'ModelArtifacts', 'TrainingJobStatus', 'SecondaryStatus', 'HyperParameters', 'AlgorithmSpecification', 'RoleArn', 'InputDataConfig', 'OutputDataConfig', 'ResourceConfig', 'StoppingCondition', 'CreationTime', 'TrainingStartTime', 'TrainingEndTime', 'LastModifiedTime', 'SecondaryStatusTransitions', 'FinalMetricDataList', 'EnableNetworkIsolation', 'EnableInterContainerTrafficEncryption', 'EnableManagedSpotTraining', 'TrainingTimeInSeconds', 'BillableTimeInSeconds', 'ProfilingStatus', 'WarmPoolStatus', 'ResponseMetadata'])
### BEGIN SOLUTION - DO NOT delete this comment for grading purposes
best_candidate_tuning_job_arn = best_candidate_description['TuningJobArn'] # Replace None
best_candidate_training_job_arn = best_candidate_description['TrainingJobArn'] # Replace None
### END SOLUTION - DO NOT delete this comment for grading purposes
print('Best candidate Tuning Job ARN: {}'.format(best_candidate_tuning_job_arn))
print('Best candidate Training Job ARN: {}'.format(best_candidate_training_job_arn))
Best candidate Tuning Job ARN: arn:aws:sagemaker:us-east-1:610140168408:hyper-parameter-tuning-job/pytorch-training-230612-0238
Best candidate Training Job ARN: arn:aws:sagemaker:us-east-1:610140168408:training-job/pytorch-training-230612-0238-001-e2bd7fda

Pull the path of the best candidate model in the S3 bucket. You will need it later to set up the Processing Job for the evaluation.

model_tar_s3_uri = sm.describe_training_job(TrainingJobName=best_candidate_training_job_name)['ModelArtifacts']['S3ModelArtifacts']
print(model_tar_s3_uri)
s3://sagemaker-us-east-1-610140168408/pytorch-training-230612-0238-001-e2bd7fda/output/model.tar.gz

To perform model evaluation you will use a scikit-learn-based Processing Job. This is essentially a generic Python Processing Job with scikit-learn pre-installed. You can specify the version of scikit-learn you wish to use. Also pass the SageMaker execution role, processing instance type and instance count.

from sagemaker.sklearn.processing import SKLearnProcessor

processing_instance_type = "ml.c5.2xlarge"
processing_instance_count = 1

processor = SKLearnProcessor(
    framework_version="0.23-1",
    role=role,
    instance_type=processing_instance_type,
    instance_count=processing_instance_count,
    max_runtime_in_seconds=7200,
)

The model evaluation Processing Job will be running the Python code from the file src/evaluate_model_metrics.py. You can open and review the file.

Launch the Processing Job, passing the defined above parameters, custom script, path and the S3 bucket location of the test data.

from sagemaker.processing import ProcessingInput, ProcessingOutput

processor.run(
    code="src/evaluate_model_metrics.py",
    inputs=[
        ProcessingInput(  
            input_name="model-tar-s3-uri",                        
            source=model_tar_s3_uri,                               
            destination="/opt/ml/processing/input/model/"
        ),
        ProcessingInput(
            input_name="evaluation-data-s3-uri",
            source=processed_test_data_s3_uri,                                    
            destination="/opt/ml/processing/input/data/",
        ),
    ],
    outputs=[
        ProcessingOutput(s3_upload_mode="EndOfJob", output_name="metrics", source="/opt/ml/processing/output/metrics"),
    ],
    arguments=["--max-seq-length", str(max_seq_length)],
    logs=True,
    wait=False,
)
Job Name:  sagemaker-scikit-learn-2023-06-12-03-14-44-443
Inputs:  [{'InputName': 'model-tar-s3-uri', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-1-610140168408/pytorch-training-230612-0238-001-e2bd7fda/output/model.tar.gz', 'LocalPath': '/opt/ml/processing/input/model/', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'evaluation-data-s3-uri', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-1-610140168408/transformed/data/sentiment-test/', 'LocalPath': '/opt/ml/processing/input/data/', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'code', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-1-610140168408/sagemaker-scikit-learn-2023-06-12-03-14-44-443/input/code/evaluate_model_metrics.py', 'LocalPath': '/opt/ml/processing/input/code', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]
Outputs:  [{'OutputName': 'metrics', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://sagemaker-us-east-1-610140168408/sagemaker-scikit-learn-2023-06-12-03-14-44-443/output/metrics', 'LocalPath': '/opt/ml/processing/output/metrics', 'S3UploadMode': 'EndOfJob'}}]

You can see the information about the Processing Jobs using the describe function. The result is in dictionary format. Let's pull the Processing Job name:

scikit_processing_job_name = processor.jobs[-1].describe()["ProcessingJobName"]

print('Processing Job name: {}'.format(scikit_processing_job_name))
Processing Job name: sagemaker-scikit-learn-2023-06-12-03-14-44-443

Exercise 7

Pull the Processing Job status from the Processing Job description.

Instructions: Print the keys of the Processing Job description dictionary, choose the one related to the status of the Processing Job and print the value of it.

print(processor.jobs[-1].describe().keys())
dict_keys(['ProcessingInputs', 'ProcessingOutputConfig', 'ProcessingJobName', 'ProcessingResources', 'StoppingCondition', 'AppSpecification', 'RoleArn', 'ProcessingJobArn', 'ProcessingJobStatus', 'LastModifiedTime', 'CreationTime', 'ResponseMetadata'])
### BEGIN SOLUTION - DO NOT delete this comment for grading purposes
scikit_processing_job_status = processor.jobs[-1].describe()['ProcessingJobStatus'] # Replace None
### END SOLUTION - DO NOT delete this comment for grading purposes
print('Processing job status: {}'.format(scikit_processing_job_status))
Processing job status: InProgress

Review the created Processing Job in the AWS console.

Instructions: - open the link - notice that you are in the section Amazon SageMaker -> Processing Jobs - check the name of the Processing Job, its status and other available information

from IPython.core.display import display, HTML

display(
    HTML(
        '<b>Review <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/processing-jobs/{}">Processing Job</a></b>'.format(
            region, scikit_processing_job_name
        )
    )
)

Review Processing Job

Wait for about 5 minutes to review the CloudWatch Logs. You may open the file src/evaluate_model_metrics.py again and examine the outputs of the code in the CloudWatch Logs.

from IPython.core.display import display, HTML

display(
    HTML(
        '<b>Review <a target="blank" href="https://console.aws.amazon.com/cloudwatch/home?region={}#logStream:group=/aws/sagemaker/ProcessingJobs;prefix={};streamFilter=typeLogStreamPrefix">CloudWatch Logs</a> after about 5 minutes</b>'.format(
            region, scikit_processing_job_name
        )
    )
)

Review CloudWatch Logs after about 5 minutes

After the completion of the Processing Job you can also review the output in the S3 bucket.

from IPython.core.display import display, HTML

display(
    HTML(
        '<b>Review <a target="blank" href="https://s3.console.aws.amazon.com/s3/buckets/{}/{}/?region={}&tab=overview">S3 output data</a> after the Processing Job has completed</b>'.format(
            bucket, scikit_processing_job_name, region
        )
    )
)

Review S3 output data after the Processing Job has completed

Monitor the Processing Job:

from pprint import pprint

running_processor = sagemaker.processing.ProcessingJob.from_processing_name(
    processing_job_name=scikit_processing_job_name, sagemaker_session=sess
)

processing_job_description = running_processor.describe()

pprint(processing_job_description)
{'AppSpecification': {'ContainerArguments': ['--max-seq-length', '128'],
                      'ContainerEntrypoint': ['python3',
                                              '/opt/ml/processing/input/code/evaluate_model_metrics.py'],
                      'ImageUri': '683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-scikit-learn:0.23-1-cpu-py3'},
 'CreationTime': datetime.datetime(2023, 6, 12, 3, 14, 44, 964000, tzinfo=tzlocal()),
 'LastModifiedTime': datetime.datetime(2023, 6, 12, 3, 18, 26, 397000, tzinfo=tzlocal()),
 'ProcessingInputs': [{'AppManaged': False,
                       'InputName': 'model-tar-s3-uri',
                       'S3Input': {'LocalPath': '/opt/ml/processing/input/model/',
                                   'S3CompressionType': 'None',
                                   'S3DataDistributionType': 'FullyReplicated',
                                   'S3DataType': 'S3Prefix',
                                   'S3InputMode': 'File',
                                   'S3Uri': 's3://sagemaker-us-east-1-610140168408/pytorch-training-230612-0238-001-e2bd7fda/output/model.tar.gz'}},
                      {'AppManaged': False,
                       'InputName': 'evaluation-data-s3-uri',
                       'S3Input': {'LocalPath': '/opt/ml/processing/input/data/',
                                   'S3CompressionType': 'None',
                                   'S3DataDistributionType': 'FullyReplicated',
                                   'S3DataType': 'S3Prefix',
                                   'S3InputMode': 'File',
                                   'S3Uri': 's3://sagemaker-us-east-1-610140168408/transformed/data/sentiment-test/'}},
                      {'AppManaged': False,
                       'InputName': 'code',
                       'S3Input': {'LocalPath': '/opt/ml/processing/input/code',
                                   'S3CompressionType': 'None',
                                   'S3DataDistributionType': 'FullyReplicated',
                                   'S3DataType': 'S3Prefix',
                                   'S3InputMode': 'File',
                                   'S3Uri': 's3://sagemaker-us-east-1-610140168408/sagemaker-scikit-learn-2023-06-12-03-14-44-443/input/code/evaluate_model_metrics.py'}}],
 'ProcessingJobArn': 'arn:aws:sagemaker:us-east-1:610140168408:processing-job/sagemaker-scikit-learn-2023-06-12-03-14-44-443',
 'ProcessingJobName': 'sagemaker-scikit-learn-2023-06-12-03-14-44-443',
 'ProcessingJobStatus': 'InProgress',
 'ProcessingOutputConfig': {'Outputs': [{'AppManaged': False,
                                         'OutputName': 'metrics',
                                         'S3Output': {'LocalPath': '/opt/ml/processing/output/metrics',
                                                      'S3UploadMode': 'EndOfJob',
                                                      'S3Uri': 's3://sagemaker-us-east-1-610140168408/sagemaker-scikit-learn-2023-06-12-03-14-44-443/output/metrics'}}]},
 'ProcessingResources': {'ClusterConfig': {'InstanceCount': 1,
                                           'InstanceType': 'ml.c5.2xlarge',
                                           'VolumeSizeInGB': 30}},
 'ProcessingStartTime': datetime.datetime(2023, 6, 12, 3, 18, 26, 394000, tzinfo=tzlocal()),
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '2367',
                                      'content-type': 'application/x-amz-json-1.1',
                                      'date': 'Mon, 12 Jun 2023 03:19:32 GMT',
                                      'x-amzn-requestid': '617a6701-8bdd-4c06-9091-c8fefea9ed4f'},
                      'HTTPStatusCode': 200,
                      'RequestId': '617a6701-8bdd-4c06-9091-c8fefea9ed4f',
                      'RetryAttempts': 0},
 'RoleArn': 'arn:aws:iam::610140168408:role/sagemaker-studio-vpc-firewall-us-east-1-sagemaker-execution-role',
 'StoppingCondition': {'MaxRuntimeInSeconds': 7200}}

Wait for the Processing Job to complete.

This cell will take approximately 5-10 minutes to run.

%%time

running_processor.wait(logs=False)
.............!CPU times: user 43 ms, sys: 25.2 ms, total: 68.2 ms
Wall time: 1min 5s

Please wait until ^^ Processing Job ^^ completes above

3.3. Inspect the processed output data

Let's take a look at the results of the Processing Job. Get the S3 bucket location of the output metrics:

processing_job_description = running_processor.describe()

output_config = processing_job_description["ProcessingOutputConfig"]
for output in output_config["Outputs"]:
    if output["OutputName"] == "metrics":
        processed_metrics_s3_uri = output["S3Output"]["S3Uri"]

print(processed_metrics_s3_uri)
s3://sagemaker-us-east-1-610140168408/sagemaker-scikit-learn-2023-06-12-03-14-44-443/output/metrics

List the content of the folder:

!aws s3 ls $processed_metrics_s3_uri/
2023-06-12 03:20:39      19519 confusion_matrix.png
2023-06-12 03:20:39         56 evaluation.json

The test accuracy can be pulled from the evaluation.json file.

import json
from pprint import pprint

metrics_json = sagemaker.s3.S3Downloader.read_file("{}/evaluation.json".format(
    processed_metrics_s3_uri
))

print('Test accuracy: {}'.format(json.loads(metrics_json)))
Test accuracy: {'metrics': {'accuracy': {'value': 0.7346278317152104}}}

Copy image with the confusion matrix generated during the model evaluation into the folder generated.

!aws s3 cp $processed_metrics_s3_uri/confusion_matrix.png ./generated/

import time
time.sleep(10) # Slight delay for our notebook to recognize the newly-downloaded file
download: s3://sagemaker-us-east-1-610140168408/sagemaker-scikit-learn-2023-06-12-03-14-44-443/output/metrics/confusion_matrix.png to generated/confusion_matrix.png

Show and review the confusion matrix, which is a table of all combinations of true (actual) and predicted labels. Each cell contains the number of the reviews for the corresponding sentiments. You can see that the highest numbers of the reviews appear in the diagonal cells, where the predicted sentiment equals the actual one.

%%html

<img src='./generated/confusion_matrix.png'>

Upload the notebook into S3 bucket for grading purposes.

Note: you may need to click on "Save" button before the upload.

!aws s3 cp ./C3_W1_Assignment.ipynb s3://$bucket/C3_W1_Assignment_Learner.ipynb


Last update: July 22, 2024
Created: July 22, 2024