Brief Guide to Amazon SageMaker Algorithms (Part 4/4): Time Series & Computer Vision

Technical Deep Dive

February 4, 2025

Aaron West

Brief Guide to Amazon SageMaker Algorithms: Time Series & Computer Vision

This article is the fourth and final installment in a series exploring Amazon SageMaker's machine learning algorithms, focusing on those tailored for time series forecasting and computer vision. It covers key concepts, functionalities, and practical applications of algorithms such as Image Classification, Object Detection, Semantic Segmentation, and DeepAR. From leveraging recurrent neural networks for precise time series predictions to utilizing convolutional neural networks for advanced image analysis, this guide provides a comprehensive overview of SageMaker's capabilities in these domains.

Whether you're aiming to forecast demand trends across products or enable machines to interpret and act on visual data, these tools offer innovative solutions within the AWS ecosystem. As with the previous entry, actionable code examples accompany the explanations. If you missed earlier parts, Part 1 covers text processing, Part 2 delves into structured data, and Part 3 highlights unsupervised learning. This final chapter concludes the series, offering a focused resource for professional development and practical applications in cloud-based machine learning.

Image Classification

SageMaker offers two supervised learning algorithms for image classification, one based on MXNet and another utilizing TensorFlow. Both algorithms accept images in .jpg, .jpeg, or .png formats and outputs probabilities for each class label. They support training from scratch or through transfer learning, allowing fine-tuning of pre-trained models even with limited datasets.

MXNet Image Classification

This algorithm employs convolutional neural networks (CNNs) for classifying images. It supports multi-label classification, enabling the assignment of multiple labels to a single image. Training data can be provided in Apache MXNet RecordIO format or as individual image files. For optimal performance, GPU instances such as ml.p2 and ml.p3 are recommended for training, while both CPU and GPU instances can be used for inference.

TensorFlow Image Classification

Leveraging pre-trained models from TensorFlow Hub, this algorithm facilitates transfer learning by attaching a classification layer to the selected model. The classification layer includes a dropout layer and a dense layer with L2 regularization. Users can fine-tune the entire network or just the classification layer based on their specific datasets. The algorithm supports all CPU and GPU instances for training, with a recommendation for GPU instances with more memory when using large batch sizes.

Object Detection

SageMaker also provides object detection algorithms using MXNet and TensorFlow, designed to identify and localize objects within images by outputting bounding boxes, class labels, and confidence scores for detected objects. Both algorithms are designed to identify and localize objects within an image by outputting bounding boxes, class labels, and confidence scores for detected objects. These algorithms support training from scratch or transfer learning, allowing users to adapt pre-trained models to their specific datasets.

MXNet Object Detection

The MXNet Object Detection algorithm is based on the Single Shot Multibox Detector (SSD) framework and uses convolutional neural networks (CNNs) like VGG-16 and ResNet-50 as base networks. This algorithm supports tasks where identifying the position and scale of multiple objects within a single image is critical. Input data can be provided in Apache MXNet RecordIO format or as individual image files in .jpg or .png formats.

To achieve optimal training performance, AWS recommends using GPU instances such as ml.p2 and ml.p3. For inference, both CPU and GPU instances are supported, providing flexibility for different deployment needs. This algorithm's design allows it to scale well for complex object detection use cases.

TensorFlow Object Detection

The TensorFlow Object Detection algorithm supports pre-trained models from the TensorFlow Model Garden, such as MobileNet, ResNet, Inception, and EfficientNet. These models can be fine-tuned for object detection tasks using custom datasets, enabling precise identification and localization of objects. The algorithm adds a detection head to the base model, allowing it to output bounding boxes alongside class labels and confidence scores.

The TensorFlow algorithm accepts input images in .jpg, .jpeg, or .png formats and supports all GPU instances for training, with a recommendation to use higher-memory GPUs for large batch sizes. Both CPU and GPU instances can be used for inference, making it adaptable to various environments and workflows.

Semantic Segmentation

SageMaker's Semantic Segmentation algorithm is a solution designed for detailed image analysis tasks, offering a pixel-level approach, assigning each pixel a class label from a predefined set. This method is ideal for applications requiring a detailed understanding of the scene, such as autonomous driving, medical images, and robot vision. They leverage deep neural network architectures; among the available selections are Fully-Convolutional Network (FCN), Pyramid Scene Parsing (PSP), and DeepLabV3.

Because the algorithm classifies every pixel in an image, it also provides information about the shapes and boundaries of objects in the scene. The output is called a segmentation mask, which is a grayscale image where each pixel corresponds to a class label. The architecture consists of a backbone (encoder) that extracts feature maps and a decoder that generates the segmentation mask. Supported backbones include ResNet50 and ResNet101, which can be fine-tuned using pretrained weights from ImageNet or trained from scratch with your data. Segmentation masks can be output as PNG images or probability maps, making the algorithm flexible for various computer vision applications.

🧠 Uses deep learning algorithms to analyze images and identify objects.
🖼 ️ Tags each pixel in an image with a class label using predefined classes, enabling fine-grained image analysis.
🤖 Supports deep neural network architectures: Fully-Convolutional Network (FCN), Pyramid Scene Parsing (PSP), and DeepLabV3.
📊 Supports both file and pipe input modes for efficient data processing during training.
📂 Requires data in S3, organized into train and validation channels, with PNG format for images and annotations.#### Algorithm Choices and Components

DeepAR

SageMaker's DeepAR is a supervised learning algorithm tailored for time series forecasting, offering a robust solution for predicting scalar time series data. By leveraging recurrent neural networks (RNNs), DeepAR captures temporal dependencies and shared patterns across multiple related time series, enabling highly accurate forecasts. Unlike traditional methods that treat each time series independently, DeepAR trains a single model on related datasets, making it particularly effective in scenarios where multiple time series share similar dynamics, such as retail demand forecasting, server load predictions, or financial analysis.

📊 DeepAR excels in scenarios with numerous related time series, such as predicting demand across multiple products or server loads.
🔄 Uses RNNs to process sequential data, capturing temporal dependencies and long-term patterns.
📚 Trains on related time series datasets, learning shared underlying patterns and relationships.
🔮 During prediction, DeepAR generates forecasts for both existing and new time series that exhibit similar characteristics to the training data.

SageMaker Workflows

SageMaker Workflows automate machine learning pipelines, ensuring a repeatable and scalable process for tasks such as data preprocessing, model training, and deployment. Key benefits include:

Automation: Simplifies end-to-end ML workflows, reducing manual steps.
Scalability: Enables efficient handling of large datasets and multiple ML processes.
Reproducibility: Tracks and manages pipeline runs, ensuring consistent results.

SageMaker Workflow Example

The following pipeline demonstrates a complete SageMaker workflow for time series forecasting using the DeepAR algorithm. This example includes data preprocessing, model training, and pipeline execution.

import sagemaker from sagemaker.workflow.steps import ProcessingStep, TrainingStep from sagemaker.workflow.pipeline import Pipeline from sagemaker.processing import ScriptProcessor, ProcessingOutput, ProcessingInput from sagemaker.estimator import Estimator from sagemaker.inputs import TrainingInput from sagemaker.workflow.parameters import ParameterString from sagemaker.amazon.amazon_estimator import get_image_uri from sagemaker.experiments.run import Run # Initialize SageMaker session and define role session = sagemaker.Session() role = "arn:aws:iam::your-account-id:role/forecasting-role" bucket_uri = "s3://your-bucket-name" # Define input data parameter input_data_uri = ParameterString( name="input_data_uri", default_value=f"{bucket_uri}/AAPL.csv", ) # Processing step for data preparation processor = ScriptProcessor( role=role, image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/sagemaker-scikit-learn:1.2-1-cpu-py3", instance_count=1, instance_type="ml.m5.large", command=["python3"], ) processing_step = ProcessingStep( name="TimeSeriesDataProcessing", processor=processor, inputs=[ ProcessingInput( source=input_data_uri, destination="/opt/ml/processing/input/", ) ], outputs=[ ProcessingOutput( output_name="processed_data", source="/opt/ml/processing/output", destination=f"{bucket_uri}/processed_data", ) ], code="scripts/processing.py", ) # Training step using DeepAR estimator = Estimator( image_uri=get_image_uri(session.boto_region_name, "forecasting-deepar"), role=role, instance_count=1, instance_type="ml.m5.large", output_path=f"{bucket_uri}/output", hyperparameters={ "time_freq": "H", "context_length": "40", "prediction_length": "20", "num_cells": "40", "num_layers": "2", "likelihood": "gaussian", "epochs": "100", "mini_batch_size": "32", "learning_rate": "0.001", }, ) training_step = TrainingStep( name="TimeSeriesTraining", estimator=estimator, inputs={ "train": TrainingInput( s3_data=f"{bucket_uri}/processed_data", ) }, ) # Create pipeline pipeline = Pipeline( name="TimeSeriesPipeline", steps=[processing_step, training_step], parameters=[input_data_uri], ) # Execute the pipeline pipeline.upsert(role_arn=role) execution = pipeline.start( parameters={"input_data_uri": f"{bucket_uri}/AAPL.csv"} ) print(execution.arn)

The ScriptProcessor is responsible for preprocessing raw time series data, including tasks such as cleaning, formatting, and optimizing the input data into a structure suitable for training. It's important to align the preprocessing steps with your specific inference goals, as these goals often dictate how the data should be encoded and structured to meet situational requirements effectively.

The ProcessingStep encapsulates the data preprocessing logic, linking the raw input data from S3 to the ScriptProcessor and storing the processed output back in S3. This step ensures the pipeline can handle raw datasets in a repeatable and modular fashion, making it easy to adapt to changes in data formats or preprocessing requirements.

The TrainingStep defines the training process using SageMaker's built-in DeepAR algorithm. The Estimator specifies the training job's configuration, including the algorithm container, compute instance type, and hyperparameters. The processed data from the ProcessingStep is used as the training input, ensuring a seamless handoff between steps in the pipeline.

Further Resources

For those looking to deepen their understanding of SageMaker workflows and algorithms, the official Amazon SageMaker SDK documentation is going to have the most up-to-date information. It provides detailed guidance on configuring pipelines, tuning model hyperparameters, and deploying scalable solutions tailored to real-world machine learning tasks.

The algorithms section offers an in-depth look at all of SageMaker's built-in algorithms, including Image Classification, Object Detection, Semantic Segmentation, and DeepAR. This resource covers critical information such as input/output specifications, parameter details, and implementation strategies essential for production-ready solutions.

For hands-on examples, the Amazon SageMaker Examples repository contains several Jupyter notebooks showcasing how to implement SageMaker workflows, train models, and integrate them into pipelines.

More examples of SageMaker workflows and algorithms, including implementations of the DeepAR algorithm, can be found in the Amazon SageMaker Examples repository. If you're interested in exploring some of my work, feel free to check out my GitHub. Here is the code that that I wrote to demonstrate the workflow.

Got Questions? Contact us

Brief Guide to Amazon SageMaker Algorithms (Part 4/4): Time Series & Computer Vision

Brief Guide to Amazon SageMaker Algorithms: Time Series & Computer Vision

Image Classification

MXNet Image Classification

TensorFlow Image Classification

Object Detection

MXNet Object Detection

TensorFlow Object Detection

Semantic Segmentation

DeepAR

SageMaker Workflows

SageMaker Workflow Example

Further Resources

A Brief Guide to Amazon SageMaker Algorithms (Part 1/4): Text Processing

A Brief Guide to Amazon SageMaker Algorithms (Part 2/4): Tabular Data

Brief Guide to Amazon SageMaker Algorithms (Part 3/4): Unsupervised Learning

Your data is trying to tell you something

... are you listening?

Brief Guide to Amazon SageMaker Algorithms: Time Series & Computer Vision

Image Classification

MXNet Image Classification

TensorFlow Image Classification

Object Detection

MXNet Object Detection

TensorFlow Object Detection

Semantic Segmentation

DeepAR

SageMaker Workflows

SageMaker Workflow Example

Further Resources

Related posts

A Brief Guide to Amazon SageMaker Algorithms (Part 1/4): Text Processing

A Brief Guide to Amazon SageMaker Algorithms (Part 2/4): Tabular Data

Brief Guide to Amazon SageMaker Algorithms (Part 3/4): Unsupervised Learning

Your data is trying to tell you something

... are you listening?