ML Serving using Ray DLC¶

Production-ready Docker images for deploying ML models with Ray Serve on AWS. Available in CPU and GPU variants, built on Amazon Linux 2023 with ongoing security patching.

Ray Serve is a scalable model serving library for deploying any Python model — NLP, computer vision, audio, tabular, and multi-model compositions — behind a single HTTP endpoint.

Images¶

Platform	Variant	Image	Default Port
EC2 / EKS	GPU	`public.ecr.aws/deep-learning-containers/ray:serve-ml-cuda`	8000
EC2 / EKS	CPU	`public.ecr.aws/deep-learning-containers/ray:serve-ml-cpu`	8000
Amazon SageMaker AI	GPU	`public.ecr.aws/deep-learning-containers/ray:serve-ml-sagemaker-cuda`	8080
Amazon SageMaker AI	CPU	`public.ecr.aws/deep-learning-containers/ray:serve-ml-sagemaker-cpu`	8080

All images are also available on the ECR Public Gallery. For private ECR URIs, see Image Access.

What's Included¶

The images bundle a curated stack so you can ship a serving endpoint without building a custom image:

Ray Serve 2.55 — scalable model serving with autoscaling, fractional GPU sharing, and multi-model composition
PyTorch 2.10 with CUDA 12.9 (GPU variant) — current stable PyTorch
Transformers 5.8 — Hugging Face model loading and pipeline() API
FFmpeg 8.0.1 — built from source for video ingestion and processing pipelines
OpenCV, Pillow, soundfile, torchaudio, torchvision, torchcodec — common image, audio, and video I/O libraries
scikit-learn, NumPy, pandas — for tabular models and feature engineering
boto3, awscli — AWS SDK pre-installed
uvicorn[standard], httpx, FastAPI — async HTTP stack used by Ray Serve and the SageMaker adapter
Python 3.13 — built from source with security hardening

Example Deployments¶

The repo includes runnable examples for the most common use cases:

Example	Use case	Path
DistilBERT	NLP / sentiment analysis	examples/ray/nlp-model
DenseNet-161	Computer vision / image classification	examples/ray/cv-model
Wav2Vec2	Audio / speech-to-text	examples/ray/audio-model
Iris classifier	Tabular / scikit-learn	examples/ray/tabular-model

How We Build¶

These images are curated builds tracking the Ray project:

Built from upstream releases — images track Ray stable releases, each gated by our test suite before publication.
Security-patched — continuously maintained with security patches from AWS on an Amazon Linux 2023 base.