← All VM Families

A3

Accelerator Optimized

Purpose-built for machine learning training, inference, and GPU-accelerated workloads. Powered by NVIDIA GPUs with high-bandwidth interconnects.

Specifications

vCPUs

208

Memory

1872 GB

Network

Up to 1800 Gbps (GPU-to-GPU)

Synthetic Benchmarks

CategoryMetricResultSource
gpuGPU Count (H100)8 GPUsGCP Documentation
gpuGPU Memory (Total)640 GB HBM3GCP Documentation
gpuGPU Interconnect Bandwidth1,800 GbpsGCP Documentation
gpuFP8 Performance15.8 PFLOPSNVIDIA H100 Specs
gpuTraining Throughput (ResNet-50)12,800 images/sCommunity Benchmarks
gpuInference Latency (Llama 2 7B)5.4 ms/tokenCommunity Benchmarks
networkMax Bandwidth1,800 GbpsGCP Documentation

Workload Performance

Workload Performance Summary

Best result per workload category (mixed units -- see table below for details)

WorkloadMetricMachine TypeResultNotes
ML InferenceLlama 2 7B Tokens/seca3-highgpu-8g185 tokens/sLlama 2 7B, FP16, single H100 GPU, vLLM
ML InferenceResNet-50 Images/seca3-highgpu-8g4,200 images/sResNet-50, batch size 64, FP16, single H100 GPU
ML InferenceStable Diffusion Images/mina3-highgpu-8g28 images/minStable Diffusion XL, 1024x1024, 30 steps, single H100 GPU
ML TrainingGPT-2 Training Throughputa3-highgpu-8g48,000 tokens/sGPT-2 medium, 8x H100, DeepSpeed ZeRO-3, FP16
ML TrainingResNet-50 Training Images/seca3-highgpu-8g12,800 images/sResNet-50, 8x H100, mixed precision, PyTorch DDP

Pricing

On-demand pricing in us-central1. Spot and committed-use discounts shown for comparison.

Machine TypevCPUsMemoryOn-Demand/hrSpot/hr1yr CUD/hr3yr CUD/hr
a3-highgpu-8g2081872 GB$29.387$8.816$18.514$13.224

Other Accelerator Optimized Series