← All VM Families
A3
Accelerator OptimizedPurpose-built for machine learning training, inference, and GPU-accelerated workloads. Powered by NVIDIA GPUs with high-bandwidth interconnects.
Specifications
vCPUs
208
Memory
1872 GB
Network
Up to 1800 Gbps (GPU-to-GPU)
Synthetic Benchmarks
| Category | Metric | Result | Source |
|---|---|---|---|
| gpu | GPU Count (H100) | 8 GPUs | GCP Documentation |
| gpu | GPU Memory (Total) | 640 GB HBM3 | GCP Documentation |
| gpu | GPU Interconnect Bandwidth | 1,800 Gbps | GCP Documentation |
| gpu | FP8 Performance | 15.8 PFLOPS | NVIDIA H100 Specs |
| gpu | Training Throughput (ResNet-50) | 12,800 images/s | Community Benchmarks |
| gpu | Inference Latency (Llama 2 7B) | 5.4 ms/token | Community Benchmarks |
| network | Max Bandwidth | 1,800 Gbps | GCP Documentation |
Workload Performance
Workload Performance Summary
Best result per workload category (mixed units -- see table below for details)
| Workload | Metric | Machine Type | Result | Notes |
|---|---|---|---|---|
| ML Inference | Llama 2 7B Tokens/sec | a3-highgpu-8g | 185 tokens/s | Llama 2 7B, FP16, single H100 GPU, vLLM |
| ML Inference | ResNet-50 Images/sec | a3-highgpu-8g | 4,200 images/s | ResNet-50, batch size 64, FP16, single H100 GPU |
| ML Inference | Stable Diffusion Images/min | a3-highgpu-8g | 28 images/min | Stable Diffusion XL, 1024x1024, 30 steps, single H100 GPU |
| ML Training | GPT-2 Training Throughput | a3-highgpu-8g | 48,000 tokens/s | GPT-2 medium, 8x H100, DeepSpeed ZeRO-3, FP16 |
| ML Training | ResNet-50 Training Images/sec | a3-highgpu-8g | 12,800 images/s | ResNet-50, 8x H100, mixed precision, PyTorch DDP |
Pricing
On-demand pricing in us-central1. Spot and committed-use discounts shown for comparison.
| Machine Type | vCPUs | Memory | On-Demand/hr | Spot/hr | 1yr CUD/hr | 3yr CUD/hr |
|---|---|---|---|---|---|---|
| a3-highgpu-8g | 208 | 1872 GB | $29.387 | $8.816 | $18.514 | $13.224 |