A3

Accelerator Optimized

Purpose-built for machine learning training, inference, and GPU-accelerated workloads. Powered by NVIDIA GPUs with high-bandwidth interconnects.

Specifications

vCPUs

208

Memory

1872 GB

Network

Up to 1800 Gbps (GPU-to-GPU)

Category	Metric	Result	Source
gpu	GPU Count (H100)	8 GPUs	GCP Documentation
gpu	GPU Memory (Total)	640 GB HBM3	GCP Documentation
gpu	GPU Interconnect Bandwidth	1,800 Gbps	GCP Documentation
gpu	FP8 Performance	15.8 PFLOPS	NVIDIA H100 Specs
gpu	Training Throughput (ResNet-50)	12,800 images/s	Community Benchmarks
gpu	Inference Latency (Llama 2 7B)	5.4 ms/token	Community Benchmarks
network	Max Bandwidth	1,800 Gbps	GCP Documentation

Best result per workload category (mixed units -- see table below for details)

Workload	Metric	Machine Type	Result	Notes
ML Inference	Llama 2 7B Tokens/sec	a3-highgpu-8g	185 tokens/s	Llama 2 7B, FP16, single H100 GPU, vLLM
ML Inference	ResNet-50 Images/sec	a3-highgpu-8g	4,200 images/s	ResNet-50, batch size 64, FP16, single H100 GPU
ML Inference	Stable Diffusion Images/min	a3-highgpu-8g	28 images/min	Stable Diffusion XL, 1024x1024, 30 steps, single H100 GPU
ML Training	GPT-2 Training Throughput	a3-highgpu-8g	48,000 tokens/s	GPT-2 medium, 8x H100, DeepSpeed ZeRO-3, FP16
ML Training	ResNet-50 Training Images/sec	a3-highgpu-8g	12,800 images/s	ResNet-50, 8x H100, mixed precision, PyTorch DDP

On-demand pricing in us-central1. Spot and committed-use discounts shown for comparison.

Machine Type	vCPUs	Memory	On-Demand/hr	Spot/hr	1yr CUD/hr	3yr CUD/hr
a3-highgpu-8g	208	1872 GB	$29.387	$8.816	$18.514	$13.224

4-96 vCPUs

16-384 GB

Up to 100 Gbps